summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Add layer_count to intel_renderbufferChris Forbes2014-04-102-0/+13
| | | | | | | | | | This is the effective layer count, for clears etc. This differs from the depth of the miptree level when views are involved. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Pull out layer_multiplier in intel_update_renderbuffer_wrapperChris Forbes2014-04-101-2/+5
| | | | | | | | | We're about to need this in another place. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Add `layered` parameter to intel_update_renderbuffer_wrapperChris Forbes2014-04-101-3/+4
| | | | | | | | | | We're about to need this so we can determine the layer count of the wrapper. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Adjust renderbuffer wrapper to account for MinLevel/MinLayerChris Forbes2014-04-101-0/+4
| | | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Enable texture upload fast path with MinLevelChris Forbes2014-04-101-1/+1
| | | | | | | | | | We'll still avoid MinLayer here since the fast path doesn't understand arrays at all, but it's straightforward to do levels. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Account for MinLevel in texture upload fast pathChris Forbes2014-04-101-2/+4
| | | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Adjust map/unmap code for MinLevel/MinLayerChris Forbes2014-04-101-3/+8
| | | | | | | | | | This allows core mesa's TexSubImage paths etc to work correctly with views which have nonzero MinLevel or MinLayer. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Don't try to use fast upload path for nontrivial viewsChris Forbes2014-04-101-0/+4
| | | | | | | | | | This will eventually be relaxed, but we'll get the fallback path working first. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Adjust surface_state emission to account for view parametersChris Forbes2014-04-101-5/+14
| | | | | | | | | V4: Comment style, remove magic shift. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Add _Format to intel_texobj.Chris Forbes2014-04-103-0/+16
| | | | | | | | | | | This is the actual mesa_format to use. In non-view cases this is always the same as the mt's format. V4: Comment style Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Add driver hook for TextureViewChris Forbes2014-04-101-0/+41
| | | | | | | | | | | We need to wire the original texture's mt into the view. All the hard work of setting up an appropriate tree of gl_texture_image structures has already been done by core mesa. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Ensure that texture validation is skipped for immutable textures.Chris Forbes2014-04-101-0/+5
| | | | | | | | | | | If we were to relayout the miptree, we'd break any views that are sharing it. (Simplified based on suggestions from Eric) Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: refactor format selection for unsupported ETC* formatsChris Forbes2014-04-102-34/+45
| | | | | | | | We will need to call this to munge view formats. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: refactor format munging for separate stencilChris Forbes2014-04-102-8/+26
| | | | | | | | | We will need this for munging the view's format. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Include #slices in miptree debugChris Forbes2014-04-101-2/+2
| | | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* mesa: Adjust _MaxLevel computation to account for viewsChris Forbes2014-04-101-0/+7
| | | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* mesa: Prefer non-swizzled formats for most sized internalformatsChris Forbes2014-04-101-4/+18
| | | | | | | | | | | | | | | | These formats can be cast to others (with different component types or sizes) via ARB_texture_view or ARB_shader_image_load_store. We want them to be laid out consistently so that we can just reinterpret the memory with a different format. In V1, this was done conditionally on a 'prefer_no_swizzle' flag which was set in TexStorage/TextureView paths, but we need the same behavior for ARB_shader_image_load_store (which also works with images created via TexImage, so we don't want it to be conditional. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Render R8G8B8X8 as R8G8B8A8Chris Forbes2014-04-101-0/+3
| | | | | | | | | The sampler can handle R8G8B8X8 (and substitute 1.0 for the fourth component) but we can't use it as a render target. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Pretend we don't support BRW_SURFACEFORMAT_R16G16B16_FLOAT for textures.Chris Forbes2014-04-101-1/+1
| | | | | | | | | | | | | | | | | | | None of the other 3-component 16bpc formats are directly supported, so they get promoted to XRGB equivalents. *Not* promoting RGB16F the same way makes texture views much more fiddly -- we don't want to have to do crazy copying behind the scenes. (with my other master + my experimental ARB_texture_view support) fixes the piglit test: `spec/ARB_texture_view/view compare 48bit formats` No regressions in gpu.tests on Haswell. V4: Don't alter the formats table -- just don't match it to a mesa_format. [Kenneth] Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Enable R10G10B10A2_UNORM formatChris Forbes2014-04-101-0/+1
| | | | | | | | | This is supported by all generations, and is required for memory layout consistency for texture_view. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Enable R8G8B8A8_UNORM_SRGB formatChris Forbes2014-04-101-0/+1
| | | | | | | | Now this is the preferred format for GL_SRGB8_ALPHA8. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* swrast: Add support for fetching from MESA_FORMAT_R10G10B10A2_UNORMChris Forbes2014-04-102-3/+16
| | | | | | | | | | V4: Fix rebase conflicts with Brian's renaming of the texfetch functions. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]> Acked-by: Eric Anholt <[email protected]>
* mesa: fix packing of float texels to GL_SHORT/GL_BYTEChris Forbes2014-04-101-58/+58
| | | | | | | | | | | | | | | | | Previously, we would unpack the texels to floats using *_TO_FLOAT_TEX, and then pack them into the desired format using FLOAT_TO_*. Unfortunately, this isn't quite the inverse operation, and so some texel values would end up off-by-one. This fixes the GL_RGB8_SNORM and GL_RGB16_SNORM subcases in piglit's arb_texture_view-format-consistency-get test on i965. The similar 1-, 2- and 4-component cases already worked because they took the memcpy path rather than repacking. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* Partially revert bba9c28 "configure: use LIB_EXT rather than hardcoded .so"Jon TURNEY2014-04-091-13/+13
| | | | | | | | | | | | | | | | | Filenames passed to dlopen() don't need to use the platform's default extension for shared libraries. Using the '.so' extension when dlopen()ing DRI drivers is hardcoded into mesa and the X server, so it should be hardcoded here in the Makefile as well. A similar fix is probably also needed for gallium DRI drivers. (Consider that if we were starting from scratch, perhaps we would use a custom extension like .dri instead) Cc: Emil Velikov <[email protected]> Signed-off-by: Jon TURNEY <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Stop advertising GL_MESA_ycbcr_texture.Kenneth Graunke2014-04-091-1/+0
| | | | | | | | | | | | | | | | | The "new" fragment shader backend has never supported the necessary color conversion code for this to work. We began using the new backend in Mesa 7.10 for GLSL (commit a81d423d93f22a948f3aa4bf73, October 2010), and for ARB_fragment_program in Mesa 9.1 (commit 97615b2d8c7c3cea6fd3a4, August 2012). I haven't heard any complaints, so I don't think anyone will miss this feature. I believe mplayer used it at one point, but these days defaults to other paths anyway. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tnl: Merge _tnl_vbo_draw_prims() into _tnl_draw_prims().Iago Toral Quiroga2014-04-085-48/+27
| | | | | | | | | This should help prevent situations where we render without proper index bounds. For example: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove unused sampler key fieldsTopi Pohjolainen2014-04-082-16/+0
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: move declaration before code in etc2_unpack_rgb8()Brian Paul2014-04-081-3/+3
| | | | To fix MSVC build since cb4ad1368551b.
* i965: Delete "fast color clear unsupported" performance warning.Kenneth Graunke2014-04-081-2/+0
| | | | | | | | | | | | | | | | | | | | | | | Applications frequently clear to colors other than 0.0 or 1.0, which prevents us from doing fast color clears. In that case, we issue this performance warning on basically every glClear call, resulting in so much spam that it's nearly impossible to see any other messages. Plus, I don't think it's useful. We aren't suggesting a better way to do what the application developers want---we're just telling them it would be faster to do something they don't want. Driver developers have no control over the clear color, so this message is totally useless to them. A better alternative to get this sort of information is to use INTEL_DEBUG=blorp, which tells you whether color clears were fast, simd16 repdata, or slow. v2: Rebase on has_color_component changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add bounds checking to eliminate buffer overrunCourtney Goeltzenleuchter2014-04-081-24/+54
| | | | | | | | | | | | | | | | | Decompressing ETC2 textures was causing intermitent segfault by copying resulting 4x4 texel block to the destination texture regardless of the size of the destination texture. Issue found via application crash in GLBenchmark 3.0's Manhattan test. v2: add more detail comment. Compute limit outside inner loops. v3: add bugzilla reference v4: Correct cc syntax in commit log v5: really grab the right patch Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988 Cc: "9.2 10.0 10.1" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> [v1, suggested v2-3]
* i965/vec4: fix record clearing in copy propagationChia-I Wu2014-04-081-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given mov vgrf7, vgrf9.xyxz add vgrf9.xyz, vgrf4.xyzw, vgrf5.xyzw add vgrf10.x, vgrf6.xyzw, vgrf7.wwww the last instruction would be wrongly changed to add vgrf10.x, vgrf6.xyzw, vgrf9.zzzz during copy propagation. The issue is that when deciding if a record should be cleared, the old code checked for inst->dst.writemask & (1 << ch) instead of inst->dst.writemask & (1 << BRW_GET_SWZ(src->swizzle, ch)) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76749 Signed-off-by: Chia-I Wu <[email protected]> Cc: Jordan Justen <[email protected]> Cc: Matt Turner <[email protected]> Reviewed-by: Ian Romainck <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "10.1" <[email protected]>
* i965/vec4: Add a test for copy propagation behavior.Eric Anholt2014-04-083-0/+164
| | | | | | | I thought I was seeing a bug in the code while reviewing, but it's not there. Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Track whether we're doing dual source in a more obvious way.Eric Anholt2014-04-083-3/+5
| | | | | | I'm going to be turning dual_src_output into an array in a moment. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add a couple more global special regs to special[]Eric Anholt2014-04-081-0/+2
| | | | | | | Nothing bad came of this because they weren't used after visitor running, but leaving them in a bad state seems like a recipe for pain later. Suggested-by: Kenneth Graunke <[email protected]>
* i965/fs: Handle arrays of special regs more cleanly.Eric Anholt2014-04-081-14/+22
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix dump_instructions() on uniforms.Eric Anholt2014-04-081-2/+2
| | | | | | All of a vec4 uniform was being printed as "u0" Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix vgrf0 live interval when no interpolation was done.Eric Anholt2014-04-081-2/+4
| | | | | | | | | | When you've got a simple solid-color shader that doesn't generate pixel_x/y interpolation, we were deciding that the first vgrf was both the undefined pixel_x and pixel_y, and extending its live interval to avoid the stride problem. That tricked other optimization that tries to see if a particular instruction is the last use of a variable. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop pointless check for variable declarations in splitting.Eric Anholt2014-04-081-10/+5
| | | | | | | We're walking the whole instruction stream, so we know the declaration will be found. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove stale comment.Eric Anholt2014-04-081-1/+0
| | | | | | | | We stopped doing variable index lowering for uniforms in a64c1eb9b110f29b8abf803a8256306702629bdc, 5 months after the comment was added. Reviewed-by: Kenneth Graunke <[email protected]>
* i915: Fix build error.Iago Toral Quiroga2014-04-081-6/+0
| | | | | | | is_power_of_two() is now provided by mesa so its definition must be removed from the i915 driver code. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Pass ctx->Const.NativeIntegers to do_common_optimization().Kenneth Graunke2014-04-084-4/+8
| | | | | | | | | | | The next few patches will introduce an optimization that only works when integers are not represented as floating point values. v2: Re-word-wrap a line, as requested by Ian Romanick. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Skip emitting MACH/MOV for small integers.Kenneth Graunke2014-04-081-12/+21
| | | | | | | | | | | | | | | | | The vector backend already implemented this optimization, but surprisingly, we never bothered to implement it in the scalar backend. In addition to saving two instructions, this eliminates a use of the accumulator as an explicit source, which is unsupported in SIMD16 mode on Gen7+, which could help us gain SIMD16 programs. Cuts 19.23% of the instructions in dolphin/efb2ram.shader_test. v2: Rebase on is_16bit_integer_constant -> is_uint16_constant rename. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Make is_16bit_constant from i965 an ir_constant method.Kenneth Graunke2014-04-081-16/+2
| | | | | | | | | | | | | | | | | | | | | | The i965 MUL instruction doesn't natively support 32-bit by 32-bit integer multiplication; additional instructions (MACH/MOV) are required. However, we can avoid those if we know one of the operands can be represented in 16 bits or less. The vector backend's is_16bit_constant static helper function checks for this. We want to be able to use it in the scalar backend as well, which means moving the function to a more generally-usable location. Since it isn't i965 specific, I decided to make it an ir_constant method, in case it ends up being useful to other people as well. v2: Rename from is_16bit_integer_constant to is_uint16_constant, as suggested by Ilia Mirkin. Update comments to clarify that it does apply to both int and uint types, as long as the value is non-negative and fits in 16-bits. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Move is_power_of_two() function from brw_context.h to macros.h.Kenneth Graunke2014-04-082-6/+11
| | | | | | | | | This makes the function available from core Mesa code, including the GLSL compiler. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix "SIMD16 unsupported" messages via KHR_debug.Kenneth Graunke2014-04-081-1/+1
| | | | | | | | | | Performance warnings are logged via KHR_debug in addition to when the INTEL_DEBUG=perf environment variable is set. Without this, messages in debug contexts would have "(null)" for the reason. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix missing dirty bits in the gen8_sbe_state atom.Kenneth Graunke2014-04-071-2/+2
| | | | | | | | | | These are clearly needed---the comments in the function are even present for each one of them. I originally had two separate state atoms for 3DSTATE_SBE and 3DSTATE_SBE_SWIZ. When I combined the functions, I must have forgotten to add the atoms for 3DSTATE_SBE_SWIZ. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Drop BRW_NEW_RASTERIZER_DISCARD flag from Broadwell SOL atom.Kenneth Graunke2014-04-071-1/+0
| | | | | | | | Nothing actually uses this---we handle rasterizer discard in the clipper in order for statistics counters to work. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use the correct program when uploading Broadwell SOL state.Kenneth Graunke2014-04-071-6/+2
| | | | | | | This is the equivalent of commit 43e77215b13b2f86e461cd8a62b542f. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Remove left-over 'removed' variable.Matt Turner2014-04-071-13/+8
| | | | | | | | I think this was used for coalescing out partly dead large virtual registers, but the patch that enabled that caused regressions and didn't make it upstream. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Check for interference after finding all channels.Matt Turner2014-04-071-11/+26
| | | | | | | | | | | | It's more likely that we won't find writes to all channels than one will interfere, and calculating interference is more expensive. This change will also help prepare for coalescing load_payload instructions' operands. Also update the live intervals for all channels, and not just the last that we saw. Reviewed-by: Eric Anholt <[email protected]>