summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965: Drop pointless cast of texObj to intelObj.Eric Anholt2014-05-011-3/+1
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Move intel_region_get_aligned_offset() to be a miptree function.Eric Anholt2014-05-017-72/+69
| | | | | | | | | | All the consumers are doing it on a miptree. v2: fix a silly duplicated dereference (review by Ken) Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> (v1) Reviewed-by: Chad Versace <[email protected]> (v1)
* i965: Move intel_region_get_tile_masks() to be a miptree function.Eric Anholt2014-05-016-48/+47
| | | | | | | | All the consumers are doing it on a miptree. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Fix another broken offset-aligned-to-tile test.Eric Anholt2014-05-011-3/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Fix offset-aligned-to-tile test in dma_buf import.Eric Anholt2014-05-011-3/+1
| | | | | | | v1 of the patch got pushed, insted of the v2 that I had reviewed. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Reuse intel_miptree_get_tile_offsets().Eric Anholt2014-05-011-12/+3
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa: move declarations before code in texstore.cBrian Paul2014-05-011-8/+7
| | | | | | To fix MSVC build. Reviewed-by: Anuj Phogat <[email protected]>
* i965: Fix format of private renderbuffersVille Syrjälä2014-05-011-19/+33
| | | | | | | | | | | | | | | | | | | | intel_alloc_renderbuffer_storage() will clobber rb->Format which was already set up by intel_create_renderbuffer(). This causes the driver to potentially create the depth buffer in the wrong format. In practice this makes the depth buffer Z24 even if the visual has depthBits==16. The incorrect depth buffer format doesn't seem to cause any actual problems in i965, but it seems like we should fix it anyway. I see Z16 has been more or less deprecated in the driver except the for the depthBits==16 case. But if we want to use Z24 even in that case (not sure it's really legal?) it would look better if the code made that decision explicitly rather than relying on the format to get magically overwritten by the renderbuffer code. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]>
* i915: Don't advertise Z formats in TextureFormatSupported on gen2Ville Syrjälä2014-05-011-13/+15
| | | | | | | Gen2 doesn't support texturing from Z formats, so state as much. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]>
* i915: Fix format of private renderbuffersVille Syrjälä2014-05-011-16/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | intel_alloc_renderbuffer_storage() will clobber rb->Format which was already set up by intel_create_renderbuffer(). This causes the driver to potentially create the depth buffer in the wrong format. Long time ago things worked by accident because _mesa_choose_tex_format() checked for ARB_depth_texture and thus returned MESA_FORMAT_NONE on gen2 hardware. Somehow that ended up working when depthBits==16 because the driver would then pick DEPTH_FRMT_16_FIXED. Not sure how, but things also seemed to work with depthBits==24. Things started to go more sideways at: commit 6ae473221a53d8bcb584021483c5328797c6b67c Author: Eric Anholt <[email protected]> Date: Mon Apr 22 16:04:25 2013 -0700 intel: Fold the one last function intel_tex_format.c into the caller. since that caused intel_miptree_create_layout() to divide by zero when encoutering MESA_FORMAT_NONE (bw==0). So after this commit things were broken enough that many applications wouldn't even run. Things got a bit better at: commit c245efe7e8247ba0c845dee7b77e63fdbfc7e1b3 Author: Eric Anholt <[email protected]> Date: Thu Mar 21 09:50:45 2013 -0700 mesa: Remove extension checking from ChooseTexFormat. since now _mesa_choose_tex_format() would return MESA_FORMAT_X8_Z24 for GL_DEPTH_COMPONENT due to i915 erroneosly claiming that MESA_FORMAT_X8_S24 (and others) are supported texture formats even on gen2 hardware. So now the the div-by-zero was gone, but now the driver would pick DEPTH_FRMT_24_FIXED_8_OTHER even when depthBits==16 which caused rendering problems. If we prevent rb->Format from getting clobbered for the depth buffer things work much better. This makes the spinning title text visible again in chromium-bsu at 16bpp, for example. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]>
* mesa: Allow FLOAT_32_UNSIGNED_INT_24_8_REV in get_tex_depth_stencil()Anuj Phogat2014-05-011-2/+2
| | | | | | | | Fixes a crash in Khronos OpenGL CTS packed_pixels tests. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add support to unpack depth-stencil texture in to ↵Anuj Phogat2014-05-012-1/+87
| | | | | | | | | | | | | | | | FLOAT_32_UNSIGNED_INT_24_8_REV V2: Follow the new naming convention for unpack functions. Use double precision for converting Z24 to a float. V3: Unpack stencil value to most significant byte. Use 'struct z32f_x24s8' type. V4: Unpack stencil value to least significant byte. Add a comment to clarify stencil packing. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add new helper function _mesa_unpack_depth_stencil_row()Anuj Phogat2014-05-013-6/+32
| | | | | | | | | | This patch makes non-functional changes in the code. New helper function added here will make it easier to support more data types in the following patches. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Remove redundant if checks in _mesa_texstore_xx_xx() functionsAnuj Phogat2014-05-011-89/+80
| | | | | | | | | This patch contains non-functional changes. Assertion checks made earlier in the functions make the if checks redundant. So, remove the if checks and unindent the code in if block. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Allow srcFormat=GL_DEPTH_STENCIL in _mesa_texstore_xx_xx() functionsAnuj Phogat2014-05-011-2/+4
| | | | | | | | | _mesa_texstore_z24_s8() and _mesa_texstore_z32f_x24s8() are capable of handling GL_DEPTH_STENCIL format. So, allow it in both the functions. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add missing types in _mesa_texstore_xx_xx() functionsAnuj Phogat2014-05-011-2/+6
| | | | | | | | | | | Depth-stencil teture targets are allowed to use source data of type GL_UNSIGNED_INT_24_8_EXT and GL_FLOAT_32_UNSIGNED_INT_24_8_REV. Fixes few crashes in Khronos OpenGL CTS packed_pixels tests. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Fix crash in do_blit_readpixels()Anuj Phogat2014-05-011-0/+7
| | | | | | | | Fixes a crash in Khronos CTS packed_pixels tests. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add error condition for format=STENCIL_INDEX in glGetTexImage()Anuj Phogat2014-05-011-0/+5
| | | | | | | | | | From OpenGL 4.0 spec, page 306: "Calling GetTexImage with a format of STENCIL_INDEX causes the error INVALID_ENUM." Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add entry for extension ARB_texture_stencil8Anuj Phogat2014-05-011-0/+1
| | | | | | | | V2: Alphabetize the new entry Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Apply the link error conditions to GL_ARB_fragment_coord_conventionsAnuj Phogat2014-05-013-1/+14
| | | | | | | | | | | | | | | Link error conditions added in previous patch are equally applicable to GL_ARB_fragment_coord_conventions implementation. Extension's spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use of gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." Signed-off-by: Anuj Phogat <[email protected]> Cc: <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Link error if fs defines conflicting qualifiers for gl_FragCoordAnuj Phogat2014-05-015-0/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GLSL 1.50 spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." This patch causes the shader link to fail if we have multiple fragment shaders with conflicting layout qualifiers for gl_FragCoord. V2: Restructure the code and add conditions to correctly handle the following case: fragment shader 1: layout(origin_upper_left) in vec4 gl_FragCoord; void main() { foo(); gl_FragColor = gl_FragData; } fragment shader 2: layout(pixel_center_integer) in vec4 gl_FragCoord; void foo() { } V3: Allow linking in the following case: fragment shader 1: void main() { foo(); gl_FragColor = gl_FragCoord; } fragment shader 2: in vec4 gl_FragCoord; void foo() { ... } Signed-off-by: Anuj Phogat <[email protected]> Cc: <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Compile error if fs uses gl_FragCoord before first redeclarationAnuj Phogat2014-05-011-0/+17
| | | | | | | | | | | | | | | | | | | Section 4.3.8.1, page 39 of GLSL 1.50 spec says: "Within any shader, the first redeclarations of gl_FragCoord must appear before any use of gl_FragCoord." GLSL compiler should generate an error in following case: vec4 p = gl_FragCoord; layout(origin_upper_left) in vec4 gl_FragCoord; void main() { } Signed-off-by: Anuj Phogat <[email protected]> Cc: <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Compile error if fs defines conflicting qualifiers for gl_FragCoordAnuj Phogat2014-05-012-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GLSL 1.50 spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." This patch makes the glsl compiler to generate an error if we have a fragment shader defined with conflicting layout qualifier declarations for gl_FragCoord. For example: layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; layout(pixel_center_integer) in vec4 gl_FragCoord; void main() { } V2: Some code refactoring for better readability. Add compiler error conditions for redeclarations like: layout(origin_upper_left) in vec4 gl_FragCoord; layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; and in vec4 gl_FragCoord; layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; V3: Simplify function is_conflicting_fragcoord_redeclaration() V4: Check for null pointer before doing strcmp(var->name, "gl_FragCoord"). Signed-off-by: Anuj Phogat <[email protected]> Cc: <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Use location VERT_ATTRIB_GENERIC0 for vertex attribute 0Anuj Phogat2014-05-013-29/+40
| | | | | | | | | | | | | | | | | | | In OpenGL 3.1 attribute 0 becomes non-magic, just like in OpenGL ES 2.0. Earlier versions of OpenGL used attribute 0 exclusively for vertex position. V2: Add a utility function _mesa_attr_zero_aliases_vertex() in varray.h Fixes 4 Khronos OpenGL CTS failures: glGetVertexAttrib depth24_basic depth24_precision rgb8_rgba8_rgb Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Fix querying location of nth element of an array variableAnuj Phogat2014-05-011-5/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes changes to the behavior of glGetAttribLocation(), glGetFragDataLocation() and glGetFragDataIndex() functions. Code changes handle a case described in following example: shader program: layout(location = 1)in vec4[4] a; void main() { } Currently, glGetAttribLocation("a") returns 1. glGetAttribLocation("a[i]"), where i = {0, 1, 2, 3}, returns -1. But the expected locations for array elements are: 1, 2, 3 and 4 respectively. This clarification came up with the addition of ARB_program_interface_query to OpenGL 4.3. From Page 326 (page 347 of the PDF) of OpenGL 4.3 spec: "Otherwise, the command is equivalent to GetProgramResourceLocation(program, PROGRAM_INPUT, name);" And, From Page 101 (page 122 of the PDF) of OpenGL 4.3 spec: "A string provided to GetProgramResourceLocation or GetProgramResourceLocationIndex is considered to match an active variable if • the string exactly matches the name of the active variable; • if the string identifies the base name of an active array, where the string would exactly match the name of the variable if the suffix "[0]" were appended to the string; or • if the string identifies an active element of the array, where the string ends with the concatenation of the "[" character, an integer (with no "+" sign, extra leading zeroes, or whitespace) identifying an array element, and the "]" character, the integer is less than the number of active elements of the array variable, and where the string would exactly match the enumerated name of the array if the decimal integer were replaced with zero." V2: Simplify get_matching_index() function. Add relevant text from OpenGL spec in commit message. Fixes failures in Khronos OpenGL CTS tests: explicit_attrib_location_room draw_instanced_max_vertex_attribs Proprietary linux drivers of NVIDIA (331.49) matches the behavior expected by OpenGL 4.3 spec. Cc: <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Allow overlapping locations for vertex input attributesAnuj Phogat2014-05-011-15/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently overlapping locations of input variables are not allowed for all the shader types in OpenGL and OpenGL ES. From OpenGL ES 3.0 spec, page 56: "Binding more than one attribute name to the same location is referred to as aliasing, and is not permitted in OpenGL ES Shading Language 3.00 vertex shaders. LinkProgram will fail when this condition exists. However, aliasing is possible in OpenGL ES Shading Language 1.00 vertex shaders." Taking in to account what different versions of OpenGL and OpenGL ES specs say about aliasing: - It is allowed only on vertex shader input attributes in OpenGL (2.0 and above) and OpenGL ES 2.0. - It is explictly disallowed in OpenGL ES 3.0. Fixes Khronos CTS failing test: explicit_attrib_location_vertex_input_aliased.test See more details about this at below mentioned khronos bug. V2: Fix the case where location exceeds the maximum allowed attribute location. V3: Simplify the condition added in V2. Signed-off-by: Anuj Phogat <[email protected]> Cc: "9.2 10.0 10.1" <[email protected]> Bugzilla: Khronos #9609 Reviewed-by: Ian Romanick <[email protected]>
* glx/drisw: fix memory leak when destroying screen.Roland Scheidegger2014-05-011-0/+1
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallivm: fix 2 leaks in disassembly codeRoland Scheidegger2014-05-011-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | don't leak the MCSubtargetInfo (not really big, was already fixed with llvm master) and TargetMachine (big). While this is only used for debugging the leak is large enough to get you into trouble in some cases. Tested with llvm 3.1 and master. Before (llvm 3.1), GALLIVM_DEBUG=asm glxgears: ==14152== LEAK SUMMARY: ==14152== definitely lost: 105,228 bytes in 20 blocks ==14152== indirectly lost: 347,252 bytes in 261 blocks ==14152== possibly lost: 866,625 bytes in 1,453 blocks ==14152== still reachable: 7,344,677 bytes in 6,494 blocks ==14152== suppressed: 0 bytes in 0 blocks After: ==13799== LEAK SUMMARY: ==13799== definitely lost: 3,108 bytes in 6 blocks ==13799== indirectly lost: 0 bytes in 0 blocks ==13799== possibly lost: 804,143 bytes in 1,429 blocks ==13799== still reachable: 7,314,267 bytes in 6,473 blocks ==13799== suppressed: 0 bytes in 0 blocks Reviewed-by: Brian Paul <[email protected]>
* mesa: Move declaration to top of block.José Fonseca2014-05-011-1/+2
| | | | To fix MSVC build. Trivial.
* osmesa: Fix typo in _MaxEnabledTexImageUnit.José Fonseca2014-05-011-1/+1
|
* i965/vec4: Port untyped atomic message support to Broadwell.Kenneth Graunke2014-05-012-1/+38
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vec4: Port untyped surface reads support to Broadwell.Kenneth Graunke2014-05-012-1/+28
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Port untyped atomic message support to Broadwell.Kenneth Graunke2014-05-012-1/+38
| | | | | | | | v2: Fix SIMD mode comment (caught by Eric Anholt). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Port untyped surface read support to Broadwell.Kenneth Graunke2014-05-012-1/+30
| | | | | | | | | v2: Drop unused num_components variable; fix SIMD Mode comment (caught by Eric Anholt). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Set fs_inst::header_present for untyped atomics/surface reads.Kenneth Graunke2014-05-011-0/+2
| | | | | | | | | | | The brw_eu_emit.c code manually forces the header present bit when used in align1 (scalar) mode. So, this has no effect currently. However, it is nice to have fs_inst::header_present reflect reality. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Disassemble atomic operations and other DP:DC1 stuff on Broadwell.Kenneth Graunke2014-05-011-0/+65
| | | | | | | | This is similar to what Eric did for Gen7 a little while ago; it also has support for untyped surface reads. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Implement the create_raw_surface() hook on Broadwell.Kenneth Graunke2014-05-011-0/+17
| | | | | | | | Otherwise we crash when setting up atomic buffer objects. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Drop mark_surface_used from gen8 generators.Kenneth Graunke2014-05-014-28/+6
| | | | | | | | Francisco made brw_mark_surface_used a freestanding function in commit a32817f3c248125fb537c3a915566445e5600d45. We should use it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Add support for fs_inst::force_writemask_all on Broadwell.Kenneth Graunke2014-05-011-0/+1
| | | | | | | | This must not have existed when I wrote the original code. The atomic operation header setup code uses this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Actually emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS.Kenneth Graunke2014-05-013-15/+8
| | | | | | | | | | | | | | | | | | | | | | | | | For platforms using hardware contexts (currently Gen6+), we failed to emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS, instead emitting MI_NOOP for both. During one of the context initialization reordering patches, we accidentally moved brw_init_state before we set brw->CMD_PIPELINE_SELECT and brw->CMD_VF_STATISTICS. So, when brw_init_state uploaded initial GPU state (brw_init_state -> brw_upload_initial_gpu_state -> brw_upload_invariant_state), these would be 0 (MI_NOOP). Storing the commands in the context is not worthwhile. We have many generation checks in our state upload code, and for platforms with hardware contexts, this only gets called once per GL context anyway. The cost is negligable, and it's easy to botch context creation ordering. This may fix hangs on Gen6+ when using the media pipeline. Cc: "10.0 10.1" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965: Don't enable reset notification support on Gen4-5.Kenneth Graunke2014-04-301-4/+9
| | | | | | | | | | | | | arekm reported that using Chrome with GPU acceleration enabled on GM45 triggered the hw_ctx != NULL assertion in brw_get_graphics_reset_status. We definitely do not want to advertise reset notification support on Gen4-5 systems, since it needs hardware contexts, and we never even request a hardware context on those systems. Cc: "10.1" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75723 Signed-off-by: Kenneth Graunke <[email protected]>
* doc: Add pointer to the Mesa Stable Queue page.Carl Worth2014-04-301-0/+5
| | | | Since this is now updated daily and looks to be useful.
* i965: Fix state flag comments on color_buffer_write_enabled() calls.Eric Anholt2014-04-303-1/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop bogus state flag comment.Eric Anholt2014-04-301-1/+0
| | | | | | | This was introduced with the comment and code below it, though the code only touches prog_data (CACHE_NEW_WM_PROG). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Track the number of samples in the drawbuffer.Eric Anholt2014-04-305-22/+29
| | | | | | | | | This keeps us from having to emit the nonpipelined state packet on every FBO binding. -4.42003% +/- 1.09961% effect on cairo-perf-trace runtime on glamor (n=110). Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Track maximum CurrentTexUnit to reduce glDeleteTextures() overhead.Eric Anholt2014-04-303-1/+12
| | | | | | | | No more walking 96*6 pointers looking to see if they're the current texture, when we only use the first 2 out of 96 units. -6.26002% +/- 1.87817% effect on cairo runtime on no-fbo-cache glamor (n=36). Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Rewrite shader-based texture image state updates.Eric Anholt2014-04-301-49/+73
| | | | | | | | | | | Instead of walking 6 shader stages for each of the 96 combined texture image units, now we just walk the samplers used in each shader stage. With cairo-perf-trace on Xephyr with glamor, I'm seeing a -6.50518% +/- 2.55601% effect on runtime (n=22) since the "drop _EnabledUnits" change. No significant performance difference on an apitrace of minecraft (n=442). Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Split the shader texture update logic from fixed function.Eric Anholt2014-04-301-90/+160
| | | | | | | | | | | | I want to avoid walking the entire long array texture image units, but the obvious way to do so means walking program samplers, and thus hitting the units in a random order. This change replaces the previous behavior of only setting up the fallback texture for a fragment shader with setting up the fallback texture for any shader that's missing a complete texture of the right target in its unit. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Finish removing the _ReallyEnabled field.Eric Anholt2014-04-302-8/+2
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radeon: Drop the remaining driver usage of _ReallyEnabled.Eric Anholt2014-04-308-29/+40
| | | | | | | | This is kind of ugly, but I think it's worth it to finish off the last consumers of _ReallyEnabled. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>