aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Refactor sampler message header to duplicate less code.Kenneth Graunke2014-01-221-25/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the code to copy g0 to the message header existed in two places - one for the texture offset case, and one for any other case. By treating texture_offset as a special case of header_present, we can remove this duplication and shorten the code. Future patches which add new header fields also won't have to add additional duplication. This also clarifies a confusing construct. The old code contained: } else if (inst->header_present) { if (brw->gen >= 7) { ...explicit copy from g0 to the message header... } else { /* Set up an implied move from g0 to the MRF. */ } } This looks like it might set up an implied move on Sandybridge, which doesn't support those. However, Sandybridge only uses a message header for texture offsets, so it would never hit this code path. The new code avoids this implicit knowledge by only setting up an implied move on Gen4-5. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Use get_element_ud to shorten texture header access.Kenneth Graunke2014-01-221-2/+1
| | | | | | | | This is shorter, easier to read, and further from the 80 column limit. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* intel: Fix initial MakeCurrent for single-buffer drawablesKristian Høgsberg2014-01-221-6/+4
| | | | | | | | | | | | | | | Commit 05da4a7a5e7d5bd988cb31f94ed8e1f053d9ee39 attempts to eliminate the call to intel_update_renderbuffer() in the case where we already have a drawbuffer for the drawable. Unfortunately this only checks the back left renderbuffer, which breaks in case of single buffer drawables. This means that the initial viewport will not be set in that case. Instead, we now check whether the initial viewport has not been set, in which case we call out to intel_update_renderbuffer(). https://bugs.freedesktop.org/show_bug.cgi?id=73862 Signed-off-by: Kristian Høgsberg <[email protected]>
* meta: Move loop variable declaration outside loop.Vinson Lee2014-01-211-1/+3
| | | | | | | | | | | | | | | | Fixes MSVC build error introduced with commit 69b258cb4636315b4c1aaaceeedd1eed8af98ba8. meta.c(618) : error C2143: syntax error : missing ';' before 'type' meta.c(618) : error C2143: syntax error : missing ')' before 'type' meta.c(618) : error C2065: 'i' : undeclared identifier meta.c(618) : warning C4552: '<' : operator has no effect; expected operator with side-effect meta.c(618) : error C2059: syntax error : ')' meta.c(618) : error C2143: syntax error : missing ';' before '{' meta.c(619) : error C2065: 'i' : undeclared identifier meta.c(620) : error C2065: 'i' : undeclared identifier Signed-off-by: Vinson Lee <[email protected]>
* i965/blorp: use BRW_COMPRESSION_2NDHALF for second half LPRTopi Pohjolainen2014-01-222-28/+42
| | | | | | | | | No known bugs fixed but this is now in line with fs-generator. No regresssions on IVB. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: patch jump counters also for endifTopi Pohjolainen2014-01-222-3/+5
| | | | | | | | | | | | | | | | | No known bugs fixed but this is now in line with fs-generator. No regresssions on IVB. Eric further explained that: "The endif jump, since it's forward, is just an optimization to have set right -- otherwise, the GPU will just step forward instruction by instruction until it hits something else that updates the per-channel PC." Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Change redundant code into loops in texstate.c.Paul Berry2014-01-211-46/+29
| | | | | | | This is possible now that ctx->Shader.CurrentProgram is an array. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Change redundant code into loops in shaderapi.c.Paul Berry2014-01-211-30/+9
| | | | | | | This is possible now that ctx->Shader.CurrentProgram is an array. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Remove ad-hoc arrays of gl_shader_program.Paul Berry2014-01-213-15/+3
| | | | | | | | Now that we have a ctx->Shader.CurrentProgram array, we can just use it directly. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* meta: Replace save_state::{Vertex,Geometry,Fragment}Shader with an array.Paul Berry2014-01-211-16/+14
| | | | | | | | Since ctx->Shader.Current{Vertex,Geometry,Fragment}Program is an array, this allows some meta code to be rolled up into loops. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Fix comments to refer to the new ctx->Shader.CurrentProgram array.Paul Berry2014-01-213-6/+6
| | | | | Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Fold long lines introduced by the previous patch.Paul Berry2014-01-218-17/+32
| | | | | Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Replace ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an ↵Paul Berry2014-01-2128-85/+84
| | | | | | | | | | | | | | | | | | | | | array. These are replaced with ctx->Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '*.c' -o -iname '*.cpp' ')' \ -print0 | xargs -0 sed -i \ -e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \ -e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \ -e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g' Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: use _mesa_validate_shader_target() more frequently.Paul Berry2014-01-212-4/+4
| | | | | | | | | | | | | This patch replaces code in _mesa_new_shader() and delete_shader_cb() that checks the type of a shader with calls to _mesa_validate_shader_target(). This has two advantages: it allows for a more thorough check (since _mesa_validate_shader_target() doesn't permit shader targets that aren't supported by the back-end), and it reduces the amount of code that will need to be modified when adding new shader stages. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* main: Allow ctx == NULL in _mesa_validate_shader_target().Paul Berry2014-01-211-3/+10
| | | | | | | | | This will allow this function to be used in circumstances where there is no context available, such as when building built-in GLSL functions. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Make validate_shader_target() non-static.Paul Berry2014-01-212-4/+7
| | | | | Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program.Paul Berry2014-01-214-21/+4
| | | | | | | | | | | | | | | | In my recent zeal to refactor Mesa's handling of the gl_shader_stage enum, I accidentally wound up with two functions that do the same thing: _mesa_program_index_to_target(), and _mesa_shader_stage_to_program(). This patch keeps _mesa_shader_stage_to_program(), since its name is more consistent with other related functions. However, it changes the signature so that it accepts an unsigned integer instead of a gl_shader_stage--this avoids awkward casts when the function is called from C++ code. Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Generate GL_INVALID_OPERATION for unsupported DSA TexStorage functionsIan Romanick2014-01-211-3/+15
| | | | | | | | | We have to make the functions available to work around a GLEW bug (see comments already in the code), but if an application calls one of these functions we should still generate GL_INVALID_OPERATION. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Silence many unused parameter warningsIan Romanick2014-01-211-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | main/texstorage.c: In function '_mesa_alloc_texture_storage': main/texstorage.c:240:53: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:241:37: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c:241:53: warning: unused parameter 'depth' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage1DEXT': main/texstorage.c:464:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:464:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:464:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:465:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:466:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage2DEXT': main/texstorage.c:473:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:473:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:473:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:474:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:475:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:475:50: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage3DEXT': main/texstorage.c:483:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:483:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:483:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:484:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:485:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:485:50: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c:485:66: warning: unused parameter 'depth' [-Wunused-parameter] Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Ignore 'centroid' interpolation qualifier in case of persample shadingAnuj Phogat2014-01-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch handles the use of 'centroid' qualifier with 'in' variables in a fragment shader when persample shading is enabled. Per sample shading for the whole fragment shader can be enabled by: glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID} builtin variables in fragment shader. Explaining it below in more detail. /* Enable sample shading using OpenGL API */ glEnable(GL_SAMPLE_SHADING); glMinSampleShading(1.0); Example fragment shader: in vec4 a; centroid in vec4 b; main() { ... } Variable 'a' will be interpolated at sample location. But, what interpolation should we use for variable 'b' ? ARB_sample_shading recommends interpolation at sample position for all the variables. GLSL 400 (and earlier) spec says that: "When an interpolation qualifier is used, it overrides settings established through the OpenGL API." But, this text got deleted in later versions of GLSL. NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3) interpolates at sample position. This convinces me to use the similar approach on intel hardware. Cc: [email protected] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Use sample barycentric coordinates with per sample shadingAnuj Phogat2014-01-214-6/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation of arb_sample_shading doesn't set 'Barycentric Interpolation Mode' correctly. We use pixel barycentric coordinates for per sample shading. Instead we should select perspective sample or non-perspective sample barycentric coordinates. It also enables using sample barycentric coordinates in case of a fragment shader variable declared with 'sample' qualifier. e.g. sample in vec4 pos; A piglit test to verify the implementation has been posted on piglit mailing list for review. V2: Do not interpolate all the 'in' variables at sample position if fragment shader uses 'sample' qualifier with one of them. For example we have a fragment shader: #version 330 #extension ARB_gpu_shader5: require sample in vec4 a; in vec4 b; main() { ... } Only 'a' should be sampled at sample location, not 'b'. Cc: [email protected] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Add an option to ignore sample qualifierAnuj Phogat2014-01-215-7/+9
| | | | | | | | | | This will be useful in my next patch which depends on a functionality of _mesa_get_min_invocations_per_fragment() to ignore the sample qualifier (prog->IsSample) based on a flag passed to it. Cc: [email protected] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* mesa/x86: Remove dead read_rgba_span_x86.h.Matt Turner2014-01-211-56/+0
| | | | Dead since 304f7a13.
* i965/fs: Optimize LRP with x == y into a MOV.Matt Turner2014-01-211-0/+10
| | | | | | | | | total instructions in shared programs: 1487331 -> 1485988 (-0.09%) instructions in affected programs: 45638 -> 44295 (-2.94%) GAINED: 7 LOST: 0 Reviewed-by: Jordan Justen <[email protected]>
* i965: Enable AOS optimizations for the geometry shader.Matt Turner2014-01-211-0/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: rename PreferDP4 to OptimizeForAOS.Matt Turner2014-01-217-9/+12
| | | | | | | | | This flag was really just a proxy for determining whether the backend was vector (AOS) or scalar (SOA). It will be used to apply a future optimization only for vector backends. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Print the maximum register pressure.Matt Turner2014-01-211-1/+3
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Show register pressure in dump_instructions() output.Kenneth Graunke2014-01-213-1/+16
| | | | | | | | | | Dumping the number of live registers at each IP allows us to see register pressure and identify any local maxima. This should aid in debugging passes designed to reduce register pressure, as well as optimizations that suddenly trigger spilling. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Compute the number of live registers at each IP.Kenneth Graunke2014-01-213-0/+22
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/fs: Call opt_peephole_sel later in the optimization loop.Matt Turner2014-01-211-1/+1
| | | | | | | | | | | | Calling it after value numbering (added in the next commit) prevents some instruction count regressions. total instructions in shared programs: 1524387 -> 1523905 (-0.03%) instructions in affected programs: 13112 -> 12630 (-3.68%) GAINED: 0 LOST: 3 Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Calculate interference better in register_coalesce.Matt Turner2014-01-211-7/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we simply considered two registers whose live ranges overlapped to interfere. Cases such as set A ------ ... | mov B, A -- | ... | B | A use B -- | ... | use A ------ would be considered to interfere, even though B is an unmodified copy of A whose live range fit wholly inside that of A. If no writes to A or B occur between the mov B, A and the use of B then we can safely coalesce them. Instead of removing MOV instructions, we make them NOPs and remove them at once after the main pass is finished in order to avoid recomputing live intervals (which are needed to perform the previous step). total instructions in shared programs: 1543768 -> 1513077 (-1.99%) instructions in affected programs: 951563 -> 920872 (-3.23%) GAINED: 46 LOST: 22 Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Support coalescing registers of size > 1.Matt Turner2014-01-211-23/+59
| | | | | | | total instructions in shared programs: 1550048 -> 1549880 (-0.01%) instructions in affected programs: 1896 -> 1728 (-8.86%) Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Assert that var < num_vars.Matt Turner2014-01-211-0/+2
| | | | | | Helped to track down a problem in a version of the next commit. Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Add a comment explaining how register coalescing works.Matt Turner2014-01-211-0/+12
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Add and use MAX_SAMPLER_MESSAGE_SIZE definition.Matt Turner2014-01-213-5/+10
| | | | Reviewed-by: Jordan Justen <[email protected]>
* mesa: Add STRINGIFY macro.Matt Turner2014-01-211-0/+2
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965/fs: Fix the example about overwriting uniforms in SIMD16.Matt Turner2014-01-211-5/+5
| | | | | | | mov takes only a single source argument. Example instruction inexplicably changed from add to mov in commit f10f5e49. Reviewed-by: Jordan Justen <[email protected]>
* i965: Print reg_offset for vgrf of size > 1 in dump_instruction().Matt Turner2014-01-212-4/+4
| | | | | | | Previously we wouldn't print the +0 for the first part of a VGRF of size greater than 1. Reviewed-by: Jordan Justen <[email protected]>
* mesa: add missing TYPE_DOUBLEN_2 cases in get.cBrian Paul2014-01-211-0/+12
| | | | | | | | | | | | The new TYPE_DOUBLEN_2 type was added in 0e60d850 but the code to return values of that type wasn't completed. Fixes conform's default state test. glGetFloatv(GL_DEPTH_RANGE) wasn't returning anything. v2: remove stray 'break' statements. Reviewed-by: Jose Fonseca <[email protected]>
* i965: Modify some error messages to refer to "vec4" instead of "vs".Paul Berry2014-01-212-5/+5
| | | | | | | | | These messages are in code that is shared between the VS and GS back-ends, so use the terminology "vec4" to avoid confusion. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add GS support to INTEL_DEBUG=shader_time.Paul Berry2014-01-218-10/+37
| | | | | | | Previously, time spent in geometry shaders would be counted as part of the vertex shader time. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Reserve space for "Vertex Count" in GS outputs.Kenneth Graunke2014-01-212-0/+13
| | | | | | | | v2: Also increment ir->offset in the GS visitor, rather than at the final assembly generation stage (requested by Paul). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Update blitter code for 48-bit addresses.Kenneth Graunke2014-01-201-16/+48
| | | | | | | v2: Rebase on Eric's SET_FIELD changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1]
* i965: Update PIPE_CONTROL packet lengths for Broadwell.Kenneth Graunke2014-01-201-2/+20
| | | | | | | | | On Broadwell, PIPE_CONTROL needs an extra DWord to accomodate the 48-bit addressing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Re-combine the Gen4-5 and Gen6+ write_depth_count functions.Kenneth Graunke2014-01-203-23/+10
| | | | | | | | | Now that we have a helper function that handles the PIPE_CONTROL variations between the various platforms, these are basically the same. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Create a helper function for emitting PIPE_CONTROL writes.Kenneth Graunke2014-01-204-93/+69
| | | | | | | | | | | | | | | | | | | There are a lot of places that use PIPE_CONTROL to write a value to a buffer (either an immediate write, TIMESTAMP, or PS_DEPTH_COUNT). Creating a single function to do this seems convenient. As part of this refactor, we now set the PPGTT/GTT selection bit correctly on Gen7+. Previously, we set bit 2 of DW2 on all platforms. This is correct for Sandybridge, but actually part of the address on Ivybridge and later! Broadwell will also increase the length of these packets by 1; with the refactoring, we should have to adjust that in substantially fewer places, giving us confidence that we've hit them all. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use full-length PIPE_CONTROL packets for workaround writes.Kenneth Graunke2014-01-201-6/+9
| | | | | | | | | | | | | | | I believe that PIPE_CONTROL uses the length field to decide whether to do 32-bit or 64-bit writes. A length of 4 would do a 32-bit write, while a length of 5 would do a 64-bit write. (I haven't verified this, though.) For workaround writes, we don't care what value gets written, or how much data. We're only writing something because hardware bugs mandate that do so. So using a 64-bit write should be fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Emit full-length PIPE_CONTROLs for (non-write) flushes.Kenneth Graunke2014-01-201-2/+3
| | | | | | | | | | | | | | | | The PIPE_CONTROL packet actually has 5 DWords on Gen6+: 1. Header 2. Flags 3. Address 4. Immediate Data: Lower DWord 5. Immediate Data: Upper DWord We just never emitted the last one. While it appears to work, it's probably safer to emit the entire thing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Create a helper function for emitting PIPE_CONTROL flushes.Kenneth Graunke2014-01-204-86/+66
| | | | | | | | | | | | | | | These days, we need to emit PIPE_CONTROL flushes all over the place. Being able to do that via a single function call seems convenient. Broadwell will also increase the length of these packets by 1; with the refactoring, we should have to do this in substantially fewer places. v2: Add back forgotten intel_emit_post_sync_nonzero_flush (caught by Eric Anholt). Drop unlikely() from BLT_RING check. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix MI_STORE_REGISTER_MEM for Broadwell.Kenneth Graunke2014-01-201-10/+23
| | | | | | | It now takes a 48-bit address. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>