aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
...
* i965: quieten compiler warning about out-of-bounds accessIlia Mirkin2016-01-051-0/+1
| | | | | | | | | | | | | | | gcc 4.9.3 shows the following error: brw_vue_map.c:260:20: warning: array subscript is above array bounds [-Warray-bounds] return brw_names[slot - VARYING_SLOT_MAX]; This is because BRW_VARYING_SLOT_COUNT is a valid value for the enum type. Adding an assert will generate no additional code but will teach the compiler to not complain. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965/wm: use binding size for ubo/ssbo when automatic size is unsetIlia Mirkin2016-01-051-4/+10
| | | | | | | | | | | | | | This fixes the same tests that commit 8cf2e892f was attempting to fix: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize as confirmed by Samuel. Signed-off-by: Ilia Mirkin <[email protected]> Cc: Samuel Iglesias Gonsálvez <[email protected]> Cc: Marta Lofstedt <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* Revert "i965/wm: use proper API buffer size for the surfaces."Ilia Mirkin2016-01-054-13/+5
| | | | | | | | | | | This reverts commit 8cf2e892fca20c4776b4a07c39918343cb2d4e0e. It's entirely bogus to attempt to store anything about the binding in the buffer object itself, which might be bound any number of times. Signed-off-by: Ilia Mirkin <[email protected]> Cc: Samuel Iglesias Gonsálvez <[email protected]> Cc: Marta Lofstedt <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* st/mesa: make KHR_debug output independent of context creation flags (v2)Nicolai Hähnle2016-01-044-57/+98
| | | | | | | | | | | | | Instead, keep track of GL_DEBUG_OUTPUT and (un)install the pipe_debug_callback accordingly. Hardware drivers can still use the absence of the callback to skip more expensive operations in the normal case, and users can no longer be surprised by the need to set the debug flag at context creation time. v2: - re-add the proper initialization of debug contexts (Ilia Mirkin) - silence a potential warning (Ilia Mirkin) Reviewed-by: Ilia Mirkin <[email protected]>
* i965/wm: use proper API buffer size for the surfaces.Samuel Iglesias Gonsálvez2016-01-044-5/+13
| | | | | | | | | | | | | | | Commit 5bb5eeea fixes a bug indicating that the surfaces should have the API buffer size. Hovewer it picked the wrong value. This patch adds a new variable, which takes into account glBindBufferRange() values. This patch fixes the following CTS regressions: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]>
* st/mesa: use PK2H/UP2H when supportedIlia Mirkin2016-01-033-5/+14
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: fix parameter names for tesseval/tessctrl prototypesSamuel Pitoiset2016-01-031-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: fix double-const qualifierIlia Mirkin2016-01-032-2/+2
| | | | | | | Reported by Tom^ on IRC. The original intent was to mark the pointer constant as well as the data being pointed to, so move the *. Signed-off-by: Ilia Mirkin <[email protected]>
* nir: extract out helper macros for running passesRob Clark2016-01-031-36/+9
| | | | | | | | | Note these are a bit uglier, due to avoidance of GNU C extensions. But drivers which do not need to be built with compilers that don't support the extension can wrap these macros with their own. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Make TCS precompile use the TES primitive mode when available.Kenneth Graunke2016-01-021-1/+3
| | | | | | | | If there's a linked TES program, we should just use the actual primitive mode. If not, just guess triangles (as we did before). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Push most TES inputs in SIMD8 mode.Kenneth Graunke2016-01-021-12/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 32 vec4 slots (16 registers) is more than sufficient to ensure that 100% of TES inputs are pushed for Shadow of Mordor, Unigine Heaven, GPUTest/TessMark, and SynMark. Note that unlike most SIMD8 stages, this actually reads packed vec4 data, since that is what our vec4 TCS programs write. Improves performance in GPUTest's tessmark_x64 microbenchmark by 93.4426% +/- 5.35541% (n = 25) on my Lenovo X250 at 1024x768. Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 22.74% +/- 0.309394% (n = 5). Improves performance in Shadow of Mordor at low settings with tessellation enabled at 1280x720 by 2.12197% +/- 0.478553% (n = 4). shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 184358 -> 181181 (-1.72%) instructions in affected programs: 27971 -> 24794 (-11.36%) helped: 226 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use LOAD_PAYLOAD for SIMD8 TES input loads, not MOV.Kenneth Graunke2016-01-021-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can. All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Move 3-src subnr swizzle handling into the vec4 backend.Kenneth Graunke2016-01-022-6/+18
| | | | | | | | | | | | | | | | | | | | | | | While most align16 instructions only support a SubRegNum of 0 or 4 (using swizzling to control the other channels), 3-src instructions actually support arbitrary SubRegNums. When the RepCtrl bit is set, we believe it ignores the swizzle and uses the equivalent of a <0,1,0> region from the subnr. In the past, we adopted a vec4-centric approach of specifying subnr of 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper SubRegNum. This isn't a great fit for the scalar backend, where we don't set swizzles at all, and happily set subnrs in the range [0, 7]. This patch changes brw_eu_emit.c to use subnr and swizzle directly, relying on the higher levels to set them sensibly. This should fix problems where scalar sources get copy propagated into 3-src instructions in the FS backend. I've only observed this with TES push model inputs, but I suppose it could happen in other cases. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* u_upload_mgr: allow specifying PIPE_USAGE_* for the upload bufferMarek Olšák2016-01-021-3/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: remove alignment parameter from u_upload_createMarek Olšák2016-01-021-8/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_data manuallyMarek Olšák2016-01-023-3/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_alloc manuallyMarek Olšák2016-01-024-4/+4
| | | | | | | | | | The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: fix GLSL uniform updates for glBitmap & glDrawPixels (v2)Marek Olšák2016-01-025-19/+25
| | | | | | | | | | Spotted by luck. The GLSL uniform storage is only associated once in LinkShader and can't be reallocated afterwards, because that would break the association. v2: don't remove st_upload_constants calls, clarify why they're needed Cc: 11.0 11.1 <[email protected]>
* program: add _mesa_reserve_parameter_storageMarek Olšák2016-01-022-15/+36
| | | | | | | The next commit will use this. Reviewed-by: Brian Paul <[email protected]> Cc: 11.0 11.1 <[email protected]>
* mesa: Fix warning with MESA_VERBOSE=api for BindBufferRangeJordan Justen2016-01-011-1/+1
| | | | | Reported-by: Dieter Nützel <[email protected]> Signed-off-by: Jordan Justen <[email protected]>
* st/mesa: sort extensions enablement arrayIlia Mirkin2016-01-011-11/+11
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Add MESA_VERBOSE=api for GL_ARB_program_interface_queryJordan Justen2016-01-011-0/+39
| | | | | | | | | v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add MESA_VERBOSE=api for several indexed BindBuffer variantsJordan Justen2016-01-011-2/+25
| | | | | | | | v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st/glsl_to_tgsi: fix block movs for doublesDave Airlie2016-01-011-1/+14
| | | | | | | | | | | | While playing with fp64, I disable varying packing to debug something else, and noticed we never emitted half the output movs for double matrix arrays. We should be moving the left index two slots for dual source doubles, and the right index two slots for non-vs input doubles. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: handle different attrib sizeDave Airlie2016-01-011-5/+14
| | | | | | | vertex inputs are counted differently in some cases, with vertex inputs we need to make sure we don't double count them. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: readd the double_reg2 for input index mappingDave Airlie2016-01-011-2/+2
| | | | | | | | | Otherwise we end up emitting the wrong index for the second double. This fixes dmat-vs-gs-tcs-tes.shader_test and dvec3-vs-gs-tcs-tes.shader_test Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: when doing reladdr get vec4 of correct typeDave Airlie2016-01-011-1/+1
| | | | | | | This fixes fp64 relative addressing, in the upcoming dmat-vs-gs-tcs-tes.shader_test. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: handle double immediates in matrices properly.Dave Airlie2016-01-011-11/+48
| | | | | | This handles matrix initialisation properly. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: setup writemask for double arrays and matricies.Dave Airlie2016-01-011-1/+20
| | | | | | | | | | It's important for the double instruction emission code that the writemasks are correct going in for double so it know which channels to replicate. This fixes it for the array and matrix cases. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: handle doubles in array shrinking code.Dave Airlie2016-01-011-2/+7
| | | | | | | | This code takes into account double inputs in the array shrinking code. This fixes some issues with doubles and geom/tess inputs. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: handle doubles outputs in arrays.Dave Airlie2016-01-011-4/+31
| | | | | | | | This handles the case where a double output is stored in an array, and tracks it for use in the double instruction emit code. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: store if dst is double in arrayDave Airlie2016-01-011-3/+10
| | | | | | | | | | This is just a precursor patch to a fix for doubles with tessellation that I've written. We need to descend into output arrays in that case and mark dst's as double. Signed-off-by: Dave Airlie <[email protected]>
* drirc: Disable ARB_blend_func_extended for Heaven 4.0/Valley 1.0.Kenneth Graunke2015-12-301-0/+8
| | | | | | | | | | | | | | | | | | Unigine Heaven 4.0 and Valley 1.0 use dual color blending but don't specify which fragment shader output is which, so there's at best a 50/50 chance of us guessing it correctly. This is invalid. Unigine fixed this in 4.1 and 1.1 versions over a year and a half ago, but hasn't actually released them for whatever reason. So, add the workaround back so that it works for most people. Fixes Heaven 4.0/Valley 1.0 rendering on Ivybridge. For whatever reason, Broadwell worked. 4.1 and 1.1 have always worked. Signed-off-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Reviewed-by: Marek Olšák <[email protected]> Cc: [email protected]
* st/mesa: add GL_ARB_shader_draw_parameters supportIlia Mirkin2015-12-303-2/+4
| | | | | | | Hooks up the new system values, passes the drawid in. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965/gen8: Always use BRW_REGISTER_TYPE_UW for MUL on GEN8+Marta Lofstedt2015-12-302-29/+1
| | | | | | | | | | | | | The imulExtended tests of the shader bitfield tests of the OpenGL ES 3.1 CTS, fail on gen8+, when BRW_REGISTER_TYPE_W is used for SHADER_OPECODE_MULH. Also, remove unused helper function: static inline bool type_is_signed(unsigned type) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/builder: Add an init function that creates a simple shader for youJason Ekstrand2015-12-291-10/+3
| | | | | | | | | | | A hugely common case when using nir_builder is to have a shader with a single function called main. This adds a helper that gives you just that. This commit also makes us use it in the NIR control-flow unit tests as well as tgsi_to_nir and prog_to_nir. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* mesa/st: Pad out _mesa_sysval_to_semantic for new SYSTEM_VALUE_* enumsKristian Høgsberg Kristensen2015-12-291-0/+2
| | | | | | | GL_ARB_shader_draw_parameters added two new system values. This gets us back to mapping mesa system values to the right TGSI semantics. Reviewed-by: Ilia Mirkin <[email protected]>
* i965: Reemit vertex state between indirect multi drawsKristian Høgsberg Kristensen2015-12-291-2/+22
| | | | | | | | | | | | If we're doing an indirect draw, prims[i].basevertex is always 0 and the real base vertex value is in the indirect parameter buffer. We try to avoid flagging BRW_NEW_VERTICES if prims[i].basevertex doesn't change, which then breaks down for indirect draws. Thus, if a program uses base vertex or base instance, and the draw call is indirect, always flag BRW_NEW_VERTICES. A new piglit test, spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this. Reviewed-by: Anuj Phogat <[email protected]>
* i965: Add support for gl_DrawIDARB and enable extensionKristian Høgsberg Kristensen2015-12-2912-5/+145
| | | | | | | | | | We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <[email protected]>
* i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARBKristian Høgsberg Kristensen2015-12-2911-37/+88
| | | | | | | We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <[email protected]>
* i965: Assert that SYSTEM_VALUE_VERTEX_ID gets loweredKristian Høgsberg Kristensen2015-12-291-0/+1
| | | | | | | | | fs_visitor::emit_vs_system_value() looks like it's trying to handle SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the backend. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add core mesa support for GL_ARB_shader_draw_parametersKristian Høgsberg Kristensen2015-12-292-0/+2
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* mesa/vbo: Add draw_id field to struct _mesa_primKristian Høgsberg Kristensen2015-12-292-0/+5
| | | | | | | | | | The drivers will need this for passing in gl_DrawIDARB. For indirect multidraw calls, we get the prim array and prim[i].draw_id == i and is redundant. But for non-indirect calls, we get one primitive at a time and need the draw_id field. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Enable ARB_tessellation_shader on Gen7-7.5.Kenneth Graunke2015-12-282-3/+3
| | | | | | | | We've resolved all the GPU hangs, and everything seems to be working. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Don't set interleave or complete on TCS EOT message.Kenneth Graunke2015-12-285-5/+41
| | | | | | | | | | | | | | | | Setting interleave on the TCS EOT message causes Ivybridge hardware to GPU hang like crazy. Individual tests would pass, but running even a simple test like nop.shader_test in a loop would hang within 1-3 runs. Adding sleep delays worked around the problem, somehow. Interleave doesn't make much sense given that we only have one patch URB handle, not two. Complete doesn't seem useful either. There's no reason to actually set those bits. We were just being lazy. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Relase input URB Handles on Gen7/7.5 when TCS threads finish.Kenneth Graunke2015-12-285-1/+93
| | | | | | | | | | | | Pre-Broadwell hardware requires us to manually release the ICP Handles by issuing URB read messages with the "Complete" bit set. We can do this in pairs to use fewer URB read messages. Based heavily on work from Chris Forbes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Use proper TCS barrier ID bits for Ivybridge/Baytrail.Kenneth Graunke2015-12-281-4/+6
| | | | | | | | Gen7 uses bits 15:12 while Gen7+ uses bits 16:13. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Use proper TCS Instance ID bits for Ivybridge/Baytrail.Kenneth Graunke2015-12-281-2/+5
| | | | | | | | Gen7 uses 22:16 while Gen7.5+ uses 23:17. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Port tessellation evaluation shaders to vec4 mode.Kenneth Graunke2015-12-288-2/+365
| | | | | | | | | This can be used on Broadwell by setting INTEL_SCALAR_TES=0. More importantly, it will be used for Ivybridge and Haswell. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Emit a real 3DSTATE_DS on Gen7.Kenneth Graunke2015-12-281-11/+54
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>