summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mesa: Add MESA_VERBOSE=api for several indexed BindBuffer variantsJordan Justen2016-01-011-2/+25
| | | | | | | | v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* st/glsl_to_tgsi: fix block movs for doublesDave Airlie2016-01-011-1/+14
| | | | | | | | | | | | While playing with fp64, I disable varying packing to debug something else, and noticed we never emitted half the output movs for double matrix arrays. We should be moving the left index two slots for dual source doubles, and the right index two slots for non-vs input doubles. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: handle different attrib sizeDave Airlie2016-01-011-5/+14
| | | | | | | vertex inputs are counted differently in some cases, with vertex inputs we need to make sure we don't double count them. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: readd the double_reg2 for input index mappingDave Airlie2016-01-011-2/+2
| | | | | | | | | Otherwise we end up emitting the wrong index for the second double. This fixes dmat-vs-gs-tcs-tes.shader_test and dvec3-vs-gs-tcs-tes.shader_test Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: when doing reladdr get vec4 of correct typeDave Airlie2016-01-011-1/+1
| | | | | | | This fixes fp64 relative addressing, in the upcoming dmat-vs-gs-tcs-tes.shader_test. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: handle double immediates in matrices properly.Dave Airlie2016-01-011-11/+48
| | | | | | This handles matrix initialisation properly. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: setup writemask for double arrays and matricies.Dave Airlie2016-01-011-1/+20
| | | | | | | | | | It's important for the double instruction emission code that the writemasks are correct going in for double so it know which channels to replicate. This fixes it for the array and matrix cases. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: handle doubles in array shrinking code.Dave Airlie2016-01-011-2/+7
| | | | | | | | This code takes into account double inputs in the array shrinking code. This fixes some issues with doubles and geom/tess inputs. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: handle doubles outputs in arrays.Dave Airlie2016-01-011-4/+31
| | | | | | | | This handles the case where a double output is stored in an array, and tracks it for use in the double instruction emit code. Signed-off-by: Dave Airlie <[email protected]>
* st/glsl_to_tgsi: store if dst is double in arrayDave Airlie2016-01-011-3/+10
| | | | | | | | | | This is just a precursor patch to a fix for doubles with tessellation that I've written. We need to descend into output arrays in that case and mark dst's as double. Signed-off-by: Dave Airlie <[email protected]>
* nvc0: Set winding order regardless of domain.Kenneth Graunke2015-12-301-2/+4
| | | | | | | | | | Quads need to respect winding order, too - not just triangles. Fixes rendering in GFXBench 4.0's tessellation benchmark. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* glsl: Fix varying struct locations when varying packing is disabled.Kenneth Graunke2015-12-301-13/+2
| | | | | | | | | | | | | | | | | | | | varying_matches::record tries to compute the number of components in each varying, which varying_matches::assign_locations uses to assign locations. With varying packing, it uses glsl_type::component_slots() to come up with a reasonable value. Without varying packing, it fell back to an open-coded computation that didn't bother to handle structs at all. I believe we can simply use 4 * glsl_type::count_attribute_slots(false), which already handles these cases correctly. Partially fixes rendering in GFXBench 4.0's tessellation benchmark. (NVE0 is almost right after this, but i965 is still mostly garbage.) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Cc: "11.0 11.1" <[email protected]>
* drirc: Disable ARB_blend_func_extended for Heaven 4.0/Valley 1.0.Kenneth Graunke2015-12-301-0/+8
| | | | | | | | | | | | | | | | | | Unigine Heaven 4.0 and Valley 1.0 use dual color blending but don't specify which fragment shader output is which, so there's at best a 50/50 chance of us guessing it correctly. This is invalid. Unigine fixed this in 4.1 and 1.1 versions over a year and a half ago, but hasn't actually released them for whatever reason. So, add the workaround back so that it works for most people. Fixes Heaven 4.0/Valley 1.0 rendering on Ivybridge. For whatever reason, Broadwell worked. 4.1 and 1.1 have always worked. Signed-off-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Reviewed-by: Marek Olšák <[email protected]> Cc: [email protected]
* glsl: add GL_ARB_shader_draw_parameters defineIlia Mirkin2015-12-301-0/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nvc0: add ARB_shader_draw_parameters supportIlia Mirkin2015-12-3014-15/+74
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* st/mesa: add GL_ARB_shader_draw_parameters supportIlia Mirkin2015-12-303-2/+4
| | | | | | | Hooks up the new system values, passes the drawid in. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add a drawid to pipe_draw_infoIlia Mirkin2015-12-301-0/+2
| | | | | | | | | This will allow the state tracker to inform the driver where in a broken-up multidraw we currently are. This can then be passed into the vertex shader. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add PIPE_CAP_DRAW_PARAMETERSIlia Mirkin2015-12-3016-2/+19
| | | | | | | | This allows the state tracker to know that the various draw parameters are available in vertex shaders. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add baseinstance/drawid semanticsIlia Mirkin2015-12-303-1/+18
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nv50/ir: attempt to do more constant folding on mad -> add conversionIlia Mirkin2015-12-301-11/+10
| | | | | | | | | The add might actually have a 0 as an argument, which would convert it into a mov. Make sure to detect that. Also avoid the hack of putting the immediate directly into the instruction, instead use a mov to put it into place and let the later LoadPropagation pass place it if possible. Signed-off-by: Ilia Mirkin <[email protected]>
* i965/gen8: Always use BRW_REGISTER_TYPE_UW for MUL on GEN8+Marta Lofstedt2015-12-302-29/+1
| | | | | | | | | | | | | The imulExtended tests of the shader bitfield tests of the OpenGL ES 3.1 CTS, fail on gen8+, when BRW_REGISTER_TYPE_W is used for SHADER_OPECODE_MULH. Also, remove unused helper function: static inline bool type_is_signed(unsigned type) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: tidy up struct with a single memberTimothy Arceri2015-12-308-19/+15
| | | | | | | | | | | There used to be more members but they now share other fields in order to keep memory use low. Also making the naming more generic will allow us to reuse the field for explicit byte offsets within blocks for ARB_enhanced_layouts. Reviewed-by: Edward O'Callaghan <[email protected]>
* glsl/linker: annotate static functions as suchEmil Velikov2015-12-302-3/+3
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* glsl: annotate ast_process_struct_or_iface_block_members() as staticEmil Velikov2015-12-301-1/+1
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* nir/builder: Add an init function that creates a simple shader for youJason Ekstrand2015-12-294-34/+28
| | | | | | | | | | | A hugely common case when using nir_builder is to have a shader with a single function called main. This adds a helper that gives you just that. This commit also makes us use it in the NIR control-flow unit tests as well as tgsi_to_nir and prog_to_nir. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* mesa/st: Pad out _mesa_sysval_to_semantic for new SYSTEM_VALUE_* enumsKristian Høgsberg Kristensen2015-12-291-0/+2
| | | | | | | GL_ARB_shader_draw_parameters added two new system values. This gets us back to mapping mesa system values to the right TGSI semantics. Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: float(s32 & 0xff) = float(u8), not s8Ilia Mirkin2015-12-291-0/+3
| | | | | | | | Make sure to make conversion unsigned when we're ANDing the high bits away. Fixes corruption in dolphin. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* i965: Reemit vertex state between indirect multi drawsKristian Høgsberg Kristensen2015-12-291-2/+22
| | | | | | | | | | | | If we're doing an indirect draw, prims[i].basevertex is always 0 and the real base vertex value is in the indirect parameter buffer. We try to avoid flagging BRW_NEW_VERTICES if prims[i].basevertex doesn't change, which then breaks down for indirect draws. Thus, if a program uses base vertex or base instance, and the draw call is indirect, always flag BRW_NEW_VERTICES. A new piglit test, spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this. Reviewed-by: Anuj Phogat <[email protected]>
* nir: Teach nir_opt_algebraic about adding and subtracting the same thingKristian Høgsberg Kristensen2015-12-291-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | This optimizes a + b - b to just a. Modest shader-db results (BDW): total instructions in shared programs: 7842452 -> 7841862 (-0.01%) instructions in affected programs: 61938 -> 61348 (-0.95%) total loops in shared programs: 2131 -> 2131 (0.00%) helped: 263 HURT: 0 GAINED: 0 LOST: 0 but the optimization turns gl_VertexID - gl_BaseVertexARB into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the i965 hardware supports natively. That means we can avoid using the internal vertex buffer for gl_BaseVertexARB in this case. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add support for gl_DrawIDARB and enable extensionKristian Høgsberg Kristensen2015-12-2912-5/+145
| | | | | | | | | | We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <[email protected]>
* i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARBKristian Høgsberg Kristensen2015-12-2911-37/+88
| | | | | | | We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <[email protected]>
* i965: Assert that SYSTEM_VALUE_VERTEX_ID gets loweredKristian Høgsberg Kristensen2015-12-291-0/+1
| | | | | | | | | fs_visitor::emit_vs_system_value() looks like it's trying to handle SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the backend. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add core mesa support for GL_ARB_shader_draw_parametersKristian Høgsberg Kristensen2015-12-299-0/+41
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* mesa/vbo: Add draw_id field to struct _mesa_primKristian Høgsberg Kristensen2015-12-292-0/+5
| | | | | | | | | | The drivers will need this for passing in gl_DrawIDARB. For indirect multidraw calls, we get the prim array and prim[i].draw_id == i and is redundant. But for non-indirect calls, we get one primitive at a time and need the draw_id field. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Remove function overload in control flow testAaron Watry2015-12-291-2/+1
| | | | | | | Fixes make check. Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radeonsi: add RADEON_REPLACE_SHADERS debug optionNicolai Hähnle2015-12-293-5/+105
| | | | | | | | | | | This option allows replacing a single shader by a pre-compiled ELF object as generated by LLVM's llc, for example. This can be useful for debugging a deterministically occuring error in shaders (and has in fact helped find the causes of https://bugs.freedesktop.org/show_bug.cgi?id=93264). v2: drop the debug flag, use DEBUG_GET_ONCE_OPTION instead Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: count compilations in si_compile_llvmNicolai Hähnle2015-12-292-1/+2
| | | | | | | | This changes the count slightly (because of si_generate_gs_copy_shader), but this is only relevant for the driver-specific num-compilations query. It sets the stage for the next commit. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: add DEBUG_GET_ONCE_OPTIONNicolai Hähnle2015-12-291-0/+13
| | | | | | This is analogous to the alreading existing macros for BOOL, NUM, and FLAGS. Reviewed-by: Marek Olšák <[email protected]>
* r600: fix constant buffer size programmingGrazvydas Ignotas2015-12-292-2/+2
| | | | | | | | | | | | When buffer size is less than 16, zero ends up being programmed as size, which prevents the hardware from fetching the correct values. Fix it by combining shift and align so that the value is always rounded up. Cc: "11.1 11.0 10.6" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92229 Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* docs: Mark ARB_tessellation_shader as done on all i965 platforms.Kenneth Graunke2015-12-282-2/+2
| | | | | | | We now support all Intel GPUs which can do tessellation. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Enable ARB_tessellation_shader on Gen7-7.5.Kenneth Graunke2015-12-282-3/+3
| | | | | | | | We've resolved all the GPU hangs, and everything seems to be working. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Don't set interleave or complete on TCS EOT message.Kenneth Graunke2015-12-285-5/+41
| | | | | | | | | | | | | | | | Setting interleave on the TCS EOT message causes Ivybridge hardware to GPU hang like crazy. Individual tests would pass, but running even a simple test like nop.shader_test in a loop would hang within 1-3 runs. Adding sleep delays worked around the problem, somehow. Interleave doesn't make much sense given that we only have one patch URB handle, not two. Complete doesn't seem useful either. There's no reason to actually set those bits. We were just being lazy. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Relase input URB Handles on Gen7/7.5 when TCS threads finish.Kenneth Graunke2015-12-285-1/+93
| | | | | | | | | | | | Pre-Broadwell hardware requires us to manually release the ICP Handles by issuing URB read messages with the "Complete" bit set. We can do this in pairs to use fewer URB read messages. Based heavily on work from Chris Forbes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Use proper TCS barrier ID bits for Ivybridge/Baytrail.Kenneth Graunke2015-12-281-4/+6
| | | | | | | | Gen7 uses bits 15:12 while Gen7+ uses bits 16:13. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Use proper TCS Instance ID bits for Ivybridge/Baytrail.Kenneth Graunke2015-12-281-2/+5
| | | | | | | | Gen7 uses 22:16 while Gen7.5+ uses 23:17. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Port tessellation evaluation shaders to vec4 mode.Kenneth Graunke2015-12-288-2/+365
| | | | | | | | | This can be used on Broadwell by setting INTEL_SCALAR_TES=0. More importantly, it will be used for Ivybridge and Haswell. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Emit a real 3DSTATE_DS on Gen7.Kenneth Graunke2015-12-281-11/+54
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Emit a real 3DSTATE_HS on Gen7.Kenneth Graunke2015-12-281-11/+47
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Add the TCS/TES state upload atoms to the gen7_atoms list.Kenneth Graunke2015-12-283-30/+14
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir: Get rid of function overloadsJason Ekstrand2015-12-2859-386/+313
| | | | | | | | | | | | | | | | | When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]> ir3 bits are Reviewed-by: Rob Clark <[email protected]>