aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir: Remove handling of dead writes from copy_prop_varsCaio Marcelo de Oliveira Filho2018-10-151-76/+8
| | | | | | These are covered by another pass now. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Separate dead write removal into its own passCaio Marcelo de Oliveira Filho2018-10-155-3/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of doing this as part of the existing copy_prop_vars pass. Separation makes easier to expand the scope of both passes to be more than per-block. For copy propagation, the information about valid copies comes from previous instructions; while the dead write removal depends on information from later instructions ("have any instruction used this deref before overwrite it?"). Also change the tests to use this pass (instead of copy prop vars). Note that the disabled tests continue to fail, since the standalone pass is still per-block. v2: Remove entries from dynarray instead of marking items as deleted. Use foreach_reverse. (Caio) (all from Jason) Do not cache nir_deref_path. Not worthy for this patch. Clear unused writes when hitting a call instruction. Clean up enumeration of modes for barriers. Move metadata calls to the inner function. v3: For copies, use the vector length to calculate the mask. (all from Jason) Use nir_component_mask_t when applicable. Rename functions for clarity. Consider local vars used by a call to be conservative (SPIR-V has such cases). Comment and assert the assumption that stores and copies are always to a deref that ends with a vector or scalar. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add tests for dead write eliminationCaio Marcelo de Oliveira Filho2018-10-151-0/+241
| | | | | | | | | | | Note at the moment the pass called is nir_opt_copy_prop_vars, because dead write elimination is implemented there. Also added tests that involve identifying dead writes in multiple blocks (e.g. the overwrite happens in another block). Those currently fail as expected, so are marked to be skipped. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add test file for vars related passesCaio Marcelo de Oliveira Filho2018-10-153-11/+224
| | | | | | | | | | | | Add basic helpers for doing tests on the vars related optimization passes. The main goal is to lower the barrier to create tests during development and debugging of the passes. Full coverage is not a requirement. v2: Make find_next_intrinsic() skip blocks before 'after'. (Jason) Move nir_imm_ivec2() to nir_builder.h. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add nir_imm_ivec2 helperCaio Marcelo de Oliveira Filho2018-10-151-0/+12
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Expose nir_remove_unused_io_vars().Eric Anholt2018-10-152-8/+27
| | | | | | | | | | | | | | For gallium drivers where you want to do some linking at variant compile time, you don't have the other producer/consumer shader on hand to modify. By exposing the inner function, the driver can have the used varyings in the compiled shader cache key and still do linking. This is also useful for V3D, where the binning shader wants to only output position and TF varyings. We've been removing those after nir_lower_io, but this will be less driver-specific code and let more of the shader get DCEed early in NIR. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Be sure to fix deref modes after demoting shader i/o vars to global.Eric Anholt2018-10-151-0/+3
| | | | | | | Fixes assertion failures when calling nir_remove_unused_varyings() or nir_remove_unused_io_vars(). Reviewed-by: Timothy Arceri <[email protected]>
* nir: Create sampler2D variables in nir_lower_{bitmap,drawpixels}.Kenneth Graunke2018-10-142-1/+23
| | | | | | | | This is needed for nir_gather_info to actually count the new textures, since it operates solely on variables. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* spirv: Update SPIR-V json and headers to Khronos masterJason Ekstrand2018-10-132-13/+604
| | | | | | This corresponds to commit 801cca8104245c07e8cc532 on GitHub. Acked-by: Bas Nieuwenhuizen <[email protected]>
* spirv/nir: handle memory access qualifiers for SSBO loads/storesSamuel Pitoiset2018-10-124-14/+77
| | | | | | | | | | | v2: - change how the access qualifiers are accumulated v3: - duplicate members in struct_member_decoration_cb() - handle access qualifiers on variables - remove access qualifiers handling in _vtn_variable_load_store() - fix setting access qualifiers on type->array_element Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]
* nir: Add a bunch of b2[if] optimizationsJason Ekstrand2018-10-111-0/+17
| | | | | | | | | | | | | | | | | | The b2f and b2i conversions always produce zero or one which are both representable in every type and size. Since b2i and b2f support all bit sizes, we can just get rid of the conversion opcode. total instructions in shared programs: 15089335 -> 15084368 (-0.03%) instructions in affected programs: 212564 -> 207597 (-2.34%) helped: 896 HURT: 0 total cycles in shared programs: 369831123 -> 369826267 (<.01%) cycles in affected programs: 2008647 -> 2003791 (-0.24%) helped: 693 HURT: 216 Reviewed-by: Ian Romanick <[email protected]>
* glsl: remove redundant es_shader checksTimothy Arceri2018-10-112-5/+1
| | | | | | The es check is already covered by the is_version() check. Reviewed-by: Ian Romanick <[email protected]>
* glsl: ignore trailing whitespace when define redefinedTimothy Arceri2018-10-103-3/+25
| | | | | | | | | The Nvidia/AMD binary drivers allow this, as does GCC. This fixes shader compilation issues in the latest update of No Mans Sky. Reviewed-by: Ian Romanick <[email protected]>
* nir/algebraic: Simplify fsat of fsignIan Romanick2018-10-091-0/+1
| | | | | | | | | | These allows us to not support fsign.sat in the Intel compiler backend, and that will simplify some later changes. No shader-db changes on any Intel platform. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* nir/algebraic: sign(x)*x*x is abs(x)*xIan Romanick2018-10-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shader-db results: All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15106023 -> 15105981 (<.01%) instructions in affected programs: 300 -> 258 (-14.00%) helped: 6 HURT: 0 helped stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7 helped stats (rel) min: 14.00% max: 14.00% x̄: 14.00% x̃: 14.00% 95% mean confidence interval for instructions value: -7.00 -7.00 95% mean confidence interval for instructions %-change: -14.00% -14.00% Instructions are helped. total cycles in shared programs: 566050327 -> 566050075 (<.01%) cycles in affected programs: 2826 -> 2574 (-8.92%) helped: 6 HURT: 0 helped stats (abs) min: 40 max: 44 x̄: 42.00 x̃: 42 helped stats (rel) min: 8.89% max: 8.94% x̄: 8.92% x̃: 8.92% 95% mean confidence interval for cycles value: -44.30 -39.70 95% mean confidence interval for cycles %-change: -8.95% -8.88% Cycles are helped. No changes on Gen6 or earlier. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* nir: Add helper functions to get the instruction that generated a nir_srcIan Romanick2018-10-091-0/+23
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* meson: Don't build glsl compiler tests unless OpenGL is enabledDylan Baker2018-10-092-2/+2
| | | | | | Since there are no other users of the glsl compiler. Reviewed-by: Eric Engestrom <[email protected]>
* glsl: fix array assignments of a swizzled vectorIlia Mirkin2018-10-081-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | This happens in situations where we might do vec.wzyx[i] = ... The swizzle would get effectively ignored because of the interaction between how ir_assignment->set_lhs works and overwriting the write_mask. There are two cases, one where i is a constant, and another where i is variable. We have to be extra-careful in both cases. Fixes the following WebGL test: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/glsl3/vector-dynamic-indexing-swizzled-lvalue.html And the new piglit tests: swizzled-writemask-indexing-nonconst.shader_test swizzled-writemask-indexing.shader_test Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: [email protected]
* glsl: do not attempt assignment if operand type not parsed correctlyTapani Pälli2018-10-081-0/+6
| | | | | | | | | v2: check types of both operands (Ian) Cc: [email protected] Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108012
* spirv: mark variables decorated with XfbBuffer as always activeSamuel Pitoiset2018-10-051-0/+1
| | | | | | | | Otherwise, they are removed during NIR linking or in some lowering passes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/alu_to_scalar: Use ssa_for_alu_src in hand-rolled expansionsJason Ekstrand2018-10-041-15/+18
| | | | | | | | | | | | | | | | | The ssa_for_alu_src helper will correctly handle swizzles and other source modifiers for you. The expansions for unpack_half_2x16, pack_uvec2_to_uint, and pack_uvec4_to_uint were all broken with regards to swizzles. The brokenness of unpack_half_2x16 was causing rendering errors in Rise of the Tomb Raider on Intel ever since c11833ab24dcba26 which added an extra copy propagation to the optimization pipeline and caused us to start seeing swizzles where we hadn't seen any before. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926 Fixes: 9ce901058f3d "nir: Add lowering of nir_op_unpack_half_2x16." Fixes: 9b8786eba955 "nir: Add lowering support for packing opcodes." Tested-by: Alex Smith <[email protected]> Tested-by: Józef Kucia <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl/linker: Check the subroutine associated functions namesVadym Shovkoplias2018-10-041-0/+40
| | | | | | | | | | | | | | | | | | | | | >From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification "A program will fail to compile or link if any shader or stage contains two or more functions with the same name if the name is associated with a subroutine type." v2: - error out earlier (Tapani) - style fixes (Iago) Fixes: * no-overloads.vert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109 Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* spirv: Move function call handling to vtn_cfgJason Ekstrand2018-10-023-63/+65
| | | | | | | It makes way more sense for it to live there with the rest of function handling. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir/from_ssa: Don't rewrite derefs destinations to registersJason Ekstrand2018-10-021-0/+6
| | | | | | | | | | | | We already call nir_rematerialize_derefs_in_use_blocks_impl prior to calling nir_lower_ssa_defs_to_regs_block so the assertion that all deref uses in the block should hold. This fixes the following CTS test when SPIR-V optimization recipe 1: dEQP-VK.glsl.struct.local.loop_nested_struct_array_vertex Fixes: 606eb56ab9449b "intel/nir: Only lower load/store derefs" Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir/cf: Remove phi sources if needed in nir_handle_add_jumpJason Ekstrand2018-10-021-17/+21
| | | | | | | | | | | | If the block in which the jump is inserted is the predecessor of a phi then we need to remove phi sources otherwise the phi may end up with things improperly connected. This fixes the following CTS test when dEQP is run with SPIR-V optimization recipe 1: dEQP-VK.glsl.functions.control_flow.return_in_nested_loop_vertex Cc: [email protected] Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl: Add an assert when cloning ir_dereference_record with invalid fieldDanylo Piliaiev2018-09-201-0/+1
| | | | | Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Avoid propagating incompatible type of initializerDanylo Piliaiev2018-09-201-29/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do_assignment validated assigment but when rhs type was not compatible it proceeded without issues and returned error_emitted = false. On the other hand process_initializer expected do_assignment to always return compatible type and never fail. As a result when variable was initialized with incompatible type the type of variable changed to the incompatible one. This manifested in unnecessary error messages and in one case in crash. Example GLSL: vec4 tmp = vec2(0.0); tmp.z -= 1.0; Past error messages: initializer of type vec2 cannot be assigned to variable of type vec4 invalid swizzle / mask `z' type mismatch operands to arithmetic operators must be numeric After this patch: initializer of type vec2 cannot be assigned to variable of type vec4 In the other case when we initialize variable with incompatible struct, accessing variable's field leaded to a crash. Example: uniform struct {float field;} data; ... vec4 tmp = data; tmp.x -= 1.0; After the patch there is only error line without a crash: initializer of type #anon_struct cannot be assigned to variable of type vec4 Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107547
* nir: add initializer data to fix MSVC compile errorJuan A. Suarez Romero2018-09-191-1/+1
| | | | | | | CC: Jason Ekstrand <[email protected]> Fixes: 82799a5d1b8 ("nir: Add a small pass to rematerialize derefs per-block") Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Add some asserts that we don't put derefs in phisJason Ekstrand2018-09-193-0/+6
| | | | | | | | | The lcssa and phis_to_regs passes are used by various NIR optimizations that modify the CFG. Putting a couple of asserts will help ensure that we don't accidentally put derefs in phis as part of an optimization pass. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir/opt_if: Re-materialize derefs in use blocks before peeling loopsJason Ekstrand2018-09-191-6/+7
| | | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107879 Cc: "18.2" <[email protected]>
* nir/loop_unroll: Re-materialize derefs in use blocks before unrollingJason Ekstrand2018-09-191-9/+4
| | | | | | | | | | When we're about to re-arrange a bunch of blocks, it's a good idea to make sure that we don't have deref uses crossing block boundaries. Otherwise we may end up with a deref going through a phi and that would be bad. Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "18.2" <[email protected]>
* nir: Add a small pass to rematerialize derefs per-blockJason Ekstrand2018-09-192-0/+134
| | | | | | | | | This pass re-materializes deref instructions on a per-block basis to ensure that every use of a deref occurs in the same block as the instruction which uses it. Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "18.2" <[email protected]>
* nir: add loop unroll support for complex wrapper loopsTimothy Arceri2018-09-141-37/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In GLSL IR we cheat with switch statements and simply convert them into loops with a single iteration. This allowed us to make use of the existing jump instruction handling provided by the loop handing code, it also allows dead code to be cleaned up once we have wrapped the code in a loop. However using loops in this way created previously unrollable loops which limits further optimisations. Here we provide a way to unroll loops that end in a break and have multiple other exits. All shader-db changes are from the dolphin uber shaders. There is a small amount of HURT shaders but in general the improvements far exceed the HURT. shader-db results IVB: total instructions in shared programs: 10018187 -> 10016468 (-0.02%) instructions in affected programs: 104080 -> 102361 (-1.65%) helped: 36 HURT: 15 total cycles in shared programs: 220065064 -> 154529655 (-29.78%) cycles in affected programs: 126063017 -> 60527608 (-51.99%) helped: 51 HURT: 0 total loops in shared programs: 2515 -> 2308 (-8.23%) loops in affected programs: 903 -> 696 (-22.92%) helped: 51 HURT: 0 total spills in shared programs: 4370 -> 4124 (-5.63%) spills in affected programs: 1397 -> 1151 (-17.61%) helped: 9 HURT: 12 total fills in shared programs: 4581 -> 4419 (-3.54%) fills in affected programs: 2201 -> 2039 (-7.36%) helped: 9 HURT: 15 Reviewed-by: Jason Ekstrand <[email protected]>
* nir: propagates if condition evaluation down some alu chainsTimothy Arceri2018-09-141-0/+128
| | | | | | | | | | | | | | | | | | | | | | | | | v2: - only allow nir_op_inot or nir_op_b2i when alu input is 1. - use some helpers as suggested by Jason. v3: - evaluate alu op for single input alu ops - add helper function to decide if to propagate through alu - make use of nir_before_src in another spot shader-db IVB results: total instructions in shared programs: 9993483 -> 9993472 (-0.00%) instructions in affected programs: 1300 -> 1289 (-0.85%) helped: 11 HURT: 0 total cycles in shared programs: 219476091 -> 219476059 (-0.00%) cycles in affected programs: 7675 -> 7643 (-0.42%) helped: 10 HURT: 1 Reviewed-by: Jason Ekstrand <[email protected]>
* nir: evaluate if condition uses inside the if branchesTimothy Arceri2018-09-142-0/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we know what side of the branch we ended up on we can just replace the use with a constant. All the spill changes in shader-db are from Dolphin uber shaders, despite some small regressions the change is clearly positive. V2: insert new constant after any phis in the use->parent_instr->type == nir_instr_type_phi path. v3: - use nir_after_block_before_jump() for inserting const - check dominance of phi uses correctly v4: - create some helpers as suggested by Jason. v5 (Jason Ekstrand): - Use LIST_ENTRY to get the phi src shader-db results IVB: total instructions in shared programs: 9999201 -> 9993483 (-0.06%) instructions in affected programs: 163235 -> 157517 (-3.50%) helped: 132 HURT: 2 total cycles in shared programs: 231670754 -> 219476091 (-5.26%) cycles in affected programs: 143424120 -> 131229457 (-8.50%) helped: 115 HURT: 24 total spills in shared programs: 4383 -> 4370 (-0.30%) spills in affected programs: 1656 -> 1643 (-0.79%) helped: 9 HURT: 18 total fills in shared programs: 4610 -> 4581 (-0.63%) fills in affected programs: 374 -> 345 (-7.75%) helped: 6 HURT: 0 Reviewed-by: Jason Ekstrand <[email protected]>
* glsl/linker: Check the invariance of built-in special variablesVadym Shovkoplias2018-09-121-0/+66
| | | | | | | | | | | | | | | | | | | | | | From Section 4.6.4 (Invariance and Linkage) of the GLSL ES 1.0 specification "The invariance of varyings that are declared in both the vertex and fragment shaders must match. For the built-in special variables, gl_FragCoord can only be declared invariant if and only if gl_Position is declared invariant. Similarly gl_PointCoord can only be declared invariant if and only if gl_PointSize is declared invariant. It is an error to declare gl_FrontFacing as invariant. The invariance of gl_FrontFacing is the same as the invariance of gl_Position." Fixes: * glsl-pcoord-invariant.shader_test * glsl-fcoord-invariant.shader_test * glsl-fface-invariant.shader_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107734 Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* Replace uses of _mesa_bitcount with util_bitcountDylan Baker2018-09-076-14/+16
| | | | | | | | | | | | | and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem in nir for platforms that don't have popcount or popcountll, such as 32bit msvc. v2: - Fix additional uses of _mesa_bitcount added after this was originally written Acked-by: Eric Engestrom <[email protected]> (v1) Acked-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Drop the vs_inputs_dual_locations optionJason Ekstrand2018-09-064-47/+21
| | | | | | | | | | | | | It was very inconsistently handled; the only things that made use of it were glsl_to_nir, glspirv, and nir_gather_info. In particular, nir_lower_io completely ignored it so anyone using nir_lower_io on 64-bit vertex attributes was going to be in for a shock. Also, as of the previous commit, it's set by every driver that supports 64-bit vertex attributes. There's no longer any reason to have it be an option so let's just delete it. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/nir: Set vs_inputs_dual_locations and let NIR do the remapJason Ekstrand2018-09-061-4/+1
| | | | | | | | | | | | | | | | | We were going out of our way to disable dual-location re-mapping in NIR only to then do the remapping in st_glsl_to_nir.cpp. Presumably, this was so that double_inputs would be correct for the core state tracker. However, now that we've it to gl_program::DualSlotInputs which is unaffected by NIR lowering, we can let NIR lower things for us. The one tricky bit here is that we have to remap the inputs_read bitfield back to the single-slot convention for the gallium state tracker to use. Since radeonsi is the only NIR-capable gallium driver that also supports GL_ARB_vertex_attrib_64bit, we only have to worry about radeonsi when making core gallium state tracker changes. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* compiler: Move double_inputs to gl_program::DualSlotInputsJason Ekstrand2018-09-067-35/+49
| | | | | | | | | | | | | | | | | | | | | | | Previously, we had two field in shader_info: double_inputs_read and double_inputs. Presumably, the one was for all double inputs that are read and the other is all that exist. However, because nir_gather_info regenerates these two values, there is a possibility, if a variable gets deleted, that the value of double_inputs could change over time. This is a problem because double_inputs is used to remap the input locations to a two-slot-per-dvec3/4 scheme for i965. If that mapping were to change between glsl_to_nir and back-end state setup, we would fall over when trying to map the NIR outputs back onto the GL location space. This commit changes the way slot re-mapping works. Instead of the double_inputs field in shader_info, it adds a DualSlotInputs bitfield to gl_program. By having it in gl_program, we more easily guarantee that NIR passes won't touch it after it's been set. It also makes more sense to put it in a GL data structure since it's really a mapping from GL slots to back-end and/or NIR slots and not really a NIR shader thing. Tested-by: Alejandro Piñeiro <[email protected]> (ARB_gl_spirv tests) Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: fixer lexer for unreachable definesTimothy Arceri2018-09-062-23/+38
| | | | | | | | | | | | | | | | | | | | | | | | If we have something like: #ifdef NOT_DEFINED #define A_MACRO(x) \ if (x) #endif The # on the #define is not skipped but the define itself is so this then gets recognised as #if. Until 28a3731e3f this didn't happen because we ended up in <HASH>{NONSPACE} where BEGIN INITIAL was called stopping the problem from happening. This change makes sure we never call RETURN_TOKEN_NEVER_SKIP for if/else/endif when processing a define. Cc: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107772 Tested-By: Eero Tamminen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: avoid lowering texcoord array except in simple casesIlia Mirkin2018-08-291-0/+6
| | | | | | | | | With compat creeping up to geometry and tess shaders, lowering texcoord accesses/writes becomes more complicated. Since it's an optimization anyways, just avoid the complication for now. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: add a mechanism to allow layout qualifiers on function paramsTimothy Arceri2018-08-303-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spec is quite clear this is not allowed: From Section 4.4. (Layout Qualifiers) of the GLSL 4.60 spec: "Layout qualifiers can appear in several forms of declaration. They can appear as part of an interface block definition or block member, as shown in the grammar in the previous section. They can also appear with just an interface-qualifier to establish layouts of other declarations made with that qualifier: layout-qualifier interface-qualifier ; Or, they can appear with an individual variable declared with an interface qualifier: layout-qualifier interface-qualifier declaration ;" From Section 4.10 (Memory Qualifiers) of the GLSL 4.60 spec: "Layout qualifiers cannot be used on formal function parameters, and layout qualification is not included in parameter matching." However on the Nvidia binary driver they actually fail to compile if image function params don't have a layout qualifier. This results in applications such as No Mans Sky using layout qualifiers on params. I've submitted a CTS test to expose this problem in the Nvidia driver but until that is resolved this patch will help Mesa drivers work around the issue. Reviewed-by: Marek Olšák <[email protected]>
* glsl: skip stringification in preprocessor if in unreachable branchTimothy Arceri2018-08-301-2/+4
| | | | | | | This fixes compilation of some "No Mans Sky" shaders where the stringification happens in branches intended for DX12. Reviewed-by: Ian Romanick <[email protected]>
* anv,i965: Lower away image derefs in the driverJason Ekstrand2018-08-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the back-end compiler turn image access into magic uniform reads and there was a complex contract between back-end compiler and driver about setting up and filling out those params. As of this commit, both drivers now lower image_deref_load_param_intel intrinsics to load_uniform intrinsics controlled by the driver and lower the other image_deref_* intrinsics to image_* intrinsics which take an actual binding table index. There are still "magic" uniforms but they are now added and controlled entirely by the driver and that contract no longer spans components. This also has the side-effect of making most image use compile-time binding table indices. Previously, all image access pulled the binding table index from a uniform. Part of the reason for this was that the magic uniforms made it difficult to decouple binding table indices from the uniforms and, since they are indexed completely differently (especially in Vulkan), it was hard to pull them apart. Now that the driver is handling both, it's trivial to decouple the two and provide actual binding table indices. Shader-db results on Kaby Lake: total instructions in shared programs: 15166872 -> 15164293 (-0.02%) instructions in affected programs: 115834 -> 113255 (-2.23%) helped: 191 HURT: 0 total cycles in shared programs: 571311495 -> 571196465 (-0.02%) cycles in affected programs: 4757115 -> 4642085 (-2.42%) helped: 73 HURT: 67 total spills in shared programs: 10951 -> 10926 (-0.23%) spills in affected programs: 742 -> 717 (-3.37%) helped: 7 HURT: 0 total fills in shared programs: 22226 -> 22201 (-0.11%) fills in affected programs: 1146 -> 1121 (-2.18%) helped: 7 HURT: 0 Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add handle/index-based image intrinsicsJason Ekstrand2018-08-293-20/+82
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Use a bitfield for image access qualifiersJason Ekstrand2018-08-296-29/+39
| | | | | | | | | | This commit expands the current memory access enum to contain the extra two bits provided for images. We choose to follow the SPIR-V convention of NonReadable and NonWriteable because readonly implies that you *can* read so readonly + writeonly doesn't make as much sense as NonReadable + NonWriteable. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/link,i965: Make ImageAccess four-stateJason Ekstrand2018-08-292-6/+10
| | | | | | | | | | | | | | | | | The GLSL spec allows you to set both the "readonly" and "writeonly" qualifiers on images to indicate that it can only be used with imageSize. However, we had no way of representing this int he linked shader and flagged it as GL_READ_ONLY. This is good from a "does it use this buffer?" perspective but not from a format and access lowering perspective. By using GL_NONE for if "readonly" and "writeonly" are both set, we can detect this case in the driver and handle it correctly. Nothing currently relies on the type of surface in the "readonly" + "writeonly" case but that's about to change. i965 is the only drier which uses the ImageAccess field and gl_bindless_image::access is currently unused. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Do image load/store lowering to NIRJason Ekstrand2018-08-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | This commit moves our storage image format conversion codegen into NIR instead of doing it in the back-end. This has the advantage of letting us run it through NIR's optimizer which is pretty effective at shrinking things down. In the common case of rgba8, the number of instructions emitted after NIR is done with it is half of what it was with the lowering happening in the back-end. On the downside, the back-end's lowering is able to directly use predicates and the NIR lowering has to use IFs. Shader-db results on Kaby Lake: total instructions in shared programs: 15166910 -> 15166872 (<.01%) instructions in affected programs: 5895 -> 5857 (-0.64%) helped: 15 HURT: 0 Clearly, we don't have that much image_load_store happening in the shaders in shader-db.... Reviewed-by: Kenneth Graunke <[email protected]>
* nir/types: Add a wrapper for coordinate_componentsJason Ekstrand2018-08-292-0/+8
| | | | Reviewed-by: Kenneth Graunke <[email protected]>