summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir: Allow using nir_lower_io_to_scalar_early on VS input vars.Eric Anholt2018-10-301-1/+3
| | | | | | | This will be used on V3D to cut down the size of the VS inputs in the VPM (memory area for sharing data between shader stages). Reviewed-by: Timothy Arceri <[email protected]>
* spirv: Pass SSA values through functionsJason Ekstrand2018-10-301-41/+139
| | | | | | | | Previously, we would create temporary variables and fill them out. Instead, we create as many function parameters as we need and pass them through as SSA defs. Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl: Add missing include guardsMichał Janiszewski2018-10-301-0/+5
| | | | | | Signed-off-by: Michał Janiszewski <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glsl/linker: Fix out variables linking during single stageVadym Shovkoplias2018-10-301-1/+2
| | | | | | | | | | | | | Since out variables are copied from shader objects instruction streams to linked shader instruction steam it should be cloned at first to keep source instruction steam unaltered. Fixes: 966a797e433 ("glsl/linker: Link all out vars from a shader objects on a single stage") Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
* nir: fix yet another MSVC build breakBrian Paul2018-10-291-1/+1
| | | | Trivial.
* nir: Add a pass for gathering transform feedback infoJason Ekstrand2018-10-294-1/+214
| | | | | | | This is different from the GL_ARB_spirv pass because it generates a much simpler data structure that isn't tied to OpenGL and mtypes.h. Reviewed-by: Samuel Pitoiset <[email protected]>
* nir: Fix array initializerBrian Paul2018-10-261-1/+1
| | | | | | Empty initializer is not standard C. This fixes MSVC build. Trivial.
* spirv: Initialize subgroup destinations with the destination typeJason Ekstrand2018-10-261-4/+8
| | | | | | | Instead of initializing them manually, just use the type that we already have sitting there. Reviewed-by: Ian Romanick <[email protected]>
* spirv: Use the right bit-size for spec constant opsJason Ekstrand2018-10-261-3/+9
| | | | | | | | | | | Previously, we would always pull the bit size from the destination which is wrong for opcodes like nir_ilt where the sources are variable-sized but the destination is a fixed size. We were getting lucky before because nir_op_ilt returns a 32-bit value and basically everyone who uses spec constants uses 32-bit ones. Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]>
* glsl/nir: Use i2b instead of ine for fixing UBO/SSBO BooleansJason Ekstrand2018-10-261-19/+5
| | | | | | | | They do the same thing in the end but i2b is a bit simpler. Also, let's clean up the mess of code for SSBO handling with one line of builder. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/system_values: Use the bit size from the load_derefJason Ekstrand2018-10-261-0/+1
| | | | | | | | | This isn't a great solution for bit-sizes but we don't have a particularly convenient way to get a bit size from the system value enum and this keeps the lowering pass from changing it. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/opt_if: Rework condition propagationJason Ekstrand2018-10-261-61/+30
| | | | | | | | | | Instead of doing our own constant folding, we just emit instructions and let constant folding happen. This is substantially simpler and lets us use the nir_imm_bool helper instead of dealing with the const_value's ourselves. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir/search: Use the nir_imm_* helpers from nir_builderJason Ekstrand2018-10-263-91/+43
| | | | | | | | This requires that we rework the interface a bit to use nir_builder but that's a nice little modernization anyway. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/builder: Handle 16-bit floats in nir_imm_floatN_tJason Ekstrand2018-10-261-0/+14
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/builder: Add a nir_imm_true/false helpersJason Ekstrand2018-10-267-13/+36
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/constant_folding: Use nir_src_as_bool for discard_ifJason Ekstrand2018-10-261-6/+7
| | | | | | | Missed one while converting to the nir_src_as_* helpers. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/constant_folding: Add an unreachable to a switchJason Ekstrand2018-10-261-0/+2
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/validate: Print when the validation failedJason Ekstrand2018-10-263-29/+35
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* util: use C99 declaration in the for-loop set_foreach() macroEric Engestrom2018-10-2515-21/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util: use C99 declaration in the for-loop hash_table_foreach() macroEric Engestrom2018-10-2511-22/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Fix array initializer.Jose Fonseca2018-10-241-1/+1
| | | | | | Empty initializer is not standard C. This fixes MSVC build. Trivial.
* nir: fix nir_copy_propagation testJuan A. Suarez Romero2018-10-241-2/+2
| | | | | | | | | | | | | | | Use nir_src_comp_as_uint() to read the proper second component, as nir_src_as_uint() returns the first one. v2: Use nir_src_comp_as_uint() [Jason] Fixes: 16870de8a0a ("nir: Use nir_src_is_const and nir_src_as_* in core code") Signed-off-by: Juan A. Suarez Romero <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108532 Tested-by: Michel Dänzer <[email protected]> Tested-by: Vinson Lee <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add linking helper nir_link_xfb_varyings()Samuel Pitoiset2018-10-242-0/+34
| | | | | | | | The linking opts shouldn't try removing or compacting XFB varyings in the consumer. To avoid this we copy the always_active_io flag from the producer. Reviewed-by: Timothy Arceri <[email protected]>
* nir/algebraic: Fix a typo in the bit size validation codeJason Ekstrand2018-10-231-2/+2
| | | | | | | | The conon_bit_class and canon_var_class variables got switched. Fixes: 932c650e0b "nir/algebraic: Loosen a restriction on variables" Reported-by: Ian Romanick <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir/algebraic: Provide descriptive asserts for bit size checksJason Ekstrand2018-10-221-9/+42
| | | | | | | This will hopefully make debugging opt_algebraic bit-size compile failures easier. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/algebraic: Loosen a restriction on variablesJason Ekstrand2018-10-221-2/+6
| | | | | | | | | Previously, we would fail if a variable had an assigned but unknown bit size X and we tried to assign it an actual bit size. However, this is ok because, at the time we do the search, the variable does have an actual bit size and it will match X because of the NIR rules. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/algebraic: A bit of validation refactoring'Jason Ekstrand2018-10-221-15/+15
| | | | | | | We rename some local variables in validate() to be more readable and plumb the var through to get/set_var_bit_class instead of the var index. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/algebraic: Make internal classes str-ableJason Ekstrand2018-10-221-4/+12
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/algebraic: Generalize an optimizationJason Ekstrand2018-10-221-1/+1
| | | | | | There's nothing boolean about (a | ~a) ~> -1 Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/algebraic: Use bool internally instead of bool32Jason Ekstrand2018-10-222-8/+5
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Use nir_src_is_const and nir_src_as_* in core codeJason Ekstrand2018-10-2217-76/+56
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/search_helpers: Use nir_src_is_const and friendsJason Ekstrand2018-10-221-27/+22
| | | | | | | This not only makes them safe for more bit sizes but it also fixes a bug in is_zero_to_one where it would return true for constant NaN. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/search: Use nir_src_is_const and friendsJason Ekstrand2018-10-221-57/+13
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Add some new helpers for working with const sourcesJason Ekstrand2018-10-222-0/+108
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1Jason Ekstrand2018-10-222-0/+12
| | | | | | | | | | | This extension adds two new decorations which carry meaning only for HLSL shaders. They are expected to be handled by higher level layers and can be ignored by implementations. However, it does save the client a bit of work if the implementation safely ignores them instead of the client having to strip them out of the SPIR-V in order for it to be valid. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Add support for SPV_GOOGLE_decorate_stringJason Ekstrand2018-10-221-0/+8
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl: Check the subroutine associated functions namesVadym Shovkoplias2018-10-161-0/+36
| | | | | | | | | | | | | | | | | | | Adding compile time check for subroutine functions with the same names. Similar check for intrastage linking was already landed in commit 5f0567a4f60. From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification "A program will fail to compile or link if any shader or stage contains two or more functions with the same name if the name is associated with a subroutine type." Fixes: * no-overloads.vert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109 Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl/linker: Change the format of spec quotationVadym Shovkoplias2018-10-161-6/+5
| | | | | | | | Also there is no "OpenGL ES Shading Language 4.00" spec, so change it to GLSL 4.00 spec. Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* nir: fix clip cull lowering to not assert if GLSL already lowered.Dave Airlie2018-10-151-0/+6
| | | | | | If GLSL has already done the lowering, we'd rather not crash in this pass. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Copy propagation between blocksCaio Marcelo de Oliveira Filho2018-10-152-82/+351
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the pass to propagate the copies information along the control flow graph. It performs two walks, first it collects the vars that were written inside each node. Then it walks applying the copy propagation using a list of copies previously available. At each node the list is invalidated according to results from the first walk. This approach is simpler than a full data-flow analysis, but covers various cases. If derefs are used for operating on more memory resources (e.g. SSBOs), the difference from a regular pass is expected to be more visible -- as the SSA copy propagation pass won't apply to those. A full data-flow analysis would handle more scenarios: conditional breaks in the control flow and merge equivalent effects from multiple branches (e.g. using a phi node to merge the source for writes to the same deref). However, as previous commentary in the code stated, its complexity 'rapidly get out of hand'. The current patch is a good intermediate step towards more complex analysis. The 'copies' linked list was modified to use util_dynarray to make it more convenient to clone it (to handle ifs/loops). Annotated shader-db results for Skylake: total instructions in shared programs: 15105796 -> 15105451 (<.01%) instructions in affected programs: 152293 -> 151948 (-0.23%) helped: 96 HURT: 17 All the HURTs and many HELPs are one instruction. Looking at pass by pass outputs, the copy prop kicks in removing a bunch of loads correctly, which ends up altering what other other optimizations kick. In those cases the copies would be propagated after lowering to SSA. In few HELPs we are actually helping doing more than was possible previously, e.g. consolidating load_uniforms from different blocks. Most of those are from shaders/dolphin/ubershaders/. total cycles in shared programs: 566048861 -> 565954876 (-0.02%) cycles in affected programs: 151461830 -> 151367845 (-0.06%) helped: 2933 HURT: 2950 A lot of noise on both sides. total loops in shared programs: 4603 -> 4603 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 11085 -> 11073 (-0.11%) spills in affected programs: 23 -> 11 (-52.17%) helped: 1 HURT: 0 The shaders/dolphin/ubershaders/12.shader_test was able to pull a couple of loads from inside if statements and reuse them. total fills in shared programs: 23143 -> 23089 (-0.23%) fills in affected programs: 2718 -> 2664 (-1.99%) helped: 27 HURT: 0 All from shaders/dolphin/ubershaders/. LOST: 0 GAINED: 0 The other generations follow the same overall shape. The spills and fills HURTs are all from the same game. shader-db results for Broadwell. total instructions in shared programs: 15402037 -> 15401841 (<.01%) instructions in affected programs: 144386 -> 144190 (-0.14%) helped: 86 HURT: 9 total cycles in shared programs: 600912755 -> 600902486 (<.01%) cycles in affected programs: 185662820 -> 185652551 (<.01%) helped: 2598 HURT: 3053 total loops in shared programs: 4579 -> 4579 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 80929 -> 80924 (<.01%) spills in affected programs: 720 -> 715 (-0.69%) helped: 1 HURT: 5 total fills in shared programs: 93057 -> 93013 (-0.05%) fills in affected programs: 3398 -> 3354 (-1.29%) helped: 27 HURT: 5 LOST: 0 GAINED: 2 shader-db results for Haswell: total instructions in shared programs: 9231975 -> 9230357 (-0.02%) instructions in affected programs: 44992 -> 43374 (-3.60%) helped: 27 HURT: 69 total cycles in shared programs: 87760587 -> 87727502 (-0.04%) cycles in affected programs: 7720673 -> 7687588 (-0.43%) helped: 1609 HURT: 1416 total loops in shared programs: 1830 -> 1830 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 1988 -> 1692 (-14.89%) spills in affected programs: 296 -> 0 helped: 1 HURT: 0 total fills in shared programs: 2103 -> 1668 (-20.68%) fills in affected programs: 438 -> 3 (-99.32%) helped: 4 HURT: 0 LOST: 0 GAINED: 1 v2: Remove the DISABLE prefix from tests we now pass. v3: Add comments about missing write_mask handling. (Caio) Add unreachable when switching on cf_node type. (Jason) Properly merge the component information in written map instead of replacing. (Jason) Explain how removal from written arrays works. (Jason) Use mode directly from deref instead of getting the var. (Jason) v4: Register the local written mode for calls. (Jason) Prefer cf_node instead of node. (Jason) Clarify that remove inside iteration only works in backward iterations. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Take call instruction into account in copy_prop_varsCaio Marcelo de Oliveira Filho2018-10-151-6/+12
| | | | | | | | | | | | | | | | | Calls are not used yet (functions are inlined), but since new code is already taking them into account, do it here too. The convention here and in other places is that no writable memory is assumed to remain unchanged, as well as global variables. Also, explicitly state the modes affected (instead of using the reverse logic) in one of the apply_for_barrier_modes calls. Suggested by Jason. v2: Consider local vars used by a call to be conservative, SPIR-V has such cases. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add tests for copy propagation of derefsCaio Marcelo de Oliveira Filho2018-10-151-0/+300
| | | | | | | | | | | | Also tests for removal of redundant loads, that we currently handle as part of the copy propagation. Note some tests involve multiple blocks and are currently DISABLED because they (expectedly) fail. v2: Add missing DISABLED prefix to "multi block" tests. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Remove handling of dead writes from copy_prop_varsCaio Marcelo de Oliveira Filho2018-10-151-76/+8
| | | | | | These are covered by another pass now. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Separate dead write removal into its own passCaio Marcelo de Oliveira Filho2018-10-155-3/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of doing this as part of the existing copy_prop_vars pass. Separation makes easier to expand the scope of both passes to be more than per-block. For copy propagation, the information about valid copies comes from previous instructions; while the dead write removal depends on information from later instructions ("have any instruction used this deref before overwrite it?"). Also change the tests to use this pass (instead of copy prop vars). Note that the disabled tests continue to fail, since the standalone pass is still per-block. v2: Remove entries from dynarray instead of marking items as deleted. Use foreach_reverse. (Caio) (all from Jason) Do not cache nir_deref_path. Not worthy for this patch. Clear unused writes when hitting a call instruction. Clean up enumeration of modes for barriers. Move metadata calls to the inner function. v3: For copies, use the vector length to calculate the mask. (all from Jason) Use nir_component_mask_t when applicable. Rename functions for clarity. Consider local vars used by a call to be conservative (SPIR-V has such cases). Comment and assert the assumption that stores and copies are always to a deref that ends with a vector or scalar. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add tests for dead write eliminationCaio Marcelo de Oliveira Filho2018-10-151-0/+241
| | | | | | | | | | | Note at the moment the pass called is nir_opt_copy_prop_vars, because dead write elimination is implemented there. Also added tests that involve identifying dead writes in multiple blocks (e.g. the overwrite happens in another block). Those currently fail as expected, so are marked to be skipped. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add test file for vars related passesCaio Marcelo de Oliveira Filho2018-10-153-11/+224
| | | | | | | | | | | | Add basic helpers for doing tests on the vars related optimization passes. The main goal is to lower the barrier to create tests during development and debugging of the passes. Full coverage is not a requirement. v2: Make find_next_intrinsic() skip blocks before 'after'. (Jason) Move nir_imm_ivec2() to nir_builder.h. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add nir_imm_ivec2 helperCaio Marcelo de Oliveira Filho2018-10-151-0/+12
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Expose nir_remove_unused_io_vars().Eric Anholt2018-10-152-8/+27
| | | | | | | | | | | | | | For gallium drivers where you want to do some linking at variant compile time, you don't have the other producer/consumer shader on hand to modify. By exposing the inner function, the driver can have the used varyings in the compiled shader cache key and still do linking. This is also useful for V3D, where the binning shader wants to only output position and TF varyings. We've been removing those after nir_lower_io, but this will be less driver-specific code and let more of the shader get DCEed early in NIR. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Be sure to fix deref modes after demoting shader i/o vars to global.Eric Anholt2018-10-151-0/+3
| | | | | | | Fixes assertion failures when calling nir_remove_unused_varyings() or nir_remove_unused_io_vars(). Reviewed-by: Timothy Arceri <[email protected]>
* nir: Create sampler2D variables in nir_lower_{bitmap,drawpixels}.Kenneth Graunke2018-10-142-1/+23
| | | | | | | | This is needed for nir_gather_info to actually count the new textures, since it operates solely on variables. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Dave Airlie <[email protected]>