summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* glsl/nir: Add and use a gl_nir_link() functionCaio Marcelo de Oliveira Filho2019-09-102-0/+24
| | | | | | | | | Perform all the NIR linking steps in order. Change iris and i965 to use it. Suggested by Alejandro. v2: Add gl_nir_linker_options struct. Reviewed-by: Alejandro Piñeiro <[email protected]> [v1]
* glsl/nir: Fill in the Parameters in NIR linkerCaio Marcelo de Oliveira Filho2019-09-103-2/+76
| | | | | | | | | | | | | | | | | | | | | The parameter lists were not being created nor filled since i965 doesn't use them. In Gallium they are used for uniform handling, so add a way to fill them. The gl_uniform_storage struct got two new fields that let us go - from a Parameter to the matching UniformStorage and, - from the variable to the *first* UniformStorage without relying on names -- since they are optional for ARB_gl_spirv. Later patches will make use of them. v2: Do not fill parameters for i965. (Timothy) Use uint32_t for the new attributes. (Marek) v3: Serialize the new fields. (Timothy) Reviewed-by: Timothy Arceri <[email protected]>
* compiler: Add glsl_contains_opaque() helperCaio Marcelo de Oliveira Filho2019-09-102-0/+7
| | | | Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl/nir: Avoid overflow when setting max_uniform_locationCaio Marcelo de Oliveira Filho2019-09-101-1/+2
| | | | | | | | | Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned max_uniform_location. Those unmapped uniforms don't have to be accounted at this point. Fixes: 7a9e5cdfbb9 ("nir/linker: Add gl_nir_link_uniforms()") Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl/tests: Handle windows \r\n new linesDylan Baker2019-09-101-1/+1
| | | | | | | | | Currently the praser for s expressions assumes that newlines will be \n, resulting in incorrect parsing on windows, where the newline is \r\n. This patch just adds \r? to the regular expression used to parse the s expressions, which fixes at 1 test on windows. Reviewed-by: Eric Engestrom <[email protected]>
* nir/dead_cf: Repair SSA if the pass makes progressJason Ekstrand2019-09-061-2/+13
| | | | | | | | | | | | | | | | | | | | The dead_cf pass calls into the CF manipulation helpers which attempt to keep NIR's SSA form sane. However, when the only break is removed from a loop, dominance gets messed up anyway because the CF SSA clean-up code only looks at phis and doesn't consider the case of code becoming unreachable. One solution to this would be to put the loop into LCSSA form before we modify any of its contents. Another (and the approach taken by this pass) is to just run the repair_ssa pass afterwards because the CF manipulation helpers are smart enough to keep all the use/def stuff sane; they just don't always preserve dominance properties. While we're here, we clean up some bogus indentation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069 Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/repair_ssa: Insert deref casts when neededJason Ekstrand2019-09-061-2/+29
| | | | | Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/repair_ssa: Repair dominance for unreachable blocksJason Ekstrand2019-09-061-4/+8
| | | | | | | | | | | | NIR currently assumes that unreachable blocks are trivially dominated by everything. However, when considering well-formed SSA, there is no path from any block to an unreachable block. Therefore, we can break any use-def chains where the use is in an unreachable block. This removes any dependencies on code created by uses in unreachable blocks and lets DCE do a better job of cleaning it up. Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add a block_is_unreachable helperJason Ekstrand2019-09-062-0/+15
| | | | | Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Don't infinitely recurse in lower_ssa_defs_to_regs_blockJason Ekstrand2019-09-061-5/+15
| | | | | Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Handle complex derefs in nir_split_array_varsJason Ekstrand2019-09-061-2/+5
| | | | | | | | We already bail and don't split the vars but we were passing a NULL to _mesa_hash_table_search which is not allowed. Fixes: f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..." Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/lower_io_to_vector: don't merge compact varyingsRhys Perry2019-09-061-0/+3
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Fixes: 02bc4aabb48 ('nir/lower_io_to_vector: allow FS outputs to be vectorized') Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_io_to_vector: add flat modeRhys Perry2019-09-061-47/+204
| | | | | | | | | | | | | | | | | | | | | This has lower_io_to_vector try to turn variables into arrays of 4-sized vectors when possible and fall back to the old approach when that isn't possible. This is so that lower_io_to_vector can guarantee that only one variable is used for each fragment shader output. v2: handle dual-source blending v3: don't try to merge structs and non-32-bit types in get_flat_type() v3: fix per-vertex inputs v3: fix and cleanup location advancement in get_flat_type() and it's calling code v4: prioritize the original mode over the flat mode v4: don't create flat variables to merge only one variable v5: don't skip an entire slot when encountering structs in the old mode Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_io_to_vector: allow FS outputs to be vectorizedRhys Perry2019-09-062-27/+33
| | | | | | | | v2: handle dual-source blending v3: use a higher MAX_SLOTS Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* glsl: Fix unroll of do{} while(false) like loopsDanylo Piliaiev2019-09-062-17/+41
| | | | | | | | | | | | | | | For loops which condition is false on the first iteration iteration count was falsely calculated under the assumption that loop's condition is true until it becomes false, meaning it's true at least one time. Now such loops are reported as having 0 iteration. Similar to the fix e71fc7f2 done in NIR. Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Carve out nir_lower_samplers from GLSL code.Timur Kristóf2019-09-065-127/+159
| | | | | | | | | | | | Lowering samplers is needed to produce NIR that can actually be consumed by some gallium drivers, so it doesn't make sense to to keep it only in the GLSL code. This commit introduces nir_lower_samplers to compiler/nir, while maintains the GL-specific function too. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_explicit_io: Handle 1 bit loads and storesCaio Marcelo de Oliveira Filho2019-09-051-9/+24
| | | | | | | | | | | | | | Load a 32-bit value then convert to 1-bit. Convert 1-bit to 32-bit value, then Store it. These cases started to appear when we changed Anvil to use derefs for shared memory. v2: Use `bit_size` in a couple of places we were missing. (Jason) Reassign `value` instead of `src[0]`. (Jason) Fixes: 024a46a4079 ("anv: use derefs for shared memory access") Reviewed-by: Jason Ekstrand <[email protected]>
* nir: allow specifying filter callback in lower_alu_to_scalarVasily Khoruzhick2019-09-062-6/+16
| | | | | | | | | | | | | Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* gallium: Plumb through a way to disable GLSL const loweringConnor Abbott2019-09-051-1/+2
| | | | | | | | | | For radeonsi, we will prefer the NIR pass as it'll generate better code (some index calculation and a single load vs. a load, then index calculation, then another load) and oftentimes NIR optimization can kick in and make all the access indices constant. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Store the precision for a function return typeNeil Roberts2019-09-043-1/+30
| | | | | | | | | The precision for a function return type is now stored in ir_function_signature. This will later be useful to implement mediump to float16 lowering. In the meantime it is also useful to catch errors where a function is redeclared with a different precision. Reviewed-by: Timothy Arceri <[email protected]>
* nir: fix memleak in error pathEric Engestrom2019-09-041-1/+3
| | | | | | | Fixes: 2cf59861a8128a91bfdd ("nir: Add partial redundancy elimination for compares") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: remove unused constant_fold_stateRob Clark2019-09-031-6/+0
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Fix num_ssbos when lowering atomic countersConnor Abbott2019-09-031-0/+21
| | | | | | | | | | | | Otherwise it's impossible to know the maximum SSBO index for both internal TGSI shaders from TTN (which don't have any notion of atomic counters and no offset) as well as shaders from GLSL. I fixed everything I could find while grepping for num_ssbos and num_abos, which hopefully is everything (iris was the only user I could find that uses it in a meaningful way). Reviewed-by: Marek Olšák <[email protected]>
* nir: do not assume that the result of fexp2(a) is always an integralSamuel Pitoiset2019-09-021-0/+1
| | | | | | | | | It's only correct when 'a' is an integral greater or equal to 0. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493 Fixes: 5544b2cbbd2 ("nir/algebraic: Use value range analysis to eliminate useless unary ops") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: replace 'x + (-x)' with constant 0Pierre-Eric Pelloux-Prayer2019-08-291-0/+12
| | | | | | | | | | | | | This fixes a hang in shadertoy for radeonsi where a buffer was initialized with: value -= value with value being undefined. In this case LLVM replace the operation with an assignment to NaN. Cc: 19.1 19.2 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241 Reviewed-by: Marek Olšák <[email protected]>
* nir/range-analysis: Add a lot more assertions about the contents of tablesIan Romanick2019-08-291-6/+128
| | | | | | | | | v2: Update several of the comments. Drop some redundant uses of ASSERT_UNION_OF_OTHERS_MATCHES_UNKNOWN_*_SOURCE source. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Suggested-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/range-analysis: Range tracking for fpowIan Romanick2019-08-291-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One shader from Metro Last Light and the rest from Rochard. In the Rochard cases, something like: min(1.0, max(pow(saturate(x), y), z)) was transformed to saturate(max(pow(saturate(x), y), z)) because the result of the pow must be >= 0. The Metro Last Light case was similar. An instance of min(pow(abs(x), y), 1.0) became saturate(pow(abs(x), y)) v2: Fix some comments. Suggested by Caio. v3: Fix setting is_intgral when the exponent might be negative. See also Mesa MR !1778. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16280670 -> 16280659 (<.01%) instructions in affected programs: 1130 -> 1119 (-0.97%) helped: 11 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.72% max: 1.43% x̄: 1.03% x̃: 0.97% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -1.19% -0.86% Instructions are helped. total cycles in shared programs: 367168430 -> 367168270 (<.01%) cycles in affected programs: 10281 -> 10121 (-1.56%) helped: 10 HURT: 1 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 1.31% max: 2.43% x̄: 1.79% x̃: 1.70% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 3.10% max: 3.10% x̄: 3.10% x̃: 3.10% 95% mean confidence interval for cycles value: -20.06 -9.04 95% mean confidence interval for cycles %-change: -2.36% -0.32% Cycles are helped.
* nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcselIan Romanick2019-08-291-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I discovered this while looking at a shader that was hurt by some other work I'm doing. When I examined the changes, I was confused that one instance of a comparison that was used in a discard_if was (incorrectly) eliminated, while another instance used by a bcsel was (correctly) not eliminated. I had to use NIR_PRINT=true to see exactly where things when wrong. A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and Strike Suit Zero were impacted. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16280659 -> 16281075 (<.01%) instructions in affected programs: 21042 -> 21458 (1.98%) helped: 0 HURT: 136 HURT stats (abs) min: 1 max: 9 x̄: 3.06 x̃: 3 HURT stats (rel) min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03% 95% mean confidence interval for instructions value: 2.93 3.19 95% mean confidence interval for instructions %-change: 2.08% 2.37% Instructions are HURT. total cycles in shared programs: 367168270 -> 367170313 (<.01%) cycles in affected programs: 172020 -> 174063 (1.19%) helped: 14 HURT: 111 helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9 helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79% HURT stats (abs) min: 2 max: 584 x̄: 21.08 x̃: 5 HURT stats (rel) min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40% 95% mean confidence interval for cycles value: 5.41 27.28 95% mean confidence interval for cycles %-change: 0.64% 1.81% Cycles are HURT.
* nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)Ian Romanick2019-08-291-3/+8
| | | | | | | | | | | | | | | Found by inspection. I tried really, really hard to make a test case that would trigger this problem, but I was unsuccesful. It's very hard to get an instruction to produce a ne_zero result without ne_zero sources. The most plausible way is using bcsel. That proves problematic because bcsel interprets its sources as integers, so it cannot currently be used to "clean" values for floating point instructions. No shader-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass")
* nir/range-analysis: Adjust result range of multiplication to account for ↵Ian Romanick2019-08-291-31/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flush-to-zero Fixes piglit tests (new in piglit!110): - fs-underflow-fma-compare-zero.shader_test - fs-underflow-mul-compare-zero.shader_test v2: Add back part of comment accidentally deleted. Noticed by Caio. Remove is_not_zero function as it is no longer used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: fa116ce357b ("nir/range-analysis: Range tracking for ffma and flrp") Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> All Gen7+ platforms** had similar results. (Ice Lake shown) total instructions in shared programs: 16278465 -> 16279492 (<.01%) instructions in affected programs: 16765 -> 17792 (6.13%) helped: 0 HURT: 23 HURT stats (abs) min: 7 max: 275 x̄: 44.65 x̃: 8 HURT stats (rel) min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62% 95% mean confidence interval for instructions value: 9.57 79.74 95% mean confidence interval for instructions %-change: 1.85% 6.61% Instructions are HURT. total cycles in shared programs: 367135159 -> 367154270 (<.01%) cycles in affected programs: 279306 -> 298417 (6.84%) helped: 0 HURT: 23 HURT stats (abs) min: 13 max: 6029 x̄: 830.91 x̃: 54 HURT stats (rel) min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49% 95% mean confidence interval for cycles value: 100.89 1560.94 95% mean confidence interval for cycles %-change: 0.94% 13.71% Cycles are HURT. total spills in shared programs: 8870 -> 8869 (-0.01%) spills in affected programs: 19 -> 18 (-5.26%) helped: 1 HURT: 0 total fills in shared programs: 21904 -> 21901 (-0.01%) fills in affected programs: 81 -> 78 (-3.70%) helped: 1 HURT: 0 LOST: 0 GAINED: 1 ** On Broadwell, a shader was hurt for spills / fills instead of helped. No changes on any earlier platforms.
* nir/range-analysis: Adjust result range of exp2 to account for flush-to-zeroIan Romanick2019-08-291-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes piglit tests (new in piglit!110): - fs-underflow-exp2-compare-zero.shader_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Most of the shaders affected are, unsurprisingly, in Unigine Heaven. All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16278207 -> 16278465 (<.01%) instructions in affected programs: 11374 -> 11632 (2.27%) helped: 0 HURT: 58 HURT stats (abs) min: 2 max: 13 x̄: 4.45 x̃: 4 HURT stats (rel) min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82% 95% mean confidence interval for instructions value: 3.77 5.13 95% mean confidence interval for instructions %-change: 2.19% 2.64% Instructions are HURT. total cycles in shared programs: 367134284 -> 367135159 (<.01%) cycles in affected programs: 81207 -> 82082 (1.08%) helped: 17 HURT: 36 helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6 helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78% HURT stats (abs) min: 4 max: 235 x̄: 66.97 x̃: 16 HURT stats (rel) min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09% 95% mean confidence interval for cycles value: -20.36 53.38 95% mean confidence interval for cycles %-change: -1.08% 4.67% Inconclusive result (value mean confidence interval includes 0). No changes on any earlier platforms.
* nir/algebraic: Clean up value range analysis-based optimizationsIan Romanick2019-08-291-8/+18
| | | | | | | | | Fix the a / b ordering in some compares. Delete duplicate patterns. Add a table explaining things. While I was cleaning this up, I managed to confuse myself. The table helped sort that out. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/algebraic: Mark some value range analysis-based optimizations impreciseIan Romanick2019-08-291-9/+13
| | | | | | | | | | | | | | | | | | This didn't fix bug #111308, but it was found will trying to find the actual cause of that bug. Fixes piglit tests (new in piglit!110): - fs-fract-of-NaN.shader_test - fs-lt-nan-tautology.shader_test - fs-ge-nan-tautology.shader_test No shader-db changes on any Intel platform. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: b77070e293c ("nir/algebraic: Use value range analysis to eliminate tautological compares") Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is ↵Ian Romanick2019-08-281-1/+1
| | | | | | | | | | | | | | | | | | | | enabled This caused a problem on Sandybridge where an open-coded bitfieldReverse() function could be optimized to a nir_op_bitfield_reverse that would generate an unsupported BFREV instruction in the backend. This was encountered in some Unreal4 tech demos in shader-db. The bug was not previously noticed because we don't actually try to run those demos on Sandybridge. The fixes tag is a bit a lie. The actual bug was introduced about 26,000 commits earlier in 371c4b3c48f ("nir: Recognize open-coded bitfield_reverse."). Without the NIR lowering pass, the flag needed to avoid the optimization does not exist. Hopefully nobody will care to fix this on an earlier Mesa release. Reviewed-by: Matt Turner <[email protected]> Fixes: 7afa26d4e39 ("nir: Add lowering for nir_op_bitfield_reverse.")
* compiler/glsl: Fix warning about unused functionCaio Marcelo de Oliveira Filho2019-08-231-1/+3
| | | | | | | | | | The helper check_node_type() is only used when DEBUG is set (in the function below), but ASSERTED macro uses NDEBUG. So just guard the helper with #ifdef. If we see more such cases we might consider a ASSERTED-like macro for the DEBUG case. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Remove nir_const_load_to_arrAlyssa Rosenzweig2019-08-221-5/+0
| | | | | | | There are no remaining users in-tree. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add explicit signs to image min/max intrinsicsJason Ekstrand2019-08-219-30/+64
| | | | | | | | | | | This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/loop_analyze: Treat do{}while(false) loops as 0 iterationsDanylo Piliaiev2019-08-211-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Loops like: block block_0: vec1 32 ssa_2 = load_const (0x00000020) vec1 32 ssa_3 = load_const (0x00000001) loop { vec1 32 ssa_7 = phi block_0: ssa_3, block_4: ssa_9 vec1 1 ssa_8 = ige ssa_2, ssa_7 if ssa_8 { break } else { } vec1 32 ssa_9 = iadd ssa_7, ssa_1 } Were treated as having more than 1 iteration and after unrolling produced wrong results, however such loop will exit during the first iteration if not unrolled. So we check if loop will actually loop. Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir/loop_unroll: Prepare loop for unrolling in wrapper_unrollDanylo Piliaiev2019-08-211-25/+1
| | | | | | | | | Without loop_prepare_for_unroll loops are losing phis. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411 Fixes: 5db98195 "nir: add loop unroll support for wrapper loops" Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir/loop_unroll: Update the comments for loop_prepare_for_unrollDanylo Piliaiev2019-08-211-2/+2
| | | | | | | | The comments say that we should remove continue if it is the last intruction in a loop however we remove any kind of jump. Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir/algebraic: some subtraction optimizationsDaniel Schürmann2019-08-211-0/+3
| | | | | | | | | | | | | | | | | | Changes with RADV/ACO: Totals from affected shaders: SGPRS: 444087 -> 455543 (2.58 %) VGPRS: 436468 -> 436768 (0.07 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 13448928 -> 13353520 (-0.71 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 68060 -> 67979 (-0.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* mesa/compiler: rework tear down of builtin/typesLionel Landwerlin2019-08-217-67/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | The issue we're running into when running CTS is that glsl types are deleted while builtins depending on them are not. This happens because on one hand we have glsl types ref counted, but builtins are not. Instead builtins are destroyed when unloading libGL or explicitly calling glReleaseShaderCompiler(). This change removes almost entirely any dealing with glsl types ref/unref by letting the builtins deal with it instead. In turn we introduce a builtin ref count mechanism. Each GL context takes a reference on the builtins when compiling a shader for the first time. It releases the reference when the context is destroyed. It can also explicitly release those when glReleaseShaderCompiler() is called. Finally we also take a reference on the glsl types when loading libGL to avoid recreating glsl types too often. v2: Ensure we take a reference if we don't have one in link step (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796 Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* compiler: ensure glsl types are not created without a referenceLionel Landwerlin2019-08-211-1/+6
| | | | | | | | | We want to detect invalid refcounting so assert we have at least one use before creating types. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* nir/tests: take reference on glsl typesLionel Landwerlin2019-08-214-1/+16
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl/tests: take refs on glsl typesLionel Landwerlin2019-08-219-18/+64
| | | | | | | | | Much like each driver, tests as standalone entities must take references on the glsl types. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* nir: add divergence analysis pass.Daniel Schürmann2019-08-203-0/+799
| | | | | | | | | | This pass expects the shader to be in LCSSA form. The algorithm is based on 'The Simple Divergence Analysis' from Diogo Sampaio, Rafael De Souza, Sylvain Collange, Fernando Magno Quintão Pereira. Divergence Analysis. ACM Transactions on Programming Languages and Systems (TOPLAS) Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/subgroups: Lower clustered reductions with cluster_size >= subgroup_size ↵Rhys Perry2019-08-201-1/+12
| | | | | | | | into reductions The behavior for reductions with cluster_size >= subgroup_size is implementation defined. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lcssa: allow to create LCSSA phis for loop-invariant booleansRhys Perry2019-08-202-3/+7
| | | | | | | ACO depends on LCSSA phis for divergent booleans to work correctly. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lcssa: Skip loop invariant variables when converting to LCSSA.Daniel Schürmann2019-08-202-14/+162
| | | | | | Co-authored-by: Rhys Perry <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: make nir_to_lcssa() a general NIR pass.Rhys Perry2019-08-202-3/+42
| | | | | Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>