aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir: Add a bool to int32 lowering passJason Ekstrand2018-12-164-0/+163
| | | | | | | | We also enable it in all of the NIR drivers. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir: Add 1-bit Boolean opcodesJason Ekstrand2018-12-163-3/+34
| | | | | | | | | We also have to add support for 1-bit integers while we're here so we get 1-bit variants of iand, ior, and inot. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir/algebraic: Generalize an optimizationJason Ekstrand2018-12-161-2/+9
| | | | | | | | This just makes it nicely scale across bit sizes. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir/large_constants: Properly handle 1-bit boolsJason Ekstrand2018-12-162-2/+31
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir: Add support for 1-bit data typesJason Ekstrand2018-12-1611-24/+92
| | | | | | | | | | | | | | This commit adds support for 1-bit Booleans and integers. Booleans obviously take a value of true or false. Because we have to define the semantics of 1-bit signed and unsigned integers, we define uint1_t to take values of 0 and 1 and int1_t to take values of 0 and -1. 1-bit arithmetic is then well-defined in the usual way, just with fewer bits. The definition of int1_t and uint1_t doesn't usually matter but we do need something for purposes of constant folding. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir/constant_expressions: Rework Boolean handlingJason Ekstrand2018-12-161-20/+12
| | | | | | | | | | | | | | This commit contains three related changes. First, we define boolN_t for N = 8, 16, and 64 and move the definition of boolN_vec to the loop with the other vec definitions. Second, there's no reason why we need the != 0 on the source because that happens implicitly when it's converted to bool. Third, for destinations, we use a signed integer type and just do -(int)bool_val which will give us the 0/-1 behavior we want and neatly scales to all bit widths. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir: Rename Boolean-related opcodes to include 32 in the nameJason Ekstrand2018-12-1610-80/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a squash of a bunch of individual changes: nir/builder: Generate 32-bit bool opcodes transparently nir/algebraic: Remap Boolean opcodes to the 32-bit variant Use 32-bit opcodes in the NIR producers and optimizations Generated with a little hand-editing and the following sed commands: sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c Use 32-bit opcodes in the NIR back-ends Generated with a little hand-editing and the following sed commands: sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir/algebraic: Make an optimization more specificJason Ekstrand2018-12-161-1/+1
| | | | | | | | Later in this series, bool is not going to imply 32-bit. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir: Drop support for lower_b2fJason Ekstrand2018-12-162-7/+1
| | | | | | | | | | This was originally added for the out-of-tree Mali driver but I think we've all agreed it's easy enough for them to just do in their back-end. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir/algebraic: Optimize x2b(xneg(a)) -> aJason Ekstrand2018-12-161-0/+2
| | | | | | | | | | | | | | | Shader-db results on Kaby Lake: total instructions in shared programs: 15072525 -> 15072525 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 This helps prevent regressions in later commits. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir/constant_folding: Fix source bit size logicJason Ekstrand2018-12-161-1/+2
| | | | | | | | | | | | | Instead of looking at input_sizes[i] which contains the number of components for each source, we look at the bit size of input_types[i]. This fixes a regression in the 1-bit boolean series though I have no idea how we haven't seen it before now. Fixes: 35baee5dce5 "nir/constant_folding: fix incorrect bit-size check" Fixes: 9076c4e289d "nir: update opcode definitions for different bit sizes" Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir/lower_idiv: Use ilt instead of bit twiddlingJason Ekstrand2018-12-161-1/+1
| | | | | | | | | The previous code was creating a boolean by doing an arithmetic right- shift by 31 which produces a boolean which is true if the argument is negative. This is the same as the expression r < 0 which is much simpler and doesn't depend on NIR's representation of booleans. Reviewed-by: Eric Anholt <[email protected]>
* nir: fix constness in nir_intrinsic_align()Rhys Perry2018-12-161-1/+1
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nir/phi_builder: Internal users should use ↵Ian Romanick2018-12-141-2/+2
| | | | | | | nir_phi_builder_value_set_block_def too Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: fix opt_if_loop_last_continue()Timothy Arceri2018-12-141-2/+6
| | | | | | | | | | | | | | | | | | | | | | | The pass did not correctly handle loops ending in: if ssa_7 { block block_8: /* preds: block_7 */ continue /* succs: block_1 */ } else { block block_9: /* preds: block_7 */ break /* succs: block_11 */ } The break will get eliminated by another opt but if this pass gets called first (as it does on RADV) we ended up inserting instructions after the break. Fixes: 5921a19d4b0c ("nir: add if opt opt_if_loop_last_continue()") Reviewed-by: Dave Airlie <[email protected]>
* nir: Move intel's half-float image store lowering to to nir_format.h.Eric Anholt2018-12-131-0/+13
| | | | | | | | I needed the same function for v3d. This was originally in d3e046e76c06 ("nir: Pull some of intel's image load/store format conversion to nir_format.h") before we made am istake about simplifying the function. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Print the format of image variables.Eric Anholt2018-12-131-0/+47
| | | | | | | | This helps a lot when debugging image load/store lowering on large testcases. Unfortunately the Mesa enum name stuff is under src/mesa and we can't get at it from the compiler. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add a pass for lowering integer division by constantsJason Ekstrand2018-12-134-0/+219
| | | | | | | | | | | It's a reasonably well-known fact in the world of compilers that integer divisions by constants can be replaced by a multiply, an add, and some shifts. This commit adds such an optimization to NIR for easiest case of udiv. Other division operations will be added in following commits. In order to provide some additional driver control, the pass takes a minimum bit size to optimize. Reviewed-by: Ian Romanick [email protected]
* nir: Add a saturated unsigned integer add opcodeIan Romanick2018-12-131-0/+2
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lower_int64: Add support for [iu]mul_highJason Ekstrand2018-12-132-0/+67
| | | | Reviewed-by: Ian Romanick [email protected]
* nir: Allow [iu]mul_high on non-32-bit typesJason Ekstrand2018-12-132-4/+40
| | | | Reviewed-by: Ian Romanick [email protected]
* nir: remove unused variableAlejandro Piñeiro2018-12-131-1/+0
| | | | | | | To avoid the following warning: ./src/compiler/nir/nir_loop_analyze.c:807:16: warning: unused variable ‘ns’ [-Wunused-variable] nir_shader *ns = impl->function->shader; Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Pull some of intel's image load/store format conversion to nir_format.hEric Anholt2018-12-121-0/+38
| | | | | | | | | I needed the same functions for v3d. Note that the color value in the Intel lowering has already been cut down to image.chans num_components. v2: Drop the half float one, since it was a 1-liner after cleanup. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add some more consts to the nir_format_convert.h helpers.Eric Anholt2018-12-121-7/+6
| | | | | | | Most of the bits were constant, but a few were missed. Avoids warnings from v3d's upcoming static const bits declarations. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: detect more induction variablesTimothy Arceri2018-12-131-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows loop analysis to detect inductions variables that are incremented in both branches of an if rather than in a main loop block. For example: loop { block block_1: /* preds: block_0 block_7 */ vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20 vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4 vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4 vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21 vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22 vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9 vec1 32 ssa_14 = ige ssa_8, ssa_5 /* succs: block_2 block_3 */ if ssa_14 { block block_2: /* preds: block_1 */ break /* succs: block_8 */ } else { block block_3: /* preds: block_1 */ /* succs: block_4 */ } block block_4: /* preds: block_3 */ vec1 32 ssa_15 = ilt ssa_6, ssa_8 /* succs: block_5 block_6 */ if ssa_15 { block block_5: /* preds: block_4 */ vec1 32 ssa_16 = iadd ssa_8, ssa_7 vec1 32 ssa_17 = load_const (0x3f800000 /* 1.000000*/) /* succs: block_7 */ } else { block block_6: /* preds: block_4 */ vec1 32 ssa_18 = iadd ssa_8, ssa_7 vec1 32 ssa_19 = load_const (0x3f800000 /* 1.000000*/) /* succs: block_7 */ } block block_7: /* preds: block_5 block_6 */ vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18 vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4 vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19 /* succs: block_1 */ } Unfortunatly GCM could move the addition out of the if for us (making this patch unrequired) but we still cannot enable the GCM pass without regressions. This unrolls a loop in Rise of The Tomb Raider. vkpipeline-db results (VEGA): Totals from affected shaders: SGPRS: 88 -> 96 (9.09 %) VGPRS: 56 -> 52 (-7.14 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 2168 -> 4560 (110.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 4 -> 4 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Thomas Helland <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
* nir: reword code commentTimothy Arceri2018-12-131-2/+2
| | | | Reviewed-by: Thomas Helland <[email protected]>
* nir: in loop analysis track actual control flow typeTimothy Arceri2018-12-131-13/+21
| | | | | | | This will allow us to improve analysis to find more induction variables. Reviewed-by: Thomas Helland <[email protected]>
* nir: add if opt opt_if_loop_last_continue()Danylo Piliaiev2018-12-131-0/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removing the last continue can allow more loops to unroll. Also inserting code into the if branch can allow the various if opts to progress further. The insertion of some loops into the if branch also reduces VGPR use in some shaders. vkpipeline-db results (VEGA): Totals from affected shaders: SGPRS: 6552 -> 6576 (0.37 %) VGPRS: 6544 -> 6532 (-0.18 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 481952 -> 478032 (-0.81 %) bytes LDS: 13 -> 13 (0.00 %) blocks Max Waves: 241 -> 242 (0.41 %) Wait states: 0 -> 0 (0.00 %) Shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 168 -> 168 (0.00 %) VGPRS: 144 -> 140 (-2.78 %) Spilled SGPRs: 157 -> 157 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 8524 -> 8488 (-0.42 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 7 -> 7 (0.00 %) Wait states: 0 -> 0 (0.00 %) v2: (Timothy Arceri): - allow for continues in either branch - move any trailing loops inside the if as well as blocks. - leave nir_opt_trivial_continues() to actually remove the continue. Reviewed-by: Thomas Helland <[email protected]> Signed-off-by: Timothy Arceri <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
* nir: rework force_unroll_array_access()Timothy Arceri2018-12-131-14/+35
| | | | | | | Here we rework force_unroll_array_access() so that we can reuse the induction variable detection in a following patch. Reviewed-by: Thomas Helland <[email protected]>
* nir: factor out some of the complex loop unroll code to a helperTimothy Arceri2018-12-131-51/+64
| | | | | Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* nir: Document the function inlining processJason Ekstrand2018-12-121-0/+68
| | | | | | | | | This has thrown a few people off recently and it's good to have the process and all the rational for it documented somewhere. A comment at the top of nir_inline_functions seems as good a place as any. Acked-by: Karol Herbst <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/lower_tex: Add lowering for some min_lod casesJason Ekstrand2018-12-112-0/+116
| | | | Reviewed-by: Ian Romanick <[email protected]>
* nir/lower_tex: Modify txd instructions instead of replacing themJason Ekstrand2018-12-111-41/+7
| | | | | | | | | I don't know if one is better than the other or not but this approach has the advantage that we never forget to copy information over and we're not hard-coding quite as many assumptions. It's also a lot simpler and much less code. Reviewed-by: Ian Romanick <[email protected]>
* nir/lower_tex: Simplify lower_gradient logicJason Ekstrand2018-12-111-12/+9
| | | | | | | | Instead of having to call two different lower_gradient functions based on whether or not it's a cube, just make lower_gradient handle cubes. This significantly simplifies some of the logic. Reviewed-by: Ian Romanick <[email protected]>
* spirv: Add support for MinLodJason Ekstrand2018-12-114-1/+16
| | | | Reviewed-by: Ian Romanick <[email protected]>
* nir: fix spelling typoRob Clark2018-12-111-1/+1
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* meson: Add nir_algebraic_parser_test to suitesDylan Baker2018-12-101-0/+1
| | | | | | | | Just to make it easier to run a nir tests together. Fixes: a0ae12ca91a45f81897e774019cde9bd081f03a0 ("nir/algebraic: Add unit tests for bitsize validation") Reviewed-by: Eric Engestrom <[email protected]>
* nir: make use of new nir_cf_list_clone_and_reinsert() helperTimothy Arceri2018-12-101-48/+28
| | | | | Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add a new nir_cf_list_clone_and_reinsert() helperTimothy Arceri2018-12-101-0/+10
| | | | | Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: clarify some nit_loop_info member namesTimothy Arceri2018-12-103-17/+19
| | | | | | | | | Following commits will introduce additional fields such as guessed_trip_count. Renaming these will help avoid confusion as our unrolling feature set grows. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: small tidy ups for nir_loop_analyze()Timothy Arceri2018-12-101-21/+10
| | | | | Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Fixup algebraic test for variable-sized conversionsConnor Abbott2018-12-071-1/+1
| | | | | | | | | | b2i can now take any size boolean in preparation for 1-bit booleans, so the error message printed is slightly different. Fixes: dca6cd9ce65 ("nir: Make boolean conversions sized just like the others") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108961 Cc: Jason Ekstrand <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Make algebraic_parser_test.sh executable.Vinson Lee2018-12-061-0/+0
| | | | | | | | | | Fixes make check permission error. ../../bin/test-driver: line 107: ./nir/tests/algebraic_parser_test.sh: Permission denied FAIL nir/tests/algebraic_parser_test.sh (exit status: 126) Fixes: a0ae12ca91a4 ("nir/algebraic: Add unit tests for bitsize validation") Signed-off-by: Vinson Lee <[email protected]>
* nir: Make boolean conversions sized just like the othersJason Ekstrand2018-12-0513-45/+79
| | | | | | | | | Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is one if 8, 16, 32, or 64. This leads to having a few more opcodes but now everything is consistent and booleans aren't a weird special case anymore. Reviewed-by: Connor Abbott <[email protected]>
* nir/opt_algebraic: Add 32-bit specifiers to a bunch of booleansJason Ekstrand2018-12-051-56/+56
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/opt_algebraic: Drop bit-size suffixes from conversionsJason Ekstrand2018-12-051-29/+29
| | | | | | | | Suffixes are dropped from a bunch of conversion opcodes when it makes sense to do so. Others are kept if we really do want the bit-size restriction. Reviewed-by: Connor Abbott <[email protected]>
* nir/opt_algebraic: Simplify an optimization using the new search opsJason Ekstrand2018-12-051-7/+2
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/algebraic: Add support for unsized conversion opcodesJason Ekstrand2018-12-053-10/+140
| | | | | | | | | | | | All conversion opcodes require a destination size but this makes constructing certain algebraic expressions rather cumbersome. This commit adds support to nir_search and nir_algebraic for writing conversion opcodes without a size. These meta-opcodes match any conversion of that type regardless of destination size and the size gets inferred from the sizes of the things being matched or from other opcodes in the expression. Reviewed-by: Connor Abbott <[email protected]>
* nir/algebraic: Refactor codegen a bitJason Ekstrand2018-12-051-10/+11
| | | | | | | Instead of using an OrderedDict, just have a (necessarily sorted) array of transforms and a set of opcodes. Reviewed-by: Connor Abbott <[email protected]>
* nir/algebraic: Clean up some __str__ cruftJason Ekstrand2018-12-051-4/+0
| | | | | | | Both of these things are already handled in the Value base class so we don't need to handle them explicitly in Constant. Reviewed-by: Connor Abbott <[email protected]>