summaryrefslogtreecommitdiffstats
path: root/src/compiler/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir: consistently use ifndef guards over pragma onceEmil Velikov2017-03-2210-11/+38
| | | | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Vedran Miletić <[email protected]> Acked-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* nir: Add positional argument specifiers.Vinson Lee2017-03-212-2/+2
| | | | | | | | | | | | | Fix build with Python < 2.7. File "src/compiler/nir/nir_builder_opcodes_h.py", line 46, in <module> from nir_opcodes import opcodes File "src/compiler/nir/nir_opcodes.py", line 178, in <module> unop_convert("{}2{}{}".format(src_t[0], dst_t[0], bit_size), ValueError: zero length field name in format Fixes: 762a6333f21f ("nir: Rework conversion opcodes") Signed-off-by: Vinson Lee <[email protected]>
* nir/constant_expressions: Refactor helper functionsJason Ekstrand2017-03-141-24/+27
| | | | | | | | Apart from avoiding some unneeded size cases, this shouldn't have any actual functional impact. Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Rework conversion opcodesJason Ekstrand2017-03-148-189/+121
| | | | | | | | | | | | | | | | | | | | | | | | The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <[email protected]>
* nir: Rewrite nir_type_conversion_opJason Ekstrand2017-03-141-63/+92
| | | | | | | | | The original version was very convoluted and tried way too hard to not just have the nested switch statement that it needs. Let's just write the obvious code and then we know it's correct. This fixes a bunch of missing cases particularly with int64. Reviewed-by: Plamena Manolova <[email protected]>
* nir: Add a get_nir_type_for_glsl_base_type helperJason Ekstrand2017-03-141-2/+8
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nir/validate: Rework ALU bit-size rule validationJason Ekstrand2017-03-141-32/+33
| | | | | | | | | | | The original bit-size validation wasn't capable of properly dealing with instructions with variable bit sizes. An attempt was made to handle it by looking at source and destinations but, because the validation was done in validate_alu_(src|dest), it didn't really have the needed information. The new validation code is much more straightforward and should be more correct. Reviewed-by: Eric Anholt <[email protected]>
* nir/validate: Validate that bit sizes and components always matchJason Ekstrand2017-03-141-38/+63
| | | | | | | | | | | | | We've always required bit sizes to match but the rules for number of components have been a bit loose. You've never been allowed to source from something with less components than you consume, but more has always been fine. This changes the validator to require that they match exactly. The fact that they don't always match has been a source of confusion in NIR for quite some time and it's time we got rid of it. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Make image_size a variable-width intrinsicJason Ekstrand2017-03-141-1/+1
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_tex: Use tex_instr_dest_size for txs destinationsJason Ekstrand2017-03-141-1/+2
| | | | | | | | | Using coord_components of the source texture is correct for everything except cube maps where it's off by one. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/copy_prop: Respect the source's number of componentsJason Ekstrand2017-03-141-33/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the near future we are going to require that the num_components in a src dereference match the num_components of the SSA value being dereferenced. To do that, we need copy_prop to not remove our MOVs from a larger SSA value into an instruction that uses fewer channels. Because we suddenly have to know how many components each source has, this makes the pass a bit more complicated. Fortunately, copy propagation is the only pass that cares about the number of components are read by any given source so it's fairly contained. Shader-db results on Sky Lake: total instructions in shared programs: 13318947 -> 13320265 (0.01%) instructions in affected programs: 260633 -> 261951 (0.51%) helped: 324 HURT: 1027 Looking through the hurt programs, about a dozen are hurt by 3 instructions and the rest are all hurt by 2 instructions. From a spot-check of the shaders, the story is always the same: They get a vec4 from somewhere (frequently an input) and use the first two or three components as a texture coordinate. Because of the vector component mismatch, we have a mov or, more likely, a vecN sitting between the texture instruction and the input. This means that the back-end inserts a bunch of MOVs and split_virtual_grfs() goes to town. Because the texture coordinate is also used by some other calculation, register coalesce can't combine them back together and we end up with an extra 2 MOV instructions in our shader. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/intrinsics: Make load_barycentric_input take a 2-component coorJason Ekstrand2017-03-141-1/+3
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: "17.0 13.0" <[email protected]>
* nir: remove shebang from python scriptsEmil Velikov2017-03-107-7/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir/int64: Properly handle imod/iremJason Ekstrand2017-03-031-3/+21
| | | | | | | | | | | The previous implementation was fine for GLSL which doesn't really have a signed modulus/remainder. They just leave the behavior undefined whenever either source is negative. However, in SPIR-V, there is a defined behavior for negative arguments. This commit beefs up the pass so that it handles both correctly. Tested using a hacked up version of the Vulkan CTS test to get 64-bit support. Reviewed-by: Matt Turner <[email protected]>
* nir/builder: Add an int64 immediate helperJason Ekstrand2017-03-031-0/+11
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Do int64 lowering in NIRJason Ekstrand2017-03-011-54/+53
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Add a simple int64 lowering passJason Ekstrand2017-03-012-0/+288
| | | | | | | | | | | | | | | | | | The algorithms used by this pass, especially for division, are heavily based on the work Ian Romanick did for the similar int64 lowering pass in the GLSL compiler. v2: Properly handle vectors v3: Get rid of log2_denom stuff. Since we're using bcsel, we do all the calculations anyway and this is just extra instructions. v4: - Add back in the log2_denom stuff since it's needed for ensuring that the shifts don't overflow. - Rework the looping part of the pass to be easier to expand. Reviewed-by: Matt Turner <[email protected]>
* nir/lower_indirect: Use nir_builder control-flow helpersJason Ekstrand2017-03-011-30/+5
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/lower_gs_intrinsics: Use nir_builder control-flow helpersJason Ekstrand2017-03-011-6/+3
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/builder: Add support for easily building control-flowJason Ekstrand2017-03-011-0/+95
| | | | | | | | | | | Each of the pop functions (and push_else) take a control flow parameter as their second argument. If NULL, it assumes that the builder is in a block that's a direct child of the control-flow node you want to pop off the virtual stack. This is what 90% of consumers will want. The SPIR-V pass, however, is a bit more "creative" about how it walks the CFG and it needs to be able to pop multiple levels at a time, hence the argument. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Delete unused arg in get_iterationElie TOURNIER2017-02-271-2/+2
| | | | | | | nir_const_value is not needed in get_iteration Signed-off-by: Elie Tournier <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: delete magic numberElie TOURNIER2017-02-241-1/+11
| | | | | Signed-off-by: Elie Tournier <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: do not #include util/debug.h within extern C {}Emil Velikov2017-02-211-1/+4
| | | | | | | | It's a problem waiting to happen. Individual headers should be annotated if needed. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nir/algebraic: Optimize 64bit pack/unpackJason Ekstrand2017-02-161-0/+6
| | | | | | This reduces the instruction count in some fp64 and int64 piglit tests Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Rename lower_double_pack to lower_64bit_packJason Ekstrand2017-02-162-5/+4
| | | | | | | There's nothing "double" about it other than, perhaps, the fact that it packs two 32-bit values. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Combine the int and double [un]pack opcodesJason Ekstrand2017-02-165-64/+30
| | | | | | | NIR is a typeless IR and the two opcodes, when considered bitwise, do exactly the same thing. There's no reason to have two versions. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: handle some 64-bit integer conversionsDave Airlie2017-02-161-7/+19
| | | | | | | | These are enough for the spir-v generator to handle UConvert and SConvert operations, and fix the 4 tests in CTS. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: handle 64-bit integer types in glsl->nir type conversion.Dave Airlie2017-02-161-0/+6
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: add opcode to perform int64 to bool conversionsSamuel Iglesias Gonsálvez2017-02-091-0/+1
| | | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: add extra const notation in compare_blocks()Emil Velikov2017-01-271-2/+2
| | | | | | | | | | | MSVC warns about different const qualifiers. Add the extra const to silence it. nir_phi_builder.c(244) : warning C4090: 'initializing' : different 'const' qualifiers nir_phi_builder.c(245) : warning C4090: 'initializing' : different 'const' qualifiers Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: silence implicit conversion to 64bitEmil Velikov2017-01-271-1/+1
| | | | | | | | | | | | MSVC warns about implicit conversion as below. Annotate the literal appropriately to silence the warning. nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: bump loop max unroll limitTimothy Arceri2017-01-251-1/+1
| | | | | | | | | | | | | | | | | | The original number was chosen in an attempt to match the limits applied to GLSL IR. A look at the git history of the why these limits were chosen for GLSL IR shows it was more to do with the slow speed of unrolling large loops in GLSL IR than anything else. The speed of loop unrolling in NIR is not a problem so we may wish to bump this even higher in future. No shader-db change, however a furture change will disbale the GLSL IR optimisation loop in the i965 backend results in 4 loops from The Talos Principle failing to unroll. Bumping the limit allows them to unroll which results in the instruction count matching the previous output from when the GLSL IR opts were still enabled. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/search: Use the correct bit size for integer comparisonsJason Ekstrand2017-01-211-32/+16
| | | | | | | | | | | | | | | | | | The previous code always compared integers as 64-bit. Due to variations in sign-extension in the code generated by nir_opt_algebraic.py, this meant that nir_search doesn't always do what you want. Instead, 32-bit values should be matched as 32-bit and 64-bit values should be matched as 64-bit. While we're here we unify the unsigned and signed paths. Now that we're using the right bit size, they should be the same since the only difference we had before was sign extension. This gets the UE4 bitfield_extract optimization working again. It had stopped working due to the constant 0xff00ff00 getting sign-extended when it shouldn't have. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Cc: "17.0 13.0" <[email protected]>
* nir: Add support for 64-bit integer types to split_var_copies_blockIan Romanick2017-01-201-0/+2
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Shift count for shift opcodes is always 32-bitsIan Romanick2017-01-202-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously both sources were unsized. This caused problems when the thing being shifted was 64-bit but the shift count was 32-bit. The expectation in NIR is that all unsized sources (and destination) will ultimately have the same size. The changes in nir_opt_algebraic.py are to prevent errors like: Failed to parse transformation: 03:12:25 (('extract_i8', 'a', 'b'), ('ishr', ('ishl', 'a', ('imul', ('isub', 3, 'b'), 8)), 24), 'options->lower_extract_byte') 03:12:25 Traceback (most recent call last): 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 610, in __init__ 03:12:25 xform = SearchAndReplace(xform) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 495, in __init__ 03:12:25 BitSizeValidator(varset).validate(self.search, self.replace) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 311, in validate 03:12:25 validate_dst_class = self._validate_bit_class_up(replace) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 414, in _validate_bit_class_up 03:12:25 src_class = self._validate_bit_class_up(val.sources[i]) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 420, in _validate_bit_class_up 03:12:25 assert src_class == src_type_bits 03:12:25 AssertionError Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Connor Abbott <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: Jason Ekstrand <[email protected]>
* nir: Lower packing and unpacking of 64-bit integer typesIan Romanick2017-01-201-5/+37
| | | | | | | | | This change makes me wonder whether double packing should be reimplemented as int64BitsToDouble(packInt2x32(v)). I'm a little on the fence since not all platforms that support fp64 natively support int64. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Add 64-bit integer support for conversions and bitcastsIan Romanick2017-01-202-1/+30
| | | | | | | | | | | | | | v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. v3 (idr): Make the "from" type in a cast unsized. This reduces the number of required cast operations at the expensive slightly more complex code. However, this will be a dramatic improvement when other sized integer types are added. Suggested by Connor. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Add 64-bit integer constant supportIan Romanick2017-01-202-0/+13
| | | | | | | v2: Rebase on 19a541f (nir: Get rid of nir_constant_data) Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]> [v1]
* nir: add min/max optimisationElie TOURNIER2017-01-191-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add the following optimisations: min(x, -x) = -abs(x) min(x, -abs(x)) = -abs(x) min(x, abs(x)) = x max(x, -abs(x)) = x max(x, abs(x)) = abs(x) max(x, -x) = abs(x) shader-db: total instructions in shared programs: 13067779 -> 13067775 (-0.00%) instructions in affected programs: 249 -> 245 (-1.61%) helped: 4 HURT: 0 total cycles in shared programs: 252054838 -> 252054806 (-0.00%) cycles in affected programs: 504 -> 472 (-6.35%) helped: 2 HURT: 0 Signed-off-by: Elie Tournier <[email protected]> Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Only include nir_search_helpers onceJason Ekstrand2017-01-191-1/+1
| | | | | | | We were including it once per value, so probably around 10k times. Let's not cause the compiler any more work than we have to. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir/gcm: fix a bug with metadata handlingConnor Abbott2017-01-141-3/+3
| | | | | | | | | | | We were using impl->num_blocks, but that isn't guaranteed to be up-to-date until after the block_index metadata is required. If we were unlucky, this could lead to overwriting memory. Noticed by inspection. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: optimise min/max fadd combosTimothy Arceri2017-01-142-0/+26
| | | | | | | | | | | | | | | | shader-db results BDW: total instructions in shared programs: 13060410 -> 13060313 (-0.00%) instructions in affected programs: 24533 -> 24436 (-0.40%) helped: 88 HURT: 0 total cycles in shared programs: 256585692 -> 256586698 (0.00%) cycles in affected programs: 647290 -> 648296 (0.16%) helped: 35 HURT: 30 Reviewed-by: Matt Turner <[email protected]>
* nir/gcm: Fix a typo in a commentJason Ekstrand2017-01-121-1/+1
| | | | Reported-by: Matt Turner <[email protected]>
* nir/gcm: Rework the schedule late loopJason Ekstrand2017-01-121-5/+6
| | | | | | | | | | | | This fixes a bug in code motion that occurred when the best block is the same as the schedule early block. In this case, because we're checking (lca != def->parent_instr->block) at the top of the loop, we never get to the check for loop depth so we wouldn't move it out of the loop. This commit reworks the loop to be a simple for loop up the dominator chain and we place the (lca != def->parent_instr->block) check at the end of the loop. Reviewed-by: Matt Turner <[email protected]>
* nir: don't turn ieq/ine into inot if used by an ifTimothy Arceri2017-01-122-2/+8
| | | | | | | | | | | | | | | | | | | Otherwise we will end up with an extra instruction to compare the result of the inot. On BDW: total instructions in shared programs: 13060620 -> 13060481 (-0.00%) instructions in affected programs: 103379 -> 103240 (-0.13%) helped: 127 HURT: 0 total cycles in shared programs: 256590950 -> 256587408 (-0.00%) cycles in affected programs: 11324730 -> 11321188 (-0.03%) helped: 114 HURT: 21 Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add late opt to turn inot/b2f combos back to bcselTimothy Arceri2017-01-122-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | We turn these from bcsel into inot/b2f combos in order for other optimisation passes to get further. Once we have finished turn the ones that remain and are used in more than a single expression back into a bcsel. On BDW: total instructions in shared programs: 13060965 -> 13060297 (-0.01%) instructions in affected programs: 835701 -> 835033 (-0.08%) helped: 670 HURT: 2 total cycles in shared programs: 256599536 -> 256598006 (-0.00%) cycles in affected programs: 114655488 -> 114653958 (-0.00%) helped: 419 HURT: 240 LOST: 0 GAINED: 1 The 2 HURT is because inserting bcsel creates the only use of const 1.0 in two shaders from tri-of-friendship-and-madness. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add imprecise flrp optimisationTimothy Arceri2017-01-121-0/+1
| | | | | | | | | | | | | | | | | | On BDW: total instructions in shared programs: 13061890 -> 13061877 (-0.00%) instructions in affected programs: 2441 -> 2428 (-0.53%) helped: 13 HURT: 0 total cycles in shared programs: 256612254 -> 256611784 (-0.00%) cycles in affected programs: 16418 -> 15948 (-2.86%) helped: 10 HURT: 2 V2: don't use ffma directly Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Introduce a nir_opt_move_comparisons() pass.Kenneth Graunke2017-01-122-0/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This tries to move comparisons (a common source of boolean values) closer to their first use. For GPUs which use condition codes, this can eliminate a lot of temporary booleans and comparisons which reload the condition code register based on a boolean. V2: (Timothy Arceri) - fix move comparision for phis so we dont end up with: vec1 32 ssa_227 = phi block_34: ssa_1, block_38: ssa_240 vec1 32 ssa_235 = feq ssa_227, ssa_1 vec1 32 ssa_230 = phi block_34: ssa_221, block_38: ssa_235 - add nir_op_i2b/nir_op_f2b to the list of comparisons. V3: (Timothy Arceri) - tidy up suggested by Jason. - add inot/fnot to move comparison list V4: (Jason Ekstrand) - clean up move_comparison_source - get rid of the tuple - rework phi handling Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> [v1] Reviewed-by: Jason Ekstrand <[email protected]>
* nir/algebraic: add support for conditional helper functions to expressionsTimothy Arceri2017-01-123-1/+15
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir/search: Only allow matching SSA valuesJason Ekstrand2017-01-111-5/+11
| | | | | | | | This is more correct and should also be a tiny bit faster since we're just comparing pointers instead of calling nir_src_equal. Reviewed-by: Timothy Arceri <[email protected]> Cc: "13.0" <[email protected]>