aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno/a2xx/ir2_nir.c
Commit message (Collapse)AuthorAgeFilesLines
* nir: add callback to nir_remove_dead_variables()Timothy Arceri2020-06-031-1/+1
| | | | | | | | | | | | This allows us to do API specific checks before removing variable without filling nir_remove_dead_variables() with API specific code. In the following patches we will use this to support the removal of dead uniforms in GLSL. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
* freedreno/a2xx: use sysval for pointcoordJonathan Marek2019-10-271-11/+7
| | | | | | | | | Fixes a problem with shaders using gl_PointCoord. Signed-off-by: Jonathan Marek <[email protected]> Reported-by: Fabio Estevam <[email protected]> Tested-by: Fabio Estevam <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a2xx: ir2: set lower_fdphJonathan Marek2019-09-061-0/+1
| | | | | | | | The fdph opcode is not supported. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a2xx: ir2: remove pointcoord y invertJonathan Marek2019-09-061-4/+2
| | | | | | | | | Fixes the following deqp test: dEQP-GLES2.functional.shaders.builtin_variable.pointcoord Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a2xx: ir2: fix lowering of instructions after float loweringJonathan Marek2019-09-061-3/+2
| | | | | | | | | | | Some instructions generated by int/bool float lowering need to be lowered by opt_algebraic. Fixes: 43dbd7d6 Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: allow specifying filter callback in lower_alu_to_scalarVasily Khoruzhick2019-09-061-11/+24
| | | | | | | | | | | | | Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_uboRhys Perry2019-08-121-1/+1
| | | | | | | | | | v2: add to series v3: update Makefile.sources v4: don't remove a comment and break statement v4: use nir_can_move_instr Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: replace nir_move_load_const() with nir_opt_sink()Rhys Perry2019-08-121-1/+1
| | | | | | | | | | | | | | | | | | | This is mostly the same as nir_move_load_const() but can also move undef instructions, comparisons and some intrinsics (being careful with loops). v2: actually delete nir_move_load_const.c v3: fix nir_opt_sink() usage in freedreno v3: update Makefile.sources v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def v4: handle if uses v4: fix handling of nested loops v5: re-write adjust_block_for_loops v5: re-write setting of use_block for if uses Signed-off-by: Rhys Perry <[email protected]> Co-authored-by: Daniel Schürmann <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno: a2xx: use nir_lower_alu_to_scalar instead of lowering passJonathan Marek2019-08-021-2/+12
| | | | | | | | | nir_lower_alu_to_scalar can now be used to only lower certain ops, so we don't need the custom pass. And we can lower fall_equal/fany_nequal with lower_vector_cmp instead. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix fneg/fabs/fsat opcodesJonathan Marek2019-08-021-0/+12
| | | | | | | | Previously we would get a fmov with modifiers, but now that mov has no type these opcodes need to be supported. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix order of NIR optsJonathan Marek2019-08-021-2/+2
| | | | | | | | int_to_float needs to come after bool_to_float, and lower_to_source_mods needs to come after both, since they don't deal wih source mods. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix non-etc1 cubemapsJonathan Marek2019-08-021-1/+1
| | | | | | | Not sure how this happened, but apparently all cubemaps need swapped XY. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir/algebraic: rename lower_bitshift to lower_bitopsErico Nunes2019-07-311-1/+1
| | | | | | | | | | | | | Optimizations that insert bitshift or bitwise operations should not be applied on GPUs that don't support integer operations. The .lower_bitshift could be used to control the bitshift related ones, but there was also one bitwise optimization uncovered. Since only lima and freedreno use this option and the use case is that no bit operations are wanted, let's rename it to .lower_bitops and use it to control all bitops related optimizations. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Jonathan Marek <[email protected]>
* nir: Add lower_rotate flag and set to true in all driversSagar Ghuge2019-07-011-0/+1
| | | | | | Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: remove fnot/fxor/fand/for opcodesJonathan Marek2019-06-261-4/+0
| | | | | | | | | | There doesn't seem to be any reason to keep these opcodes around: * fnot/fxor are not used at all. * fand/for are only used in lower_alu_to_scalar, but easily replaced Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Combine lower_fmod16/32 back into a single lower_fmod.Kenneth Graunke2019-06-051-1/+1
| | | | | | | | | | | | | | We originally had a single lower_fmod option. In commit 2ab2d2e5, Sam split 32 and 64-bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16-bit unhandled, so Iago added a lower_fmod16 option in commit ca31df6f. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewed-by: Marek Olšák <[email protected]>
* nir: remove bool lowering from lower_int_to_floatJonathan Marek2019-05-311-0/+1
| | | | | | | | | | | | | | Removes the bool_to_float logic from the int_to_float pass, so that both can be used separately. By having separate passes we have better validation and it makes it possible to use with the lower_ftrunc option (int lowering generates ftrunc, but lower_ftrunc generates bools, ftrunc lowering should probably be reworked). For now we always expect lower_bool to come after lower_int. Also fixes f2i32 to become ftrunc and adds u2f/f2u cases. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add lower_bitshift optionJonathan Marek2019-05-311-0/+1
| | | | | | | | | Add a "lower_bitshift" option, which disables optimizations introducing bitshifts and lowers ishl by constant to a multiply, so that we don't have to deal with bitshifts in int_to_float lowering. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Drop imov/fmov in favor of one mov instructionJason Ekstrand2019-05-241-14/+11
| | | | | | | | | | | | | | | | The difference between imov and fmov has been a constant source of confusion in NIR for years. No one really knows why we have two or when to use one vs. the other. The real reason is that they do different things in the presence of source and destination modifiers. However, without modifiers (which many back-ends don't have), they are identical. Now that we've reworked nir_lower_to_source_mods to leave one abs/neg instruction in place rather than replacing them with imov or fmov instructions, we don't need two different instructions at all anymore. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Acked-by: Rob Clark <[email protected]>
* freedreno: a2xx: use nir_lower_io for TGSI shadersJonathan Marek2019-04-231-36/+2
| | | | | | | | | Allows removing the load_deref/store_deref code in the compiler. tgsi_to_nir now uses screen instead of options so we can simplify that too. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: make nir_const_value scalarKarol Herbst2019-04-141-2/+4
| | | | | | | | | v2: remove & operator in a couple of memsets add some memsets v3: fixup lima Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v2)
* nir: Get rid of global registersJason Ekstrand2019-04-091-1/+0
| | | | | | | | | We have a pass to lower global registers to locals and many drivers dutifully call it. However, no one ever creates a global register ever so it's all dead code. It's time we bury it. Acked-by: Karol Herbst <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/radv: remove restrictions on opt_if_loop_last_continue()Timothy Arceri2019-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <[email protected]> (v1) Acked-by: Samuel Pitoiset <[email protected]>
* tgsi_to_nir: Produce optimized NIR for a given pipe_screen.Timur Kristóf2019-03-051-3/+5
| | | | | | | | | | | | | | | | | | | With this patch, tgsi_to_nir will output NIR that is tailored to the given pipe, by reading its capabilities and adjusting the NIR code to those capabilities similarly to how glsl_to_nir works. It also adds an optimization loop that brings the output NIR in line with what glsl_to_nir outputs. This is necessary for the same reason why glsl_to_nir has its own optimization loop: currently not every driver does these optimizations yet. For uses which cannot pass a pipe_screen we also keep a variant called tgsi_to_nir_noscreen which keeps the old behavior. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Acked-By: Eric Anholt <[email protected]>
* freedreno: Plumb pipe_screen through to irX_tgsi_to_nir.Timur Kristóf2019-03-051-1/+4
| | | | | | | | | | This patch makes it possible for freedreno to pass a pipe_screen to tgsi_to_nir. This will be needed when tgsi_to_nir supports reading pipe capabilities. Signed-off-by: Timur Kristóf <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: enable early-Z testingJonathan Marek2019-01-281-0/+1
| | | | | | | Enable earlyZ when alpha test is disabled. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: ir2 cleanupJonathan Marek2019-01-281-2/+0
| | | | Reviewed-by: Rob Clark <[email protected]>
* freedreno/a2xx: fix unused variable warningRob Clark2019-01-261-1/+0
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: add partial lower_scalar pass for ir2Jonathan Marek2019-01-221-0/+3
| | | | | | Some instructions can only be scalar on a2xx, lower these only Signed-off-by: Jonathan Marek <[email protected]>
* freedreno: a2xx: NIR backendJonathan Marek2019-01-221-0/+1173
This patch replaces the a2xx TGSI compiler with a NIR compiler. It also adds several new features: -gl_FrontFacing, gl_FragCoord, gl_PointCoord, gl_PointSize -control flow (including loops) -texture related features (LOD/bias, cubemaps) -filling scalar ALU slot when possible Signed-off-by: Jonathan Marek <[email protected]>