aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/compiler/brw_vec4_cmod_propagation.cpp
Commit message (Collapse)AuthorAgeFilesLines
* intel/compiler: Pass detailed dependency classes to invalidate_analysis()Francisco Jerez2020-03-061-1/+1
| | | | | | | | | | | Have fun reading through the whole back-end optimizer to verify whether I've missed any dependency flags -- Or alternatively, just trust that any mistake here will trigger an assertion failure during analysis pass validation if it ever poses a problem for the consistency of any of the analysis passes managed by the framework. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>
* intel/compiler: Introduce backend_shader method to propagate IR changes to ↵Francisco Jerez2020-03-061-1/+1
| | | | | | | | | | | | | | | | | analysis passes The invalidate_analysis() method knows what analysis passes there are in the back-end and calls their invalidate() method to report changes in the IR. For the moment it just calls invalidate_live_intervals() (which will eventually be fully replaced by this function) if anything changed. This makes all optimization passes invalidate DEPENDENCY_EVERYTHING, which is clearly far from ideal -- The dependency classes passed to invalidate_analysis() will be refined in a future commit. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>
* intel/compiler: use correct swizzle for replacementLionel Landwerlin2019-02-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | The optimization in 4cd1a0be76883c introduced a replacement of : cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D ... cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D By : cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D ... mov(8) vgrf11.y:D, vgrf15.yyyy:D The first cmp instruction is storing in x while the second mov is sourcing from y. We need to take into account where the replacement on the scan_inst destination is going to store thing so that the replacement mov can source things from the correct location. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 4cd1a0be76883c ("i965/vec4: Propagate conditional modifiers from more compares to other compares") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759 Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Propagate conditional modifiers from more compares to other comparesIan Romanick2018-12-171-3/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is a CMP.NZ that compares a single component (via a .zzzz swizzle, for example) with 0, it can propagate its conditional modifier back to a previous CMP that writes only that component. The specific case that I saw was: cmp.l.f0(8) g42<1>.xF g61<4>.xF (abs)g18<4>.zF ... cmp.nz.f0(8) null<1>D g42<4>.xD 0D In this case we can just delete the second CMP. No changes on Broadwell or Skylake because they do not use the vec4 backend. Also no changes on GM45 or Iron Lake. Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Sandy Bridge shown) total instructions in shared programs: 10856676 -> 10852569 (-0.04%) instructions in affected programs: 228322 -> 224215 (-1.80%) helped: 1331 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 3.09 x̃: 4 helped stats (rel) min: 0.11% max: 6.67% x̄: 1.88% x̃: 1.83% 95% mean confidence interval for instructions value: -3.19 -2.99 95% mean confidence interval for instructions %-change: -1.93% -1.83% Instructions are helped. total cycles in shared programs: 154788865 -> 154732047 (-0.04%) cycles in affected programs: 2485892 -> 2429074 (-2.29%) helped: 1097 HURT: 59 helped stats (abs) min: 2 max: 168 x̄: 51.96 x̃: 64 helped stats (rel) min: 0.12% max: 12.70% x̄: 3.44% x̃: 2.22% HURT stats (abs) min: 2 max: 16 x̄: 3.02 x̃: 2 HURT stats (rel) min: 0.18% max: 0.83% x̄: 0.64% x̃: 0.71% 95% mean confidence interval for cycles value: -51.04 -47.26 95% mean confidence interval for cycles %-change: -3.40% -3.07% Cycles are helped. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't ↵Ian Romanick2018-07-021-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | compatible Otherwise we can incorrectly cmod propagate in situations like add(8) g10<1>.xD g2<0>.xD -16D ... cmp.ge.f0(8) null<1>D g2<0>.xD 16D ... (+f0) sel(8) g21<1>.xyUD g14<4>.xyyyUD g18<4>.xyyyUD Sadly, this change hurts quite a few shaders. v2: Refactor writemask compatibility check into a separate function. Suggested by Caio. Ivy Bridge and Haswell had similar results. (Haswell shown) total instructions in shared programs: 12968489 -> 12968738 (<.01%) instructions in affected programs: 60679 -> 60928 (0.41%) helped: 0 HURT: 249 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.22% max: 0.81% x̄: 0.46% x̃: 0.44% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.44% 0.48% Instructions are HURT. total cycles in shared programs: 409171965 -> 409172317 (<.01%) cycles in affected programs: 260056 -> 260408 (0.14%) helped: 0 HURT: 176 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.04% max: 0.34% x̄: 0.17% x̃: 0.17% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: 0.16% 0.18% Cycles are HURT. Sandy Bridge total instructions in shared programs: 10423577 -> 10423753 (<.01%) instructions in affected programs: 40667 -> 40843 (0.43%) helped: 0 HURT: 176 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.29% max: 0.79% x̄: 0.48% x̃: 0.42% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.46% 0.51% Instructions are HURT. total cycles in shared programs: 146097503 -> 146097855 (<.01%) cycles in affected programs: 503990 -> 504342 (0.07%) helped: 0 HURT: 176 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.02% max: 0.36% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: 0.11% 0.13% Cycles are HURT. No changes on any other platforms. Signed-off-by: Ian Romanick <[email protected]> Fixes: cd635d149b2 i965/vec4: Propagate conditional modifiers from compares to adds Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965/vec4: Propagate conditional modifiers from compares to addsIan Romanick2018-03-261-5/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No changes on Broadwell or later as those platforms do not use the vec4 backend. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11682119 -> 11681056 (<.01%) instructions in affected programs: 150403 -> 149340 (-0.71%) helped: 950 HURT: 0 helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1 helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71% 95% mean confidence interval for instructions value: -1.19 -1.04 95% mean confidence interval for instructions %-change: -0.84% -0.79% Instructions are helped. total cycles in shared programs: 257495842 -> 257495238 (<.01%) cycles in affected programs: 270302 -> 269698 (-0.22%) helped: 271 HURT: 13 helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2 helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28% HURT stats (abs) min: 2 max: 12 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26% 95% mean confidence interval for cycles value: -2.41 -1.84 95% mean confidence interval for cycles %-change: -0.31% -0.26% Cycles are helped. Sandy Bridge total instructions in shared programs: 10430493 -> 10429727 (<.01%) instructions in affected programs: 120860 -> 120094 (-0.63%) helped: 766 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 146138718 -> 146138446 (<.01%) cycles in affected programs: 244114 -> 243842 (-0.11%) helped: 132 HURT: 0 helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2 helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19% 95% mean confidence interval for cycles value: -2.12 -2.00 95% mean confidence interval for cycles %-change: -0.18% -0.15% Cycles are helped. GM45 and Iron Lake had identical results. (Iron Lake shown) total instructions in shared programs: 7780251 -> 7780248 (<.01%) instructions in affected programs: 175 -> 172 (-1.71%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49% total cycles in shared programs: 177851584 -> 177851578 (<.01%) cycles in affected programs: 9796 -> 9790 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05% Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Allow cmod propagation when src0 is a uniform or shader inputIan Romanick2018-03-261-1/+2
| | | | | | | | | | No shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help more shaders. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Don't propagate cmod into integer multipliesJason Ekstrand2017-10-051-0/+17
| | | | | | | No shader-db change on Sky Lake. Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* intel/compiler: Don't cmod propagate into a saturated operationJason Ekstrand2017-10-051-0/+8
| | | | | | | | | | | | Shader-db results on Sky Lake: total instructions in shared programs: 12954445 -> 12955125 (0.01%) instructions in affected programs: 141862 -> 142542 (0.48%) helped: 0 HURT: 626 Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* i965: Move the back-end compiler to src/intel/compilerJason Ekstrand2017-03-131-0/+172
Mostly a dummy git mv with a couple of noticable parts: - With the earlier header cleanups, nothing in src/intel depends files from src/mesa/drivers/dri/i965/ - Both Autoconf and Android builds are addressed. Thanks to Mauro and Tapani for the fixups in the latter - brw_util.[ch] is not really compiler specific, so it's moved to i965. v2: - move brw_eu_defines.h instead of brw_defines.h - remove no-longer applicable includes - add missing vulkan/ prefix in the Android build (thanks Tapani) v3: - don't list brw_defines.h in src/intel/Makefile.sources (Jason) - rebase on top of the oa patches [Emil Velikov: commit message, various small fixes througout] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>