aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir/copy_prop_vars: Record progress in more placesJason Ekstrand2020-05-221-0/+3
| | | | | | Fixes: 96c32d7776 "nir/copy_prop_vars: handle load/store of vector..." Reviewed-by: Dave Airlie <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
* nir/opt_deref: Report progress if we remove a derefJason Ekstrand2020-05-221-1/+3
| | | | | | Fixes: a1c688517de "nir/opt_deref: Properly optimize ptr_as_array..." Reviewed-by: Dave Airlie <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
* nir/lower_double_ops: Rework the if (progress) treeJason Ekstrand2020-05-221-1/+8
| | | | | | Fixes: d7d35a9522 "nir/lower_doubles: Use the new NIR lowering..." Reviewed-by: Dave Airlie <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
* compiler: delete leftover autotools test wrapperEric Engestrom2020-05-201-3/+0
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5114>
* nir: Add const to nir_intrinsic_src_componentsJason Ekstrand2020-05-191-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5108>
* nir: Add fclamp_pos opcodeAlyssa Rosenzweig2020-05-191-0/+1
| | | | | | | | | Corresponds to the .pos modifier on all Mali GPUs (lima and panfrost). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5102>
* nir: Add fsat_signed opcodeAlyssa Rosenzweig2020-05-191-0/+1
| | | | | | | | | Exists on later Mali. Equivalent to clamp(x, -1.0, 1.0) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5102>
* nir: Add a store_reg helper and use the builder in phis_to_regsJason Ekstrand2020-05-192-21/+25
| | | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
* nir: Add a new helper for iterating phi sources leaving a blockJason Ekstrand2020-05-193-15/+30
| | | | | | | | | This takes the same callback as nir_foreach_src except it walks all phi sources which leave a given block. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
* nir/clone: Re-use clone_alu for nir_alu_instr_cloneJason Ekstrand2020-05-191-21/+17
| | | | | | | | | | | | All it takes are a couple small tweaks to the clone infrastructure to allow us to use it without any remap table at all. This reduces code duplication and the chances for bugs that come with it. In particular, the hand-rolled nir_alu_instr_clone didn't preserve no_[un]signed_wrap, or source/destination modifiers. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
* nir/opt_if: use nir_src_as_bool in opt_peel_loop_initial_if helperRhys Perry2020-05-191-12/+10
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4757>
* nir/opt_if: run opt_peel_loop_initial_if after all other optimizationsRhys Perry2020-05-191-5/+44
| | | | | | | | | | | | | Fixes dEQP-VK.graphicsfuzz.loops-ifs-continues-call with RADV. opt_if_loop_terminator can cause this optimization or opt_if_simplification to be run on the non-SSA code. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Fixes: 52c8bc0130a ('nir: make opt_if_loop_terminator() less strict') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2943 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4757>
* nir: Add documentation for each jump instruction typeJason Ekstrand2020-05-191-0/+18
| | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
* nir: Use a switch statement in nir_handle_add_jumpJason Ekstrand2020-05-191-13/+20
| | | | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
* nir: Validate jump instructions as an instruction typeJason Ekstrand2020-05-191-30/+39
| | | | | | | | | | | | | | This has the downside of putting block successor validation in two places that are a bit further apart. However, handling them as a special case makes the code more confusing than needed. At least two different people have not noticed that we don't have jump instruction validation in the last week or two and added it. Being able to search for validate_jump_instr is useful. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
* nir: Consider atomic counter intrinsics when setting writes_memoryCaio Marcelo de Oliveira Filho2020-05-181-0/+22
| | | | | | | | | | | | | | In i965 these get lowered after gather info, so let's consider them too. Fixes piglit.spec.arb_framebuffer_no_attachments.arb_framebuffer_no_attachments-atomic in Gen9, HSW and IVB. Fixes: 6a6c36e9776 ("intel/fs: Use writes_memory from shader_info") Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5093>
* nir: Use deref intrinsics to set writes_memory when gathering infoCaio Marcelo de Oliveira Filho2020-05-181-0/+29
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4815>
* nir: Use 8-bit types for most info fieldsJason Ekstrand2020-05-152-11/+11
| | | | | | | | | This shrinks nir_intrinsics.c.o from 73K to 35K and nir_opcodes.c.o from 64K to 31K on a release build. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5045>
* Revert "nir/validate: validate the stride for deref_ptr_as_array"Karol Herbst2020-05-141-1/+0
| | | This reverts commit 667e14e7bd759a77e732c4de09fb978ee3816eaf
* nir/validate: validate the stride for deref_ptr_as_arrayKarol Herbst2020-05-141-0/+1
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068>
* nir/deref: copy ptr_stride when rematerializingKarol Herbst2020-05-141-1/+4
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068>
* nir: add pass to lower disjoint wrmask'sRob Clark2020-05-133-0/+235
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: add helper to copy const_index[]Rob Clark2020-05-131-0/+27
| | | | | | | | | It seems less brittle to not assume they are in the same order for src and dst instructions. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: fix indices for ir3 ssbo_atomic intrinsicsRob Clark2020-05-131-10/+10
| | | | | | | | | Caught by the sanity checking in nir_intrinsic_copy_const_indices() (which is introduced by the next patch). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: Drop wrmask for ir3 local and global store intrinsicsKristian H. Kristensen2020-05-131-2/+2
| | | | | | | These intrinsics are supposed to map to the underlying hardware instructions, which don't have wrmask. We use them when we lower store_output in the geometry pipeline and since store_output gets lowered to temps, we always see full wrmasks there.
* nir: Add some docs to the metadata typesJason Ekstrand2020-05-141-0/+51
| | | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5028>
* nir: Include num_ubos in the printed shader (if nonzero).Eric Anholt2020-05-141-0/+2
| | | | | | I keep wanting this number for debugging shaders. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
* nir: reset ssa-defs as non-divergent during divergence analysis instead of ↵Daniel Schürmann2020-05-131-21/+36
| | | | | | | upfront Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: simplify phi handling in divergence analysisDaniel Schürmann2020-05-131-113/+116
| | | | | | | | | | This patch adds some control flow information to the state to keep track whether a loop contains divergent continue or break statements to not having to recalculate this property for every phi. Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: rework phi handling in divergence analysisDaniel Schürmann2020-05-131-173/+214
| | | | | | | | | | | | This patch splits the visit_phi() function into three different ones according to the kind of phi (merge-node, loop-header or loop-exit) and calls them when visiting the cf_nodes. This allows to revisit loops if the loop header's phis have changed, only. Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: refactor divergence analysis stateDaniel Schürmann2020-05-131-35/+37
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: add nir_intrinsic_elect to divergence analysisDaniel Schürmann2020-05-131-0/+1
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: Make "divergent" a property of an SSA valueJason Ekstrand2020-05-133-64/+94
| | | | | | | v2: fix usage in ACO (by Daniel Schürmann) Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* turnip: Execute ir3_nir_lower_gs pass againBrian Ho2020-05-121-2/+5
| | | | | | | | | | | | | | | This commit fixes a GS regression introduced in !4562 where ir3's GS lowering pass was moved from common code (ir3_nir) to freedreno-specific code (ir3_shader). For GS support in turnip, we need to add the GS lowering pass back in, this time in tu_shader. As for the nir_gather_info change, the GS lowering pass has always introduced a discard_if intrinsic into the GS. Previously, we simply ran nir_shader_gather_info before GS lowering, but now since we lower the GS before we need to remove the assertion that only a FS can use the discard_if intrinsic. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
* nir: Fix count when we didn't lower load_uniforms but did shift load_ubos.Eric Anholt2020-05-121-1/+1
| | | | | | | | | | | | | The fixed commit was really nice in mostly fixing num_ubos to reflect the shader after lowering, but for dEQP-GLES31.functional.compute.basic.ubo_to_ssbo_single_invocation there are no default uniforms and so we skipped the increment, even though we shifted the block index up. Fixes: 4777ee1a62f0 ("nir: Always create UBO variable when lowering uniforms to ubo") Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
* nir/algebraic: Eliminate useless extract before unpackIan Romanick2020-05-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The shader helped for spills and fills is the big compute shader in Dirt Showdown. One of the shaders hurt for spills and fills on Broadwell is the big compute shader in Bioshock Infinite, but combined with the previous commit, it's still an impovement. Tiger Lake total instructions in shared programs: 21833218 -> 21832449 (<.01%) instructions in affected programs: 66104 -> 65335 (-1.16%) helped: 106 HURT: 14 helped stats (abs) min: 1 max: 67 x̄: 7.87 x̃: 5 helped stats (rel) min: 0.19% max: 5.76% x̄: 1.27% x̃: 0.95% HURT stats (abs) min: 1 max: 14 x̄: 4.64 x̃: 1 HURT stats (rel) min: 0.19% max: 4.12% x̄: 1.41% x̃: 0.19% 95% mean confidence interval for instructions value: -8.51 -4.30 95% mean confidence interval for instructions %-change: -1.23% -0.69% Instructions are helped. total cycles in shared programs: 506180109 -> 506196314 (<.01%) cycles in affected programs: 1671429 -> 1687634 (0.97%) helped: 37 HURT: 84 helped stats (abs) min: 1 max: 490 x̄: 73.27 x̃: 24 helped stats (rel) min: 0.02% max: 7.98% x̄: 1.25% x̃: 0.41% HURT stats (abs) min: 1 max: 5000 x̄: 225.19 x̃: 8 HURT stats (rel) min: 0.03% max: 10.22% x̄: 1.22% x̃: 0.42% 95% mean confidence interval for cycles value: 2.85 265.00 95% mean confidence interval for cycles %-change: 0.04% 0.88% Cycles are HURT. Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19961317 -> 19960543 (<.01%) instructions in affected programs: 30268 -> 29494 (-2.56%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 142 x̄: 19.85 x̃: 7 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.33% x̃: 2.31% 95% mean confidence interval for instructions value: -29.46 -10.23 95% mean confidence interval for instructions %-change: -2.95% -1.71% Instructions are helped. total cycles in shared programs: 498863755 -> 498865843 (<.01%) cycles in affected programs: 1831136 -> 1833224 (0.11%) helped: 57 HURT: 65 helped stats (abs) min: 1 max: 1400 x̄: 128.93 x̃: 25 helped stats (rel) min: 0.05% max: 3.49% x̄: 0.89% x̃: 0.71% HURT stats (abs) min: 1 max: 1887 x̄: 145.18 x̃: 15 HURT stats (rel) min: 0.02% max: 9.88% x̄: 1.83% x̃: 0.73% 95% mean confidence interval for cycles value: -58.30 92.53 95% mean confidence interval for cycles %-change: 0.16% 0.97% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8774 -> 8773 (-0.01%) spills in affected programs: 20 -> 19 (-5.00%) helped: 1 HURT: 0 total fills in shared programs: 9496 -> 9494 (-0.02%) fills in affected programs: 40 -> 38 (-5.00%) helped: 1 HURT: 0 Broadwell total instructions in shared programs: 17859373 -> 17858548 (<.01%) instructions in affected programs: 38452 -> 37627 (-2.15%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 143 x̄: 26.61 x̃: 10 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.57% x̃: 2.69% 95% mean confidence interval for instructions value: -39.79 -13.44 95% mean confidence interval for instructions %-change: -3.25% -1.89% Instructions are helped. total cycles in shared programs: 525858109 -> 525869236 (<.01%) cycles in affected programs: 2058597 -> 2069724 (0.54%) helped: 44 HURT: 75 helped stats (abs) min: 2 max: 1330 x̄: 187.84 x̃: 23 helped stats (rel) min: 0.04% max: 31.31% x̄: 2.13% x̃: 0.85% HURT stats (abs) min: 1 max: 3915 x̄: 258.56 x̃: 47 HURT stats (rel) min: 0.02% max: 10.53% x̄: 2.81% x̃: 2.21% 95% mean confidence interval for cycles value: -26.06 213.07 95% mean confidence interval for cycles %-change: 0.19% 1.78% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 25744 -> 25730 (-0.05%) spills in affected programs: 1578 -> 1564 (-0.89%) helped: 4 HURT: 2 total fills in shared programs: 31710 -> 31689 (-0.07%) fills in affected programs: 4346 -> 4325 (-0.48%) helped: 3 HURT: 3 Haswell total instructions in shared programs: 16228399 -> 16227783 (<.01%) instructions in affected programs: 22201 -> 21585 (-2.77%) helped: 27 HURT: 0 helped stats (abs) min: 1 max: 68 x̄: 22.81 x̃: 11 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.92% x̃: 2.86% 95% mean confidence interval for instructions value: -31.96 -13.66 95% mean confidence interval for instructions %-change: -3.68% -2.15% Instructions are helped. total cycles in shared programs: 538613967 -> 538701354 (0.02%) cycles in affected programs: 1653044 -> 1740431 (5.29%) helped: 36 HURT: 81 helped stats (abs) min: 2 max: 708 x̄: 104.50 x̃: 17 helped stats (rel) min: <.01% max: 15.01% x̄: 1.67% x̃: 0.65% HURT stats (abs) min: 1 max: 30100 x̄: 1125.30 x̃: 304 HURT stats (rel) min: 0.02% max: 16.21% x̄: 8.98% x̃: 11.60% 95% mean confidence interval for cycles value: 23.78 1470.01 95% mean confidence interval for cycles %-change: 4.29% 7.12% Cycles are HURT. total spills in shared programs: 23418 -> 23409 (-0.04%) spills in affected programs: 177 -> 168 (-5.08%) helped: 2 HURT: 0 total fills in shared programs: 25919 -> 25896 (-0.09%) fills in affected programs: 568 -> 545 (-4.05%) helped: 3 HURT: 0 Ivy Bridge total instructions in shared programs: 15265983 -> 15265759 (<.01%) instructions in affected programs: 8418 -> 8194 (-2.66%) helped: 5 HURT: 0 helped stats (abs) min: 18 max: 99 x̄: 44.80 x̃: 26 helped stats (rel) min: 1.74% max: 4.26% x̄: 3.12% x̃: 3.00% 95% mean confidence interval for instructions value: -86.29 -3.31 95% mean confidence interval for instructions %-change: -4.43% -1.81% Instructions are helped. total cycles in shared programs: 422930336 -> 422929589 (<.01%) cycles in affected programs: 59347 -> 58600 (-1.26%) helped: 3 HURT: 2 helped stats (abs) min: 72 max: 1060 x̄: 433.33 x̃: 168 helped stats (rel) min: 1.14% max: 3.48% x̄: 2.23% x̃: 2.06% HURT stats (abs) min: 265 max: 288 x̄: 276.50 x̃: 276 HURT stats (rel) min: 4.79% max: 5.64% x̄: 5.22% x̃: 5.22% 95% mean confidence interval for cycles value: -829.08 530.28 95% mean confidence interval for cycles %-change: -4.43% 5.93% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 4953 -> 4946 (-0.14%) spills in affected programs: 344 -> 337 (-2.03%) helped: 2 HURT: 0 total fills in shared programs: 5548 -> 5521 (-0.49%) fills in affected programs: 838 -> 811 (-3.22%) helped: 2 HURT: 0 No shader-db changes on any earlier Intel platform. Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
* nir/algebraic: Add some half packing optimizations for pack_half_2x16_splitIan Romanick2020-05-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like 1f72857739b ("nir/algebraic: add some half packing optimizations"), but for the pack_half_2x16_split variant. The shader helped for spills and fills is the big compute shader in Bioshock Infinite. Tiger Lake total instructions in shared programs: 21834539 -> 21833218 (<.01%) instructions in affected programs: 60119 -> 58798 (-2.20%) helped: 105 HURT: 0 helped stats (abs) min: 5 max: 50 x̄: 12.58 x̃: 9 helped stats (rel) min: 0.86% max: 26.46% x̄: 2.58% x̃: 1.70% 95% mean confidence interval for instructions value: -14.35 -10.81 95% mean confidence interval for instructions %-change: -3.20% -1.97% Instructions are helped. total cycles in shared programs: 506215169 -> 506180109 (<.01%) cycles in affected programs: 1445088 -> 1410028 (-2.43%) helped: 97 HURT: 8 helped stats (abs) min: 1 max: 16882 x̄: 387.76 x̃: 26 helped stats (rel) min: 0.05% max: 18.31% x̄: 1.77% x̃: 1.34% HURT stats (abs) min: 21 max: 635 x̄: 319.12 x̃: 212 HURT stats (rel) min: 0.39% max: 20.08% x̄: 8.96% x̃: 4.46% 95% mean confidence interval for cycles value: -782.96 115.15 95% mean confidence interval for cycles %-change: -1.74% -0.16% Inconclusive result (value mean confidence interval includes 0). Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown) total instructions in shared programs: 19962974 -> 19961317 (<.01%) instructions in affected programs: 63471 -> 61814 (-2.61%) helped: 105 HURT: 0 helped stats (abs) min: 6 max: 82 x̄: 15.78 x̃: 11 helped stats (rel) min: 1.11% max: 28.65% x̄: 3.17% x̃: 2.16% 95% mean confidence interval for instructions value: -18.38 -13.18 95% mean confidence interval for instructions %-change: -3.86% -2.48% Instructions are helped. total cycles in shared programs: 498908953 -> 498863755 (<.01%) cycles in affected programs: 1566998 -> 1521800 (-2.88%) helped: 89 HURT: 15 helped stats (abs) min: 2 max: 17502 x̄: 532.19 x̃: 69 helped stats (rel) min: 0.07% max: 18.54% x̄: 4.71% x̃: 3.12% HURT stats (abs) min: 3 max: 661 x̄: 144.47 x̃: 16 HURT stats (rel) min: 0.14% max: 20.57% x̄: 4.29% x̃: 0.30% 95% mean confidence interval for cycles value: -903.93 34.74 95% mean confidence interval for cycles %-change: -4.50% -2.32% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8776 -> 8774 (-0.02%) spills in affected programs: 25 -> 23 (-8.00%) helped: 1 HURT: 0 total fills in shared programs: 9500 -> 9496 (-0.04%) fills in affected programs: 46 -> 42 (-8.70%) helped: 1 HURT: 0 Haswell total instructions in shared programs: 16229912 -> 16228399 (<.01%) instructions in affected programs: 61257 -> 59744 (-2.47%) helped: 105 HURT: 0 helped stats (abs) min: 6 max: 51 x̄: 14.41 x̃: 11 helped stats (rel) min: 0.77% max: 28.65% x̄: 3.08% x̃: 2.15% 95% mean confidence interval for instructions value: -16.14 -12.68 95% mean confidence interval for instructions %-change: -3.77% -2.40% Instructions are helped. total cycles in shared programs: 538654481 -> 538613967 (<.01%) cycles in affected programs: 1448966 -> 1408452 (-2.80%) helped: 58 HURT: 47 helped stats (abs) min: 9 max: 22604 x̄: 957.00 x̃: 74 helped stats (rel) min: 0.40% max: 18.81% x̄: 6.22% x̃: 3.03% HURT stats (abs) min: 5 max: 3720 x̄: 318.98 x̃: 49 HURT stats (rel) min: 0.20% max: 34.50% x̄: 5.05% x̃: 2.12% 95% mean confidence interval for cycles value: -999.84 228.14 95% mean confidence interval for cycles %-change: -2.86% 0.51% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 15266086 -> 15265983 (<.01%) instructions in affected programs: 7272 -> 7169 (-1.42%) helped: 3 HURT: 0 helped stats (abs) min: 21 max: 41 x̄: 34.33 x̃: 41 helped stats (rel) min: 0.66% max: 5.43% x̄: 2.44% x̃: 1.23% total cycles in shared programs: 422930883 -> 422930336 (<.01%) cycles in affected programs: 49259 -> 48712 (-1.11%) helped: 3 HURT: 0 helped stats (abs) min: 106 max: 221 x̄: 182.33 x̃: 220 helped stats (rel) min: 0.71% max: 5.95% x̄: 2.46% x̃: 0.72% No changes on any earilier Intel platforms. Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
* nir/algebraic: Optimize ushr of pack_half, not ishrIan Romanick2020-05-111-1/+1
| | | | | | | | | | | | When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce 0xBC000000. The ishr will produce 0xFFFFBC00. The replacement pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00. Fixes: 1f72857739b ("nir/algebraic: add some half packing optimizations") Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: Connor Abbott <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
* nir: do not vectorize load/store if offset can overflow and robustness enabledSamuel Pitoiset2020-05-113-5/+44
| | | | | | | | | | | | This prevents vectorization for loads/stores that can overflow if the low offset is negative and the range greater or equal than 0. The caller can pass the list of variable modes that matter for robust access. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4881>
* nir/algebraic: Optimize some bfe patternsIan Romanick2020-05-071-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Use -x instead of 32-x in shift counts. Tiger Lake total instructions in shared programs: 17597691 -> 17597405 (<.01%) instructions in affected programs: 224557 -> 224271 (-0.13%) helped: 74 HURT: 17 helped stats (abs) min: 1 max: 71 x̄: 14.36 x̃: 7 helped stats (rel) min: 0.08% max: 1.80% x̄: 0.50% x̃: 0.37% HURT stats (abs) min: 1 max: 141 x̄: 45.71 x̃: 40 HURT stats (rel) min: 0.03% max: 3.55% x̄: 1.20% x̃: 1.14% 95% mean confidence interval for instructions value: -10.53 4.24 95% mean confidence interval for instructions %-change: -0.38% 0.01% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 333595656 -> 333180770 (-0.12%) cycles in affected programs: 70056467 -> 69641581 (-0.59%) helped: 91 HURT: 4 helped stats (abs) min: 1 max: 25174 x̄: 4571.40 x̃: 400 helped stats (rel) min: <.01% max: 2.23% x̄: 0.40% x̃: 0.21% HURT stats (abs) min: 1 max: 370 x̄: 277.75 x̃: 370 HURT stats (rel) min: 0.01% max: 0.04% x̄: 0.04% x̃: 0.04% 95% mean confidence interval for cycles value: -5981.55 -2752.89 95% mean confidence interval for cycles %-change: -0.48% -0.29% Cycles are helped. Ice Lake, Skylake, Broadwell, and Haswell had similar results. (Ice Lake shown) total instructions in shared programs: 16117204 -> 16116723 (<.01%) instructions in affected programs: 207109 -> 206628 (-0.23%) helped: 100 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 4.81 x̃: 7 helped stats (rel) min: 0.10% max: 1.58% x̄: 0.23% x̃: 0.20% 95% mean confidence interval for instructions value: -5.51 -4.11 95% mean confidence interval for instructions %-change: -0.27% -0.19% Instructions are helped. total cycles in shared programs: 330487341 -> 330082421 (-0.12%) cycles in affected programs: 68037050 -> 67632130 (-0.60%) helped: 89 HURT: 7 helped stats (abs) min: 2 max: 24610 x̄: 4567.07 x̃: 400 helped stats (rel) min: <.01% max: 1.52% x̄: 0.39% x̃: 0.22% HURT stats (abs) min: 1 max: 370 x̄: 221.29 x̃: 170 HURT stats (rel) min: 0.01% max: 1.66% x̄: 0.58% x̃: 0.04% 95% mean confidence interval for cycles value: -5780.79 -2655.05 95% mean confidence interval for cycles %-change: -0.42% -0.22% Cycles are helped. Ivy Bridge total instructions in shared programs: 11873641 -> 11873137 (<.01%) instructions in affected programs: 147464 -> 146960 (-0.34%) helped: 54 HURT: 0 helped stats (abs) min: 9 max: 10 x̄: 9.33 x̃: 9 helped stats (rel) min: 0.29% max: 0.41% x̄: 0.34% x̃: 0.34% 95% mean confidence interval for instructions value: -9.46 -9.20 95% mean confidence interval for instructions %-change: -0.35% -0.33% Instructions are helped. total cycles in shared programs: 175769085 -> 175549519 (-0.12%) cycles in affected programs: 60770592 -> 60551026 (-0.36%) helped: 54 HURT: 0 helped stats (abs) min: 252 max: 13434 x̄: 4066.04 x̃: 1290 helped stats (rel) min: 0.02% max: 0.74% x̄: 0.34% x̃: 0.26% 95% mean confidence interval for cycles value: -5323.59 -2808.48 95% mean confidence interval for cycles %-change: -0.41% -0.27% Cycles are helped. No changes on any earlier Intel platforms. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
* nir/algebraic: Split ibfe and ubfe with two constant sourcesIan Romanick2020-05-072-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I also tried splitting ubfe instructions with one or zero constants, and zero shaders in shader-db were affected. The "lost" shader is a compute shader that was promoted from SIMD8 to SIMD16, so is also counted as the gained shader. v2: Further restrict bfe splitting. bfe with multiple constants is better on at least some Radeon GPUs. Use -x instead of 32-x in shift counts. v3: Fix the outer shift count for ibfe lowering. Add c=0 optimizations to prevent bad lowering. Both suggested by Rhys. Add shift by -32 optimizations. Tiger Lake total instructions in shared programs: 17608764 -> 17596316 (-0.07%) instructions in affected programs: 303765 -> 291317 (-4.10%) helped: 113 HURT: 46 helped stats (abs) min: 1 max: 458 x̄: 120.67 x̃: 21 helped stats (rel) min: 0.09% max: 11.23% x̄: 3.47% x̃: 1.39% HURT stats (abs) min: 1 max: 201 x̄: 25.83 x̃: 6 HURT stats (rel) min: 0.23% max: 5.18% x̄: 1.53% x̃: 1.11% 95% mean confidence interval for instructions value: -101.13 -55.45 95% mean confidence interval for instructions %-change: -2.61% -1.44% Instructions are helped. total cycles in shared programs: 338390770 -> 333530868 (-1.44%) cycles in affected programs: 79438330 -> 74578428 (-6.12%) helped: 112 HURT: 64 helped stats (abs) min: 2 max: 268955 x̄: 44261.93 x̃: 1452 helped stats (rel) min: <.01% max: 29.51% x̄: 4.72% x̃: 2.23% HURT stats (abs) min: 2 max: 17618 x̄: 1522.41 x̃: 84 HURT stats (rel) min: <.01% max: 7.34% x̄: 1.35% x̃: 0.34% 95% mean confidence interval for cycles value: -37232.47 -17993.69 95% mean confidence interval for cycles %-change: -3.37% -1.65% Cycles are helped. total spills in shared programs: 8944 -> 8138 (-9.01%) spills in affected programs: 3240 -> 2434 (-24.88%) helped: 67 HURT: 0 total fills in shared programs: 9373 -> 7842 (-16.33%) fills in affected programs: 4736 -> 3205 (-32.33%) helped: 67 HURT: 0 LOST: 1 GAINED: 2 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 16123288 -> 16116876 (-0.04%) instructions in affected programs: 241155 -> 234743 (-2.66%) helped: 126 HURT: 2 helped stats (abs) min: 1 max: 209 x̄: 50.90 x̃: 7 helped stats (rel) min: 0.07% max: 5.94% x̄: 1.76% x̃: 0.65% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.15% x̃: 0.15% 95% mean confidence interval for instructions value: -61.29 -38.89 95% mean confidence interval for instructions %-change: -2.05% -1.42% Instructions are helped. total cycles in shared programs: 335419163 -> 330438819 (-1.48%) cycles in affected programs: 77515502 -> 72535158 (-6.42%) helped: 139 HURT: 37 helped stats (abs) min: 2 max: 269140 x̄: 36374.19 x̃: 597 helped stats (rel) min: <.01% max: 28.60% x̄: 3.67% x̃: 1.31% HURT stats (abs) min: 4 max: 17618 x̄: 2045.08 x̃: 174 HURT stats (rel) min: 0.02% max: 8.32% x̄: 2.61% x̃: 0.62% 95% mean confidence interval for cycles value: -37799.30 -18795.51 95% mean confidence interval for cycles %-change: -3.13% -1.57% Cycles are helped. total spills in shared programs: 8065 -> 7306 (-9.41%) spills in affected programs: 3153 -> 2394 (-24.07%) helped: 67 HURT: 0 total fills in shared programs: 8710 -> 7412 (-14.90%) fills in affected programs: 4466 -> 3168 (-29.06%) helped: 67 HURT: 0 LOST: 1 GAINED: 1 Broadwell total instructions in shared programs: 14970538 -> 14965967 (-0.03%) instructions in affected programs: 227040 -> 222469 (-2.01%) helped: 126 HURT: 2 helped stats (abs) min: 1 max: 136 x̄: 36.29 x̃: 8 helped stats (rel) min: 0.07% max: 6.02% x̄: 1.47% x̃: 0.89% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.14% x̃: 0.14% 95% mean confidence interval for instructions value: -43.05 -28.37 95% mean confidence interval for instructions %-change: -1.69% -1.19% Instructions are helped. total cycles in shared programs: 336237662 -> 333035960 (-0.95%) cycles in affected programs: 72066394 -> 68864692 (-4.44%) helped: 134 HURT: 42 helped stats (abs) min: 4 max: 122640 x̄: 24344.54 x̃: 1833 helped stats (rel) min: <.01% max: 26.93% x̄: 4.02% x̃: 2.38% HURT stats (abs) min: 1 max: 17205 x̄: 1439.69 x̃: 92 HURT stats (rel) min: <.01% max: 7.12% x̄: 1.34% x̃: 0.62% 95% mean confidence interval for cycles value: -23753.58 -12629.40 95% mean confidence interval for cycles %-change: -3.50% -1.98% Cycles are helped. total spills in shared programs: 21122 -> 20204 (-4.35%) spills in affected programs: 3644 -> 2726 (-25.19%) helped: 67 HURT: 0 total fills in shared programs: 24879 -> 23460 (-5.70%) fills in affected programs: 4883 -> 3464 (-29.06%) helped: 67 HURT: 0 Haswell total instructions in shared programs: 13148269 -> 13145444 (-0.02%) instructions in affected programs: 137046 -> 134221 (-2.06%) helped: 97 HURT: 3 helped stats (abs) min: 1 max: 137 x̄: 30.58 x̃: 3 helped stats (rel) min: 0.14% max: 4.38% x̄: 1.38% x̃: 0.44% HURT stats (abs) min: 1 max: 70 x̄: 47.00 x̃: 70 HURT stats (rel) min: 0.05% max: 5.82% x̄: 3.90% x̃: 5.82% 95% mean confidence interval for instructions value: -37.15 -19.35 95% mean confidence interval for instructions %-change: -1.56% -0.89% Instructions are helped. total cycles in shared programs: 321221834 -> 318333159 (-0.90%) cycles in affected programs: 54932349 -> 52043674 (-5.26%) helped: 95 HURT: 53 helped stats (abs) min: 4 max: 123390 x̄: 30648.39 x̃: 702 helped stats (rel) min: <.01% max: 28.87% x̄: 4.27% x̃: 2.87% HURT stats (abs) min: 4 max: 2357 x̄: 432.49 x̃: 113 HURT stats (rel) min: <.01% max: 3.44% x̄: 1.03% x̃: 0.54% 95% mean confidence interval for cycles value: -26154.16 -12881.99 95% mean confidence interval for cycles %-change: -3.20% -1.55% Cycles are helped. total spills in shared programs: 19878 -> 19293 (-2.94%) spills in affected programs: 3020 -> 2435 (-19.37%) helped: 41 HURT: 2 total fills in shared programs: 20918 -> 19875 (-4.99%) fills in affected programs: 3968 -> 2925 (-26.29%) helped: 41 HURT: 2 LOST: 0 GAINED: 1 Ivy Bridge total instructions in shared programs: 11875585 -> 11873641 (-0.02%) instructions in affected programs: 78065 -> 76121 (-2.49%) helped: 27 HURT: 0 helped stats (abs) min: 8 max: 134 x̄: 72.00 x̃: 72 helped stats (rel) min: 0.36% max: 4.23% x̄: 2.42% x̃: 2.42% 95% mean confidence interval for instructions value: -83.68 -60.32 95% mean confidence interval for instructions %-change: -2.78% -2.07% Instructions are helped. total cycles in shared programs: 178232734 -> 175769085 (-1.38%) cycles in affected programs: 50018707 -> 47555058 (-4.93%) helped: 27 HURT: 0 helped stats (abs) min: 82035 max: 99953 x̄: 91246.26 x̃: 92278 helped stats (rel) min: 4.40% max: 5.69% x̄: 4.93% x̃: 4.95% 95% mean confidence interval for cycles value: -93674.20 -88818.32 95% mean confidence interval for cycles %-change: -5.09% -4.78% Cycles are helped. total spills in shared programs: 4182 -> 3739 (-10.59%) spills in affected programs: 1089 -> 646 (-40.68%) helped: 27 HURT: 0 total fills in shared programs: 5216 -> 4345 (-16.70%) fills in affected programs: 1874 -> 1003 (-46.48%) helped: 27 HURT: 0 No changes on any earlier Intel platforms. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
* nir/algebraic: Recognize open-coded byte or word extract from bfeIan Romanick2020-05-071-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Move word-extract patterns up near the byte-extract patterns. Suggested by Rhys. Tiger Lake total instructions in shared programs: 21369236 -> 21368712 (<.01%) instructions in affected programs: 913104 -> 912580 (-0.06%) helped: 209 HURT: 165 helped stats (abs) min: 1 max: 30 x̄: 5.35 x̃: 3 helped stats (rel) min: 0.03% max: 6.92% x̄: 0.28% x̃: 0.12% HURT stats (abs) min: 1 max: 18 x̄: 3.61 x̃: 3 HURT stats (rel) min: 0.04% max: 0.87% x̄: 0.16% x̃: 0.12% 95% mean confidence interval for instructions value: -2.04 -0.76 95% mean confidence interval for instructions %-change: -0.14% -0.04% Instructions are helped. total cycles in shared programs: 490161481 -> 490175959 (<.01%) cycles in affected programs: 72557244 -> 72571722 (0.02%) helped: 193 HURT: 189 helped stats (abs) min: 1 max: 14240 x̄: 509.16 x̃: 71 helped stats (rel) min: <.01% max: 13.71% x̄: 0.44% x̃: 0.05% HURT stats (abs) min: 2 max: 4210 x̄: 596.53 x̃: 173 HURT stats (rel) min: <.01% max: 5.59% x̄: 0.54% x̃: 0.14% 95% mean confidence interval for cycles value: -96.33 172.13 95% mean confidence interval for cycles %-change: -0.07% 0.16% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 10780 -> 10782 (0.02%) spills in affected programs: 18 -> 20 (11.11%) helped: 0 HURT: 1 total fills in shared programs: 10396 -> 10370 (-0.25%) fills in affected programs: 2292 -> 2266 (-1.13%) helped: 27 HURT: 1 Ice Lake total instructions in shared programs: 19556356 -> 19555446 (<.01%) instructions in affected programs: 833336 -> 832426 (-0.11%) helped: 400 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 2.27 x̃: 2 helped stats (rel) min: 0.07% max: 4.42% x̄: 0.14% x̃: 0.10% 95% mean confidence interval for instructions value: -2.42 -2.13 95% mean confidence interval for instructions %-change: -0.18% -0.11% Instructions are helped. total cycles in shared programs: 488026481 -> 488008714 (<.01%) cycles in affected programs: 81581708 -> 81563941 (-0.02%) helped: 193 HURT: 206 helped stats (abs) min: 1 max: 3615 x̄: 576.35 x̃: 131 helped stats (rel) min: <.01% max: 4.50% x̄: 0.49% x̃: 0.22% HURT stats (abs) min: 1 max: 2244 x̄: 453.73 x̃: 170 HURT stats (rel) min: <.01% max: 5.71% x̄: 0.36% x̃: 0.14% 95% mean confidence interval for cycles value: -127.23 38.17 95% mean confidence interval for cycles %-change: -0.12% 0.03% Inconclusive result (value mean confidence interval includes 0). total fills in shared programs: 9935 -> 9908 (-0.27%) fills in affected programs: 2208 -> 2181 (-1.22%) helped: 27 HURT: 0 Skylake total instructions in shared programs: 17766078 -> 17765186 (<.01%) instructions in affected programs: 822017 -> 821125 (-0.11%) helped: 399 HURT: 1 helped stats (abs) min: 1 max: 20 x̄: 2.27 x̃: 2 helped stats (rel) min: 0.07% max: 4.46% x̄: 0.15% x̃: 0.10% HURT stats (abs) min: 12 max: 12 x̄: 12.00 x̃: 12 HURT stats (rel) min: 0.50% max: 0.50% x̄: 0.50% x̃: 0.50% 95% mean confidence interval for instructions value: -2.39 -2.07 95% mean confidence interval for instructions %-change: -0.18% -0.11% Instructions are helped. total cycles in shared programs: 470905548 -> 470907497 (<.01%) cycles in affected programs: 78598491 -> 78600440 (<.01%) helped: 202 HURT: 192 helped stats (abs) min: 1 max: 3690 x̄: 228.98 x̃: 60 helped stats (rel) min: <.01% max: 4.51% x̄: 0.24% x̃: 0.03% HURT stats (abs) min: 1 max: 2260 x̄: 251.05 x̃: 77 HURT stats (rel) min: <.01% max: 5.31% x̄: 0.24% x̃: 0.06% 95% mean confidence interval for cycles value: -45.01 54.90 95% mean confidence interval for cycles %-change: -0.07% 0.05% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9941 -> 9943 (0.02%) spills in affected programs: 26 -> 28 (7.69%) helped: 0 HURT: 1 total fills in shared programs: 10293 -> 10268 (-0.24%) fills in affected programs: 2391 -> 2366 (-1.05%) helped: 27 HURT: 1 Broadwell total instructions in shared programs: 17463211 -> 17462366 (<.01%) instructions in affected programs: 861444 -> 860599 (-0.10%) helped: 399 HURT: 1 helped stats (abs) min: 1 max: 20 x̄: 2.14 x̃: 2 helped stats (rel) min: 0.03% max: 4.46% x̄: 0.14% x̃: 0.09% HURT stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.33% max: 0.33% x̄: 0.33% x̃: 0.33% 95% mean confidence interval for instructions value: -2.26 -1.97 95% mean confidence interval for instructions %-change: -0.17% -0.10% Instructions are helped. total cycles in shared programs: 507048912 -> 506898243 (-0.03%) cycles in affected programs: 79806433 -> 79655764 (-0.19%) helped: 248 HURT: 136 helped stats (abs) min: 1 max: 8450 x̄: 1124.18 x̃: 64 helped stats (rel) min: <.01% max: 5.91% x̄: 0.83% x̃: 0.05% HURT stats (abs) min: 2 max: 7632 x̄: 942.12 x̃: 103 HURT stats (rel) min: <.01% max: 5.62% x̄: 0.71% x̃: 0.08% 95% mean confidence interval for cycles value: -647.01 -137.73 95% mean confidence interval for cycles %-change: -0.47% -0.10% Cycles are helped. total spills in shared programs: 22996 -> 22998 (<.01%) spills in affected programs: 31 -> 33 (6.45%) helped: 0 HURT: 1 total fills in shared programs: 25951 -> 25923 (-0.11%) fills in affected programs: 2444 -> 2416 (-1.15%) helped: 29 HURT: 1 Haswell total instructions in shared programs: 15841325 -> 15840554 (<.01%) instructions in affected programs: 869679 -> 868908 (-0.09%) helped: 394 HURT: 6 helped stats (abs) min: 1 max: 20 x̄: 2.15 x̃: 2 helped stats (rel) min: 0.06% max: 4.46% x̄: 0.14% x̃: 0.09% HURT stats (abs) min: 7 max: 18 x̄: 12.83 x̃: 13 HURT stats (rel) min: 0.32% max: 0.82% x̄: 0.59% x̃: 0.61% 95% mean confidence interval for instructions value: -2.16 -1.69 95% mean confidence interval for instructions %-change: -0.16% -0.09% Instructions are helped. total cycles in shared programs: 520417167 -> 520279766 (-0.03%) cycles in affected programs: 80949963 -> 80812562 (-0.17%) helped: 246 HURT: 139 helped stats (abs) min: 1 max: 8152 x̄: 790.08 x̃: 129 helped stats (rel) min: <.01% max: 11.46% x̄: 0.70% x̃: 0.09% HURT stats (abs) min: 1 max: 7085 x̄: 409.78 x̃: 80 HURT stats (rel) min: <.01% max: 5.25% x̄: 0.31% x̃: 0.06% 95% mean confidence interval for cycles value: -526.34 -187.43 95% mean confidence interval for cycles %-change: -0.49% -0.18% Cycles are helped. total spills in shared programs: 21714 -> 21729 (0.07%) spills in affected programs: 174 -> 189 (8.62%) helped: 0 HURT: 6 total fills in shared programs: 22136 -> 22132 (-0.02%) fills in affected programs: 2848 -> 2844 (-0.14%) helped: 31 HURT: 6 Ivy Bridge total instructions in shared programs: 15177059 -> 15177003 (<.01%) instructions in affected programs: 79370 -> 79314 (-0.07%) helped: 29 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.93 x̃: 2 helped stats (rel) min: 0.06% max: 0.16% x̄: 0.08% x̃: 0.07% 95% mean confidence interval for instructions value: -2.03 -1.83 95% mean confidence interval for instructions %-change: -0.09% -0.07% Instructions are helped. total cycles in shared programs: 420424359 -> 420417254 (<.01%) cycles in affected programs: 29562648 -> 29555543 (-0.02%) helped: 23 HURT: 6 helped stats (abs) min: 2 max: 2741 x̄: 432.57 x̃: 142 helped stats (rel) min: <.01% max: 0.26% x̄: 0.04% x̃: 0.02% HURT stats (abs) min: 4 max: 1184 x̄: 474.00 x̃: 226 HURT stats (rel) min: <.01% max: 0.11% x̄: 0.05% x̃: 0.05% 95% mean confidence interval for cycles value: -553.48 63.48 95% mean confidence interval for cycles %-change: -0.05% <.01% Inconclusive result (value mean confidence interval includes 0). total fills in shared programs: 6420 -> 6393 (-0.42%) fills in affected programs: 1901 -> 1874 (-1.42%) helped: 27 HURT: 0 No changes on any earlier Intel platforms. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
* nir: make fsat return 0.0 with NaN instead of passing it throughRhys Perry2020-05-072-5/+14
| | | | | | | | | This is how lower_fsat and ACO implements fsat and is a more useful definition since it can be exactly created from fmin(fmax(a, 0.0), 1.0). Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3716>
* nir: add missing group_memory_barrier handlingRhys Perry2020-05-052-0/+2
| | | | | | | | | | | | | | | Totals from 2 (0.00% of 127638) affected shaders: VGPRs: 164 -> 168 (+2.44%) CodeSize: 18420 -> 18756 (+1.82%) Instrs: 3658 -> 3700 (+1.15%) Cycles: 82912 -> 83080 (+0.20%) VMEM: 70 -> 69 (-1.43%) PreVGPRs: 155 -> 168 (+8.39%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> CC: <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4889>
* nir: Always create UBO variable when lowering uniforms to uboLouis-Francis Ratté-Boulianne2020-05-051-2/+27
| | | | | | | | | | | | | | | | | | | Zink needs to know the sizes of UBOs, and for normal UBOs we get this from the nir_var_mem_ubo variables. This allows us to treat all of these the same way. We're about to need the same information for the in-progress D3D12 driver, so let's do this in a central location instead of in the driver. This version is also a bit more careful than the Zink version. In particular, for two reasons: 1. We increase the variable bindings when we adjust the pre-existing UBOs. 2. We increase shader->info.num_ubos when we insert a new UBO variable. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4734>
* compiler/nir: move tan-calculation to helperErik Faye-Lund2020-05-041-0/+6
| | | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-By: Karol Herbst <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
* intel/fs: Add and use a new load_simd_width_intel intrinsicCaio Marcelo de Oliveira Filho2020-05-011-0/+3
| | | | | | | | | | | | | | | | | | | | | Intrinsic to get the SIMD width, which not always the same as subgroup size. Starting with a small scope (Intel), but we can rename it later to generalize if this turns out useful for other drivers. Change brw_nir_lower_cs_intrinsics() to use this intrinsic instead of a width will be passed as argument. The pass also used to optimized load_subgroup_id for the case that the workgroup fitted into a single thread (it will be constant zero). This optimization moved together with lowering of the SIMD. This is a preparation for letting the drivers call it before the brw_compile_cs() step. No shader-db changes in BDW, SKL, ICL and TGL. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4794>
* nir: Add new linking helper to set linked driver locations.Timur Kristóf2020-04-292-0/+108
| | | | | | | | | | | | | | This commit introduces a new function nir_assign_linked_io_var_locations which is intended to help with assigning driver locations to shaders during linking, primarily aimed at the VS->TCS->TES->GS stages. It ensures that the linked shaders have the same driver locations, and it also packs these as close to each other as possible. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4388>
* nir/combine_stores: Handle volatileJason Ekstrand2020-04-282-1/+66
| | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4767>
* nir/dead_write_vars: Handle volatileJason Ekstrand2020-04-282-0/+61
| | | | | | | | We can't remove volatile writes and we can't combine them with other volatile writes so all we can do is clear the unused bits. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4767>