aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler/nir
Commit message (Collapse)AuthorAgeFilesLines
...
* nir: Fold f2f16(b2f32(x)) to b2f16(x)Alyssa Rosenzweig2020-06-021-0/+2
| | | | | | | | | | | | | By definition. This reduces register pressure on freedreno so that the noubo expected failure goes away. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
* meson: use gnu_symbol_visibility argumentDylan Baker2020-06-011-8/+16
| | | | | | | | | | This uses a meson builtin to handle -fvisibility=hidden. This is nice because we don't need to track which languages are used, if C++ is suddenly added meson just does the right thing. Acked-by: Matt Turner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
* nir: lower_tex: Don't normalize coordinates for TXF with RECTGert Wollny2020-05-281-1/+2
| | | | | | | | | | | | | | v2: remove the option to actually request normalization and its application in Intel < Gen6 (Jason) v3: Also don't lower for query operations (Jason) Fixes: 1ce8060c25c7f2c7a54159fab6a6974c0ba182a8 nir/lower_tex: support for lowering RECT textures Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5105>
* nir: Fix sources for image atomic faddJason Ekstrand2020-05-261-1/+1
| | | | | | | | | | Somehow we ended up with an extra scalar source up-front. It doesn't look like any drivers use this opcode yet so no real harm has been done by it being wrong. Acked-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5218>
* nir: fix lowering to scratch with boolean accessRhys Perry2020-05-261-6/+7
| | | | | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Fixes: 18ed82b084c79bf63666f2da22e5d675fb01aa26 ('nir: Add a pass for selectively lowering variables to scratch space') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5214>
* nir: Add un/pack_32_4x8 opcodesAlyssa Rosenzweig2020-05-251-0/+7
| | | | | | | | | | Complement the existing un/pack_32_2x16 opcodes. These are useful for 8-bit format packing. On Midgard, they are equivalent to just a 32-bit move, but other GPUs could lower to other packs if needed. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5107>
* nir: replace fnv1a hash function with xxhashDmitriy Nester2020-05-255-18/+17
| | | | | | | | | xxhash is faster than fnv1a in almost all circumstances, so we're switching to it globally. Signed-off-by: Dmytro Nester <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4020>
* spirv: add ReadClockKHR support with device scopeSamuel Pitoiset2020-05-241-1/+2
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>
* tree-wide: fix deprecated GitLab URLsEric Engestrom2020-05-231-1/+1
| | | | | | | | | | | | | They will stop working in the next GitLab release, so let's update them ASAP to make sure things are propagated to everyone by then. See: https://about.gitlab.com/releases/2020/05/06/gitlab-com-13-0-breaking-changes/#removal-of-deprecated-project-paths Cc: [email protected] Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5111>
* nir/copy_prop_vars: Record progress in more placesJason Ekstrand2020-05-221-0/+3
| | | | | | Fixes: 96c32d7776 "nir/copy_prop_vars: handle load/store of vector..." Reviewed-by: Dave Airlie <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
* nir/opt_deref: Report progress if we remove a derefJason Ekstrand2020-05-221-1/+3
| | | | | | Fixes: a1c688517de "nir/opt_deref: Properly optimize ptr_as_array..." Reviewed-by: Dave Airlie <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
* nir/lower_double_ops: Rework the if (progress) treeJason Ekstrand2020-05-221-1/+8
| | | | | | Fixes: d7d35a9522 "nir/lower_doubles: Use the new NIR lowering..." Reviewed-by: Dave Airlie <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
* compiler: delete leftover autotools test wrapperEric Engestrom2020-05-201-3/+0
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5114>
* nir: Add const to nir_intrinsic_src_componentsJason Ekstrand2020-05-191-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5108>
* nir: Add fclamp_pos opcodeAlyssa Rosenzweig2020-05-191-0/+1
| | | | | | | | | Corresponds to the .pos modifier on all Mali GPUs (lima and panfrost). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5102>
* nir: Add fsat_signed opcodeAlyssa Rosenzweig2020-05-191-0/+1
| | | | | | | | | Exists on later Mali. Equivalent to clamp(x, -1.0, 1.0) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5102>
* nir: Add a store_reg helper and use the builder in phis_to_regsJason Ekstrand2020-05-192-21/+25
| | | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
* nir: Add a new helper for iterating phi sources leaving a blockJason Ekstrand2020-05-193-15/+30
| | | | | | | | | This takes the same callback as nir_foreach_src except it walks all phi sources which leave a given block. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
* nir/clone: Re-use clone_alu for nir_alu_instr_cloneJason Ekstrand2020-05-191-21/+17
| | | | | | | | | | | | All it takes are a couple small tweaks to the clone infrastructure to allow us to use it without any remap table at all. This reduces code duplication and the chances for bugs that come with it. In particular, the hand-rolled nir_alu_instr_clone didn't preserve no_[un]signed_wrap, or source/destination modifiers. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
* nir/opt_if: use nir_src_as_bool in opt_peel_loop_initial_if helperRhys Perry2020-05-191-12/+10
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4757>
* nir/opt_if: run opt_peel_loop_initial_if after all other optimizationsRhys Perry2020-05-191-5/+44
| | | | | | | | | | | | | Fixes dEQP-VK.graphicsfuzz.loops-ifs-continues-call with RADV. opt_if_loop_terminator can cause this optimization or opt_if_simplification to be run on the non-SSA code. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Fixes: 52c8bc0130a ('nir: make opt_if_loop_terminator() less strict') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2943 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4757>
* nir: Add documentation for each jump instruction typeJason Ekstrand2020-05-191-0/+18
| | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
* nir: Use a switch statement in nir_handle_add_jumpJason Ekstrand2020-05-191-13/+20
| | | | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
* nir: Validate jump instructions as an instruction typeJason Ekstrand2020-05-191-30/+39
| | | | | | | | | | | | | | This has the downside of putting block successor validation in two places that are a bit further apart. However, handling them as a special case makes the code more confusing than needed. At least two different people have not noticed that we don't have jump instruction validation in the last week or two and added it. Being able to search for validate_jump_instr is useful. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
* nir: Consider atomic counter intrinsics when setting writes_memoryCaio Marcelo de Oliveira Filho2020-05-181-0/+22
| | | | | | | | | | | | | | In i965 these get lowered after gather info, so let's consider them too. Fixes piglit.spec.arb_framebuffer_no_attachments.arb_framebuffer_no_attachments-atomic in Gen9, HSW and IVB. Fixes: 6a6c36e9776 ("intel/fs: Use writes_memory from shader_info") Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5093>
* nir: Use deref intrinsics to set writes_memory when gathering infoCaio Marcelo de Oliveira Filho2020-05-181-0/+29
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4815>
* nir: Use 8-bit types for most info fieldsJason Ekstrand2020-05-152-11/+11
| | | | | | | | | This shrinks nir_intrinsics.c.o from 73K to 35K and nir_opcodes.c.o from 64K to 31K on a release build. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5045>
* Revert "nir/validate: validate the stride for deref_ptr_as_array"Karol Herbst2020-05-141-1/+0
| | | This reverts commit 667e14e7bd759a77e732c4de09fb978ee3816eaf
* nir/validate: validate the stride for deref_ptr_as_arrayKarol Herbst2020-05-141-0/+1
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068>
* nir/deref: copy ptr_stride when rematerializingKarol Herbst2020-05-141-1/+4
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068>
* nir: add pass to lower disjoint wrmask'sRob Clark2020-05-133-0/+235
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: add helper to copy const_index[]Rob Clark2020-05-131-0/+27
| | | | | | | | | It seems less brittle to not assume they are in the same order for src and dst instructions. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: fix indices for ir3 ssbo_atomic intrinsicsRob Clark2020-05-131-10/+10
| | | | | | | | | Caught by the sanity checking in nir_intrinsic_copy_const_indices() (which is introduced by the next patch). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: Drop wrmask for ir3 local and global store intrinsicsKristian H. Kristensen2020-05-131-2/+2
| | | | | | | These intrinsics are supposed to map to the underlying hardware instructions, which don't have wrmask. We use them when we lower store_output in the geometry pipeline and since store_output gets lowered to temps, we always see full wrmasks there.
* nir: Add some docs to the metadata typesJason Ekstrand2020-05-141-0/+51
| | | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5028>
* nir: Include num_ubos in the printed shader (if nonzero).Eric Anholt2020-05-141-0/+2
| | | | | | I keep wanting this number for debugging shaders. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
* nir: reset ssa-defs as non-divergent during divergence analysis instead of ↵Daniel Schürmann2020-05-131-21/+36
| | | | | | | upfront Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: simplify phi handling in divergence analysisDaniel Schürmann2020-05-131-113/+116
| | | | | | | | | | This patch adds some control flow information to the state to keep track whether a loop contains divergent continue or break statements to not having to recalculate this property for every phi. Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: rework phi handling in divergence analysisDaniel Schürmann2020-05-131-173/+214
| | | | | | | | | | | | This patch splits the visit_phi() function into three different ones according to the kind of phi (merge-node, loop-header or loop-exit) and calls them when visiting the cf_nodes. This allows to revisit loops if the loop header's phis have changed, only. Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: refactor divergence analysis stateDaniel Schürmann2020-05-131-35/+37
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: add nir_intrinsic_elect to divergence analysisDaniel Schürmann2020-05-131-0/+1
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: Make "divergent" a property of an SSA valueJason Ekstrand2020-05-133-64/+94
| | | | | | | v2: fix usage in ACO (by Daniel Schürmann) Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* turnip: Execute ir3_nir_lower_gs pass againBrian Ho2020-05-121-2/+5
| | | | | | | | | | | | | | | This commit fixes a GS regression introduced in !4562 where ir3's GS lowering pass was moved from common code (ir3_nir) to freedreno-specific code (ir3_shader). For GS support in turnip, we need to add the GS lowering pass back in, this time in tu_shader. As for the nir_gather_info change, the GS lowering pass has always introduced a discard_if intrinsic into the GS. Previously, we simply ran nir_shader_gather_info before GS lowering, but now since we lower the GS before we need to remove the assertion that only a FS can use the discard_if intrinsic. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
* nir: Fix count when we didn't lower load_uniforms but did shift load_ubos.Eric Anholt2020-05-121-1/+1
| | | | | | | | | | | | | The fixed commit was really nice in mostly fixing num_ubos to reflect the shader after lowering, but for dEQP-GLES31.functional.compute.basic.ubo_to_ssbo_single_invocation there are no default uniforms and so we skipped the increment, even though we shifted the block index up. Fixes: 4777ee1a62f0 ("nir: Always create UBO variable when lowering uniforms to ubo") Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
* nir/algebraic: Eliminate useless extract before unpackIan Romanick2020-05-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The shader helped for spills and fills is the big compute shader in Dirt Showdown. One of the shaders hurt for spills and fills on Broadwell is the big compute shader in Bioshock Infinite, but combined with the previous commit, it's still an impovement. Tiger Lake total instructions in shared programs: 21833218 -> 21832449 (<.01%) instructions in affected programs: 66104 -> 65335 (-1.16%) helped: 106 HURT: 14 helped stats (abs) min: 1 max: 67 x̄: 7.87 x̃: 5 helped stats (rel) min: 0.19% max: 5.76% x̄: 1.27% x̃: 0.95% HURT stats (abs) min: 1 max: 14 x̄: 4.64 x̃: 1 HURT stats (rel) min: 0.19% max: 4.12% x̄: 1.41% x̃: 0.19% 95% mean confidence interval for instructions value: -8.51 -4.30 95% mean confidence interval for instructions %-change: -1.23% -0.69% Instructions are helped. total cycles in shared programs: 506180109 -> 506196314 (<.01%) cycles in affected programs: 1671429 -> 1687634 (0.97%) helped: 37 HURT: 84 helped stats (abs) min: 1 max: 490 x̄: 73.27 x̃: 24 helped stats (rel) min: 0.02% max: 7.98% x̄: 1.25% x̃: 0.41% HURT stats (abs) min: 1 max: 5000 x̄: 225.19 x̃: 8 HURT stats (rel) min: 0.03% max: 10.22% x̄: 1.22% x̃: 0.42% 95% mean confidence interval for cycles value: 2.85 265.00 95% mean confidence interval for cycles %-change: 0.04% 0.88% Cycles are HURT. Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19961317 -> 19960543 (<.01%) instructions in affected programs: 30268 -> 29494 (-2.56%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 142 x̄: 19.85 x̃: 7 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.33% x̃: 2.31% 95% mean confidence interval for instructions value: -29.46 -10.23 95% mean confidence interval for instructions %-change: -2.95% -1.71% Instructions are helped. total cycles in shared programs: 498863755 -> 498865843 (<.01%) cycles in affected programs: 1831136 -> 1833224 (0.11%) helped: 57 HURT: 65 helped stats (abs) min: 1 max: 1400 x̄: 128.93 x̃: 25 helped stats (rel) min: 0.05% max: 3.49% x̄: 0.89% x̃: 0.71% HURT stats (abs) min: 1 max: 1887 x̄: 145.18 x̃: 15 HURT stats (rel) min: 0.02% max: 9.88% x̄: 1.83% x̃: 0.73% 95% mean confidence interval for cycles value: -58.30 92.53 95% mean confidence interval for cycles %-change: 0.16% 0.97% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8774 -> 8773 (-0.01%) spills in affected programs: 20 -> 19 (-5.00%) helped: 1 HURT: 0 total fills in shared programs: 9496 -> 9494 (-0.02%) fills in affected programs: 40 -> 38 (-5.00%) helped: 1 HURT: 0 Broadwell total instructions in shared programs: 17859373 -> 17858548 (<.01%) instructions in affected programs: 38452 -> 37627 (-2.15%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 143 x̄: 26.61 x̃: 10 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.57% x̃: 2.69% 95% mean confidence interval for instructions value: -39.79 -13.44 95% mean confidence interval for instructions %-change: -3.25% -1.89% Instructions are helped. total cycles in shared programs: 525858109 -> 525869236 (<.01%) cycles in affected programs: 2058597 -> 2069724 (0.54%) helped: 44 HURT: 75 helped stats (abs) min: 2 max: 1330 x̄: 187.84 x̃: 23 helped stats (rel) min: 0.04% max: 31.31% x̄: 2.13% x̃: 0.85% HURT stats (abs) min: 1 max: 3915 x̄: 258.56 x̃: 47 HURT stats (rel) min: 0.02% max: 10.53% x̄: 2.81% x̃: 2.21% 95% mean confidence interval for cycles value: -26.06 213.07 95% mean confidence interval for cycles %-change: 0.19% 1.78% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 25744 -> 25730 (-0.05%) spills in affected programs: 1578 -> 1564 (-0.89%) helped: 4 HURT: 2 total fills in shared programs: 31710 -> 31689 (-0.07%) fills in affected programs: 4346 -> 4325 (-0.48%) helped: 3 HURT: 3 Haswell total instructions in shared programs: 16228399 -> 16227783 (<.01%) instructions in affected programs: 22201 -> 21585 (-2.77%) helped: 27 HURT: 0 helped stats (abs) min: 1 max: 68 x̄: 22.81 x̃: 11 helped stats (rel) min: 0.19% max: 7.87% x̄: 2.92% x̃: 2.86% 95% mean confidence interval for instructions value: -31.96 -13.66 95% mean confidence interval for instructions %-change: -3.68% -2.15% Instructions are helped. total cycles in shared programs: 538613967 -> 538701354 (0.02%) cycles in affected programs: 1653044 -> 1740431 (5.29%) helped: 36 HURT: 81 helped stats (abs) min: 2 max: 708 x̄: 104.50 x̃: 17 helped stats (rel) min: <.01% max: 15.01% x̄: 1.67% x̃: 0.65% HURT stats (abs) min: 1 max: 30100 x̄: 1125.30 x̃: 304 HURT stats (rel) min: 0.02% max: 16.21% x̄: 8.98% x̃: 11.60% 95% mean confidence interval for cycles value: 23.78 1470.01 95% mean confidence interval for cycles %-change: 4.29% 7.12% Cycles are HURT. total spills in shared programs: 23418 -> 23409 (-0.04%) spills in affected programs: 177 -> 168 (-5.08%) helped: 2 HURT: 0 total fills in shared programs: 25919 -> 25896 (-0.09%) fills in affected programs: 568 -> 545 (-4.05%) helped: 3 HURT: 0 Ivy Bridge total instructions in shared programs: 15265983 -> 15265759 (<.01%) instructions in affected programs: 8418 -> 8194 (-2.66%) helped: 5 HURT: 0 helped stats (abs) min: 18 max: 99 x̄: 44.80 x̃: 26 helped stats (rel) min: 1.74% max: 4.26% x̄: 3.12% x̃: 3.00% 95% mean confidence interval for instructions value: -86.29 -3.31 95% mean confidence interval for instructions %-change: -4.43% -1.81% Instructions are helped. total cycles in shared programs: 422930336 -> 422929589 (<.01%) cycles in affected programs: 59347 -> 58600 (-1.26%) helped: 3 HURT: 2 helped stats (abs) min: 72 max: 1060 x̄: 433.33 x̃: 168 helped stats (rel) min: 1.14% max: 3.48% x̄: 2.23% x̃: 2.06% HURT stats (abs) min: 265 max: 288 x̄: 276.50 x̃: 276 HURT stats (rel) min: 4.79% max: 5.64% x̄: 5.22% x̃: 5.22% 95% mean confidence interval for cycles value: -829.08 530.28 95% mean confidence interval for cycles %-change: -4.43% 5.93% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 4953 -> 4946 (-0.14%) spills in affected programs: 344 -> 337 (-2.03%) helped: 2 HURT: 0 total fills in shared programs: 5548 -> 5521 (-0.49%) fills in affected programs: 838 -> 811 (-3.22%) helped: 2 HURT: 0 No shader-db changes on any earlier Intel platform. Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
* nir/algebraic: Add some half packing optimizations for pack_half_2x16_splitIan Romanick2020-05-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like 1f72857739b ("nir/algebraic: add some half packing optimizations"), but for the pack_half_2x16_split variant. The shader helped for spills and fills is the big compute shader in Bioshock Infinite. Tiger Lake total instructions in shared programs: 21834539 -> 21833218 (<.01%) instructions in affected programs: 60119 -> 58798 (-2.20%) helped: 105 HURT: 0 helped stats (abs) min: 5 max: 50 x̄: 12.58 x̃: 9 helped stats (rel) min: 0.86% max: 26.46% x̄: 2.58% x̃: 1.70% 95% mean confidence interval for instructions value: -14.35 -10.81 95% mean confidence interval for instructions %-change: -3.20% -1.97% Instructions are helped. total cycles in shared programs: 506215169 -> 506180109 (<.01%) cycles in affected programs: 1445088 -> 1410028 (-2.43%) helped: 97 HURT: 8 helped stats (abs) min: 1 max: 16882 x̄: 387.76 x̃: 26 helped stats (rel) min: 0.05% max: 18.31% x̄: 1.77% x̃: 1.34% HURT stats (abs) min: 21 max: 635 x̄: 319.12 x̃: 212 HURT stats (rel) min: 0.39% max: 20.08% x̄: 8.96% x̃: 4.46% 95% mean confidence interval for cycles value: -782.96 115.15 95% mean confidence interval for cycles %-change: -1.74% -0.16% Inconclusive result (value mean confidence interval includes 0). Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown) total instructions in shared programs: 19962974 -> 19961317 (<.01%) instructions in affected programs: 63471 -> 61814 (-2.61%) helped: 105 HURT: 0 helped stats (abs) min: 6 max: 82 x̄: 15.78 x̃: 11 helped stats (rel) min: 1.11% max: 28.65% x̄: 3.17% x̃: 2.16% 95% mean confidence interval for instructions value: -18.38 -13.18 95% mean confidence interval for instructions %-change: -3.86% -2.48% Instructions are helped. total cycles in shared programs: 498908953 -> 498863755 (<.01%) cycles in affected programs: 1566998 -> 1521800 (-2.88%) helped: 89 HURT: 15 helped stats (abs) min: 2 max: 17502 x̄: 532.19 x̃: 69 helped stats (rel) min: 0.07% max: 18.54% x̄: 4.71% x̃: 3.12% HURT stats (abs) min: 3 max: 661 x̄: 144.47 x̃: 16 HURT stats (rel) min: 0.14% max: 20.57% x̄: 4.29% x̃: 0.30% 95% mean confidence interval for cycles value: -903.93 34.74 95% mean confidence interval for cycles %-change: -4.50% -2.32% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8776 -> 8774 (-0.02%) spills in affected programs: 25 -> 23 (-8.00%) helped: 1 HURT: 0 total fills in shared programs: 9500 -> 9496 (-0.04%) fills in affected programs: 46 -> 42 (-8.70%) helped: 1 HURT: 0 Haswell total instructions in shared programs: 16229912 -> 16228399 (<.01%) instructions in affected programs: 61257 -> 59744 (-2.47%) helped: 105 HURT: 0 helped stats (abs) min: 6 max: 51 x̄: 14.41 x̃: 11 helped stats (rel) min: 0.77% max: 28.65% x̄: 3.08% x̃: 2.15% 95% mean confidence interval for instructions value: -16.14 -12.68 95% mean confidence interval for instructions %-change: -3.77% -2.40% Instructions are helped. total cycles in shared programs: 538654481 -> 538613967 (<.01%) cycles in affected programs: 1448966 -> 1408452 (-2.80%) helped: 58 HURT: 47 helped stats (abs) min: 9 max: 22604 x̄: 957.00 x̃: 74 helped stats (rel) min: 0.40% max: 18.81% x̄: 6.22% x̃: 3.03% HURT stats (abs) min: 5 max: 3720 x̄: 318.98 x̃: 49 HURT stats (rel) min: 0.20% max: 34.50% x̄: 5.05% x̃: 2.12% 95% mean confidence interval for cycles value: -999.84 228.14 95% mean confidence interval for cycles %-change: -2.86% 0.51% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 15266086 -> 15265983 (<.01%) instructions in affected programs: 7272 -> 7169 (-1.42%) helped: 3 HURT: 0 helped stats (abs) min: 21 max: 41 x̄: 34.33 x̃: 41 helped stats (rel) min: 0.66% max: 5.43% x̄: 2.44% x̃: 1.23% total cycles in shared programs: 422930883 -> 422930336 (<.01%) cycles in affected programs: 49259 -> 48712 (-1.11%) helped: 3 HURT: 0 helped stats (abs) min: 106 max: 221 x̄: 182.33 x̃: 220 helped stats (rel) min: 0.71% max: 5.95% x̄: 2.46% x̃: 0.72% No changes on any earilier Intel platforms. Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
* nir/algebraic: Optimize ushr of pack_half, not ishrIan Romanick2020-05-111-1/+1
| | | | | | | | | | | | When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce 0xBC000000. The ishr will produce 0xFFFFBC00. The replacement pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00. Fixes: 1f72857739b ("nir/algebraic: add some half packing optimizations") Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: Connor Abbott <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
* nir: do not vectorize load/store if offset can overflow and robustness enabledSamuel Pitoiset2020-05-113-5/+44
| | | | | | | | | | | | This prevents vectorization for loads/stores that can overflow if the low offset is negative and the range greater or equal than 0. The caller can pass the list of variable modes that matter for robust access. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4881>
* nir/algebraic: Optimize some bfe patternsIan Romanick2020-05-071-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Use -x instead of 32-x in shift counts. Tiger Lake total instructions in shared programs: 17597691 -> 17597405 (<.01%) instructions in affected programs: 224557 -> 224271 (-0.13%) helped: 74 HURT: 17 helped stats (abs) min: 1 max: 71 x̄: 14.36 x̃: 7 helped stats (rel) min: 0.08% max: 1.80% x̄: 0.50% x̃: 0.37% HURT stats (abs) min: 1 max: 141 x̄: 45.71 x̃: 40 HURT stats (rel) min: 0.03% max: 3.55% x̄: 1.20% x̃: 1.14% 95% mean confidence interval for instructions value: -10.53 4.24 95% mean confidence interval for instructions %-change: -0.38% 0.01% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 333595656 -> 333180770 (-0.12%) cycles in affected programs: 70056467 -> 69641581 (-0.59%) helped: 91 HURT: 4 helped stats (abs) min: 1 max: 25174 x̄: 4571.40 x̃: 400 helped stats (rel) min: <.01% max: 2.23% x̄: 0.40% x̃: 0.21% HURT stats (abs) min: 1 max: 370 x̄: 277.75 x̃: 370 HURT stats (rel) min: 0.01% max: 0.04% x̄: 0.04% x̃: 0.04% 95% mean confidence interval for cycles value: -5981.55 -2752.89 95% mean confidence interval for cycles %-change: -0.48% -0.29% Cycles are helped. Ice Lake, Skylake, Broadwell, and Haswell had similar results. (Ice Lake shown) total instructions in shared programs: 16117204 -> 16116723 (<.01%) instructions in affected programs: 207109 -> 206628 (-0.23%) helped: 100 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 4.81 x̃: 7 helped stats (rel) min: 0.10% max: 1.58% x̄: 0.23% x̃: 0.20% 95% mean confidence interval for instructions value: -5.51 -4.11 95% mean confidence interval for instructions %-change: -0.27% -0.19% Instructions are helped. total cycles in shared programs: 330487341 -> 330082421 (-0.12%) cycles in affected programs: 68037050 -> 67632130 (-0.60%) helped: 89 HURT: 7 helped stats (abs) min: 2 max: 24610 x̄: 4567.07 x̃: 400 helped stats (rel) min: <.01% max: 1.52% x̄: 0.39% x̃: 0.22% HURT stats (abs) min: 1 max: 370 x̄: 221.29 x̃: 170 HURT stats (rel) min: 0.01% max: 1.66% x̄: 0.58% x̃: 0.04% 95% mean confidence interval for cycles value: -5780.79 -2655.05 95% mean confidence interval for cycles %-change: -0.42% -0.22% Cycles are helped. Ivy Bridge total instructions in shared programs: 11873641 -> 11873137 (<.01%) instructions in affected programs: 147464 -> 146960 (-0.34%) helped: 54 HURT: 0 helped stats (abs) min: 9 max: 10 x̄: 9.33 x̃: 9 helped stats (rel) min: 0.29% max: 0.41% x̄: 0.34% x̃: 0.34% 95% mean confidence interval for instructions value: -9.46 -9.20 95% mean confidence interval for instructions %-change: -0.35% -0.33% Instructions are helped. total cycles in shared programs: 175769085 -> 175549519 (-0.12%) cycles in affected programs: 60770592 -> 60551026 (-0.36%) helped: 54 HURT: 0 helped stats (abs) min: 252 max: 13434 x̄: 4066.04 x̃: 1290 helped stats (rel) min: 0.02% max: 0.74% x̄: 0.34% x̃: 0.26% 95% mean confidence interval for cycles value: -5323.59 -2808.48 95% mean confidence interval for cycles %-change: -0.41% -0.27% Cycles are helped. No changes on any earlier Intel platforms. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
* nir/algebraic: Split ibfe and ubfe with two constant sourcesIan Romanick2020-05-072-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I also tried splitting ubfe instructions with one or zero constants, and zero shaders in shader-db were affected. The "lost" shader is a compute shader that was promoted from SIMD8 to SIMD16, so is also counted as the gained shader. v2: Further restrict bfe splitting. bfe with multiple constants is better on at least some Radeon GPUs. Use -x instead of 32-x in shift counts. v3: Fix the outer shift count for ibfe lowering. Add c=0 optimizations to prevent bad lowering. Both suggested by Rhys. Add shift by -32 optimizations. Tiger Lake total instructions in shared programs: 17608764 -> 17596316 (-0.07%) instructions in affected programs: 303765 -> 291317 (-4.10%) helped: 113 HURT: 46 helped stats (abs) min: 1 max: 458 x̄: 120.67 x̃: 21 helped stats (rel) min: 0.09% max: 11.23% x̄: 3.47% x̃: 1.39% HURT stats (abs) min: 1 max: 201 x̄: 25.83 x̃: 6 HURT stats (rel) min: 0.23% max: 5.18% x̄: 1.53% x̃: 1.11% 95% mean confidence interval for instructions value: -101.13 -55.45 95% mean confidence interval for instructions %-change: -2.61% -1.44% Instructions are helped. total cycles in shared programs: 338390770 -> 333530868 (-1.44%) cycles in affected programs: 79438330 -> 74578428 (-6.12%) helped: 112 HURT: 64 helped stats (abs) min: 2 max: 268955 x̄: 44261.93 x̃: 1452 helped stats (rel) min: <.01% max: 29.51% x̄: 4.72% x̃: 2.23% HURT stats (abs) min: 2 max: 17618 x̄: 1522.41 x̃: 84 HURT stats (rel) min: <.01% max: 7.34% x̄: 1.35% x̃: 0.34% 95% mean confidence interval for cycles value: -37232.47 -17993.69 95% mean confidence interval for cycles %-change: -3.37% -1.65% Cycles are helped. total spills in shared programs: 8944 -> 8138 (-9.01%) spills in affected programs: 3240 -> 2434 (-24.88%) helped: 67 HURT: 0 total fills in shared programs: 9373 -> 7842 (-16.33%) fills in affected programs: 4736 -> 3205 (-32.33%) helped: 67 HURT: 0 LOST: 1 GAINED: 2 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 16123288 -> 16116876 (-0.04%) instructions in affected programs: 241155 -> 234743 (-2.66%) helped: 126 HURT: 2 helped stats (abs) min: 1 max: 209 x̄: 50.90 x̃: 7 helped stats (rel) min: 0.07% max: 5.94% x̄: 1.76% x̃: 0.65% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.15% x̃: 0.15% 95% mean confidence interval for instructions value: -61.29 -38.89 95% mean confidence interval for instructions %-change: -2.05% -1.42% Instructions are helped. total cycles in shared programs: 335419163 -> 330438819 (-1.48%) cycles in affected programs: 77515502 -> 72535158 (-6.42%) helped: 139 HURT: 37 helped stats (abs) min: 2 max: 269140 x̄: 36374.19 x̃: 597 helped stats (rel) min: <.01% max: 28.60% x̄: 3.67% x̃: 1.31% HURT stats (abs) min: 4 max: 17618 x̄: 2045.08 x̃: 174 HURT stats (rel) min: 0.02% max: 8.32% x̄: 2.61% x̃: 0.62% 95% mean confidence interval for cycles value: -37799.30 -18795.51 95% mean confidence interval for cycles %-change: -3.13% -1.57% Cycles are helped. total spills in shared programs: 8065 -> 7306 (-9.41%) spills in affected programs: 3153 -> 2394 (-24.07%) helped: 67 HURT: 0 total fills in shared programs: 8710 -> 7412 (-14.90%) fills in affected programs: 4466 -> 3168 (-29.06%) helped: 67 HURT: 0 LOST: 1 GAINED: 1 Broadwell total instructions in shared programs: 14970538 -> 14965967 (-0.03%) instructions in affected programs: 227040 -> 222469 (-2.01%) helped: 126 HURT: 2 helped stats (abs) min: 1 max: 136 x̄: 36.29 x̃: 8 helped stats (rel) min: 0.07% max: 6.02% x̄: 1.47% x̃: 0.89% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.14% x̃: 0.14% 95% mean confidence interval for instructions value: -43.05 -28.37 95% mean confidence interval for instructions %-change: -1.69% -1.19% Instructions are helped. total cycles in shared programs: 336237662 -> 333035960 (-0.95%) cycles in affected programs: 72066394 -> 68864692 (-4.44%) helped: 134 HURT: 42 helped stats (abs) min: 4 max: 122640 x̄: 24344.54 x̃: 1833 helped stats (rel) min: <.01% max: 26.93% x̄: 4.02% x̃: 2.38% HURT stats (abs) min: 1 max: 17205 x̄: 1439.69 x̃: 92 HURT stats (rel) min: <.01% max: 7.12% x̄: 1.34% x̃: 0.62% 95% mean confidence interval for cycles value: -23753.58 -12629.40 95% mean confidence interval for cycles %-change: -3.50% -1.98% Cycles are helped. total spills in shared programs: 21122 -> 20204 (-4.35%) spills in affected programs: 3644 -> 2726 (-25.19%) helped: 67 HURT: 0 total fills in shared programs: 24879 -> 23460 (-5.70%) fills in affected programs: 4883 -> 3464 (-29.06%) helped: 67 HURT: 0 Haswell total instructions in shared programs: 13148269 -> 13145444 (-0.02%) instructions in affected programs: 137046 -> 134221 (-2.06%) helped: 97 HURT: 3 helped stats (abs) min: 1 max: 137 x̄: 30.58 x̃: 3 helped stats (rel) min: 0.14% max: 4.38% x̄: 1.38% x̃: 0.44% HURT stats (abs) min: 1 max: 70 x̄: 47.00 x̃: 70 HURT stats (rel) min: 0.05% max: 5.82% x̄: 3.90% x̃: 5.82% 95% mean confidence interval for instructions value: -37.15 -19.35 95% mean confidence interval for instructions %-change: -1.56% -0.89% Instructions are helped. total cycles in shared programs: 321221834 -> 318333159 (-0.90%) cycles in affected programs: 54932349 -> 52043674 (-5.26%) helped: 95 HURT: 53 helped stats (abs) min: 4 max: 123390 x̄: 30648.39 x̃: 702 helped stats (rel) min: <.01% max: 28.87% x̄: 4.27% x̃: 2.87% HURT stats (abs) min: 4 max: 2357 x̄: 432.49 x̃: 113 HURT stats (rel) min: <.01% max: 3.44% x̄: 1.03% x̃: 0.54% 95% mean confidence interval for cycles value: -26154.16 -12881.99 95% mean confidence interval for cycles %-change: -3.20% -1.55% Cycles are helped. total spills in shared programs: 19878 -> 19293 (-2.94%) spills in affected programs: 3020 -> 2435 (-19.37%) helped: 41 HURT: 2 total fills in shared programs: 20918 -> 19875 (-4.99%) fills in affected programs: 3968 -> 2925 (-26.29%) helped: 41 HURT: 2 LOST: 0 GAINED: 1 Ivy Bridge total instructions in shared programs: 11875585 -> 11873641 (-0.02%) instructions in affected programs: 78065 -> 76121 (-2.49%) helped: 27 HURT: 0 helped stats (abs) min: 8 max: 134 x̄: 72.00 x̃: 72 helped stats (rel) min: 0.36% max: 4.23% x̄: 2.42% x̃: 2.42% 95% mean confidence interval for instructions value: -83.68 -60.32 95% mean confidence interval for instructions %-change: -2.78% -2.07% Instructions are helped. total cycles in shared programs: 178232734 -> 175769085 (-1.38%) cycles in affected programs: 50018707 -> 47555058 (-4.93%) helped: 27 HURT: 0 helped stats (abs) min: 82035 max: 99953 x̄: 91246.26 x̃: 92278 helped stats (rel) min: 4.40% max: 5.69% x̄: 4.93% x̃: 4.95% 95% mean confidence interval for cycles value: -93674.20 -88818.32 95% mean confidence interval for cycles %-change: -5.09% -4.78% Cycles are helped. total spills in shared programs: 4182 -> 3739 (-10.59%) spills in affected programs: 1089 -> 646 (-40.68%) helped: 27 HURT: 0 total fills in shared programs: 5216 -> 4345 (-16.70%) fills in affected programs: 1874 -> 1003 (-46.48%) helped: 27 HURT: 0 No changes on any earlier Intel platforms. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>