| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
| |
By definition.
This reduces register pressure on freedreno so that the noubo expected
failure goes away.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
|
|
|
|
|
|
|
|
|
|
| |
This uses a meson builtin to handle -fvisibility=hidden. This is nice
because we don't need to track which languages are used, if C++ is
suddenly added meson just does the right thing.
Acked-by: Matt Turner <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: remove the option to actually request normalization and its
application in Intel < Gen6 (Jason)
v3: Also don't lower for query operations (Jason)
Fixes: 1ce8060c25c7f2c7a54159fab6a6974c0ba182a8
nir/lower_tex: support for lowering RECT textures
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5105>
|
|
|
|
|
|
|
|
|
|
| |
Somehow we ended up with an extra scalar source up-front. It doesn't
look like any drivers use this opcode yet so no real harm has been done
by it being wrong.
Acked-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5218>
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Fixes: 18ed82b084c79bf63666f2da22e5d675fb01aa26
('nir: Add a pass for selectively lowering variables to scratch space')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5214>
|
|
|
|
|
|
|
|
|
|
| |
Complement the existing un/pack_32_2x16 opcodes. These are useful for
8-bit format packing. On Midgard, they are equivalent to just a 32-bit
move, but other GPUs could lower to other packs if needed.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5107>
|
|
|
|
|
|
|
|
|
| |
xxhash is faster than fnv1a in almost all circumstances, so we're
switching to it globally.
Signed-off-by: Dmytro Nester <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4020>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
They will stop working in the next GitLab release, so let's update them
ASAP to make sure things are propagated to everyone by then.
See:
https://about.gitlab.com/releases/2020/05/06/gitlab-com-13-0-breaking-changes/#removal-of-deprecated-project-paths
Cc: [email protected]
Signed-off-by: Eric Engestrom <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5111>
|
|
|
|
|
|
| |
Fixes: 96c32d7776 "nir/copy_prop_vars: handle load/store of vector..."
Reviewed-by: Dave Airlie <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
|
|
|
|
|
|
| |
Fixes: a1c688517de "nir/opt_deref: Properly optimize ptr_as_array..."
Reviewed-by: Dave Airlie <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
|
|
|
|
|
|
| |
Fixes: d7d35a9522 "nir/lower_doubles: Use the new NIR lowering..."
Reviewed-by: Dave Airlie <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5170>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5114>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5108>
|
|
|
|
|
|
|
|
|
| |
Corresponds to the .pos modifier on all Mali GPUs (lima and panfrost).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5102>
|
|
|
|
|
|
|
|
|
| |
Exists on later Mali. Equivalent to clamp(x, -1.0, 1.0)
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5102>
|
|
|
|
|
|
| |
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
|
|
|
|
|
|
|
|
|
| |
This takes the same callback as nir_foreach_src except it walks all phi
sources which leave a given block.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
|
|
|
|
|
|
|
|
|
|
|
|
| |
All it takes are a couple small tweaks to the clone infrastructure to
allow us to use it without any remap table at all. This reduces code
duplication and the chances for bugs that come with it. In particular,
the hand-rolled nir_alu_instr_clone didn't preserve no_[un]signed_wrap,
or source/destination modifiers.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5094>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4757>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes dEQP-VK.graphicsfuzz.loops-ifs-continues-call with RADV.
opt_if_loop_terminator can cause this optimization or
opt_if_simplification to be run on the non-SSA code.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Fixes: 52c8bc0130a ('nir: make opt_if_loop_terminator() less strict')
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2943
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4757>
|
|
|
|
|
| |
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
|
|
|
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has the downside of putting block successor validation in two
places that are a bit further apart. However, handling them as a
special case makes the code more confusing than needed. At least two
different people have not noticed that we don't have jump instruction
validation in the last week or two and added it. Being able to search
for validate_jump_instr is useful.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In i965 these get lowered after gather info, so let's consider them
too. Fixes
piglit.spec.arb_framebuffer_no_attachments.arb_framebuffer_no_attachments-atomic
in Gen9, HSW and IVB.
Fixes: 6a6c36e9776 ("intel/fs: Use writes_memory from shader_info")
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5093>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4815>
|
|
|
|
|
|
|
|
|
| |
This shrinks nir_intrinsics.c.o from 73K to 35K and nir_opcodes.c.o from
64K to 31K on a release build.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5045>
|
|
|
| |
This reverts commit 667e14e7bd759a77e732c4de09fb978ee3816eaf
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068>
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It seems less brittle to not assume they are in the same order for src
and dst instructions.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Caught by the sanity checking in nir_intrinsic_copy_const_indices()
(which is introduced by the next patch).
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
These intrinsics are supposed to map to the underlying hardware
instructions, which don't have wrmask. We use them when we lower
store_output in the geometry pipeline and since store_output gets
lowered to temps, we always see full wrmasks there.
|
|
|
|
|
|
| |
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5028>
|
|
|
|
|
|
| |
I keep wanting this number for debugging shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
|
|
|
|
|
|
|
| |
upfront
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
|
|
|
|
|
|
|
|
|
|
| |
This patch adds some control flow information to the
state to keep track whether a loop contains divergent
continue or break statements to not having to
recalculate this property for every phi.
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch splits the visit_phi() function into
three different ones according to the kind of phi
(merge-node, loop-header or loop-exit) and calls
them when visiting the cf_nodes.
This allows to revisit loops if the loop header's
phis have changed, only.
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
|
|
|
|
|
|
|
| |
v2: fix usage in ACO (by Daniel Schürmann)
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes a GS regression introduced in !4562 where
ir3's GS lowering pass was moved from common code (ir3_nir) to
freedreno-specific code (ir3_shader). For GS support in turnip, we
need to add the GS lowering pass back in, this time in tu_shader.
As for the nir_gather_info change, the GS lowering pass has always
introduced a discard_if intrinsic into the GS. Previously, we simply
ran nir_shader_gather_info before GS lowering, but now since we lower
the GS before we need to remove the assertion that only a FS can use
the discard_if intrinsic.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fixed commit was really nice in mostly fixing num_ubos to reflect the
shader after lowering, but for
dEQP-GLES31.functional.compute.basic.ubo_to_ssbo_single_invocation there
are no default uniforms and so we skipped the increment, even though we
shifted the block index up.
Fixes: 4777ee1a62f0 ("nir: Always create UBO variable when lowering uniforms to ubo")
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Erik Faye-Lund <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The shader helped for spills and fills is the big compute shader in Dirt
Showdown. One of the shaders hurt for spills and fills on Broadwell is
the big compute shader in Bioshock Infinite, but combined with the
previous commit, it's still an impovement.
Tiger Lake
total instructions in shared programs: 21833218 -> 21832449 (<.01%)
instructions in affected programs: 66104 -> 65335 (-1.16%)
helped: 106
HURT: 14
helped stats (abs) min: 1 max: 67 x̄: 7.87 x̃: 5
helped stats (rel) min: 0.19% max: 5.76% x̄: 1.27% x̃: 0.95%
HURT stats (abs) min: 1 max: 14 x̄: 4.64 x̃: 1
HURT stats (rel) min: 0.19% max: 4.12% x̄: 1.41% x̃: 0.19%
95% mean confidence interval for instructions value: -8.51 -4.30
95% mean confidence interval for instructions %-change: -1.23% -0.69%
Instructions are helped.
total cycles in shared programs: 506180109 -> 506196314 (<.01%)
cycles in affected programs: 1671429 -> 1687634 (0.97%)
helped: 37
HURT: 84
helped stats (abs) min: 1 max: 490 x̄: 73.27 x̃: 24
helped stats (rel) min: 0.02% max: 7.98% x̄: 1.25% x̃: 0.41%
HURT stats (abs) min: 1 max: 5000 x̄: 225.19 x̃: 8
HURT stats (rel) min: 0.03% max: 10.22% x̄: 1.22% x̃: 0.42%
95% mean confidence interval for cycles value: 2.85 265.00
95% mean confidence interval for cycles %-change: 0.04% 0.88%
Cycles are HURT.
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 19961317 -> 19960543 (<.01%)
instructions in affected programs: 30268 -> 29494 (-2.56%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 142 x̄: 19.85 x̃: 7
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.33% x̃: 2.31%
95% mean confidence interval for instructions value: -29.46 -10.23
95% mean confidence interval for instructions %-change: -2.95% -1.71%
Instructions are helped.
total cycles in shared programs: 498863755 -> 498865843 (<.01%)
cycles in affected programs: 1831136 -> 1833224 (0.11%)
helped: 57
HURT: 65
helped stats (abs) min: 1 max: 1400 x̄: 128.93 x̃: 25
helped stats (rel) min: 0.05% max: 3.49% x̄: 0.89% x̃: 0.71%
HURT stats (abs) min: 1 max: 1887 x̄: 145.18 x̃: 15
HURT stats (rel) min: 0.02% max: 9.88% x̄: 1.83% x̃: 0.73%
95% mean confidence interval for cycles value: -58.30 92.53
95% mean confidence interval for cycles %-change: 0.16% 0.97%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 8774 -> 8773 (-0.01%)
spills in affected programs: 20 -> 19 (-5.00%)
helped: 1
HURT: 0
total fills in shared programs: 9496 -> 9494 (-0.02%)
fills in affected programs: 40 -> 38 (-5.00%)
helped: 1
HURT: 0
Broadwell
total instructions in shared programs: 17859373 -> 17858548 (<.01%)
instructions in affected programs: 38452 -> 37627 (-2.15%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 143 x̄: 26.61 x̃: 10
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.57% x̃: 2.69%
95% mean confidence interval for instructions value: -39.79 -13.44
95% mean confidence interval for instructions %-change: -3.25% -1.89%
Instructions are helped.
total cycles in shared programs: 525858109 -> 525869236 (<.01%)
cycles in affected programs: 2058597 -> 2069724 (0.54%)
helped: 44
HURT: 75
helped stats (abs) min: 2 max: 1330 x̄: 187.84 x̃: 23
helped stats (rel) min: 0.04% max: 31.31% x̄: 2.13% x̃: 0.85%
HURT stats (abs) min: 1 max: 3915 x̄: 258.56 x̃: 47
HURT stats (rel) min: 0.02% max: 10.53% x̄: 2.81% x̃: 2.21%
95% mean confidence interval for cycles value: -26.06 213.07
95% mean confidence interval for cycles %-change: 0.19% 1.78%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 25744 -> 25730 (-0.05%)
spills in affected programs: 1578 -> 1564 (-0.89%)
helped: 4
HURT: 2
total fills in shared programs: 31710 -> 31689 (-0.07%)
fills in affected programs: 4346 -> 4325 (-0.48%)
helped: 3
HURT: 3
Haswell
total instructions in shared programs: 16228399 -> 16227783 (<.01%)
instructions in affected programs: 22201 -> 21585 (-2.77%)
helped: 27
HURT: 0
helped stats (abs) min: 1 max: 68 x̄: 22.81 x̃: 11
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.92% x̃: 2.86%
95% mean confidence interval for instructions value: -31.96 -13.66
95% mean confidence interval for instructions %-change: -3.68% -2.15%
Instructions are helped.
total cycles in shared programs: 538613967 -> 538701354 (0.02%)
cycles in affected programs: 1653044 -> 1740431 (5.29%)
helped: 36
HURT: 81
helped stats (abs) min: 2 max: 708 x̄: 104.50 x̃: 17
helped stats (rel) min: <.01% max: 15.01% x̄: 1.67% x̃: 0.65%
HURT stats (abs) min: 1 max: 30100 x̄: 1125.30 x̃: 304
HURT stats (rel) min: 0.02% max: 16.21% x̄: 8.98% x̃: 11.60%
95% mean confidence interval for cycles value: 23.78 1470.01
95% mean confidence interval for cycles %-change: 4.29% 7.12%
Cycles are HURT.
total spills in shared programs: 23418 -> 23409 (-0.04%)
spills in affected programs: 177 -> 168 (-5.08%)
helped: 2
HURT: 0
total fills in shared programs: 25919 -> 25896 (-0.09%)
fills in affected programs: 568 -> 545 (-4.05%)
helped: 3
HURT: 0
Ivy Bridge
total instructions in shared programs: 15265983 -> 15265759 (<.01%)
instructions in affected programs: 8418 -> 8194 (-2.66%)
helped: 5
HURT: 0
helped stats (abs) min: 18 max: 99 x̄: 44.80 x̃: 26
helped stats (rel) min: 1.74% max: 4.26% x̄: 3.12% x̃: 3.00%
95% mean confidence interval for instructions value: -86.29 -3.31
95% mean confidence interval for instructions %-change: -4.43% -1.81%
Instructions are helped.
total cycles in shared programs: 422930336 -> 422929589 (<.01%)
cycles in affected programs: 59347 -> 58600 (-1.26%)
helped: 3
HURT: 2
helped stats (abs) min: 72 max: 1060 x̄: 433.33 x̃: 168
helped stats (rel) min: 1.14% max: 3.48% x̄: 2.23% x̃: 2.06%
HURT stats (abs) min: 265 max: 288 x̄: 276.50 x̃: 276
HURT stats (rel) min: 4.79% max: 5.64% x̄: 5.22% x̃: 5.22%
95% mean confidence interval for cycles value: -829.08 530.28
95% mean confidence interval for cycles %-change: -4.43% 5.93%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 4953 -> 4946 (-0.14%)
spills in affected programs: 344 -> 337 (-2.03%)
helped: 2
HURT: 0
total fills in shared programs: 5548 -> 5521 (-0.49%)
fills in affected programs: 838 -> 811 (-3.22%)
helped: 2
HURT: 0
No shader-db changes on any earlier Intel platform.
Reviewed-by: Rhys Perry <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like 1f72857739b ("nir/algebraic: add some half packing optimizations"),
but for the pack_half_2x16_split variant.
The shader helped for spills and fills is the big compute shader in
Bioshock Infinite.
Tiger Lake
total instructions in shared programs: 21834539 -> 21833218 (<.01%)
instructions in affected programs: 60119 -> 58798 (-2.20%)
helped: 105
HURT: 0
helped stats (abs) min: 5 max: 50 x̄: 12.58 x̃: 9
helped stats (rel) min: 0.86% max: 26.46% x̄: 2.58% x̃: 1.70%
95% mean confidence interval for instructions value: -14.35 -10.81
95% mean confidence interval for instructions %-change: -3.20% -1.97%
Instructions are helped.
total cycles in shared programs: 506215169 -> 506180109 (<.01%)
cycles in affected programs: 1445088 -> 1410028 (-2.43%)
helped: 97
HURT: 8
helped stats (abs) min: 1 max: 16882 x̄: 387.76 x̃: 26
helped stats (rel) min: 0.05% max: 18.31% x̄: 1.77% x̃: 1.34%
HURT stats (abs) min: 21 max: 635 x̄: 319.12 x̃: 212
HURT stats (rel) min: 0.39% max: 20.08% x̄: 8.96% x̃: 4.46%
95% mean confidence interval for cycles value: -782.96 115.15
95% mean confidence interval for cycles %-change: -1.74% -0.16%
Inconclusive result (value mean confidence interval includes 0).
Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown)
total instructions in shared programs: 19962974 -> 19961317 (<.01%)
instructions in affected programs: 63471 -> 61814 (-2.61%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 82 x̄: 15.78 x̃: 11
helped stats (rel) min: 1.11% max: 28.65% x̄: 3.17% x̃: 2.16%
95% mean confidence interval for instructions value: -18.38 -13.18
95% mean confidence interval for instructions %-change: -3.86% -2.48%
Instructions are helped.
total cycles in shared programs: 498908953 -> 498863755 (<.01%)
cycles in affected programs: 1566998 -> 1521800 (-2.88%)
helped: 89
HURT: 15
helped stats (abs) min: 2 max: 17502 x̄: 532.19 x̃: 69
helped stats (rel) min: 0.07% max: 18.54% x̄: 4.71% x̃: 3.12%
HURT stats (abs) min: 3 max: 661 x̄: 144.47 x̃: 16
HURT stats (rel) min: 0.14% max: 20.57% x̄: 4.29% x̃: 0.30%
95% mean confidence interval for cycles value: -903.93 34.74
95% mean confidence interval for cycles %-change: -4.50% -2.32%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 8776 -> 8774 (-0.02%)
spills in affected programs: 25 -> 23 (-8.00%)
helped: 1
HURT: 0
total fills in shared programs: 9500 -> 9496 (-0.04%)
fills in affected programs: 46 -> 42 (-8.70%)
helped: 1
HURT: 0
Haswell
total instructions in shared programs: 16229912 -> 16228399 (<.01%)
instructions in affected programs: 61257 -> 59744 (-2.47%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 51 x̄: 14.41 x̃: 11
helped stats (rel) min: 0.77% max: 28.65% x̄: 3.08% x̃: 2.15%
95% mean confidence interval for instructions value: -16.14 -12.68
95% mean confidence interval for instructions %-change: -3.77% -2.40%
Instructions are helped.
total cycles in shared programs: 538654481 -> 538613967 (<.01%)
cycles in affected programs: 1448966 -> 1408452 (-2.80%)
helped: 58
HURT: 47
helped stats (abs) min: 9 max: 22604 x̄: 957.00 x̃: 74
helped stats (rel) min: 0.40% max: 18.81% x̄: 6.22% x̃: 3.03%
HURT stats (abs) min: 5 max: 3720 x̄: 318.98 x̃: 49
HURT stats (rel) min: 0.20% max: 34.50% x̄: 5.05% x̃: 2.12%
95% mean confidence interval for cycles value: -999.84 228.14
95% mean confidence interval for cycles %-change: -2.86% 0.51%
Inconclusive result (value mean confidence interval includes 0).
Ivy Bridge
total instructions in shared programs: 15266086 -> 15265983 (<.01%)
instructions in affected programs: 7272 -> 7169 (-1.42%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 41 x̄: 34.33 x̃: 41
helped stats (rel) min: 0.66% max: 5.43% x̄: 2.44% x̃: 1.23%
total cycles in shared programs: 422930883 -> 422930336 (<.01%)
cycles in affected programs: 49259 -> 48712 (-1.11%)
helped: 3
HURT: 0
helped stats (abs) min: 106 max: 221 x̄: 182.33 x̃: 220
helped stats (rel) min: 0.71% max: 5.95% x̄: 2.46% x̃: 0.72%
No changes on any earilier Intel platforms.
Reviewed-by: Rhys Perry <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce
0xBC000000. The ishr will produce 0xFFFFBC00. The replacement
pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00.
Fixes: 1f72857739b ("nir/algebraic: add some half packing optimizations")
Reviewed-by: Rhys Perry <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: Connor Abbott <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents vectorization for loads/stores that can overflow if
the low offset is negative and the range greater or equal than 0.
The caller can pass the list of variable modes that matter for
robust access.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4881>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Use -x instead of 32-x in shift counts.
Tiger Lake
total instructions in shared programs: 17597691 -> 17597405 (<.01%)
instructions in affected programs: 224557 -> 224271 (-0.13%)
helped: 74
HURT: 17
helped stats (abs) min: 1 max: 71 x̄: 14.36 x̃: 7
helped stats (rel) min: 0.08% max: 1.80% x̄: 0.50% x̃: 0.37%
HURT stats (abs) min: 1 max: 141 x̄: 45.71 x̃: 40
HURT stats (rel) min: 0.03% max: 3.55% x̄: 1.20% x̃: 1.14%
95% mean confidence interval for instructions value: -10.53 4.24
95% mean confidence interval for instructions %-change: -0.38% 0.01%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 333595656 -> 333180770 (-0.12%)
cycles in affected programs: 70056467 -> 69641581 (-0.59%)
helped: 91
HURT: 4
helped stats (abs) min: 1 max: 25174 x̄: 4571.40 x̃: 400
helped stats (rel) min: <.01% max: 2.23% x̄: 0.40% x̃: 0.21%
HURT stats (abs) min: 1 max: 370 x̄: 277.75 x̃: 370
HURT stats (rel) min: 0.01% max: 0.04% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for cycles value: -5981.55 -2752.89
95% mean confidence interval for cycles %-change: -0.48% -0.29%
Cycles are helped.
Ice Lake, Skylake, Broadwell, and Haswell had similar results. (Ice Lake shown)
total instructions in shared programs: 16117204 -> 16116723 (<.01%)
instructions in affected programs: 207109 -> 206628 (-0.23%)
helped: 100
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 4.81 x̃: 7
helped stats (rel) min: 0.10% max: 1.58% x̄: 0.23% x̃: 0.20%
95% mean confidence interval for instructions value: -5.51 -4.11
95% mean confidence interval for instructions %-change: -0.27% -0.19%
Instructions are helped.
total cycles in shared programs: 330487341 -> 330082421 (-0.12%)
cycles in affected programs: 68037050 -> 67632130 (-0.60%)
helped: 89
HURT: 7
helped stats (abs) min: 2 max: 24610 x̄: 4567.07 x̃: 400
helped stats (rel) min: <.01% max: 1.52% x̄: 0.39% x̃: 0.22%
HURT stats (abs) min: 1 max: 370 x̄: 221.29 x̃: 170
HURT stats (rel) min: 0.01% max: 1.66% x̄: 0.58% x̃: 0.04%
95% mean confidence interval for cycles value: -5780.79 -2655.05
95% mean confidence interval for cycles %-change: -0.42% -0.22%
Cycles are helped.
Ivy Bridge
total instructions in shared programs: 11873641 -> 11873137 (<.01%)
instructions in affected programs: 147464 -> 146960 (-0.34%)
helped: 54
HURT: 0
helped stats (abs) min: 9 max: 10 x̄: 9.33 x̃: 9
helped stats (rel) min: 0.29% max: 0.41% x̄: 0.34% x̃: 0.34%
95% mean confidence interval for instructions value: -9.46 -9.20
95% mean confidence interval for instructions %-change: -0.35% -0.33%
Instructions are helped.
total cycles in shared programs: 175769085 -> 175549519 (-0.12%)
cycles in affected programs: 60770592 -> 60551026 (-0.36%)
helped: 54
HURT: 0
helped stats (abs) min: 252 max: 13434 x̄: 4066.04 x̃: 1290
helped stats (rel) min: 0.02% max: 0.74% x̄: 0.34% x̃: 0.26%
95% mean confidence interval for cycles value: -5323.59 -2808.48
95% mean confidence interval for cycles %-change: -0.41% -0.27%
Cycles are helped.
No changes on any earlier Intel platforms.
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I also tried splitting ubfe instructions with one or zero constants,
and zero shaders in shader-db were affected.
The "lost" shader is a compute shader that was promoted from SIMD8 to
SIMD16, so is also counted as the gained shader.
v2: Further restrict bfe splitting. bfe with multiple constants is
better on at least some Radeon GPUs. Use -x instead of 32-x in shift
counts.
v3: Fix the outer shift count for ibfe lowering. Add c=0 optimizations
to prevent bad lowering. Both suggested by Rhys. Add shift by -32
optimizations.
Tiger Lake
total instructions in shared programs: 17608764 -> 17596316 (-0.07%)
instructions in affected programs: 303765 -> 291317 (-4.10%)
helped: 113
HURT: 46
helped stats (abs) min: 1 max: 458 x̄: 120.67 x̃: 21
helped stats (rel) min: 0.09% max: 11.23% x̄: 3.47% x̃: 1.39%
HURT stats (abs) min: 1 max: 201 x̄: 25.83 x̃: 6
HURT stats (rel) min: 0.23% max: 5.18% x̄: 1.53% x̃: 1.11%
95% mean confidence interval for instructions value: -101.13 -55.45
95% mean confidence interval for instructions %-change: -2.61% -1.44%
Instructions are helped.
total cycles in shared programs: 338390770 -> 333530868 (-1.44%)
cycles in affected programs: 79438330 -> 74578428 (-6.12%)
helped: 112
HURT: 64
helped stats (abs) min: 2 max: 268955 x̄: 44261.93 x̃: 1452
helped stats (rel) min: <.01% max: 29.51% x̄: 4.72% x̃: 2.23%
HURT stats (abs) min: 2 max: 17618 x̄: 1522.41 x̃: 84
HURT stats (rel) min: <.01% max: 7.34% x̄: 1.35% x̃: 0.34%
95% mean confidence interval for cycles value: -37232.47 -17993.69
95% mean confidence interval for cycles %-change: -3.37% -1.65%
Cycles are helped.
total spills in shared programs: 8944 -> 8138 (-9.01%)
spills in affected programs: 3240 -> 2434 (-24.88%)
helped: 67
HURT: 0
total fills in shared programs: 9373 -> 7842 (-16.33%)
fills in affected programs: 4736 -> 3205 (-32.33%)
helped: 67
HURT: 0
LOST: 1
GAINED: 2
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 16123288 -> 16116876 (-0.04%)
instructions in affected programs: 241155 -> 234743 (-2.66%)
helped: 126
HURT: 2
helped stats (abs) min: 1 max: 209 x̄: 50.90 x̃: 7
helped stats (rel) min: 0.07% max: 5.94% x̄: 1.76% x̃: 0.65%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.15% x̃: 0.15%
95% mean confidence interval for instructions value: -61.29 -38.89
95% mean confidence interval for instructions %-change: -2.05% -1.42%
Instructions are helped.
total cycles in shared programs: 335419163 -> 330438819 (-1.48%)
cycles in affected programs: 77515502 -> 72535158 (-6.42%)
helped: 139
HURT: 37
helped stats (abs) min: 2 max: 269140 x̄: 36374.19 x̃: 597
helped stats (rel) min: <.01% max: 28.60% x̄: 3.67% x̃: 1.31%
HURT stats (abs) min: 4 max: 17618 x̄: 2045.08 x̃: 174
HURT stats (rel) min: 0.02% max: 8.32% x̄: 2.61% x̃: 0.62%
95% mean confidence interval for cycles value: -37799.30 -18795.51
95% mean confidence interval for cycles %-change: -3.13% -1.57%
Cycles are helped.
total spills in shared programs: 8065 -> 7306 (-9.41%)
spills in affected programs: 3153 -> 2394 (-24.07%)
helped: 67
HURT: 0
total fills in shared programs: 8710 -> 7412 (-14.90%)
fills in affected programs: 4466 -> 3168 (-29.06%)
helped: 67
HURT: 0
LOST: 1
GAINED: 1
Broadwell
total instructions in shared programs: 14970538 -> 14965967 (-0.03%)
instructions in affected programs: 227040 -> 222469 (-2.01%)
helped: 126
HURT: 2
helped stats (abs) min: 1 max: 136 x̄: 36.29 x̃: 8
helped stats (rel) min: 0.07% max: 6.02% x̄: 1.47% x̃: 0.89%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.14% x̃: 0.14%
95% mean confidence interval for instructions value: -43.05 -28.37
95% mean confidence interval for instructions %-change: -1.69% -1.19%
Instructions are helped.
total cycles in shared programs: 336237662 -> 333035960 (-0.95%)
cycles in affected programs: 72066394 -> 68864692 (-4.44%)
helped: 134
HURT: 42
helped stats (abs) min: 4 max: 122640 x̄: 24344.54 x̃: 1833
helped stats (rel) min: <.01% max: 26.93% x̄: 4.02% x̃: 2.38%
HURT stats (abs) min: 1 max: 17205 x̄: 1439.69 x̃: 92
HURT stats (rel) min: <.01% max: 7.12% x̄: 1.34% x̃: 0.62%
95% mean confidence interval for cycles value: -23753.58 -12629.40
95% mean confidence interval for cycles %-change: -3.50% -1.98%
Cycles are helped.
total spills in shared programs: 21122 -> 20204 (-4.35%)
spills in affected programs: 3644 -> 2726 (-25.19%)
helped: 67
HURT: 0
total fills in shared programs: 24879 -> 23460 (-5.70%)
fills in affected programs: 4883 -> 3464 (-29.06%)
helped: 67
HURT: 0
Haswell
total instructions in shared programs: 13148269 -> 13145444 (-0.02%)
instructions in affected programs: 137046 -> 134221 (-2.06%)
helped: 97
HURT: 3
helped stats (abs) min: 1 max: 137 x̄: 30.58 x̃: 3
helped stats (rel) min: 0.14% max: 4.38% x̄: 1.38% x̃: 0.44%
HURT stats (abs) min: 1 max: 70 x̄: 47.00 x̃: 70
HURT stats (rel) min: 0.05% max: 5.82% x̄: 3.90% x̃: 5.82%
95% mean confidence interval for instructions value: -37.15 -19.35
95% mean confidence interval for instructions %-change: -1.56% -0.89%
Instructions are helped.
total cycles in shared programs: 321221834 -> 318333159 (-0.90%)
cycles in affected programs: 54932349 -> 52043674 (-5.26%)
helped: 95
HURT: 53
helped stats (abs) min: 4 max: 123390 x̄: 30648.39 x̃: 702
helped stats (rel) min: <.01% max: 28.87% x̄: 4.27% x̃: 2.87%
HURT stats (abs) min: 4 max: 2357 x̄: 432.49 x̃: 113
HURT stats (rel) min: <.01% max: 3.44% x̄: 1.03% x̃: 0.54%
95% mean confidence interval for cycles value: -26154.16 -12881.99
95% mean confidence interval for cycles %-change: -3.20% -1.55%
Cycles are helped.
total spills in shared programs: 19878 -> 19293 (-2.94%)
spills in affected programs: 3020 -> 2435 (-19.37%)
helped: 41
HURT: 2
total fills in shared programs: 20918 -> 19875 (-4.99%)
fills in affected programs: 3968 -> 2925 (-26.29%)
helped: 41
HURT: 2
LOST: 0
GAINED: 1
Ivy Bridge
total instructions in shared programs: 11875585 -> 11873641 (-0.02%)
instructions in affected programs: 78065 -> 76121 (-2.49%)
helped: 27
HURT: 0
helped stats (abs) min: 8 max: 134 x̄: 72.00 x̃: 72
helped stats (rel) min: 0.36% max: 4.23% x̄: 2.42% x̃: 2.42%
95% mean confidence interval for instructions value: -83.68 -60.32
95% mean confidence interval for instructions %-change: -2.78% -2.07%
Instructions are helped.
total cycles in shared programs: 178232734 -> 175769085 (-1.38%)
cycles in affected programs: 50018707 -> 47555058 (-4.93%)
helped: 27
HURT: 0
helped stats (abs) min: 82035 max: 99953 x̄: 91246.26 x̃: 92278
helped stats (rel) min: 4.40% max: 5.69% x̄: 4.93% x̃: 4.95%
95% mean confidence interval for cycles value: -93674.20 -88818.32
95% mean confidence interval for cycles %-change: -5.09% -4.78%
Cycles are helped.
total spills in shared programs: 4182 -> 3739 (-10.59%)
spills in affected programs: 1089 -> 646 (-40.68%)
helped: 27
HURT: 0
total fills in shared programs: 5216 -> 4345 (-16.70%)
fills in affected programs: 1874 -> 1003 (-46.48%)
helped: 27
HURT: 0
No changes on any earlier Intel platforms.
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
|