| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Fixes: c643979228 (intel/fs: Choose memory message type based on bit size)
Fixes: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_i8vec2
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5000>
|
|
|
|
|
|
|
| |
Improves nohw drawoverhead 8-ubos update throughput by 13.493% +/-
0.391444% (n=15).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5011>
|
|
|
|
|
|
|
|
|
| |
Looking at perf, the drawoverhead test case was now spending 13% CPU (89%
in that function) on stack management.
nohw drawoverhead throughput 1.03902% +/- 0.380257% (n=13).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
|
|
|
|
|
|
| |
nohw drawoverhead 8 UBOs test throughput 1.06093% +/- 0.363376% (n=10).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
|
|
|
|
|
|
|
|
|
|
| |
This is for an optimization I plan in a following commit. I found I had
to add likely()s to avoid a perf regression from branch prediction.
On the drawoverhead 8 UBOs test, the HW can't quite keep up with the CPU,
but if I set nohw then this change is 1.32023% +/- 0.373053% (n=10).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
|
|
|
|
|
|
|
|
| |
For some CPU-side-only optimizations, it can be nice to disable rendering
so that we can see what the impact is even on cases where the GPU can't
quite keep up.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes a GS regression introduced in !4562 where
ir3's GS lowering pass was moved from common code (ir3_nir) to
freedreno-specific code (ir3_shader). For GS support in turnip, we
need to add the GS lowering pass back in, this time in tu_shader.
As for the nir_gather_info change, the GS lowering pass has always
introduced a discard_if intrinsic into the GS. Previously, we simply
ran nir_shader_gather_info before GS lowering, but now since we lower
the GS before we need to remove the assertion that only a FS can use
the discard_if intrinsic.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
|
|
|
|
|
|
|
|
|
| |
And try a bit harder to find an optimal layout. Improves on a sub-
optimal layout we arrive at in the 4 MRT pass in manhattan, picking
up a bit more than 3%.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>
|
|
|
|
|
|
|
|
| |
The blob only uses single page alignment, and empirically that appears
to work just fine.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>
|
|
|
|
|
|
|
|
| |
A simple standalone thing to run through a bunch of GMEM layouts for a
given gpu.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>
|
|
|
|
|
|
|
|
| |
Somehow the initialization of this got lost somewhere along the way,
resulting in assuming minx/miny are always zero.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>
|
|
|
|
|
|
|
|
|
| |
We potentially don't know yet what the resulting scissor bounds are, so
we can't assume this when estimating number of bins per pipe for VSC
size calculations.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>
|
|
|
|
|
|
|
|
|
|
| |
There is an annoying spec corner on Android. Because VkSwapchain is
implemented in the Vulkan loader on Android which may not know about
this extension, we have to handle it as a special case inside the
driver. We only have to do this on Android and only for VkSwapchainKHR.
Acked-by: Samuel Pitoiset <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4882>
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4882>
|
|
|
|
|
|
|
| |
Now that layout code supports this, we can enable it.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems this is actually a "minimum pitch" value. For example
TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels.
This fixes breakage with compressed formats. For example this test:
dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3
Fixes: a34b3fa198a4f ("freedreno/fdl: Align after dividing by block size")
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse
what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers
to the HW for UBO block 1-N, so let's just fix up the shader state.
Fixes an off by one in const state layout setup, and some really dodgy
register addressing trying to deal with dynamic UBO indices when the UBO
pointers happen to be at the start of the constbuf.
There's no fixes tag, though this fixes a bug from September, because it
would require the num_ubos fix in nir_lower_uniforms_to_ubo.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fixed commit was really nice in mostly fixing num_ubos to reflect the
shader after lowering, but for
dEQP-GLES31.functional.compute.basic.ubo_to_ssbo_single_invocation there
are no default uniforms and so we skipped the increment, even though we
shifted the block index up.
Fixes: 4777ee1a62f0 ("nir: Always create UBO variable when lowering uniforms to ubo")
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Erik Faye-Lund <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
|
|
|
|
|
|
| |
Final cleanup commit now that they're the same.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using non-write flags is pretty dubious -- it means the kernel tracking an
array of read-only consumers of the BO and having exclusive consumers wait
on each reader's fence. It allows multiple readers through dma-bufs to do
work in parallel, but at the cost of kernel CPU time and memory management
of the shared array. Other drivers have dropped this distinction since
dma-buf sharing is usually producer-consumer, not producer-two-consumers,
and the userspace and kernel space tracking is expensive.
For us, this lets us drop the flags passed in for relocs and tracked in
the ringbuffer reloc lists. The end result of the flags reduction work is
drawoverhead uniforms test throughput 2.37195% +/- 0.365579% (n=15)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
|
|
|
|
|
|
|
| |
We can avoid passing these flags around in the DRM backends by just
marking ring BOs up front.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
|
|
|
|
|
|
|
|
| |
It's silly to have all the reloc emitters passing around FD_RELOC_READ
when you have to have it set on all relocs (that don't include WRITE,
which implies read) for the kernel to actually track the fences on the BO.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: outline into a separate function and also optimize additions (by Daniel Schürmann)
Totals from affected shaders: (VEGA)
SGPRS: 938888 -> 941496 (0.28 %)
VGPRS: 832068 -> 831532 (-0.06 %)
Spilled SGPRs: 618 -> 618 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 3696 -> 3696 (0.00 %) dwords per thread
Code Size: 72893900 -> 72558928 (-0.46 %) bytes
LDS: 18201 -> 18201 (0.00 %) blocks
Max Waves: 64256 -> 64268 (0.02 %)
Signed-off-by: Samuel Pitoiset <[email protected]>
Co-authored-by: Daniel Schürmann <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4419>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are the result of lowering to CSSA, and should be removed if possible
Totals from affected shaders: (VEGA)
SGPRS: 544544 -> 544544 (0.00 %)
VGPRS: 418224 -> 418224 (0.00 %)
Spilled SGPRs: 141826 -> 141826 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 65853740 -> 64703380 (-1.75 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13669 -> 13669 (0.00 %)
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4952>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
../src/mesa/main/glthread_draw.c: In function ‘_mesa_marshal_MultiDrawElementsBaseVertex’:
../src/mesa/main/glthread_draw.c:812:36: error: implicit declaration of function ‘alloca’; did you mean ‘malloc’? [-Werror=implicit-function-declaration]
812 | const GLvoid **out_indices = alloca(sizeof(indices[0]) * draw_count);
| ^~~~~~
| malloc
../src/mesa/main/glthread_draw.c:812:36: error: initialization of ‘const GLvoid **’ {aka ‘const void **’} from ‘int’ makes pointer from integer without a cast [-Werror=int-conversion]
cc1: some warnings being treated as errors
Include c99_alloca.h to portably make the alloca() prototype available.
Fixes: 2840bc30
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4920>
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that some of the new shader and sampler states added with
Halti0 are not self-synchronizing anymore. Make sure to stall the FE
before loading those new states to avoid corruption of the in-flight
draw state.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3963>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>
|
|
|
|
|
|
|
|
| |
Loosely based on ANV.
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>
|
|
|
|
|
|
|
|
|
|
|
| |
Vulkan 1.2 seems rejected. This hardcodes the Android version to
1.1.107.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2936
Fixes: 7f5462e349a ("radv: enable Vulkan 1.2")
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4985>
|
|
|
|
|
|
|
|
|
|
| |
Also fix alignments and add umad24 and umul24 options.
Fixes: 6747a984f59ea9a2dd74b98d59cb8fdb028969ae
r600: Enable tesselation for NIR
Signed-off-by: Gert Wollny <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4982>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TMUSLOD register is the same that TMUS but having the same effect that
setting disable_autolod on the TMU configuration parameter 2.
So using that register is potentially more efficient, as in several
cases we would be able to skip writing P2.
One case where we can't use it is for texture cube maps, as we need to
use TMUSCM.
v2: don't put a comment in the middle of the conditions (Iago)
Reviewed-by: Iago Toral Quiroga <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Texture access has three configuration parameters, P0 (texture), P1
(sampler) and P2(lookup). P1 and P2 are optional, but if P2 is needed
(like for example to set the offset for texelFetchOffset), then you
need to set P1.
But until now when setting up P1 we were asking the driver to fill up
the address with the shader state. But in that case we can just fill
that address with the default value NULL.
So let's avoid asking the driver to fill that default values, and do
it directly on the compiler. This is a good-to-have on OpenGL, and
likely would be needed on Vulkan.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1bc71e8b655f2f02b3e3a0af34c7cad12b9cb83d already did that for
the 3rd offset, but it also needs to do it for the 2nd (to handle 1d
array).
Fixes assertion failures with Vulkan CTS tests using 1darray
targets. Seems that there isn't too many 1darray tests on OpenGL CTS,
and OpenGL-ES don't support 1d arrays, but the same problem could
arise eventually on OpenGL.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>
|
|
|
|
|
|
|
|
| |
Closes: #2939
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4988>
|
|
|
|
|
|
|
|
| |
This fixes the sky being red in OpenMW, as well as some of the Mesa
demos using shadows (shadowtex, shadow_sampler).
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4997>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the nir backend is used, the create_shader
call is supposed to release state->ir.nir.
When the cache hits, create_shader is not called,
thus state->ir.nir should be freed.
There is nothing to be done for the TGSI case as the
tokens release is done by the caller.
This fixes a leak noticed in:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/2931
Fixes: 4bb919b0b8b4ed6f6a7049c3f8d294b74b50e198
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4980>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The shader helped for spills and fills is the big compute shader in Dirt
Showdown. One of the shaders hurt for spills and fills on Broadwell is
the big compute shader in Bioshock Infinite, but combined with the
previous commit, it's still an impovement.
Tiger Lake
total instructions in shared programs: 21833218 -> 21832449 (<.01%)
instructions in affected programs: 66104 -> 65335 (-1.16%)
helped: 106
HURT: 14
helped stats (abs) min: 1 max: 67 x̄: 7.87 x̃: 5
helped stats (rel) min: 0.19% max: 5.76% x̄: 1.27% x̃: 0.95%
HURT stats (abs) min: 1 max: 14 x̄: 4.64 x̃: 1
HURT stats (rel) min: 0.19% max: 4.12% x̄: 1.41% x̃: 0.19%
95% mean confidence interval for instructions value: -8.51 -4.30
95% mean confidence interval for instructions %-change: -1.23% -0.69%
Instructions are helped.
total cycles in shared programs: 506180109 -> 506196314 (<.01%)
cycles in affected programs: 1671429 -> 1687634 (0.97%)
helped: 37
HURT: 84
helped stats (abs) min: 1 max: 490 x̄: 73.27 x̃: 24
helped stats (rel) min: 0.02% max: 7.98% x̄: 1.25% x̃: 0.41%
HURT stats (abs) min: 1 max: 5000 x̄: 225.19 x̃: 8
HURT stats (rel) min: 0.03% max: 10.22% x̄: 1.22% x̃: 0.42%
95% mean confidence interval for cycles value: 2.85 265.00
95% mean confidence interval for cycles %-change: 0.04% 0.88%
Cycles are HURT.
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 19961317 -> 19960543 (<.01%)
instructions in affected programs: 30268 -> 29494 (-2.56%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 142 x̄: 19.85 x̃: 7
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.33% x̃: 2.31%
95% mean confidence interval for instructions value: -29.46 -10.23
95% mean confidence interval for instructions %-change: -2.95% -1.71%
Instructions are helped.
total cycles in shared programs: 498863755 -> 498865843 (<.01%)
cycles in affected programs: 1831136 -> 1833224 (0.11%)
helped: 57
HURT: 65
helped stats (abs) min: 1 max: 1400 x̄: 128.93 x̃: 25
helped stats (rel) min: 0.05% max: 3.49% x̄: 0.89% x̃: 0.71%
HURT stats (abs) min: 1 max: 1887 x̄: 145.18 x̃: 15
HURT stats (rel) min: 0.02% max: 9.88% x̄: 1.83% x̃: 0.73%
95% mean confidence interval for cycles value: -58.30 92.53
95% mean confidence interval for cycles %-change: 0.16% 0.97%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 8774 -> 8773 (-0.01%)
spills in affected programs: 20 -> 19 (-5.00%)
helped: 1
HURT: 0
total fills in shared programs: 9496 -> 9494 (-0.02%)
fills in affected programs: 40 -> 38 (-5.00%)
helped: 1
HURT: 0
Broadwell
total instructions in shared programs: 17859373 -> 17858548 (<.01%)
instructions in affected programs: 38452 -> 37627 (-2.15%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 143 x̄: 26.61 x̃: 10
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.57% x̃: 2.69%
95% mean confidence interval for instructions value: -39.79 -13.44
95% mean confidence interval for instructions %-change: -3.25% -1.89%
Instructions are helped.
total cycles in shared programs: 525858109 -> 525869236 (<.01%)
cycles in affected programs: 2058597 -> 2069724 (0.54%)
helped: 44
HURT: 75
helped stats (abs) min: 2 max: 1330 x̄: 187.84 x̃: 23
helped stats (rel) min: 0.04% max: 31.31% x̄: 2.13% x̃: 0.85%
HURT stats (abs) min: 1 max: 3915 x̄: 258.56 x̃: 47
HURT stats (rel) min: 0.02% max: 10.53% x̄: 2.81% x̃: 2.21%
95% mean confidence interval for cycles value: -26.06 213.07
95% mean confidence interval for cycles %-change: 0.19% 1.78%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 25744 -> 25730 (-0.05%)
spills in affected programs: 1578 -> 1564 (-0.89%)
helped: 4
HURT: 2
total fills in shared programs: 31710 -> 31689 (-0.07%)
fills in affected programs: 4346 -> 4325 (-0.48%)
helped: 3
HURT: 3
Haswell
total instructions in shared programs: 16228399 -> 16227783 (<.01%)
instructions in affected programs: 22201 -> 21585 (-2.77%)
helped: 27
HURT: 0
helped stats (abs) min: 1 max: 68 x̄: 22.81 x̃: 11
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.92% x̃: 2.86%
95% mean confidence interval for instructions value: -31.96 -13.66
95% mean confidence interval for instructions %-change: -3.68% -2.15%
Instructions are helped.
total cycles in shared programs: 538613967 -> 538701354 (0.02%)
cycles in affected programs: 1653044 -> 1740431 (5.29%)
helped: 36
HURT: 81
helped stats (abs) min: 2 max: 708 x̄: 104.50 x̃: 17
helped stats (rel) min: <.01% max: 15.01% x̄: 1.67% x̃: 0.65%
HURT stats (abs) min: 1 max: 30100 x̄: 1125.30 x̃: 304
HURT stats (rel) min: 0.02% max: 16.21% x̄: 8.98% x̃: 11.60%
95% mean confidence interval for cycles value: 23.78 1470.01
95% mean confidence interval for cycles %-change: 4.29% 7.12%
Cycles are HURT.
total spills in shared programs: 23418 -> 23409 (-0.04%)
spills in affected programs: 177 -> 168 (-5.08%)
helped: 2
HURT: 0
total fills in shared programs: 25919 -> 25896 (-0.09%)
fills in affected programs: 568 -> 545 (-4.05%)
helped: 3
HURT: 0
Ivy Bridge
total instructions in shared programs: 15265983 -> 15265759 (<.01%)
instructions in affected programs: 8418 -> 8194 (-2.66%)
helped: 5
HURT: 0
helped stats (abs) min: 18 max: 99 x̄: 44.80 x̃: 26
helped stats (rel) min: 1.74% max: 4.26% x̄: 3.12% x̃: 3.00%
95% mean confidence interval for instructions value: -86.29 -3.31
95% mean confidence interval for instructions %-change: -4.43% -1.81%
Instructions are helped.
total cycles in shared programs: 422930336 -> 422929589 (<.01%)
cycles in affected programs: 59347 -> 58600 (-1.26%)
helped: 3
HURT: 2
helped stats (abs) min: 72 max: 1060 x̄: 433.33 x̃: 168
helped stats (rel) min: 1.14% max: 3.48% x̄: 2.23% x̃: 2.06%
HURT stats (abs) min: 265 max: 288 x̄: 276.50 x̃: 276
HURT stats (rel) min: 4.79% max: 5.64% x̄: 5.22% x̃: 5.22%
95% mean confidence interval for cycles value: -829.08 530.28
95% mean confidence interval for cycles %-change: -4.43% 5.93%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 4953 -> 4946 (-0.14%)
spills in affected programs: 344 -> 337 (-2.03%)
helped: 2
HURT: 0
total fills in shared programs: 5548 -> 5521 (-0.49%)
fills in affected programs: 838 -> 811 (-3.22%)
helped: 2
HURT: 0
No shader-db changes on any earlier Intel platform.
Reviewed-by: Rhys Perry <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like 1f72857739b ("nir/algebraic: add some half packing optimizations"),
but for the pack_half_2x16_split variant.
The shader helped for spills and fills is the big compute shader in
Bioshock Infinite.
Tiger Lake
total instructions in shared programs: 21834539 -> 21833218 (<.01%)
instructions in affected programs: 60119 -> 58798 (-2.20%)
helped: 105
HURT: 0
helped stats (abs) min: 5 max: 50 x̄: 12.58 x̃: 9
helped stats (rel) min: 0.86% max: 26.46% x̄: 2.58% x̃: 1.70%
95% mean confidence interval for instructions value: -14.35 -10.81
95% mean confidence interval for instructions %-change: -3.20% -1.97%
Instructions are helped.
total cycles in shared programs: 506215169 -> 506180109 (<.01%)
cycles in affected programs: 1445088 -> 1410028 (-2.43%)
helped: 97
HURT: 8
helped stats (abs) min: 1 max: 16882 x̄: 387.76 x̃: 26
helped stats (rel) min: 0.05% max: 18.31% x̄: 1.77% x̃: 1.34%
HURT stats (abs) min: 21 max: 635 x̄: 319.12 x̃: 212
HURT stats (rel) min: 0.39% max: 20.08% x̄: 8.96% x̃: 4.46%
95% mean confidence interval for cycles value: -782.96 115.15
95% mean confidence interval for cycles %-change: -1.74% -0.16%
Inconclusive result (value mean confidence interval includes 0).
Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown)
total instructions in shared programs: 19962974 -> 19961317 (<.01%)
instructions in affected programs: 63471 -> 61814 (-2.61%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 82 x̄: 15.78 x̃: 11
helped stats (rel) min: 1.11% max: 28.65% x̄: 3.17% x̃: 2.16%
95% mean confidence interval for instructions value: -18.38 -13.18
95% mean confidence interval for instructions %-change: -3.86% -2.48%
Instructions are helped.
total cycles in shared programs: 498908953 -> 498863755 (<.01%)
cycles in affected programs: 1566998 -> 1521800 (-2.88%)
helped: 89
HURT: 15
helped stats (abs) min: 2 max: 17502 x̄: 532.19 x̃: 69
helped stats (rel) min: 0.07% max: 18.54% x̄: 4.71% x̃: 3.12%
HURT stats (abs) min: 3 max: 661 x̄: 144.47 x̃: 16
HURT stats (rel) min: 0.14% max: 20.57% x̄: 4.29% x̃: 0.30%
95% mean confidence interval for cycles value: -903.93 34.74
95% mean confidence interval for cycles %-change: -4.50% -2.32%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 8776 -> 8774 (-0.02%)
spills in affected programs: 25 -> 23 (-8.00%)
helped: 1
HURT: 0
total fills in shared programs: 9500 -> 9496 (-0.04%)
fills in affected programs: 46 -> 42 (-8.70%)
helped: 1
HURT: 0
Haswell
total instructions in shared programs: 16229912 -> 16228399 (<.01%)
instructions in affected programs: 61257 -> 59744 (-2.47%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 51 x̄: 14.41 x̃: 11
helped stats (rel) min: 0.77% max: 28.65% x̄: 3.08% x̃: 2.15%
95% mean confidence interval for instructions value: -16.14 -12.68
95% mean confidence interval for instructions %-change: -3.77% -2.40%
Instructions are helped.
total cycles in shared programs: 538654481 -> 538613967 (<.01%)
cycles in affected programs: 1448966 -> 1408452 (-2.80%)
helped: 58
HURT: 47
helped stats (abs) min: 9 max: 22604 x̄: 957.00 x̃: 74
helped stats (rel) min: 0.40% max: 18.81% x̄: 6.22% x̃: 3.03%
HURT stats (abs) min: 5 max: 3720 x̄: 318.98 x̃: 49
HURT stats (rel) min: 0.20% max: 34.50% x̄: 5.05% x̃: 2.12%
95% mean confidence interval for cycles value: -999.84 228.14
95% mean confidence interval for cycles %-change: -2.86% 0.51%
Inconclusive result (value mean confidence interval includes 0).
Ivy Bridge
total instructions in shared programs: 15266086 -> 15265983 (<.01%)
instructions in affected programs: 7272 -> 7169 (-1.42%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 41 x̄: 34.33 x̃: 41
helped stats (rel) min: 0.66% max: 5.43% x̄: 2.44% x̃: 1.23%
total cycles in shared programs: 422930883 -> 422930336 (<.01%)
cycles in affected programs: 49259 -> 48712 (-1.11%)
helped: 3
HURT: 0
helped stats (abs) min: 106 max: 221 x̄: 182.33 x̃: 220
helped stats (rel) min: 0.71% max: 5.95% x̄: 2.46% x̃: 0.72%
No changes on any earilier Intel platforms.
Reviewed-by: Rhys Perry <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce
0xBC000000. The ishr will produce 0xFFFFBC00. The replacement
pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00.
Fixes: 1f72857739b ("nir/algebraic: add some half packing optimizations")
Reviewed-by: Rhys Perry <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: Connor Abbott <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On all Gen7+ platforms except DG1, the URB is a subsection of the
configurable L3 cache, and so the size can vary. The size listed
in the documentation on those platforms is an "example size", picked
by calculating it based on an arbitrarily chosen L3 config.
Hardcoding a value for those platforms provides no value and only
confuses people trying to fill out these tables when doing hardware
enabling. anv and iris never use this field. i965 uses it to
initialize brw->urb.size, but then updates that in update_urb_size()
to be the correct value, so the initial value doesn't matter.
Delete the values for Gen7+ and update the comment accordingly.
Reviewed-by: Jordan Justen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4969>
|
|
|
|
|
|
|
|
|
|
|
| |
Android support EGL 1.5 from Q onwards,
so limit EGL ver to 1.4 for P and below.
Closes: #2892
Signed-off-by: Abhishek Kumar <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4951>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
This doesn't seem to work perfect, but I'm not sure what is possible
in GL vs Vulkan here
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2867
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4835>
|
|
|
|
|
|
|
| |
This doesn't do much for now, but it will keep thing cleaner in the next
commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4835>
|
|
|
|
|
|
|
| |
We're about to load some more extension-pointers as well, so let's
create a separate place for doing this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4835>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>
|