| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841590 -> 841332 (-0.03%)
instructions in affected programs: 121957 -> 121699 (-0.21%)
helped: 7
HURT: 0
helped stats (abs) min: 15 max: 54 x̄: 36.86 x̃: 41
helped stats (rel) min: 0.16% max: 0.33% x̄: 0.23% x̃: 0.18%
95% mean confidence interval for instructions value: -49.73 -23.98
95% mean confidence interval for instructions %-change: -0.29% -0.16%
Instructions are helped.
total cycles in shared programs: 6926828 -> 6923967 (-0.04%)
cycles in affected programs: 1038569 -> 1035708 (-0.28%)
helped: 7
HURT: 0
helped stats (abs) min: 128 max: 616 x̄: 408.71 x̃: 446
helped stats (rel) min: 0.18% max: 0.44% x̄: 0.29% x̃: 0.22%
95% mean confidence interval for cycles value: -571.72 -245.70
95% mean confidence interval for cycles %-change: -0.38% -0.19%
Cycles are helped.
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841647 -> 841590 (<.01%)
instructions in affected programs: 122014 -> 121957 (-0.05%)
helped: 7
HURT: 0
helped stats (abs) min: 3 max: 12 x̄: 8.14 x̃: 9
helped stats (rel) min: 0.04% max: 0.07% x̄: 0.05% x̃: 0.04%
95% mean confidence interval for instructions value: -11.23 -5.06
95% mean confidence interval for instructions %-change: -0.06% -0.03%
Instructions are helped.
total cycles in shared programs: 6926904 -> 6926828 (<.01%)
cycles in affected programs: 1038645 -> 1038569 (<.01%)
helped: 7
HURT: 0
helped stats (abs) min: 4 max: 16 x̄: 10.86 x̃: 12
helped stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -14.97 -6.74
95% mean confidence interval for cycles %-change: -0.01% <.01%
Cycles are helped.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fsat is defined as min(max(a, 0.0), 1.0), and IEEE defines both min and
max to return the non-NaN value when one value is NaN. Based on this,
fsat should definitely return 0.0 for NaN.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841666 -> 841647 (<.01%)
instructions in affected programs: 122033 -> 122014 (-0.02%)
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.71 x̃: 3
helped stats (rel) min: 0.01% max: 0.02% x̄: 0.02% x̃: 0.01%
95% mean confidence interval for instructions value: -3.74 -1.69
95% mean confidence interval for instructions %-change: -0.02% -0.01%
Instructions are helped.
total cycles in shared programs: 6927246 -> 6926904 (<.01%)
cycles in affected programs: 1038987 -> 1038645 (-0.03%)
helped: 7
HURT: 0
helped stats (abs) min: 18 max: 72 x̄: 48.86 x̃: 54
helped stats (rel) min: 0.03% max: 0.05% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -67.38 -30.33
95% mean confidence interval for cycles %-change: -0.05% -0.02%
Cycles are helped.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Fixes: a42163cbbc1 ("compiler: Add lowering support for 64-bit saturate operations to software")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2585
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The change to nir_opt_algebraic cleans up a pattern that was never
produced before the rest of this commit was added.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 843005 -> 841666 (-0.16%)
instructions in affected programs: 460655 -> 459316 (-0.29%)
helped: 64
HURT: 17
helped stats (abs) min: 1 max: 72 x̄: 21.72 x̃: 20
helped stats (rel) min: 0.01% max: 28.07% x̄: 12.67% x̃: 16.07%
HURT stats (abs) min: 1 max: 7 x̄: 3.00 x̃: 2
HURT stats (rel) min: 0.01% max: 0.04% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -20.87 -12.19
95% mean confidence interval for instructions %-change: -12.35% -7.66%
Instructions are helped.
total cycles in shared programs: 6944998 -> 6927246 (-0.26%)
cycles in affected programs: 3891872 -> 3874120 (-0.46%)
helped: 71
HURT: 10
helped stats (abs) min: 2 max: 772 x̄: 254.21 x̃: 156
helped stats (rel) min: <.01% max: 66.44% x̄: 21.72% x̃: 18.40%
HURT stats (abs) min: 18 max: 69 x̄: 29.70 x̃: 20
HURT stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -270.82 -167.50
95% mean confidence interval for cycles %-change: -24.41% -13.65%
Cycles are helped.
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__fabs64 doesn't do anything special, and the value is still NaN
regardless of the value of the MSB. In a strict sense, it's possible
that both functions should set the "signal" bit.
lts on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 844558 -> 843005 (-0.18%)
instructions in affected programs: 725975 -> 724422 (-0.21%)
helped: 53
HURT: 4
helped stats (abs) min: 1 max: 313 x̄: 29.87 x̃: 21
helped stats (rel) min: 0.01% max: 0.94% x̄: 0.30% x̃: 0.22%
HURT stats (abs) min: 4 max: 11 x̄: 7.50 x̃: 7
HURT stats (rel) min: 0.03% max: 0.09% x̄: 0.05% x̃: 0.04%
95% mean confidence interval for instructions value: -39.02 -15.47
95% mean confidence interval for instructions %-change: -0.34% -0.21%
Instructions are helped.
total cycles in shared programs: 6962024 -> 6944998 (-0.24%)
cycles in affected programs: 6185470 -> 6168444 (-0.28%)
helped: 59
HURT: 0
helped stats (abs) min: 64 max: 2863 x̄: 288.58 x̃: 208
helped stats (rel) min: 0.11% max: 0.87% x̄: 0.33% x̃: 0.27%
95% mean confidence interval for cycles value: -387.15 -190.00
95% mean confidence interval for cycles %-change: -0.38% -0.28%
Cycles are helped.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...instead of 0 or 1. Many places the sign bit is extracted, then later
put back in the same position. This saves some left-shift operations.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 848106 -> 844558 (-0.42%)
instructions in affected programs: 833480 -> 829932 (-0.43%)
helped: 106
HURT: 1
helped stats (abs) min: 1 max: 995 x̄: 33.48 x̃: 12
helped stats (rel) min: 0.15% max: 2.20% x̄: 0.60% x̃: 0.35%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for instructions value: -51.88 -14.43
95% mean confidence interval for instructions %-change: -0.71% -0.47%
Instructions are helped.
total cycles in shared programs: 6969125 -> 6962024 (-0.10%)
cycles in affected programs: 6717689 -> 6710588 (-0.11%)
helped: 78
HURT: 7
helped stats (abs) min: 2 max: 2083 x̄: 110.27 x̃: 56
helped stats (rel) min: <.01% max: 0.30% x̄: 0.11% x̃: 0.11%
HURT stats (abs) min: 2 max: 1340 x̄: 214.29 x̃: 4
HURT stats (rel) min: 0.01% max: 0.71% x̄: 0.13% x̃: 0.02%
95% mean confidence interval for cycles value: -144.02 -23.06
95% mean confidence interval for cycles %-change: -0.12% -0.07%
Cycles are helped.
total spills in shared programs: 814 -> 759 (-6.76%)
spills in affected programs: 814 -> 759 (-6.76%)
helped: 2
HURT: 1
total fills in shared programs: 2488 -> 2412 (-3.05%)
fills in affected programs: 2488 -> 2412 (-3.05%)
helped: 2
HURT: 1
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace all of the bool(qSign) with qSign != 0u. Remove unnecessary
parenthesis from around most of the existing qSign != 0u.
This dramatically simplifies the next commit.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 848109 -> 848106 (<.01%)
instructions in affected programs: 53 -> 50 (-5.66%)
helped: 1
HURT: 0
total cycles in shared programs: 6969145 -> 6969125 (<.01%)
cycles in affected programs: 396 -> 376 (-5.05%)
helped: 1
HURT: 0
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
findMSB returns -1 for an input of zero, so 31 - findMSB(a) is
sufficient on its own.
There's only one user of findMSB in shader-db, and it does not match
this pattern.
TODO: Add a pattern in the backend code generator that emits 31 -
nir_op_ufind_msb(a) as if it were nir_op_uclz. That should save a couple
instructions.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 859509 -> 848109 (-1.33%)
instructions in affected programs: 841058 -> 829658 (-1.36%)
helped: 97
HURT: 0
helped stats (abs) min: 3 max: 1161 x̄: 117.53 x̃: 72
helped stats (rel) min: 0.98% max: 6.74% x̄: 1.70% x̃: 1.35%
95% mean confidence interval for instructions value: -147.21 -87.84
95% mean confidence interval for instructions %-change: -1.94% -1.46%
Instructions are helped.
total cycles in shared programs: 7072275 -> 6969145 (-1.46%)
cycles in affected programs: 6955767 -> 6852637 (-1.48%)
helped: 97
HURT: 0
helped stats (abs) min: 32 max: 10900 x̄: 1063.20 x̃: 560
helped stats (rel) min: 1.18% max: 7.58% x̄: 1.84% x̃: 1.45%
95% mean confidence interval for cycles value: -1339.43 -786.96
95% mean confidence interval for cycles %-change: -2.11% -1.57%
Cycles are helped.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 928859 -> 859509 (-7.47%)
instructions in affected programs: 866293 -> 796943 (-8.01%)
helped: 76
HURT: 0
helped stats (abs) min: 75 max: 8042 x̄: 912.50 x̃: 688
helped stats (rel) min: 5.35% max: 21.02% x̄: 10.35% x̃: 7.58%
95% mean confidence interval for instructions value: -1138.37 -686.63
95% mean confidence interval for instructions %-change: -11.69% -9.00%
Instructions are helped.
total cycles in shared programs: 7272912 -> 7072275 (-2.76%)
cycles in affected programs: 6763486 -> 6562849 (-2.97%)
helped: 76
HURT: 0
helped stats (abs) min: 214 max: 30136 x̄: 2639.96 x̃: 1923
helped stats (rel) min: 1.75% max: 9.20% x̄: 4.04% x̃: 2.41%
95% mean confidence interval for cycles value: -3455.29 -1824.63
95% mean confidence interval for cycles %-change: -4.69% -3.39%
Cycles are helped.
total spills in shared programs: 817 -> 814 (-0.37%)
spills in affected programs: 791 -> 788 (-0.38%)
helped: 2
HURT: 0
total fills in shared programs: 2438 -> 2488 (2.05%)
fills in affected programs: 2392 -> 2442 (2.09%)
helped: 0
HURT: 2
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't help a lot of shaders, but it helps those few a LOT.
This could also be implemented using bcsel. That version is very
slightly worse because the generated SEL instruction wants to have two
immediate sources, so one of them usually needs an extra MOV instruction
to load.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 929619 -> 928859 (-0.08%)
instructions in affected programs: 1651 -> 891 (-46.03%)
helped: 8
HURT: 0
helped stats (abs) min: 38 max: 152 x̄: 95.00 x̃: 95
helped stats (rel) min: 42.70% max: 86.36% x̄: 49.88% x̃: 44.66%
95% mean confidence interval for instructions value: -132.97 -57.03
95% mean confidence interval for instructions %-change: -62.28% -37.49%
Instructions are helped.
total cycles in shared programs: 7280180 -> 7272912 (-0.10%)
cycles in affected programs: 12960 -> 5692 (-56.08%)
helped: 8
HURT: 0
helped stats (abs) min: 352 max: 1456 x̄: 908.50 x̃: 910
helped stats (rel) min: 52.45% max: 91.19% x̄: 59.24% x̃: 55.15%
95% mean confidence interval for cycles value: -1274.03 -542.97
95% mean confidence interval for cycles %-change: -70.06% -48.41%
Cycles are helped.
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pattern is added to opt_algebraic because, for example, comparisons
with constant 0.0 will produce (a1 < 0).
Even with a pass that optimized Boolean expressions, I think this would
be very difficult to automatically recognize and optimize.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 933054 -> 929619 (-0.37%)
instructions in affected programs: 784041 -> 780606 (-0.44%)
helped: 59
HURT: 0
helped stats (abs) min: 2 max: 213 x̄: 58.22 x̃: 44
helped stats (rel) min: 0.02% max: 2.51% x̄: 0.72% x̃: 0.46%
95% mean confidence interval for instructions value: -70.80 -45.64
95% mean confidence interval for instructions %-change: -0.92% -0.53%
Instructions are helped.
total cycles in shared programs: 7304712 -> 7280180 (-0.34%)
cycles in affected programs: 7176260 -> 7151728 (-0.34%)
helped: 92
HURT: 0
helped stats (abs) min: 8 max: 1414 x̄: 266.65 x̃: 166
helped stats (rel) min: 0.04% max: 2.34% x̄: 0.43% x̃: 0.22%
95% mean confidence interval for cycles value: -333.05 -200.26
95% mean confidence interval for cycles %-change: -0.54% -0.31%
Cycles are helped.
Regular shader-db changes:
No changes on any Intel platform.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like 5886cd79a0e, but for iand, ior, and ixor.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake
total instructions in shared programs: 903108 -> 902830 (-0.03%)
instructions in affected programs: 654910 -> 654632 (-0.04%)
helped: 31
HURT: 5
helped stats (abs) min: 2 max: 31 x̄: 9.58 x̃: 7
helped stats (rel) min: 0.01% max: 0.23% x̄: 0.06% x̃: 0.04%
HURT stats (abs) min: 1 max: 10 x̄: 3.80 x̃: 3
HURT stats (rel) min: 0.01% max: 0.10% x̄: 0.03% x̃: 0.02%
95% mean confidence interval for instructions value: -10.55 -4.89
95% mean confidence interval for instructions %-change: -0.07% -0.03%
Instructions are helped.
total cycles in shared programs: 7059681 -> 7058006 (-0.02%)
cycles in affected programs: 5081309 -> 5079634 (-0.03%)
helped: 33
HURT: 12
helped stats (abs) min: 1 max: 444 x̄: 60.91 x̃: 18
helped stats (rel) min: <.01% max: 2.17% x̄: 0.25% x̃: 0.05%
HURT stats (abs) min: 1 max: 288 x̄: 27.92 x̃: 2
HURT stats (rel) min: <.01% max: 1.00% x̄: 0.23% x̃: 0.02%
95% mean confidence interval for cycles value: -68.32 -6.12
95% mean confidence interval for cycles %-change: -0.28% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).
Ice Lake
total instructions in shared programs: 895384 -> 895159 (-0.03%)
instructions in affected programs: 658678 -> 658453 (-0.03%)
helped: 37
HURT: 0
helped stats (abs) min: 3 max: 16 x̄: 6.08 x̃: 4
helped stats (rel) min: <.01% max: 0.07% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for instructions value: -7.46 -4.70
95% mean confidence interval for instructions %-change: -0.04% -0.03%
Instructions are helped.
total cycles in shared programs: 7092224 -> 7091195 (-0.01%)
cycles in affected programs: 5221666 -> 5220637 (-0.02%)
helped: 35
HURT: 11
helped stats (abs) min: 1 max: 247 x̄: 43.46 x̃: 12
helped stats (rel) min: <.01% max: 2.17% x̄: 0.23% x̃: 0.05%
HURT stats (abs) min: 2 max: 432 x̄: 44.73 x̃: 5
HURT stats (rel) min: <.01% max: 1.00% x̄: 0.25% x̃: 0.02%
95% mean confidence interval for cycles value: -49.00 4.26
95% mean confidence interval for cycles %-change: -0.27% 0.03%
Inconclusive result (value mean confidence interval includes 0).
Regular shader-db results:
All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611408 -> 17611398 (<.01%)
instructions in affected programs: 1648 -> 1638 (-0.61%)
helped: 2
HURT: 0
total cycles in shared programs: 338366148 -> 338366124 (<.01%)
cycles in affected programs: 124048 -> 124024 (-0.02%)
helped: 2
HURT: 0
No changes on any earlier Intel platforms.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generalizes some of the patterns from 76289fbfa84a and 905ff8619824. In
particular, some of the soft-fp64 code generates (a & 0x7fffffff) << 1
when constant 0.0 is compared (flt or feq).
v2: Reduce the set of added patterns to those that actually help
something. This reduces the size of the state transition tables by
about 29k. Suggested by Jason. Remove the existing patterns that this
commit subsumes.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake
total instructions in shared programs: 903171 -> 903108 (<.01%)
instructions in affected programs: 635903 -> 635840 (<.01%)
helped: 25
HURT: 11
helped stats (abs) min: 1 max: 16 x̄: 5.04 x̃: 3
helped stats (rel) min: <.01% max: 0.15% x̄: 0.04% x̃: 0.03%
HURT stats (abs) min: 2 max: 14 x̄: 5.73 x̃: 5
HURT stats (rel) min: <.01% max: 0.11% x̄: 0.04% x̃: 0.02%
95% mean confidence interval for instructions value: -3.91 0.41
95% mean confidence interval for instructions %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 7059527 -> 7059681 (<.01%)
cycles in affected programs: 5249401 -> 5249555 (<.01%)
helped: 41
HURT: 9
helped stats (abs) min: 2 max: 76 x̄: 11.90 x̃: 10
helped stats (rel) min: <.01% max: 11.86% x̄: 0.99% x̃: 0.01%
HURT stats (abs) min: 2 max: 380 x̄: 71.33 x̃: 12
HURT stats (rel) min: <.01% max: 0.22% x̄: 0.04% x̃: 0.01%
95% mean confidence interval for cycles value: -14.93 21.09
95% mean confidence interval for cycles %-change: -1.40% -0.20%
Inconclusive result (value mean confidence interval includes 0).
Ice Lake
total instructions in shared programs: 895506 -> 895384 (-0.01%)
instructions in affected programs: 658800 -> 658678 (-0.02%)
helped: 37
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 3.30 x̃: 2
helped stats (rel) min: <.01% max: 0.03% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -4.00 -2.59
95% mean confidence interval for instructions %-change: -0.02% -0.02%
Instructions are helped.
total cycles in shared programs: 7092748 -> 7092224 (<.01%)
cycles in affected programs: 5272008 -> 5271484 (<.01%)
helped: 36
HURT: 14
helped stats (abs) min: 2 max: 440 x̄: 21.67 x̃: 8
helped stats (rel) min: <.01% max: 11.86% x̄: 1.12% x̃: 0.02%
HURT stats (abs) min: 2 max: 122 x̄: 18.29 x̃: 6
HURT stats (rel) min: <.01% max: 0.07% x̄: 0.01% x̃: <.01%
95% mean confidence interval for cycles value: -29.24 8.28
95% mean confidence interval for cycles %-change: -1.40% -0.21%
Inconclusive result (value mean confidence interval includes 0).
Regular shader-db results:
All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611489 -> 17611408 (<.01%)
instructions in affected programs: 21188 -> 21107 (-0.38%)
helped: 23
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 3.78 x̃: 3
helped stats (rel) min: 0.03% max: 5.82% x̄: 1.13% x̃: 0.85%
HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel) min: 0.60% max: 0.60% x̄: 0.60% x̃: 0.60%
95% mean confidence interval for instructions value: -5.27 -1.48
95% mean confidence interval for instructions %-change: -1.70% -0.42%
Instructions are helped.
total cycles in shared programs: 338418502 -> 338366148 (-0.02%)
cycles in affected programs: 2289052 -> 2236698 (-2.29%)
helped: 18
HURT: 3
helped stats (abs) min: 4 max: 18000 x̄: 2909.67 x̃: 38
helped stats (rel) min: 0.09% max: 4.07% x̄: 0.96% x̃: 0.43%
HURT stats (abs) min: 2 max: 14 x̄: 6.67 x̃: 4
HURT stats (rel) min: 0.22% max: 1.13% x̄: 0.66% x̃: 0.64%
95% mean confidence interval for cycles value: -5204.00 217.91
95% mean confidence interval for cycles %-change: -1.31% -0.14%
Inconclusive result (value mean confidence interval includes 0).
Ivy Bridge
total instructions in shared programs: 11875617 -> 11875615 (<.01%)
instructions in affected programs: 1339 -> 1337 (-0.15%)
helped: 2
HURT: 0
No changes on any earlier Intel platforms.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like 70f9e2589e6b. Also scrub the unnecessary size qualifier in both
replacement patterns.
This occurs in a handful of places in the soft-fp64 code, and that is
the primary reason for the change.
Perhaps the patterns that generate umin should be conditioned on
something, but I'm not sure what. lower_bitops might cover the cases
that matter, but it seems ugly.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 936505 -> 933388 (-0.33%)
instructions in affected programs: 925719 -> 922602 (-0.34%)
helped: 154
HURT: 1
helped stats (abs) min: 1 max: 211 x̄: 35.45 x̃: 16
helped stats (rel) min: 0.34% max: 9.30% x̄: 2.28% x̃: 0.96%
HURT stats (abs) min: 2342 max: 2342 x̄: 2342.00 x̃: 2342
HURT stats (rel) min: 2.28% max: 2.28% x̄: 2.28% x̃: 2.28%
95% mean confidence interval for instructions value: -51.21 10.99
95% mean confidence interval for instructions %-change: -2.61% -1.89%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 7323502 -> 7306184 (-0.24%)
cycles in affected programs: 7220376 -> 7203058 (-0.24%)
helped: 126
HURT: 1
helped stats (abs) min: 2 max: 946 x̄: 159.10 x̃: 95
helped stats (rel) min: 0.01% max: 9.62% x̄: 0.80% x̃: 0.37%
HURT stats (abs) min: 2728 max: 2728 x̄: 2728.00 x̃: 2728
HURT stats (rel) min: 0.37% max: 0.37% x̄: 0.37% x̃: 0.37%
95% mean confidence interval for cycles value: -192.07 -80.66
95% mean confidence interval for cycles %-change: -1.07% -0.51%
Cycles are helped.
total spills in shared programs: 635 -> 817 (28.66%)
spills in affected programs: 635 -> 817 (28.66%)
helped: 0
HURT: 3
total fills in shared programs: 2065 -> 2438 (18.06%)
fills in affected programs: 2019 -> 2392 (18.47%)
helped: 0
HURT: 2
Regular shader-db results:
All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611506 -> 17611489 (<.01%)
instructions in affected programs: 33442 -> 33425 (-0.05%)
helped: 32
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 1.69 x̃: 1
helped stats (rel) min: 0.08% max: 1.90% x̄: 0.27% x̃: 0.11%
HURT stats (abs) min: 1 max: 15 x̄: 6.17 x̃: 5
HURT stats (rel) min: 0.09% max: 1.50% x̄: 0.65% x̃: 0.55%
95% mean confidence interval for instructions value: -1.70 0.80
95% mean confidence interval for instructions %-change: -0.30% 0.05%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 338419218 -> 338418502 (<.01%)
cycles in affected programs: 385795 -> 385079 (-0.19%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 192 x̄: 24.57 x̃: 16
helped stats (rel) min: 0.04% max: 2.09% x̄: 0.33% x̃: 0.22%
HURT stats (abs) min: 64 max: 164 x̄: 105.33 x̃: 88
HURT stats (rel) min: 0.77% max: 1.58% x̄: 1.09% x̃: 0.93%
95% mean confidence interval for cycles value: -29.76 -2.06
95% mean confidence interval for cycles %-change: -0.40% -0.07%
Cycles are helped.
Ivy Bridge and Sandy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11875620 -> 11875617 (<.01%)
instructions in affected programs: 421 -> 418 (-0.71%)
helped: 2
HURT: 0
total cycles in shared programs: 178245336 -> 178245326 (<.01%)
cycles in affected programs: 3425 -> 3415 (-0.29%)
helped: 2
HURT: 0
No changes on Gen4 or Gen5.
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This occurs in a handful of places in the soft-fp64 code, and that is
the primary reason for the change.
v2: Fix a typo in a comment. Noticed by Matt. Copy the correct fp64
shader-db results to the commit message. I realized that I used
accidentally used the results from the next commit.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 906235 -> 906149 (<.01%)
instructions in affected programs: 353966 -> 353880 (-0.02%)
helped: 31
HURT: 2
helped stats (abs) min: 1 max: 8 x̄: 3.03 x̃: 3
helped stats (rel) min: 0.01% max: 1.59% x̄: 0.10% x̃: 0.04%
HURT stats (abs) min: 3 max: 5 x̄: 4.00 x̃: 4
HURT stats (rel) min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -3.51 -1.70
95% mean confidence interval for instructions %-change: -0.19% <.01%
Inconclusive result (%-change mean confidence interval includes 0).
total cycles in shared programs: 7076552 -> 7076173 (<.01%)
cycles in affected programs: 2878361 -> 2877982 (-0.01%)
helped: 37
HURT: 2
helped stats (abs) min: 2 max: 48 x̄: 10.81 x̃: 6
helped stats (rel) min: <.01% max: 2.17% x̄: 0.47% x̃: 0.01%
HURT stats (abs) min: 1 max: 20 x̄: 10.50 x̃: 10
HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -13.96 -5.48
95% mean confidence interval for cycles %-change: -0.72% -0.16%
Cycles are helped.
total fills in shared programs: 2064 -> 2065 (0.05%)
fills in affected programs: 45 -> 46 (2.22%)
helped: 0
HURT: 1
Regular shader-db results:
All Gen7+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611530 -> 17611506 (<.01%)
instructions in affected programs: 5934 -> 5910 (-0.40%)
helped: 10
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 2
helped stats (rel) min: 0.14% max: 1.24% x̄: 0.47% x̃: 0.34%
95% mean confidence interval for instructions value: -3.53 -1.27
95% mean confidence interval for instructions %-change: -0.78% -0.17%
Instructions are helped.
total cycles in shared programs: 338419178 -> 338419218 (<.01%)
cycles in affected programs: 19244 -> 19284 (0.21%)
helped: 4
HURT: 2
helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.05% max: 0.11% x̄: 0.08% x̃: 0.08%
HURT stats (abs) min: 26 max: 26 x̄: 26.00 x̃: 26
HURT stats (rel) min: 1.20% max: 1.20% x̄: 1.20% x̃: 1.20%
95% mean confidence interval for cycles value: -9.08 22.41
95% mean confidence interval for cycles %-change: -0.35% 1.04%
Inconclusive result (value mean confidence interval includes 0).
No changes on any earlier Intel platform.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
u_threaded can hang in these situation, with one context waiting on a
deferred fence from the other context.
But the other context isn't flushing its pending work (because it's waiting
for more work to pushed) so everything is stuck.
Fixes: d17b35e671a ("gallium: add PIPE_FLUSH_DEFERRED")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1430
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>
|
|
|
|
|
|
|
|
|
|
| |
Use it in anv_layout_to_aux_state().
Refactor only. No change in behavior.
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>
|
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The shader module name is used to compute the pipeline key. The
driver used to load the wrong pipelines because the shader names
were similar.
This should fix random failures of
dEQP-VK.pipeline.depth_range_unrestricted.*
Fixes: f11ea226664 ("radv: fix a performance regression with graphics depth/stencil clears")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So we could align stream outputs correctly even if unused in/outs are
removed.
Fixes:
dEQP-VK.transform_feedback.fuzz.random_vertex.scalar_types.*
dEQP-VK.transform_feedback.fuzz.random_vertex.vector_types.*
Signed-off-by: Hyunjun Ko <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Should be divided by 4 so we could calculate the offset correctly in
tu6_setup_streamout.
Fixes: 2a1d6b81ed54971d33e83b7f5545da096b13b043
Related: 374406a7c420d266f920461f904864a94dc1b8c8
Signed-off-by: Hyunjun Ko <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
|
|
|
|
|
|
|
|
|
| |
v2: use ULL
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Danylo Piliaiev <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>
|
|
|
|
|
|
|
|
|
|
| |
Now that we know the bits that are responsible for enabling depth/stencil
writes in shader we can decode them properly.
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fragment shader can write depth and stencil if we set necessary flags
in RSW. In addition to that we need to use special format for Z24S8.
Original format is apparently Z24X8 since we can't sample stencil in GLES2.
This new format also seems to use several components for storing depth
since we saw r != g != b when sampling with this format.
[vasily: - initialize clear->depth to 0xffffff if we reload depth, just
like blob does. Reloading doesn't work otherwise
- use single bitmap for reload type]
Reviewed-by: Vasily Khoruzhick <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Icenowy Zheng <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately we don't know how to reload Z16 buffers yet and blob
is using Z24 for dEQP tests that need depth reload.
Reviewed-by: Icenowy Zheng <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The util_float_to_half() implementation was much smaller, but when trying
to switch _mesa_float_to_half to it, many testcases
(dEQP-VK.spirv_assembly.instruction.graphics.opquantize.*,
piglit.spec.arb_shading_language_packing.*packhalf2x16) start failing on
Intel. Replace the broken impl so that people don't have to debug it
later.
Acked-by: Michel Dänzer <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3699>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3699>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
piglit/bin/arb_bindless_texture-limit -auto -fbo:
Needed to deal with non-NULL dynamic_index without deref in tex instructions.
piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto:
Need to deal with non-deref images in enter_waterfall_imae.
Fixes: b83c9aca4a5 "amd/llvm: Fix divergent descriptor indexing. (v3)"
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Fedora 32 build was failing with meson-0.53.2-1.git88e40c7.fc32
and gcc-10.0.1-0.9.fc32.x86_64.
Worked with meson-0.53.1-1 and same gcc.
/usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri2.c.o): in function `dri2_interop_export_object':
/home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri2.c:1813: undefined reference to `st_finalize_texture'
/usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri_screen.c.o): in function `dri_init_screen_helper':
/home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri_screen.c:580: undefined reference to `st_gl_api_create'
Moving this around seems to fix it.
Cc: [email protected]
Reviewed-by: Dylan Baker <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>
|
|
|
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
|
|
|
|
|
|
|
|
| |
See: https://reviews.llvm.org/D71358
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
|
|
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
|
|
|
|
|
|
|
| |
Acked-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Mathias Fröhlich <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
|
|
|
|
|
|
|
|
|
| |
This was missing for the READ case. This improves glBegin/End performance.
(vbo maps with WRITE | READ | UNSYCHRONIZED)
Acked-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Mathias Fröhlich <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
|
|
|
|
|
|
|
|
|
| |
This a port of 3338d6e5f8b5 ("freedreno/a3xx: Mostly fix min-vs-mag
filtering decisions on non-mipmap tex.")
Reviewed-by: Kristian H. Kristensen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177>
|
|
|
|
|
|
|
|
|
| |
Following on from the db410c conversion to baremetal testing, reuse the
same scripts in the same rack to run 7 db820c boards (#4/8 is failing in
the bootloader for unknown reasons).
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177>
|
|
|
|
|
|
|
|
|
| |
They ignore $PATH for unknown reasons, so you have to force the ccache
wrapping yourself.
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4099>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4099>
|
|
|
|
|
|
|
|
|
| |
This should reduce our container rebuild times, particularly on the
40-minute ARM build (which is split across only 2 runners and thus likely
to have a hot cache) when working on updating containers.
Reviewed-by: Michel Dänzer <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4099>
|
|
|
|
|
|
|
|
| |
There has been a big rename of variables in the upstream repo to make it
clear what's being handed to ci-templates.
Reviewed-by: Michel Dänzer <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4099>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've found in GL that an actual end-of-pipe sync is required before
invalidating the aux tables and that a simple CS stall is insufficient.
If we're about to modify the actual AUX table entries from the GPU, we
should definitely make sure it's stopped dead before we do so.
Cc: [email protected]
Reviewed-by: Rafael Antognolli <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
|
|
|
|
|
|
|
|
|
|
| |
Vulkan uses that for its own upload function -- even though for BLORP
it doesn't really currently care. Neither Iris and i965 makes use of
it at the moment.
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since zink_render_pass_state is used as a hash-key, the entire struct gets
compared. This means we don't want any uninitialized padding in there, or
else we risk getting false negatives. This has led to issues on macOS builds.
So let's zero out the struct before we start filling it out.
Reviewed-by: Erik Faye-Lund <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4212>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4212>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If compute shaders require a specific subgroup size (ie. Wave32),
we have to use the correct ballot size.
Fixes dEQP-VK.subgroups.ballot_other.compute.*_requiredsubgroupSize.
Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If compute shaders require a specific subgroup size (ie. Wave32),
we have to return the correct one.
Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*.
Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Vulkan 1.2.134 spec update clarified when implicit subpass
dependencies should be injected by the driver. They only make
sense if automatic layout transitions are performed.
This should fix a performance regression with RPCS3 (although
they added a workaround for RADV since the regression has been found).
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2502
Fixes: e60de085473 ("radv: handle missing implicit subpass dependencies")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>
|
|
|
|
|
|
|
|
|
| |
These are the ones which can be enabled with the current x86_build
docker image and which build without warnings.
Reviewed-by: Eric Engestrom <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of uint64_t. Fixes potentially writing beyond the end of the
handles pointer array on 32-bit architectures (and copying all 0s
instead of the computed pointer values to the array on big endian
ones).
Corresponding compiler warning:
../src/gallium/drivers/llvmpipe/lp_state_cs.c: In function ‘llvmpipe_set_global_binding’:
../src/gallium/drivers/llvmpipe/lp_state_cs.c:1312:12: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
1312 | va = (uint64_t)((char *)lp_res->data + offset);
| ^
Fixes: 264663d55d32 "gallivm/llvmpipe: add support for global
operations."
Reviewed-by: Roland Scheidegger <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivation is to allow llvmpipe to be enabled instead in the
meson-i386 job.
v2: (Eric Engestrom)
* Rename meson-main job to meson-gallium
* Remove stale comment above meson-i386 job
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>
|
|
|
|
|
|
|
| |
Should be fast enough.
Acked-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I wasn't able to find any CTS tests that used compute shaders with
samplers and set a border color, so I hacked one of the tests included
with amber:
https://gist.github.com/cwabbott0/e72f0ed8259b84ed6bf3920c68fefee6
The register was found via looking at dumps of the Vulkan blob, and
setting it fixes this test.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4204>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4204>
|
|
|
|
|
|
|
|
| |
The backports repository can be temporarily inconsistent between
architectures, which can break the docker image build.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4209>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4209>
|