aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* soft-fp64: Split a block that was missing a cast on a comparisonIan Romanick2020-03-181-13/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function has code like: if (0x7FD <= zExp) { if ((0x7FD < zExp) || ((zExp == 0x7FD) && (0x001FFFFFu == zFrac0 && 0xFFFFFFFFu == zFrac1) && increment)) { ... return ...; } if (zExp < 0) { I saw that, and I thought, "Uh... what? Dead code?" I thought it was a bit fishy, so I grabbed the Berkeley SoftFloat Library 3e code, and there is similar code in softfloat_roundPackToF64 (source/s_roundPackToF64.c), but it has an extra (uint16_t) cast in the first comparison. This is basicially a shortcut for if (zExp < 0 || zExp >= 0x7FD) { So, having the nesting kind of makes sense. On a CPU, nesting the flow control can be an optimization. On a GPU, it's just fail. Split the block so that we don't need the uint16_t cast magic. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 683638 -> 658127 (-3.73%) instructions in affected programs: 666839 -> 641328 (-3.83%) helped: 92 HURT: 0 helped stats (abs) min: 26 max: 2456 x̄: 277.29 x̃: 144 helped stats (rel) min: 3.21% max: 4.22% x̄: 3.79% x̃: 3.90% 95% mean confidence interval for instructions value: -345.84 -208.75 95% mean confidence interval for instructions %-change: -3.86% -3.73% Instructions are helped. total cycles in shared programs: 5458858 -> 5344600 (-2.09%) cycles in affected programs: 5360114 -> 5245856 (-2.13%) helped: 92 HURT: 0 helped stats (abs) min: 126 max: 10300 x̄: 1241.93 x̃: 655 helped stats (rel) min: 1.71% max: 2.37% x̄: 2.12% x̃: 2.17% 95% mean confidence interval for cycles value: -1539.93 -943.94 95% mean confidence interval for cycles %-change: -2.16% -2.08% Cycles are helped. Fixes: f111d72596c ("glsl: Add "built-in" functions to do add(fp64, fp64)") Reviewed-by: Matt Turner <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Common code optimization for differing sign caseIan Romanick2020-03-181-21/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is basically the same ideas from the previous 4 commits applied to the aSign != bSign part... and all smashed into one commit. The shader hurt for spill and / or fills is from KHR-GL46.gpu_shader_fp64.builtin.inverse_dmat4. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake total instructions in shared programs: 787258 -> 683638 (-13.16%) instructions in affected programs: 725435 -> 621815 (-14.28%) helped: 74 HURT: 0 helped stats (abs) min: 152 max: 10261 x̄: 1400.27 x̃: 975 helped stats (rel) min: 11.61% max: 20.92% x̄: 15.40% x̃: 14.86% 95% mean confidence interval for instructions value: -1740.11 -1060.43 95% mean confidence interval for instructions %-change: -16.01% -14.79% Instructions are helped. total cycles in shared programs: 6483227 -> 5458858 (-15.80%) cycles in affected programs: 6051245 -> 5026876 (-16.93%) helped: 74 HURT: 0 helped stats (abs) min: 1566 max: 95474 x̄: 13842.82 x̃: 9757 helped stats (rel) min: 13.94% max: 23.26% x̄: 17.98% x̃: 17.57% 95% mean confidence interval for cycles value: -17104.25 -10581.40 95% mean confidence interval for cycles %-change: -18.61% -17.35% Cycles are helped. total spills in shared programs: 553 -> 445 (-19.53%) spills in affected programs: 553 -> 445 (-19.53%) helped: 1 HURT: 0 total fills in shared programs: 1307 -> 1323 (1.22%) fills in affected programs: 1307 -> 1323 (1.22%) helped: 0 HURT: 1 Ice Lake total instructions in shared programs: 781216 -> 678470 (-13.15%) instructions in affected programs: 720088 -> 617342 (-14.27%) helped: 74 HURT: 0 helped stats (abs) min: 153 max: 8863 x̄: 1388.46 x̃: 975 helped stats (rel) min: 11.24% max: 21.03% x̄: 15.47% x̃: 15.01% 95% mean confidence interval for instructions value: -1703.57 -1073.35 95% mean confidence interval for instructions %-change: -16.09% -14.85% Instructions are helped. total cycles in shared programs: 6464085 -> 5453997 (-15.63%) cycles in affected programs: 6031771 -> 5021683 (-16.75%) helped: 74 HURT: 0 helped stats (abs) min: 1552 max: 90317 x̄: 13649.84 x̃: 9650 helped stats (rel) min: 13.84% max: 23.11% x̄: 17.83% x̃: 17.41% 95% mean confidence interval for cycles value: -16802.89 -10496.79 95% mean confidence interval for cycles %-change: -18.46% -17.21% Cycles are helped. total spills in shared programs: 279 -> 368 (31.90%) spills in affected programs: 279 -> 368 (31.90%) helped: 0 HURT: 1 total fills in shared programs: 973 -> 1155 (18.71%) fills in affected programs: 973 -> 1155 (18.71%) helped: 0 HURT: 1 Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Move common code out of both branches of an if-statementIan Romanick2020-03-181-22/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous two commits were just setting the scene for this change. The mix(..., __propagateFloat64NaN(a, b), propagate) statements are not identical in the two halves, but they are equivalent. The first clause of the mix in the else-branch is trivally ±Inf. The first clause in the then-branch __packFloat64(aSign, aExp, aFracHi, aFracLo). The preceeding conditions prove that aExp=0x7ff, aFracHi=0, and aFracLo=0. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 819560 -> 787258 (-3.94%) instructions in affected programs: 757737 -> 725435 (-4.26%) helped: 74 HURT: 0 helped stats (abs) min: 43 max: 3545 x̄: 436.51 x̃: 296 helped stats (rel) min: 3.54% max: 6.16% x̄: 4.52% x̃: 4.36% 95% mean confidence interval for instructions value: -548.42 -324.61 95% mean confidence interval for instructions %-change: -4.68% -4.37% Instructions are helped. total cycles in shared programs: 6817254 -> 6483227 (-4.90%) cycles in affected programs: 6385272 -> 6051245 (-5.23%) helped: 74 HURT: 0 helped stats (abs) min: 430 max: 33271 x̄: 4513.88 x̃: 3047 helped stats (rel) min: 4.28% max: 7.45% x̄: 5.48% x̃: 5.31% 95% mean confidence interval for cycles value: -5610.46 -3417.30 95% mean confidence interval for cycles %-change: -5.65% -5.32% Cycles are helped. total spills in shared programs: 591 -> 553 (-6.43%) spills in affected programs: 591 -> 553 (-6.43%) helped: 1 HURT: 0 total fills in shared programs: 1353 -> 1307 (-3.40%) fills in affected programs: 1353 -> 1307 (-3.40%) helped: 1 HURT: 0 Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Use absolute value of expDiffIan Romanick2020-03-181-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In one branch we know that expDiff is already positive. In the other branch we know the expDiff is negative. Previously in that branch the code was -(expDiff + 1). This is equvialent to (-expDiff) - 1, and since expDiff is negative, abs(expDiff) - 1. The main purpose of this commit is to prepare for "soft-fp64/fadd: Move common code out of both branches of an if-statement". Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 818246 -> 819560 (0.16%) instructions in affected programs: 756423 -> 757737 (0.17%) helped: 1 HURT: 73 helped stats (abs) min: 1205 max: 1205 x̄: 1205.00 x̃: 1205 helped stats (rel) min: 1.36% max: 1.36% x̄: 1.36% x̃: 1.36% HURT stats (abs) min: 2 max: 149 x̄: 34.51 x̃: 27 HURT stats (rel) min: 0.14% max: 1.09% x̄: 0.41% x̃: 0.30% 95% mean confidence interval for instructions value: -16.56 52.07 95% mean confidence interval for instructions %-change: 0.30% 0.47% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 6816686 -> 6817254 (<.01%) cycles in affected programs: 6384704 -> 6385272 (<.01%) helped: 37 HURT: 37 helped stats (abs) min: 30 max: 5790 x̄: 289.05 x̃: 102 helped stats (rel) min: 0.04% max: 0.86% x̄: 0.29% x̃: 0.31% HURT stats (abs) min: 2 max: 1020 x̄: 304.41 x̃: 232 HURT stats (rel) min: <.01% max: 1.58% x̄: 0.55% x̃: 0.43% 95% mean confidence interval for cycles value: -165.37 180.72 95% mean confidence interval for cycles %-change: <.01% 0.27% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 705 -> 591 (-16.17%) spills in affected programs: 705 -> 591 (-16.17%) helped: 1 HURT: 0 total fills in shared programs: 1501 -> 1353 (-9.86%) fills in affected programs: 1501 -> 1353 (-9.86%) helped: 1 HURT: 0 Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Rename aFrac and bFrac variablesIan Romanick2020-03-181-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Exchanging aFracHi / bFracHi and aFracLo / bFracLo should not affect the result of the later call to __add64. The main purpose of this commit is to prepare for "soft-fp64/fadd: Move common code out of both branches of an if-statement". v2: Fix a typo in a comment. Noticed by Matt. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 812094 -> 818246 (0.76%) instructions in affected programs: 750271 -> 756423 (0.82%) helped: 0 HURT: 74 HURT stats (abs) min: 7 max: 520 x̄: 83.14 x̃: 59 HURT stats (rel) min: 0.52% max: 1.48% x̄: 0.89% x̃: 0.84% 95% mean confidence interval for instructions value: 63.96 102.31 95% mean confidence interval for instructions %-change: 0.83% 0.95% Instructions are HURT. total cycles in shared programs: 6797157 -> 6816686 (0.29%) cycles in affected programs: 6365175 -> 6384704 (0.31%) helped: 0 HURT: 74 HURT stats (abs) min: 16 max: 1690 x̄: 263.91 x̃: 181 HURT stats (rel) min: 0.14% max: 0.68% x̄: 0.32% x̃: 0.27% 95% mean confidence interval for cycles value: 199.74 328.07 95% mean confidence interval for cycles %-change: 0.29% 0.36% Cycles are HURT. total spills in shared programs: 703 -> 705 (0.28%) spills in affected programs: 703 -> 705 (0.28%) helped: 0 HURT: 1 total fills in shared programs: 1499 -> 1501 (0.13%) fills in affected programs: 1499 -> 1501 (0.13%) helped: 0 HURT: 1 Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Combine an if-statement into the preceeding else-clauseIan Romanick2020-03-181-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main purpose of this commit is to prepare for "soft-fp64/fadd: Move common code out of both branches of an if-statement". Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 812590 -> 812094 (-0.06%) instructions in affected programs: 672135 -> 671639 (-0.07%) helped: 57 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 8.70 x̃: 7 helped stats (rel) min: <.01% max: 0.49% x̄: 0.12% x̃: 0.09% 95% mean confidence interval for instructions value: -10.46 -6.94 95% mean confidence interval for instructions %-change: -0.15% -0.09% Instructions are helped. total cycles in shared programs: 6798039 -> 6797157 (-0.01%) cycles in affected programs: 5810059 -> 5809177 (-0.02%) helped: 54 HURT: 2 helped stats (abs) min: 2 max: 68 x̄: 16.44 x̃: 12 helped stats (rel) min: <.01% max: 0.12% x̄: 0.03% x̃: 0.02% HURT stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -19.50 -12.00 95% mean confidence interval for cycles %-change: -0.03% -0.02% Cycles are helped. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Reformat after previous commitIan Romanick2020-03-181-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | Convert } else if (...) { ... } else { ... } to } else { if (...) { ... } else { ... } } Not doing this reformatting in the previous commit makes the previous commit easier to review, and doing it before the next commit makes the next commit easier to review. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Delete a redundant condition checkIan Romanick2020-03-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous condition checks already guaranteen that expDiff != 0 and !(expDiff > 0), so expDiff < 0 is the only option left. The main purpose of this commit is to prepare for "soft-fp64/fadd: Move common code out of both branches of an if-statement". Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 815491 -> 812590 (-0.36%) instructions in affected programs: 753668 -> 750767 (-0.38%) helped: 74 HURT: 0 helped stats (abs) min: 3 max: 281 x̄: 39.20 x̃: 25 helped stats (rel) min: 0.29% max: 0.73% x̄: 0.42% x̃: 0.40% 95% mean confidence interval for instructions value: -48.50 -29.91 95% mean confidence interval for instructions %-change: -0.45% -0.40% Instructions are helped. total cycles in shared programs: 6813681 -> 6798039 (-0.23%) cycles in affected programs: 6381699 -> 6366057 (-0.25%) helped: 74 HURT: 0 helped stats (abs) min: 24 max: 1488 x̄: 211.38 x̃: 149 helped stats (rel) min: 0.20% max: 0.44% x̄: 0.26% x̃: 0.25% 95% mean confidence interval for cycles value: -261.68 -161.08 95% mean confidence interval for cycles %-change: -0.28% -0.25% Cycles are helped. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Just let the subtraction happen when the result will be zeroIan Romanick2020-03-181-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main purpose of this commit is to prepare for "soft-fp64/fadd: Move common code out of both branches of an if-statement". Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 815717 -> 815491 (-0.03%) instructions in affected programs: 735489 -> 735263 (-0.03%) helped: 39 HURT: 34 helped stats (abs) min: 2 max: 192 x̄: 20.79 x̃: 12 helped stats (rel) min: 0.01% max: 0.46% x̄: 0.26% x̃: 0.28% HURT stats (abs) min: 1 max: 65 x̄: 17.21 x̃: 11 HURT stats (rel) min: <.01% max: 1.11% x̄: 0.35% x̃: 0.19% 95% mean confidence interval for instructions value: -10.40 4.21 95% mean confidence interval for instructions %-change: -0.07% 0.13% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 6820707 -> 6813681 (-0.10%) cycles in affected programs: 6388725 -> 6381699 (-0.11%) helped: 51 HURT: 23 helped stats (abs) min: 3 max: 1837 x̄: 184.76 x̃: 120 helped stats (rel) min: <.01% max: 0.48% x̄: 0.25% x̃: 0.25% HURT stats (abs) min: 18 max: 216 x̄: 104.22 x̃: 98 HURT stats (rel) min: 0.06% max: 0.73% x̄: 0.31% x̃: 0.11% 95% mean confidence interval for cycles value: -154.67 -35.22 95% mean confidence interval for cycles %-change: -0.15% <.01% Inconclusive result (%-change mean confidence interval includes 0). total spills in shared programs: 702 -> 703 (0.14%) spills in affected programs: 702 -> 703 (0.14%) helped: 0 HURT: 1 total fills in shared programs: 1497 -> 1499 (0.13%) fills in affected programs: 1497 -> 1499 (0.13%) helped: 0 HURT: 1 Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Pick zero or non-zero result based on subtraction resultIan Romanick2020-03-181-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main purpose of this commit is to prepare for "soft-fp64/fadd: Move common code out of both branches of an if-statement". Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 817327 -> 815717 (-0.20%) instructions in affected programs: 755504 -> 753894 (-0.21%) helped: 73 HURT: 1 helped stats (abs) min: 1 max: 159 x̄: 22.12 x̃: 14 helped stats (rel) min: 0.05% max: 0.40% x̄: 0.22% x̃: 0.23% HURT stats (abs) min: 5 max: 5 x̄: 5.00 x̃: 5 HURT stats (rel) min: 0.07% max: 0.07% x̄: 0.07% x̃: 0.07% 95% mean confidence interval for instructions value: -27.27 -16.24 95% mean confidence interval for instructions %-change: -0.24% -0.20% Instructions are helped. total cycles in shared programs: 6822826 -> 6820707 (-0.03%) cycles in affected programs: 6390844 -> 6388725 (-0.03%) helped: 71 HURT: 3 helped stats (abs) min: 2 max: 537 x̄: 30.72 x̃: 18 helped stats (rel) min: <.01% max: 0.08% x̄: 0.03% x̃: 0.03% HURT stats (abs) min: 10 max: 32 x̄: 20.67 x̃: 20 HURT stats (rel) min: 0.01% max: 0.02% x̄: 0.02% x̃: 0.02% 95% mean confidence interval for cycles value: -43.41 -13.86 95% mean confidence interval for cycles %-change: -0.04% -0.03% Cycles are helped. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Massively split the live range of zFrac0 and zFrac1Ian Romanick2020-03-181-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main purpose of this commit is to prepare for "soft-fp64/fadd: Move common code out of both branches of an if-statement". Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 822766 -> 817327 (-0.66%) instructions in affected programs: 760943 -> 755504 (-0.71%) helped: 74 HURT: 0 helped stats (abs) min: 8 max: 515 x̄: 73.50 x̃: 51 helped stats (rel) min: 0.58% max: 1.10% x̄: 0.77% x̃: 0.73% 95% mean confidence interval for instructions value: -91.17 -55.83 95% mean confidence interval for instructions %-change: -0.81% -0.74% Instructions are helped. total cycles in shared programs: 6816791 -> 6822826 (0.09%) cycles in affected programs: 6384809 -> 6390844 (0.09%) helped: 0 HURT: 74 HURT stats (abs) min: 6 max: 1179 x̄: 81.55 x̃: 50 HURT stats (rel) min: 0.02% max: 0.17% x̄: 0.09% x̃: 0.09% 95% mean confidence interval for cycles value: 48.99 114.12 95% mean confidence interval for cycles %-change: 0.09% 0.10% Cycles are HURT. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fadd: Instead of tracking "b < a", track sign of the differenceIan Romanick2020-03-181-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 824403 -> 822766 (-0.20%) instructions in affected programs: 756260 -> 754623 (-0.22%) helped: 68 HURT: 1 helped stats (abs) min: 1 max: 118 x̄: 26.26 x̃: 18 helped stats (rel) min: 0.02% max: 0.97% x̄: 0.31% x̃: 0.23% HURT stats (abs) min: 149 max: 149 x̄: 149.00 x̃: 149 HURT stats (rel) min: 0.17% max: 0.17% x̄: 0.17% x̃: 0.17% 95% mean confidence interval for instructions value: -31.94 -15.51 95% mean confidence interval for instructions %-change: -0.37% -0.23% Instructions are helped. total cycles in shared programs: 6828935 -> 6816791 (-0.18%) cycles in affected programs: 6385191 -> 6373047 (-0.19%) helped: 73 HURT: 0 helped stats (abs) min: 2 max: 852 x̄: 166.36 x̃: 120 helped stats (rel) min: <.01% max: 0.80% x̄: 0.22% x̃: 0.17% 95% mean confidence interval for cycles value: -210.80 -121.91 95% mean confidence interval for cycles %-change: -0.27% -0.17% Cycles are helped. total fills in shared programs: 1442 -> 1497 (3.81%) fills in affected programs: 1442 -> 1497 (3.81%) helped: 0 HURT: 1 Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64: Optimize __fmin64 and __fmax64 by using different evaluation ↵Ian Romanick2020-03-181-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | order [v2] v2: Go to extra effort to avoid flow control inserted to implement short-circuit evaluation rules. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 797779 -> 796849 (-0.12%) instructions in affected programs: 3499 -> 2569 (-26.58%) helped: 21 HURT: 0 helped stats (abs) min: 8 max: 112 x̄: 44.29 x̃: 44 helped stats (rel) min: 16.09% max: 33.15% x̄: 25.72% x̃: 24.62% 95% mean confidence interval for instructions value: -55.94 -32.63 95% mean confidence interval for instructions %-change: -28.14% -23.30% Instructions are helped. total cycles in shared programs: 6601355 -> 6588351 (-0.20%) cycles in affected programs: 25376 -> 12372 (-51.25%) helped: 21 HURT: 0 helped stats (abs) min: 156 max: 1410 x̄: 619.24 x̃: 526 helped stats (rel) min: 42.39% max: 53.98% x̄: 50.12% x̃: 50.75% 95% mean confidence interval for cycles value: -776.58 -461.89 95% mean confidence interval for cycles %-change: -51.57% -48.67% Cycles are helped. Reviewed-by: Jason Ekstrand <[email protected]> [v1] Reviewed-by: Matt Turner <[email protected]> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/ffloor: Simplify the >= 0 comparisonIan Romanick2020-03-181-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 797951 -> 797779 (-0.02%) instructions in affected programs: 126482 -> 126310 (-0.14%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 11.47 x̃: 10 helped stats (rel) min: <.01% max: 0.60% x̄: 0.28% x̃: 0.29% 95% mean confidence interval for instructions value: -14.79 -8.14 95% mean confidence interval for instructions %-change: -0.40% -0.16% Instructions are helped. total cycles in shared programs: 6601437 -> 6601355 (<.01%) cycles in affected programs: 1089336 -> 1089254 (<.01%) helped: 15 HURT: 0 helped stats (abs) min: 2 max: 12 x̄: 5.47 x̃: 6 helped stats (rel) min: <.01% max: 0.04% x̄: 0.01% x̃: 0.01% 95% mean confidence interval for cycles value: -7.06 -3.87 95% mean confidence interval for cycles %-change: -0.02% <.01% Cycles are helped. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64: Relax the way NaN is propagatedIan Romanick2020-03-181-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also reassociate a couple expressions to encourage some CSE. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 813599 -> 797951 (-1.92%) instructions in affected programs: 796110 -> 780462 (-1.97%) helped: 92 HURT: 0 helped stats (abs) min: 3 max: 5198 x̄: 170.09 x̃: 83 helped stats (rel) min: 0.36% max: 5.50% x̄: 1.57% x̃: 1.40% 95% mean confidence interval for instructions value: -282.42 -57.75 95% mean confidence interval for instructions %-change: -1.71% -1.42% Instructions are helped. total cycles in shared programs: 6687128 -> 6601437 (-1.28%) cycles in affected programs: 6582246 -> 6496555 (-1.30%) helped: 92 HURT: 0 helped stats (abs) min: 36 max: 14442 x̄: 931.42 x̃: 592 helped stats (rel) min: 0.45% max: 3.16% x̄: 1.44% x̃: 1.23% 95% mean confidence interval for cycles value: -1257.58 -605.27 95% mean confidence interval for cycles %-change: -1.58% -1.30% Cycles are helped. total spills in shared programs: 759 -> 702 (-7.51%) spills in affected programs: 759 -> 702 (-7.51%) helped: 3 HURT: 0 total fills in shared programs: 2412 -> 1442 (-40.22%) fills in affected programs: 2412 -> 1442 (-40.22%) helped: 3 HURT: 0 Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fsat: Micro-optimize x >= 1 testIan Romanick2020-03-181-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 841590 -> 841332 (-0.03%) instructions in affected programs: 121957 -> 121699 (-0.21%) helped: 7 HURT: 0 helped stats (abs) min: 15 max: 54 x̄: 36.86 x̃: 41 helped stats (rel) min: 0.16% max: 0.33% x̄: 0.23% x̃: 0.18% 95% mean confidence interval for instructions value: -49.73 -23.98 95% mean confidence interval for instructions %-change: -0.29% -0.16% Instructions are helped. total cycles in shared programs: 6926828 -> 6923967 (-0.04%) cycles in affected programs: 1038569 -> 1035708 (-0.28%) helped: 7 HURT: 0 helped stats (abs) min: 128 max: 616 x̄: 408.71 x̃: 446 helped stats (rel) min: 0.18% max: 0.44% x̄: 0.29% x̃: 0.22% 95% mean confidence interval for cycles value: -571.72 -245.70 95% mean confidence interval for cycles %-change: -0.38% -0.19% Cycles are helped. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fsat: Micro-optimize x < 0 testIan Romanick2020-03-181-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 841647 -> 841590 (<.01%) instructions in affected programs: 122014 -> 121957 (-0.05%) helped: 7 HURT: 0 helped stats (abs) min: 3 max: 12 x̄: 8.14 x̃: 9 helped stats (rel) min: 0.04% max: 0.07% x̄: 0.05% x̃: 0.04% 95% mean confidence interval for instructions value: -11.23 -5.06 95% mean confidence interval for instructions %-change: -0.06% -0.03% Instructions are helped. total cycles in shared programs: 6926904 -> 6926828 (<.01%) cycles in affected programs: 1038645 -> 1038569 (<.01%) helped: 7 HURT: 0 helped stats (abs) min: 4 max: 16 x̄: 10.86 x̃: 12 helped stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -14.97 -6.74 95% mean confidence interval for cycles %-change: -0.01% <.01% Cycles are helped. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fsat: Correctly handle NaNIan Romanick2020-03-181-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fsat is defined as min(max(a, 0.0), 1.0), and IEEE defines both min and max to return the non-NaN value when one value is NaN. Based on this, fsat should definitely return 0.0 for NaN. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 841666 -> 841647 (<.01%) instructions in affected programs: 122033 -> 122014 (-0.02%) helped: 7 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 2.71 x̃: 3 helped stats (rel) min: 0.01% max: 0.02% x̄: 0.02% x̃: 0.01% 95% mean confidence interval for instructions value: -3.74 -1.69 95% mean confidence interval for instructions %-change: -0.02% -0.01% Instructions are helped. total cycles in shared programs: 6927246 -> 6926904 (<.01%) cycles in affected programs: 1038987 -> 1038645 (-0.03%) helped: 7 HURT: 0 helped stats (abs) min: 18 max: 72 x̄: 48.86 x̃: 54 helped stats (rel) min: 0.03% max: 0.05% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -67.38 -30.33 95% mean confidence interval for cycles %-change: -0.05% -0.02% Cycles are helped. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Fixes: a42163cbbc1 ("compiler: Add lowering support for 64-bit saturate operations to software") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2585 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/flt: Perform checks in a different orderIan Romanick2020-03-182-16/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The change to nir_opt_algebraic cleans up a pattern that was never produced before the rest of this commit was added. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 843005 -> 841666 (-0.16%) instructions in affected programs: 460655 -> 459316 (-0.29%) helped: 64 HURT: 17 helped stats (abs) min: 1 max: 72 x̄: 21.72 x̃: 20 helped stats (rel) min: 0.01% max: 28.07% x̄: 12.67% x̃: 16.07% HURT stats (abs) min: 1 max: 7 x̄: 3.00 x̃: 2 HURT stats (rel) min: 0.01% max: 0.04% x̄: 0.02% x̃: 0.02% 95% mean confidence interval for instructions value: -20.87 -12.19 95% mean confidence interval for instructions %-change: -12.35% -7.66% Instructions are helped. total cycles in shared programs: 6944998 -> 6927246 (-0.26%) cycles in affected programs: 3891872 -> 3874120 (-0.46%) helped: 71 HURT: 10 helped stats (abs) min: 2 max: 772 x̄: 254.21 x̃: 156 helped stats (rel) min: <.01% max: 66.44% x̄: 21.72% x̃: 18.40% HURT stats (abs) min: 18 max: 69 x̄: 29.70 x̃: 20 HURT stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -270.82 -167.50 95% mean confidence interval for cycles %-change: -24.41% -13.65% Cycles are helped. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/fneg: Don't treat NaN speciallyIan Romanick2020-03-181-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __fabs64 doesn't do anything special, and the value is still NaN regardless of the value of the MSB. In a strict sense, it's possible that both functions should set the "signal" bit. lts on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 844558 -> 843005 (-0.18%) instructions in affected programs: 725975 -> 724422 (-0.21%) helped: 53 HURT: 4 helped stats (abs) min: 1 max: 313 x̄: 29.87 x̃: 21 helped stats (rel) min: 0.01% max: 0.94% x̄: 0.30% x̃: 0.22% HURT stats (abs) min: 4 max: 11 x̄: 7.50 x̃: 7 HURT stats (rel) min: 0.03% max: 0.09% x̄: 0.05% x̃: 0.04% 95% mean confidence interval for instructions value: -39.02 -15.47 95% mean confidence interval for instructions %-change: -0.34% -0.21% Instructions are helped. total cycles in shared programs: 6962024 -> 6944998 (-0.24%) cycles in affected programs: 6185470 -> 6168444 (-0.28%) helped: 59 HURT: 0 helped stats (abs) min: 64 max: 2863 x̄: 288.58 x̃: 208 helped stats (rel) min: 0.11% max: 0.87% x̄: 0.33% x̃: 0.27% 95% mean confidence interval for cycles value: -387.15 -190.00 95% mean confidence interval for cycles %-change: -0.38% -0.28% Cycles are helped. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64: Store sign value as 0 or 0x80000000Ian Romanick2020-03-181-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...instead of 0 or 1. Many places the sign bit is extracted, then later put back in the same position. This saves some left-shift operations. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 848106 -> 844558 (-0.42%) instructions in affected programs: 833480 -> 829932 (-0.43%) helped: 106 HURT: 1 helped stats (abs) min: 1 max: 995 x̄: 33.48 x̃: 12 helped stats (rel) min: 0.15% max: 2.20% x̄: 0.60% x̃: 0.35% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for instructions value: -51.88 -14.43 95% mean confidence interval for instructions %-change: -0.71% -0.47% Instructions are helped. total cycles in shared programs: 6969125 -> 6962024 (-0.10%) cycles in affected programs: 6717689 -> 6710588 (-0.11%) helped: 78 HURT: 7 helped stats (abs) min: 2 max: 2083 x̄: 110.27 x̃: 56 helped stats (rel) min: <.01% max: 0.30% x̄: 0.11% x̃: 0.11% HURT stats (abs) min: 2 max: 1340 x̄: 214.29 x̃: 4 HURT stats (rel) min: 0.01% max: 0.71% x̄: 0.13% x̃: 0.02% 95% mean confidence interval for cycles value: -144.02 -23.06 95% mean confidence interval for cycles %-change: -0.12% -0.07% Cycles are helped. total spills in shared programs: 814 -> 759 (-6.76%) spills in affected programs: 814 -> 759 (-6.76%) helped: 2 HURT: 1 total fills in shared programs: 2488 -> 2412 (-3.05%) fills in affected programs: 2488 -> 2412 (-3.05%) helped: 2 HURT: 1 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64: Pick a single idiom for treating sign value as a BooleanIan Romanick2020-03-181-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Replace all of the bool(qSign) with qSign != 0u. Remove unnecessary parenthesis from around most of the existing qSign != 0u. This dramatically simplifies the next commit. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 848109 -> 848106 (<.01%) instructions in affected programs: 53 -> 50 (-5.66%) helped: 1 HURT: 0 total cycles in shared programs: 6969145 -> 6969125 (<.01%) cycles in affected programs: 396 -> 376 (-5.05%) helped: 1 HURT: 0 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64: Simplify __countLeadingZeros32 functionIan Romanick2020-03-181-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | findMSB returns -1 for an input of zero, so 31 - findMSB(a) is sufficient on its own. There's only one user of findMSB in shader-db, and it does not match this pattern. TODO: Add a pattern in the backend code generator that emits 31 - nir_op_ufind_msb(a) as if it were nir_op_uclz. That should save a couple instructions. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 859509 -> 848109 (-1.33%) instructions in affected programs: 841058 -> 829658 (-1.36%) helped: 97 HURT: 0 helped stats (abs) min: 3 max: 1161 x̄: 117.53 x̃: 72 helped stats (rel) min: 0.98% max: 6.74% x̄: 1.70% x̃: 1.35% 95% mean confidence interval for instructions value: -147.21 -87.84 95% mean confidence interval for instructions %-change: -1.94% -1.46% Instructions are helped. total cycles in shared programs: 7072275 -> 6969145 (-1.46%) cycles in affected programs: 6955767 -> 6852637 (-1.48%) helped: 97 HURT: 0 helped stats (abs) min: 32 max: 10900 x̄: 1063.20 x̃: 560 helped stats (rel) min: 1.18% max: 7.58% x̄: 1.84% x̃: 1.45% 95% mean confidence interval for cycles value: -1339.43 -786.96 95% mean confidence interval for cycles %-change: -2.11% -1.57% Cycles are helped. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64: Don't open-code umulExtendedIan Romanick2020-03-181-32/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 928859 -> 859509 (-7.47%) instructions in affected programs: 866293 -> 796943 (-8.01%) helped: 76 HURT: 0 helped stats (abs) min: 75 max: 8042 x̄: 912.50 x̃: 688 helped stats (rel) min: 5.35% max: 21.02% x̄: 10.35% x̃: 7.58% 95% mean confidence interval for instructions value: -1138.37 -686.63 95% mean confidence interval for instructions %-change: -11.69% -9.00% Instructions are helped. total cycles in shared programs: 7272912 -> 7072275 (-2.76%) cycles in affected programs: 6763486 -> 6562849 (-2.97%) helped: 76 HURT: 0 helped stats (abs) min: 214 max: 30136 x̄: 2639.96 x̃: 1923 helped stats (rel) min: 1.75% max: 9.20% x̄: 4.04% x̃: 2.41% 95% mean confidence interval for cycles value: -3455.29 -1824.63 95% mean confidence interval for cycles %-change: -4.69% -3.39% Cycles are helped. total spills in shared programs: 817 -> 814 (-0.37%) spills in affected programs: 791 -> 788 (-0.38%) helped: 2 HURT: 0 total fills in shared programs: 2438 -> 2488 (2.05%) fills in affected programs: 2392 -> 2442 (2.09%) helped: 0 HURT: 2 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* soft-fp64/b2f: Reimplement using bitwise logic opsIan Romanick2020-03-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't help a lot of shaders, but it helps those few a LOT. This could also be implemented using bcsel. That version is very slightly worse because the generated SEL instruction wants to have two immediate sources, so one of them usually needs an extra MOV instruction to load. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 929619 -> 928859 (-0.08%) instructions in affected programs: 1651 -> 891 (-46.03%) helped: 8 HURT: 0 helped stats (abs) min: 38 max: 152 x̄: 95.00 x̃: 95 helped stats (rel) min: 42.70% max: 86.36% x̄: 49.88% x̃: 44.66% 95% mean confidence interval for instructions value: -132.97 -57.03 95% mean confidence interval for instructions %-change: -62.28% -37.49% Instructions are helped. total cycles in shared programs: 7280180 -> 7272912 (-0.10%) cycles in affected programs: 12960 -> 5692 (-56.08%) helped: 8 HURT: 0 helped stats (abs) min: 352 max: 1456 x̄: 908.50 x̃: 910 helped stats (rel) min: 52.45% max: 91.19% x̄: 59.24% x̃: 55.15% 95% mean confidence interval for cycles value: -1274.03 -542.97 95% mean confidence interval for cycles %-change: -70.06% -48.41% Cycles are helped. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* nir/algebraic: Simplify a contradiction that can occur in __flt64_nonnanIan Romanick2020-03-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pattern is added to opt_algebraic because, for example, comparisons with constant 0.0 will produce (a1 < 0). Even with a pass that optimized Boolean expressions, I think this would be very difficult to automatically recognize and optimize. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 933054 -> 929619 (-0.37%) instructions in affected programs: 784041 -> 780606 (-0.44%) helped: 59 HURT: 0 helped stats (abs) min: 2 max: 213 x̄: 58.22 x̃: 44 helped stats (rel) min: 0.02% max: 2.51% x̄: 0.72% x̃: 0.46% 95% mean confidence interval for instructions value: -70.80 -45.64 95% mean confidence interval for instructions %-change: -0.92% -0.53% Instructions are helped. total cycles in shared programs: 7304712 -> 7280180 (-0.34%) cycles in affected programs: 7176260 -> 7151728 (-0.34%) helped: 92 HURT: 0 helped stats (abs) min: 8 max: 1414 x̄: 266.65 x̃: 166 helped stats (rel) min: 0.04% max: 2.34% x̄: 0.43% x̃: 0.22% 95% mean confidence interval for cycles value: -333.05 -200.26 95% mean confidence interval for cycles %-change: -0.54% -0.31% Cycles are helped. Regular shader-db changes: No changes on any Intel platform. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* nir/algebraic: Constant reassociation for bitwise operations tooIan Romanick2020-03-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like 5886cd79a0e, but for iand, ior, and ixor. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake total instructions in shared programs: 903108 -> 902830 (-0.03%) instructions in affected programs: 654910 -> 654632 (-0.04%) helped: 31 HURT: 5 helped stats (abs) min: 2 max: 31 x̄: 9.58 x̃: 7 helped stats (rel) min: 0.01% max: 0.23% x̄: 0.06% x̃: 0.04% HURT stats (abs) min: 1 max: 10 x̄: 3.80 x̃: 3 HURT stats (rel) min: 0.01% max: 0.10% x̄: 0.03% x̃: 0.02% 95% mean confidence interval for instructions value: -10.55 -4.89 95% mean confidence interval for instructions %-change: -0.07% -0.03% Instructions are helped. total cycles in shared programs: 7059681 -> 7058006 (-0.02%) cycles in affected programs: 5081309 -> 5079634 (-0.03%) helped: 33 HURT: 12 helped stats (abs) min: 1 max: 444 x̄: 60.91 x̃: 18 helped stats (rel) min: <.01% max: 2.17% x̄: 0.25% x̃: 0.05% HURT stats (abs) min: 1 max: 288 x̄: 27.92 x̃: 2 HURT stats (rel) min: <.01% max: 1.00% x̄: 0.23% x̃: 0.02% 95% mean confidence interval for cycles value: -68.32 -6.12 95% mean confidence interval for cycles %-change: -0.28% 0.03% Inconclusive result (%-change mean confidence interval includes 0). Ice Lake total instructions in shared programs: 895384 -> 895159 (-0.03%) instructions in affected programs: 658678 -> 658453 (-0.03%) helped: 37 HURT: 0 helped stats (abs) min: 3 max: 16 x̄: 6.08 x̃: 4 helped stats (rel) min: <.01% max: 0.07% x̄: 0.04% x̃: 0.04% 95% mean confidence interval for instructions value: -7.46 -4.70 95% mean confidence interval for instructions %-change: -0.04% -0.03% Instructions are helped. total cycles in shared programs: 7092224 -> 7091195 (-0.01%) cycles in affected programs: 5221666 -> 5220637 (-0.02%) helped: 35 HURT: 11 helped stats (abs) min: 1 max: 247 x̄: 43.46 x̃: 12 helped stats (rel) min: <.01% max: 2.17% x̄: 0.23% x̃: 0.05% HURT stats (abs) min: 2 max: 432 x̄: 44.73 x̃: 5 HURT stats (rel) min: <.01% max: 1.00% x̄: 0.25% x̃: 0.02% 95% mean confidence interval for cycles value: -49.00 4.26 95% mean confidence interval for cycles %-change: -0.27% 0.03% Inconclusive result (value mean confidence interval includes 0). Regular shader-db results: All Haswell+ platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 17611408 -> 17611398 (<.01%) instructions in affected programs: 1648 -> 1638 (-0.61%) helped: 2 HURT: 0 total cycles in shared programs: 338366148 -> 338366124 (<.01%) cycles in affected programs: 124048 -> 124024 (-0.02%) helped: 2 HURT: 0 No changes on any earlier Intel platforms. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* nir/algebraic: Generalize some and-of-shift-right patterns [v2]Ian Romanick2020-03-181-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generalizes some of the patterns from 76289fbfa84a and 905ff8619824. In particular, some of the soft-fp64 code generates (a & 0x7fffffff) << 1 when constant 0.0 is compared (flt or feq). v2: Reduce the set of added patterns to those that actually help something. This reduces the size of the state transition tables by about 29k. Suggested by Jason. Remove the existing patterns that this commit subsumes. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake total instructions in shared programs: 903171 -> 903108 (<.01%) instructions in affected programs: 635903 -> 635840 (<.01%) helped: 25 HURT: 11 helped stats (abs) min: 1 max: 16 x̄: 5.04 x̃: 3 helped stats (rel) min: <.01% max: 0.15% x̄: 0.04% x̃: 0.03% HURT stats (abs) min: 2 max: 14 x̄: 5.73 x̃: 5 HURT stats (rel) min: <.01% max: 0.11% x̄: 0.04% x̃: 0.02% 95% mean confidence interval for instructions value: -3.91 0.41 95% mean confidence interval for instructions %-change: -0.03% <.01% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 7059527 -> 7059681 (<.01%) cycles in affected programs: 5249401 -> 5249555 (<.01%) helped: 41 HURT: 9 helped stats (abs) min: 2 max: 76 x̄: 11.90 x̃: 10 helped stats (rel) min: <.01% max: 11.86% x̄: 0.99% x̃: 0.01% HURT stats (abs) min: 2 max: 380 x̄: 71.33 x̃: 12 HURT stats (rel) min: <.01% max: 0.22% x̄: 0.04% x̃: 0.01% 95% mean confidence interval for cycles value: -14.93 21.09 95% mean confidence interval for cycles %-change: -1.40% -0.20% Inconclusive result (value mean confidence interval includes 0). Ice Lake total instructions in shared programs: 895506 -> 895384 (-0.01%) instructions in affected programs: 658800 -> 658678 (-0.02%) helped: 37 HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 3.30 x̃: 2 helped stats (rel) min: <.01% max: 0.03% x̄: 0.02% x̃: 0.02% 95% mean confidence interval for instructions value: -4.00 -2.59 95% mean confidence interval for instructions %-change: -0.02% -0.02% Instructions are helped. total cycles in shared programs: 7092748 -> 7092224 (<.01%) cycles in affected programs: 5272008 -> 5271484 (<.01%) helped: 36 HURT: 14 helped stats (abs) min: 2 max: 440 x̄: 21.67 x̃: 8 helped stats (rel) min: <.01% max: 11.86% x̄: 1.12% x̃: 0.02% HURT stats (abs) min: 2 max: 122 x̄: 18.29 x̃: 6 HURT stats (rel) min: <.01% max: 0.07% x̄: 0.01% x̃: <.01% 95% mean confidence interval for cycles value: -29.24 8.28 95% mean confidence interval for cycles %-change: -1.40% -0.21% Inconclusive result (value mean confidence interval includes 0). Regular shader-db results: All Haswell+ platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 17611489 -> 17611408 (<.01%) instructions in affected programs: 21188 -> 21107 (-0.38%) helped: 23 HURT: 1 helped stats (abs) min: 1 max: 16 x̄: 3.78 x̃: 3 helped stats (rel) min: 0.03% max: 5.82% x̄: 1.13% x̃: 0.85% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.60% max: 0.60% x̄: 0.60% x̃: 0.60% 95% mean confidence interval for instructions value: -5.27 -1.48 95% mean confidence interval for instructions %-change: -1.70% -0.42% Instructions are helped. total cycles in shared programs: 338418502 -> 338366148 (-0.02%) cycles in affected programs: 2289052 -> 2236698 (-2.29%) helped: 18 HURT: 3 helped stats (abs) min: 4 max: 18000 x̄: 2909.67 x̃: 38 helped stats (rel) min: 0.09% max: 4.07% x̄: 0.96% x̃: 0.43% HURT stats (abs) min: 2 max: 14 x̄: 6.67 x̃: 4 HURT stats (rel) min: 0.22% max: 1.13% x̄: 0.66% x̃: 0.64% 95% mean confidence interval for cycles value: -5204.00 217.91 95% mean confidence interval for cycles %-change: -1.31% -0.14% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 11875617 -> 11875615 (<.01%) instructions in affected programs: 1339 -> 1337 (-0.15%) helped: 2 HURT: 0 No changes on any earlier Intel platforms. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* nir/algebraic: optimize ior(ine(a, 0), ine(b, 0)) to ine(ior(a, b), 0)Ian Romanick2020-03-181-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like 70f9e2589e6b. Also scrub the unnecessary size qualifier in both replacement patterns. This occurs in a handful of places in the soft-fp64 code, and that is the primary reason for the change. Perhaps the patterns that generate umin should be conditioned on something, but I'm not sure what. lower_bitops might cover the cases that matter, but it seems ugly. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 936505 -> 933388 (-0.33%) instructions in affected programs: 925719 -> 922602 (-0.34%) helped: 154 HURT: 1 helped stats (abs) min: 1 max: 211 x̄: 35.45 x̃: 16 helped stats (rel) min: 0.34% max: 9.30% x̄: 2.28% x̃: 0.96% HURT stats (abs) min: 2342 max: 2342 x̄: 2342.00 x̃: 2342 HURT stats (rel) min: 2.28% max: 2.28% x̄: 2.28% x̃: 2.28% 95% mean confidence interval for instructions value: -51.21 10.99 95% mean confidence interval for instructions %-change: -2.61% -1.89% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 7323502 -> 7306184 (-0.24%) cycles in affected programs: 7220376 -> 7203058 (-0.24%) helped: 126 HURT: 1 helped stats (abs) min: 2 max: 946 x̄: 159.10 x̃: 95 helped stats (rel) min: 0.01% max: 9.62% x̄: 0.80% x̃: 0.37% HURT stats (abs) min: 2728 max: 2728 x̄: 2728.00 x̃: 2728 HURT stats (rel) min: 0.37% max: 0.37% x̄: 0.37% x̃: 0.37% 95% mean confidence interval for cycles value: -192.07 -80.66 95% mean confidence interval for cycles %-change: -1.07% -0.51% Cycles are helped. total spills in shared programs: 635 -> 817 (28.66%) spills in affected programs: 635 -> 817 (28.66%) helped: 0 HURT: 3 total fills in shared programs: 2065 -> 2438 (18.06%) fills in affected programs: 2019 -> 2392 (18.47%) helped: 0 HURT: 2 Regular shader-db results: All Haswell+ platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 17611506 -> 17611489 (<.01%) instructions in affected programs: 33442 -> 33425 (-0.05%) helped: 32 HURT: 6 helped stats (abs) min: 1 max: 6 x̄: 1.69 x̃: 1 helped stats (rel) min: 0.08% max: 1.90% x̄: 0.27% x̃: 0.11% HURT stats (abs) min: 1 max: 15 x̄: 6.17 x̃: 5 HURT stats (rel) min: 0.09% max: 1.50% x̄: 0.65% x̃: 0.55% 95% mean confidence interval for instructions value: -1.70 0.80 95% mean confidence interval for instructions %-change: -0.30% 0.05% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 338419218 -> 338418502 (<.01%) cycles in affected programs: 385795 -> 385079 (-0.19%) helped: 42 HURT: 3 helped stats (abs) min: 2 max: 192 x̄: 24.57 x̃: 16 helped stats (rel) min: 0.04% max: 2.09% x̄: 0.33% x̃: 0.22% HURT stats (abs) min: 64 max: 164 x̄: 105.33 x̃: 88 HURT stats (rel) min: 0.77% max: 1.58% x̄: 1.09% x̃: 0.93% 95% mean confidence interval for cycles value: -29.76 -2.06 95% mean confidence interval for cycles %-change: -0.40% -0.07% Cycles are helped. Ivy Bridge and Sandy Bridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11875620 -> 11875617 (<.01%) instructions in affected programs: 421 -> 418 (-0.71%) helped: 2 HURT: 0 total cycles in shared programs: 178245336 -> 178245326 (<.01%) cycles in affected programs: 3425 -> 3415 (-0.29%) helped: 2 HURT: 0 No changes on Gen4 or Gen5. Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* nir/algebraic: Simplify logic to detect sign of an integerIan Romanick2020-03-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This occurs in a handful of places in the soft-fp64 code, and that is the primary reason for the change. v2: Fix a typo in a comment. Noticed by Matt. Copy the correct fp64 shader-db results to the commit message. I realized that I used accidentally used the results from the next commit. Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 906235 -> 906149 (<.01%) instructions in affected programs: 353966 -> 353880 (-0.02%) helped: 31 HURT: 2 helped stats (abs) min: 1 max: 8 x̄: 3.03 x̃: 3 helped stats (rel) min: 0.01% max: 1.59% x̄: 0.10% x̃: 0.04% HURT stats (abs) min: 3 max: 5 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02% 95% mean confidence interval for instructions value: -3.51 -1.70 95% mean confidence interval for instructions %-change: -0.19% <.01% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 7076552 -> 7076173 (<.01%) cycles in affected programs: 2878361 -> 2877982 (-0.01%) helped: 37 HURT: 2 helped stats (abs) min: 2 max: 48 x̄: 10.81 x̃: 6 helped stats (rel) min: <.01% max: 2.17% x̄: 0.47% x̃: 0.01% HURT stats (abs) min: 1 max: 20 x̄: 10.50 x̃: 10 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -13.96 -5.48 95% mean confidence interval for cycles %-change: -0.72% -0.16% Cycles are helped. total fills in shared programs: 2064 -> 2065 (0.05%) fills in affected programs: 45 -> 46 (2.22%) helped: 0 HURT: 1 Regular shader-db results: All Gen7+ platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 17611530 -> 17611506 (<.01%) instructions in affected programs: 5934 -> 5910 (-0.40%) helped: 10 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 2 helped stats (rel) min: 0.14% max: 1.24% x̄: 0.47% x̃: 0.34% 95% mean confidence interval for instructions value: -3.53 -1.27 95% mean confidence interval for instructions %-change: -0.78% -0.17% Instructions are helped. total cycles in shared programs: 338419178 -> 338419218 (<.01%) cycles in affected programs: 19244 -> 19284 (0.21%) helped: 4 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.05% max: 0.11% x̄: 0.08% x̃: 0.08% HURT stats (abs) min: 26 max: 26 x̄: 26.00 x̃: 26 HURT stats (rel) min: 1.20% max: 1.20% x̄: 1.20% x̃: 1.20% 95% mean confidence interval for cycles value: -9.08 22.41 95% mean confidence interval for cycles %-change: -0.35% 1.04% Inconclusive result (value mean confidence interval includes 0). No changes on any earlier Intel platform. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
* st/mesa: disallow deferred flush if there are multiple contextsPierre-Eric Pelloux-Prayer2020-03-181-1/+2
| | | | | | | | | | | | | u_threaded can hang in these situation, with one context waiting on a deferred fence from the other context. But the other context isn't flushing its pending work (because it's waiting for more work to pushed) so everything is stuck. Fixes: d17b35e671a ("gallium: add PIPE_FLUSH_DEFERRED") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1430 Reviewed-by: Marek Olšák <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>
* anv: Use isl_drm_modifier_get_default_aux_state()Chad Versace2020-03-181-18/+21
| | | | | | | | | | Use it in anv_layout_to_aux_state(). Refactor only. No change in behavior. Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>
* intel/isl: Don't align linear images to 64K on Gen12+Jason Ekstrand2020-03-181-3/+12
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>
* radv: fix random depth range unrestricted failures due to a cache issueSamuel Pitoiset2020-03-181-2/+6
| | | | | | | | | | | | | | | The shader module name is used to compute the pipeline key. The driver used to load the wrong pipelines because the shader names were similar. This should fix random failures of dEQP-VK.pipeline.depth_range_unrestricted.* Fixes: f11ea226664 ("radv: fix a performance regression with graphics depth/stencil clears") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
* turnip: Do gathering xfb info after nir_remove_dead_variablesHyunjun Ko2020-03-181-3/+5
| | | | | | | | | | | | | | So we could align stream outputs correctly even if unused in/outs are removed. Fixes: dEQP-VK.transform_feedback.fuzz.random_vertex.scalar_types.* dEQP-VK.transform_feedback.fuzz.random_vertex.vector_types.* Signed-off-by: Hyunjun Ko <[email protected]> Reviewed-by: Jonathan Marek <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
* turnip: Fix wrong assignment of xfb output's offset.Hyunjun Ko2020-03-181-1/+1
| | | | | | | | | | | | Should be divided by 4 so we could calculate the offset correctly in tu6_setup_streamout. Fixes: 2a1d6b81ed54971d33e83b7f5545da096b13b043 Related: 374406a7c420d266f920461f904864a94dc1b8c8 Signed-off-by: Hyunjun Ko <[email protected]> Reviewed-by: Jonathan Marek <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
* intel/decoder: don't consider header fields past dword0Lionel Landwerlin2020-03-181-2/+4
| | | | | | | | | v2: use ULL Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Danylo Piliaiev <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>
* lima: decode depth/stencil write bits in RSWVasily Khoruzhick2020-03-181-2/+10
| | | | | | | | | | Now that we know the bits that are responsible for enabling depth/stencil writes in shader we can decode them properly. Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
* lima: implement zsbuf reloadIcenowy Zheng2020-03-187-38/+84
| | | | | | | | | | | | | | | | | | Fragment shader can write depth and stencil if we set necessary flags in RSW. In addition to that we need to use special format for Z24S8. Original format is apparently Z24X8 since we can't sample stencil in GLES2. This new format also seems to use several components for storing depth since we saw r != g != b when sampling with this format. [vasily: - initialize clear->depth to 0xffffff if we reload depth, just like blob does. Reloading doesn't work otherwise - use single bitmap for reload type] Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Icenowy Zheng <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
* lima: disable Z16 formatVasily Khoruzhick2020-03-182-3/+0
| | | | | | | | | | Unfortunately we don't know how to reload Z16 buffers yet and blob is using Z24 for dEQP tests that need depth reload. Reviewed-by: Icenowy Zheng <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
* gallium/util: Switch util_float_to_half to _mesa_float_to_half()'s impl.Eric Anholt2020-03-171-52/+2
| | | | | | | | | | | | | | The util_float_to_half() implementation was much smaller, but when trying to switch _mesa_float_to_half to it, many testcases (dEQP-VK.spirv_assembly.instruction.graphics.opquantize.*, piglit.spec.arb_shading_language_packing.*packhalf2x16) start failing on Intel. Replace the broken impl so that people don't have to debug it later. Acked-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3699> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3699>
* amd/llvm: Fix divergent descriptor regressions with radeonsi.Bas Nieuwenhuizen2020-03-171-11/+13
| | | | | | | | | | | | | | piglit/bin/arb_bindless_texture-limit -auto -fbo: Needed to deal with non-NULL dynamic_index without deref in tex instructions. piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto: Need to deal with non-deref images in enter_waterfall_imae. Fixes: b83c9aca4a5 "amd/llvm: Fix divergent descriptor indexing. (v3)" Acked-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
* gallium: fix build with latest meson and gcc10Dave Airlie2020-03-171-1/+1
| | | | | | | | | | | | | | | | | | | In Fedora 32 build was failing with meson-0.53.2-1.git88e40c7.fc32 and gcc-10.0.1-0.9.fc32.x86_64. Worked with meson-0.53.1-1 and same gcc. /usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri2.c.o): in function `dri2_interop_export_object': /home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri2.c:1813: undefined reference to `st_finalize_texture' /usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri_screen.c.o): in function `dri_init_screen_helper': /home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri_screen.c:580: undefined reference to `st_gl_api_create' Moving this around seems to fix it. Cc: [email protected] Reviewed-by: Dylan Baker <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>
* ac: don't set old denormals flags with LLVM >= 11Marek Olšák2020-03-171-1/+2
| | | | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected] Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
* ac: set new LLVM denormal flagsMarek Olšák2020-03-171-0/+9
| | | | | | | | See: https://reviews.llvm.org/D71358 Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
* ac: unify denorm setting enforcementMarek Olšák2020-03-173-32/+14
| | | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
* gallium/u_vbuf: simplify the first if statement in u_vbuf_upload_buffersMarek Olšák2020-03-171-6/+8
| | | | | | | Acked-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
* gallium/u_threaded: don't sync the thread for all unsychronized mappingsMarek Olšák2020-03-171-0/+3
| | | | | | | | | This was missing for the READ case. This improves glBegin/End performance. (vbo maps with WRITE | READ | UNSYCHRONIZED) Acked-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
* freedreno/a5xx: Fix min-vs-mag filtering decisions on non-mipmap tex.Eric Anholt2020-03-172-155/+10
| | | | | | | | | This a port of 3338d6e5f8b5 ("freedreno/a3xx: Mostly fix min-vs-mag filtering decisions on non-mipmap tex.") Reviewed-by: Kristian H. Kristensen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177>
* ci: Enable testing GLES2-3 on a530 (Dragonboard 820c).Eric Anholt2020-03-177-8/+2636
| | | | | | | | | Following on from the db410c conversion to baremetal testing, reuse the same scripts in the same rack to run 7 db820c boards (#4/8 is failing in the bootloader for unknown reasons). Reviewed-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177>