mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 format.	Krzysztof Raszkowski	2019-12-03	1	-1/+3
\| \| \| \| \| \| \|	Reject the new formats in swr to prevent crashes because it doesn't know how to handle the new formats. Reviewed-by: Jan Zielinski <[email protected]>
*	anv: Set up SBE_SWIZ properly for gl_Viewport	Jason Ekstrand	2019-12-03	1	-2/+2
\| \| \| \| \| \| \| \| \|	gl_Viewport is also in the VUE header so we need to whack the read offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that case as well. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
*	ac/llvm: fix atomic var operations if source isn't a deref	Samuel Pitoiset	2019-12-03	1	-7/+9
\| \| \| \| \| \| \| \|	Fixes some CTS regressions. Fixes: e61a826f396 ("ac/llvm: fix pointer type for global atomics") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	gallivm/llvmpipe: add support for front facing in sysval.	Dave Airlie	2019-12-03	6	-1/+14
\| \| \| \| \| \| \| \|	This wires up the front facing value as a sysval, I'd like to remove the other facing code but I'd need to confirm VMware don't use it first. Reviewed-by: Marek Olšák <[email protected]>
*	llvmpipe/images: handle undefined atomic without crashing	Dave Airlie	2019-12-03	1	-2/+10
\| \| \| \| \| \|	just return 0 for unbound atomic operations. Reviewed-by: Marek Olšák <[email protected]>
*	panfrost: Remove blend shader hack	Alyssa Rosenzweig	2019-12-03	2	-5/+1
\| \| \| \| \| \|	This is no longer used. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: White list the Mali T720	Tomeu Vizoso	2019-12-03	1	-0/+1
\| \| \| \| \| \| \|	Support for this GPU is equal now to that of T760, so whitelist it. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	pan/midgard: Splatter on fragment out	Alyssa Rosenzweig	2019-12-03	1	-1/+20
\| \| \| \| \| \| \|	Make sure that the fragment is complete when writing it out. Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]>
*	panfrost: Simplify shader patching	Tomeu Vizoso	2019-12-03	1	-41/+19
\| \| \| \| \| \| \|	We need to always upload anyway. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Simplify draw_flags	Alyssa Rosenzweig	2019-12-03	1	-11/+2
\| \| \| \| \| \| \| \| \|	Fixes dEQP-GLES3.functional.primitive_restart.*. Note the 0x18000 value is accidentally somehow enabling primitive restart for some reason. I'm not sure where this value came from but let's not. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
*	panfrost: Implement pan_tiler for non-hierarchy GPUs	Alyssa Rosenzweig	2019-12-03	6	-136/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The algorithm is as described. Nothing fancy here, just need to add some new code paths depending on which model we're running on. Tomeu: - Also disable tiling when !hierarchy and !vertex_count - Avoid creating polygon lists smaller than the minimum when vertex_count > 0 but tile size smaller than 16 byte - Take into account tile size when calculating polygon list size for !hierarchy - Allow 0-sized tiles in a single dimension Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]>
*	panfrost: Add information about T720 tiling	Alyssa Rosenzweig	2019-12-03	1	-1/+46
\| \| \| \| \| \| \|	We've figured out most of the big pieces, and though it looks faintly like other Midgards, it's much simpler. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Add quirks system to cmdstream	Tomeu Vizoso	2019-12-03	7	-13/+83
\| \| \| \| \| \| \| \|	Similarly to how it's already done in the compiler, add a way to express differences between GPU models that need to be taken into account when assembling the cmdstream. Signed-off-by: Tomeu Vizoso <[email protected]>
*	nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select	Ian Romanick	2019-12-02	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Matt Turner <[email protected]> All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14660366 -> 14653437 (-0.05%) instructions in affected programs: 316166 -> 309237 (-2.19%) helped: 905 HURT: 10 helped stats (abs) min: 1 max: 36 x̄: 7.67 x̃: 6 helped stats (rel) min: 0.13% max: 18.75% x̄: 4.28% x̃: 3.60% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.10% max: 1.33% x̄: 0.70% x̃: 0.97% 95% mean confidence interval for instructions value: -7.91 -7.23 95% mean confidence interval for instructions %-change: -4.46% -3.99% Instructions are helped. total cycles in shared programs: 228571646 -> 228549759 (<.01%) cycles in affected programs: 56239919 -> 56218032 (-0.04%) helped: 681 HURT: 216 helped stats (abs) min: 1 max: 5156 x̄: 45.49 x̃: 10 helped stats (rel) min: <.01% max: 10.45% x̄: 1.29% x̃: 0.65% HURT stats (abs) min: 1 max: 320 x̄: 42.09 x̃: 14 HURT stats (rel) min: <.01% max: 37.04% x̄: 1.38% x̃: 0.49% 95% mean confidence interval for cycles value: -41.51 -7.29 95% mean confidence interval for cycles %-change: -0.80% -0.49% Cycles are helped. LOST: 1 GAINED: 0
*	nir/algebraic: Simplify some Inf and NaN avoidance code	Ian Romanick	2019-12-02	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since a is non-negative, neither fsqrt nor frsq should return NaN. frsq should only return Inf when fsqrt returns 0. The changes are pretty small, but this turns a few hundred hurt shaders in the next patch into helped shaders. An alternative to the intBitsToFloat is to import numpy and do np.finfo(np.float32).max. That's more explicit, but we may also want to have specific bit encodings of float values later. I could be convinced either way, but intBitsToFloat(0x7f7fffff) was what I implemented first. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Matt Turner <[email protected]> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14661140 -> 14661104 (<.01%) instructions in affected programs: 7520 -> 7484 (-0.48%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.32% max: 0.61% x̄: 0.49% x̃: 0.52% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.52% -0.47% Instructions are helped. total cycles in shared programs: 228585416 -> 228584806 (<.01%) cycles in affected programs: 56321 -> 55711 (-1.08%) helped: 32 HURT: 0 helped stats (abs) min: 2 max: 98 x̄: 19.06 x̃: 10 helped stats (rel) min: 0.08% max: 6.41% x̄: 1.09% x̃: 0.65% 95% mean confidence interval for cycles value: -28.32 -9.80 95% mean confidence interval for cycles %-change: -1.63% -0.54% Cycles are helped. Sandy Bridge total cycles in shared programs: 152991077 -> 152991075 (<.01%) cycles in affected programs: 11525 -> 11523 (-0.02%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.07% max: 0.11% x̄: 0.09% x̃: 0.09% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -5.27 4.27 95% mean confidence interval for cycles %-change: -0.16% 0.15% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45.
*	intel/compiler: Increase nir_opt_peephole_select threshold	Ian Romanick	2019-12-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I tried 2, 4, 6, 8, and 10. 8 seemed to be the sweet spot across all Intel platforms. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Matt Turner <[email protected]> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14736141 -> 14661140 (-0.51%) instructions in affected programs: 2272413 -> 2197412 (-3.30%) helped: 8416 HURT: 140 helped stats (abs) min: 1 max: 1152 x̄: 8.99 x̃: 6 helped stats (rel) min: 0.13% max: 42.55% x̄: 4.15% x̃: 3.20% HURT stats (abs) min: 1 max: 140 x̄: 4.73 x̃: 1 HURT stats (rel) min: 0.03% max: 3.44% x̄: 0.87% x̃: 0.60% 95% mean confidence interval for instructions value: -9.36 -8.17 95% mean confidence interval for instructions %-change: -4.14% -3.99% Instructions are helped. total cycles in shared programs: 231560416 -> 228585416 (-1.28%) cycles in affected programs: 126536021 -> 123561021 (-2.35%) helped: 7092 HURT: 1898 helped stats (abs) min: 1 max: 419320 x̄: 519.02 x̃: 159 helped stats (rel) min: <.01% max: 77.25% x̄: 13.52% x̃: 11.77% HURT stats (abs) min: 1 max: 14518 x̄: 371.91 x̃: 36 HURT stats (rel) min: <.01% max: 103.23% x̄: 5.92% x̃: 2.55% 95% mean confidence interval for cycles value: -514.34 -147.50 95% mean confidence interval for cycles %-change: -9.69% -9.14% Cycles are helped. total spills in shared programs: 5763 -> 5848 (1.47%) spills in affected programs: 1797 -> 1882 (4.73%) helped: 13 HURT: 13 total fills in shared programs: 17163 -> 16931 (-1.35%) fills in affected programs: 7214 -> 6982 (-3.22%) helped: 22 HURT: 19 total sends in shared programs: 730410 -> 730246 (-0.02%) sends in affected programs: 2705 -> 2541 (-6.06%) helped: 114 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.60% max: 20.00% x̄: 7.26% x̃: 5.88% 95% mean confidence interval for sends value: -1.55 -1.33 95% mean confidence interval for sends %-change: -7.90% -6.62% Sends are helped. LOST: 4 GAINED: 0 Sandy Bridge total instructions in shared programs: 10760511 -> 10724637 (-0.33%) instructions in affected programs: 961305 -> 925431 (-3.73%) helped: 3734 HURT: 110 helped stats (abs) min: 1 max: 151 x̄: 9.66 x̃: 8 helped stats (rel) min: 0.14% max: 41.21% x̄: 4.93% x̃: 3.95% HURT stats (abs) min: 1 max: 20 x̄: 1.68 x̃: 1 HURT stats (rel) min: 0.12% max: 5.41% x̄: 0.88% x̃: 0.52% 95% mean confidence interval for instructions value: -9.76 -8.91 95% mean confidence interval for instructions %-change: -4.90% -4.63% Instructions are helped. total cycles in shared programs: 153359411 -> 152991077 (-0.24%) cycles in affected programs: 11615401 -> 11247067 (-3.17%) helped: 2725 HURT: 1138 helped stats (abs) min: 1 max: 2844 x̄: 164.27 x̃: 80 helped stats (rel) min: 0.02% max: 48.60% x̄: 7.47% x̃: 3.91% HURT stats (abs) min: 1 max: 4351 x̄: 69.69 x̃: 25 HURT stats (rel) min: 0.02% max: 40.00% x̄: 3.39% x̃: 1.47% 95% mean confidence interval for cycles value: -103.18 -87.52 95% mean confidence interval for cycles %-change: -4.57% -3.97% Cycles are helped. total sends in shared programs: 584038 -> 583855 (-0.03%) sends in affected programs: 3512 -> 3329 (-5.21%) helped: 157 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.17 x̃: 1 helped stats (rel) min: 2.38% max: 25.00% x̄: 6.52% x̃: 6.06% 95% mean confidence interval for sends value: -1.26 -1.07 95% mean confidence interval for sends %-change: -7.17% -5.87% Sends are helped. LOST: 23 GAINED: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122617 -> 8111592 (-0.14%) instructions in affected programs: 380503 -> 369478 (-2.90%) helped: 912 HURT: 86 helped stats (abs) min: 1 max: 129 x̄: 12.19 x̃: 9 helped stats (rel) min: 0.30% max: 39.21% x̄: 3.69% x̃: 2.57% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.12% max: 3.64% x̄: 0.54% x̃: 0.36% 95% mean confidence interval for instructions value: -12.00 -10.10 95% mean confidence interval for instructions %-change: -3.56% -3.10% Instructions are helped. total cycles in shared programs: 188509780 -> 188534398 (0.01%) cycles in affected programs: 7211542 -> 7236160 (0.34%) helped: 859 HURT: 132 helped stats (abs) min: 2 max: 690 x̄: 46.59 x̃: 16 helped stats (rel) min: 0.01% max: 26.76% x̄: 1.53% x̃: 0.33% HURT stats (abs) min: 2 max: 1592 x̄: 489.67 x̃: 618 HURT stats (rel) min: 0.03% max: 185.92% x̄: 23.35% x̃: 6.26% 95% mean confidence interval for cycles value: 9.58 40.10 95% mean confidence interval for cycles %-change: 0.65% 2.93% Cycles are HURT.
*	nir/opt_peephole_select: Don't count some unary operations	Ian Romanick	2019-12-02	1	-1/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In many cases, fsat, fneg, fabs, ineg, and iabs will get folded into another instruction as either source or destination modifiers. Counting them as instructions means that some if-statements won't get converted to selects. For example, vec1 32 ssa_25 = flt32 ssa_0, ssa_23.x /* succs: block_1 block_2 / if ssa_25 { block block_1: / preds: block_0 / vec1 32 ssa_26 = fabs ssa_24 vec1 32 ssa_27 = fneg ssa_26 vec1 32 ssa_28 = fabs ssa_20 vec1 32 ssa_29 = fneg ssa_28 vec1 32 ssa_30 = fmul ssa_27, ssa_29 vec1 32 ssa_31 = fsat ssa_30 / succs: block_3 / } else { block block_2: / preds: block_0 / / succs: block_3 / } block block_3: / preds: block_1 block_2 */ block_1 isn't really 6 instructions, but it will be counted that way. Most callers of the peephole_select pass use either 1 or 8. It's very easy to blow way past either of these limits with things that are really only one or two actual instructions. I also tried some fancier things like making sure the fsat was of another SSA def from the same block, but the simple test was actually better. The i965 back-end SEL peephole pass still helps ~700 shaders in shader-db with this change. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Matt Turner <[email protected]> All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14743694 -> 14738910 (-0.03%) instructions in affected programs: 156575 -> 151791 (-3.06%) helped: 1204 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 3.97 x̃: 3 helped stats (rel) min: 0.15% max: 19.57% x̄: 5.15% x̃: 4.55% 95% mean confidence interval for instructions value: -4.12 -3.82 95% mean confidence interval for instructions %-change: -5.35% -4.95% Instructions are helped. total cycles in shared programs: 231749141 -> 231602916 (-0.06%) cycles in affected programs: 2818975 -> 2672750 (-5.19%) helped: 876 HURT: 322 helped stats (abs) min: 2 max: 788 x̄: 180.99 x̃: 220 helped stats (rel) min: <.01% max: 43.82% x̄: 20.75% x̃: 19.44% HURT stats (abs) min: 1 max: 1188 x̄: 38.27 x̃: 20 HURT stats (rel) min: 0.09% max: 102.67% x̄: 5.17% x̃: 1.70% 95% mean confidence interval for cycles value: -130.47 -113.64 95% mean confidence interval for cycles %-change: -14.85% -12.72% Cycles are helped. total sends in shared programs: 730495 -> 730491 (<.01%) sends in affected programs: 46 -> 42 (-8.70%) helped: 2 HURT: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122757 -> 8122617 (<.01%) instructions in affected programs: 14716 -> 14576 (-0.95%) helped: 46 HURT: 1 helped stats (abs) min: 1 max: 8 x̄: 3.07 x̃: 3 helped stats (rel) min: 0.36% max: 10.00% x̄: 2.54% x̃: 1.06% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.59% max: 1.59% x̄: 1.59% x̃: 1.59% 95% mean confidence interval for instructions value: -3.42 -2.54 95% mean confidence interval for instructions %-change: -3.28% -1.62% Instructions are helped. total cycles in shared programs: 188510100 -> 188509780 (<.01%) cycles in affected programs: 58994 -> 58674 (-0.54%) helped: 32 HURT: 1 helped stats (abs) min: 2 max: 96 x̄: 10.06 x̃: 6 helped stats (rel) min: 0.05% max: 15.29% x̄: 1.37% x̃: 0.31% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.68% max: 0.68% x̄: 0.68% x̃: 0.68% 95% mean confidence interval for cycles value: -16.34 -3.06 95% mean confidence interval for cycles %-change: -2.46% -0.15% Cycles are helped.
*	iris: Allow max dynamic pool size of 2GB for gen12	Jordan Justen	2019-12-02	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Reworks: * Adjust comment to list the state packets that curro found to be affected. Fixes: 8125d7960b6 ("intel/dev: Add preliminary device info for Tigerlake") Cc: 19.3 <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
*	radeonsi/gfx10: fix the vertex order for triangle strips emitted by a GS	Marek Olšák	2019-12-02	1	-3/+53
\| \| \| \|	Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	radeonsi/gfx10: simplify some duplicated NGG GS code	Marek Olšák	2019-12-02	1	-62/+34
\| \| \| \|	Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	util/u_thread: don't restrict u_thread_get_time_nano() to __linux__	Jonathan Gray	2019-12-02	1	-1/+1
\| \| \| \| \| \| \| \|	pthread_getcpuclockid() and clock_gettime() are also available on at least OpenBSD, FreeBSD, NetBSD, DragonFly, Cygwin. Signed-off-by: Jonathan Gray <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	util/futex: use futex syscall on OpenBSD	Jonathan Gray	2019-12-02	1	-0/+18
\| \| \| \| \| \| \|	Make use of the futex syscall added in OpenBSD 6.2. Signed-off-by: Jonathan Gray <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	meson: Add a "prefer_iris" build option	Kenneth Graunke	2019-12-02	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enabling this option makes Intel Gen8-11 hardware load the 'iris' driver by default instead of the older 'i965' driver. Regardless of how this option is set, users can still override which driver the loader selects via two methods. The first is to create a ~/.drirc or /etc/drirc file with the following snippet: <driconf> <device driver="loader" kernel_driver="i915"> <option name="dri_driver" value="i965" /> </device> </driconf> The other option is to set an environment variable: export MESA_LOADER_DRIVER_OVERRIDE=i965 For now, "prefer_iris" defaults to i965 (the historical choice). A separate future patch will change the default driver to iris. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1893 Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	turnip: fix display wsi fence timing out	Jonathan Marek	2019-12-02	1	-0/+11
\| \| \| \| \| \| \|	Fixes: df9f2adf ("turnip: add display wsi") Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	nir/lower_io_to_vector: don't create arrays when not needed	Rhys Perry	2019-12-02	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Some backends require that there are no array varyings. If there were no arrays in the input shader, the pass shouldn't have to create new ones. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2103 Fixes: bcd14756eec ('nir/lower_io_to_vector: add flat mode') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	aco: fix block_kind_discard s_andn2 definition to exec	Rhys Perry	2019-12-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Improves generated code of dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop because a loop exit phi can then be fixed to exec, removing copies and improving jump threading. No pipeline-db changes. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: handle loop exit and IF merge phis with break/discard	Rhys Perry	2019-12-02	2	-54/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ACO considers discards jumps and creates edges in the CFG for them but NIR does neither of these. This can be fixed instead by keeping track of whether a side of an IF had a break/discard, but this doesn't solve the issue with discards affecting loop exit phis. So this reworks phi handling a bit. Fixes these tests: dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop dEQP-VK.graphicsfuzz.loop-call-discard dEQP-VK.graphicsfuzz.complex-nested-loops-and-call Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: validate the CFG	Rhys Perry	2019-12-02	1	-0/+31
\| \| \| \| \|	Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	mesa/main/util: moving gallium u_mm to util, remove main/mm	Alejandro Piñeiro	2019-12-02	11	-387/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now there are two copies of mm: * mesa/main/mm.[ch] * gallium/auxiliary/util/u_mm.[ch] At some point they splitted, and from the commit message it was not clear why it was not possible to have only one copy at a common place. Taking into account that was several years ago, Im assuming that it was not possible then. This change would allow to have one copy of the same code, and also being able to use that code out of mesa/main or gallium, if needed. This commit moves u_mm and removes mm, as u_mm has slightly more changes. Reviewed-by: Jose Fonseca <[email protected]>
*	radv: set writes_memory for global memory stores/atomics	Rhys Perry	2019-12-02	1	-8/+25
\| \| \| \| \| \|	Fixes: 13ab63bb62b ('radv: Implement VK_EXT_buffer_device_address.') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/llvm: improve sync scope for global atomics	Rhys Perry	2019-12-02	1	-0/+3
\| \| \| \| \| \| \|	Stronger ordering is implemented in SPIRV->NIR with barriers. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac/llvm: fix pointer type for global atomics	Rhys Perry	2019-12-02	1	-0/+6
\| \| \| \| \|	Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	iris: Map FXT1 texture formats	Kenneth Graunke	2019-12-01	1	-0/+2
\| \| \| \| \| \| \| \| \|	This exposes GL_TDFX_texture_compression_FXT1 support. It's ancient, only Intel GPUs appear to support it, and I seriously doubt anybody uses it. But i965 supports it, and it's trivial to do, so we may as well support it in the new iris driver as well. Reviewed-by: Eric Anholt <[email protected]>
*	st/mesa: Add GL_TDFX_texture_compression_FXT1 support	Kenneth Graunke	2019-12-01	5	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	Eric recently added PIPE_FORMAT_FXT1_RGB[A] as part of his format unification work. This was really most of the work of implementing the extension. We just need to handle it in a couple of places and expose the extension. v2: Reject the new formats in llvmpipe_is_format_supported to prevent crashes because it doesn't know how to handle the new formats. Reviewed-by: Marek Olšák <[email protected]> [v1] Reviewed-by: Eric Anholt <[email protected]> [v1]
*	nir/samplers: don't zero samplers_used/txf.	Dave Airlie	2019-12-02	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \|	This allows this pass to be run multiple times and the results are just or'ed together. It fixes on test on llvmpipe nir, and regresses none. Suggested by Kenneth Reviewed-by: Marek Olšák <[email protected]>
*	aco: drop useless lowering of deref operations for shared memory	Samuel Pitoiset	2019-11-29	1	-5/+1
\| \| \| \| \| \| \|	Moved to RADV. No pipeline-db changes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	radv,ac/nir: lower deref operations for shared memory	Samuel Pitoiset	2019-11-29	2	-22/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This shouldn't introduce any functional changes for RadeonSI when NIR is enabled because these operations are already lowered. pipeline-db (NAVI10/LLVM): SGPRS: 9043 -> 9051 (0.09 %) VGPRS: 7272 -> 7292 (0.28 %) Code Size: 638892 -> 621628 (-2.70 %) bytes LDS: 1333 -> 1331 (-0.15 %) blocks Max Waves: 1614 -> 1608 (-0.37 %) Found this while glancing at some F12019 shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	aco: fix a couple of value numbering issues	Daniel Schürmann	2019-11-29	1	-11/+18
\| \| \| \| \| \|	Fixes: 3a20ef4a3299fddc886f9d5908d8b3952dd63a54 'aco: refactor value numbering' Reviewed-by: Rhys Perry <[email protected]>
*	aco: don't split live-ranges of linear VGPRs	Daniel Schürmann	2019-11-29	1	-4/+11
\| \| \| \| \| \|	Fixes: 93c8ebfa780ebd1495095e794731881aef29e7d3 'aco: Initial commit of independent AMD compiler' Reviewed-by: Rhys Perry <[email protected]>
*	aco: implement global atomics	Rhys Perry	2019-11-29	2	-0/+107
\| \| \| \| \|	Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: improve FLAT/GLOBAL scheduling	Rhys Perry	2019-11-29	5	-14/+30
\| \| \| \| \|	Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: don't enable store_global for helper invocations	Rhys Perry	2019-11-29	4	-0/+8
\| \| \| \| \|	Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: fix SADDR with FLAT on GFX10	Rhys Perry	2019-11-29	1	-1/+1
\| \| \| \| \| \| \| \|	The reference guide is incorrect and SADDR is actually used with FLAT on GFX10. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: fix assembly of FLAT/GLOBAL atomics	Rhys Perry	2019-11-29	1	-1/+1
\| \| \| \| \| \| \|	They can take both a definition and data operand Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: fix GFX10 opcodes for some global/flat atomics	Rhys Perry	2019-11-29	1	-6/+6
\| \| \| \| \|	Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: improve WAR hazard workaround with >64bit stores	Rhys Perry	2019-11-29	1	-9/+15
\| \| \| \| \|	Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: add v_nop inbetween exec write and VMEM/DS/FLAT	Rhys Perry	2019-11-29	1	-5/+8
\| \| \| \| \| \| \| \|	LLVM and the proprietary compiler seem to do this Fixes: b01847bd9 ("aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.") Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: fix incorrect cast in parse_wait_instr()	Rhys Perry	2019-11-29	1	-1/+1
\| \| \| \| \| \| \|	s_waitcnt is SOPP, not SOPK Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: fix i2i64	Rhys Perry	2019-11-29	1	-2/+6
\| \| \| \| \| \|	Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: propagate p_wqm on an image_sample's coordinate p_create_vector	Rhys Perry	2019-11-29	1	-9/+12
\| \| \| \| \| \| \|	Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2156 Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>