mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	glsl: replace 'x + (-x)' with constant 0	Pierre-Eric Pelloux-Prayer	2019-08-30	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a hang in shadertoy for radeonsi where a buffer was initialized with: value -= value with value being undefined. In this case LLVM replace the operation with an assignment to NaN. Cc: 19.1 19.2 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241 Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 47cc660d9c19572e5ef2dce7c8ae1766a2ac9885)
*	nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is ↵	Ian Romanick	2019-08-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled This caused a problem on Sandybridge where an open-coded bitfieldReverse() function could be optimized to a nir_op_bitfield_reverse that would generate an unsupported BFREV instruction in the backend. This was encountered in some Unreal4 tech demos in shader-db. The bug was not previously noticed because we don't actually try to run those demos on Sandybridge. The fixes tag is a bit a lie. The actual bug was introduced about 26,000 commits earlier in 371c4b3c48f ("nir: Recognize open-coded bitfield_reverse."). Without the NIR lowering pass, the flag needed to avoid the optimization does not exist. Hopefully nobody will care to fix this on an earlier Mesa release. Reviewed-by: Matt Turner <[email protected]> Fixes: 7afa26d4e39 ("nir: Add lowering for nir_op_bitfield_reverse.") (cherry picked from commit d3fd1c761aab01e06665180ab86c9528c0b285b2)
*	nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll	Danylo Piliaiev	2019-08-23	1	-25/+1
\| \| \| \| \| \| \| \| \| \|	Without loop_prepare_for_unroll loops are losing phis. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411 Fixes: 5db98195 "nir: add loop unroll support for wrapper loops" Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 84b3ef6a96eabc28b18e8cdf1b0d61826b1a8a67)
*	nir/lcssa: handle deref instructions properly	Daniel Schürmann	2019-08-23	2	-14/+26
\| \| \| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]> Fixes: 414148cdc124 "nir: Support deref instructions in loop_analyze" (cherry picked from commit 204846ad062fe4e154406fa2d9093cdab4461ea2)
*	nir: remove explicit nir_intrinsic_index_flag values	Eric Engestrom	2019-08-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	These were left after a rebase and happen to make NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it was noticed. Fixes: 6f20643b471a851c936f ("nir: Allow qualifiers on copy_deref and image instructions") Cc: Connor Abbott <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit 5d7bcac4e711bc278eabf198d7d5016b77d9eb0e)
*	spirv: don't discard access set by vtn_pointer_dereference	Lionel Landwerlin	2019-07-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	We can have a access flag already set here so just augment the existing ones. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 0fb61dfdeb ("spirv: propagate access qualifiers through ssa & pointer") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 7deb5ec0e89769382fb5dd86aa5305001ae413fa)
*	spirv: propagate access qualifiers through ssa & pointer	Lionel Landwerlin	2019-07-30	3	-4/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not only variables can be flagged as NonUniformEXT but also expressions. We're currently ignoring it in an expression such as : imageLoad(data[nonuniformEXT(rIndex)], 0) The associated SPIRV : OpDecorate %69 NonUniformEXT ... %69 = OpLoad %61 %68 This changes propagates access qualifiers through ssa & pointers so that when it hits a OpLoad/OpStore style instructions, qualifiers are not forgotten. Fixes failure the following tests : dEQP-VK.descriptor_indexing.* Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 8ed583fe523703 ("spirv: Handle the NonUniformEXT decoration") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 0fb61dfdebac802e4b4c7b5dbebf3d7ba1e60dc2)
*	spirv: wrap push ssa/pointer values	Lionel Landwerlin	2019-07-30	4	-64/+83
\| \| \| \| \| \| \| \| \| \| \| \|	This refactor allows for common code to apply decoration on all ssa/pointer values. In particular this will allow to propagage access qualifiers. Signed-off-by: Lionel Landwerlin <[email protected]> Suggested-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 86b53770e1ea6e79452ccc97bab829ad58ffc5fd) [Lionel Landwerlin: patch adapted for 19.1 branch]
*	nir: Allow qualifiers on copy_deref and image instructions	Connor Abbott	2019-07-30	6	-12/+48
\| \| \| \| \| \| \| \| \|	In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 6f20643b471a851c936fc8b569cf05dcd6e6e7fe)
*	nir: add access to image_deref intrinsics	Lionel Landwerlin	2019-07-29	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPIRV added the ability to access variables and have expressions non dynamically uniform and because spirv_to_nir generates deref instructions, we'll need to have that access there. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 8c330728f3094f2c836e022e57f003d0c82953ef) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/compiler/nir/nir.c
*	spirv: Fix order of barriers in SpvOpControlBarrier	Daniel Schürmann	2019-07-25	1	-4/+4
\| \| \| \| \| \| \| \| \|	Semantically, the memory barrier has to come first to wait for the completion of pending memory requests. Afterwards, the workgroups can be synchronized. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit e352b4d650d37730e5087792b9a74ef31d1974ab)
*	nir: don't return void	Eric Engestrom	2019-07-24	1	-1/+2
\| \| \| \| \| \| \|	Fixes: 14531d676b11999123c0 ("nir: make nir_const_value scalar") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Karol Herbst <[email protected]> (cherry picked from commit 3acc4278ad4138ad3a914085aefd7c47d46e1ad4)
*	nir/loop_analyze: Properly handle swizzles in loop conditions	Jason Ekstrand	2019-07-18	1	-140/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit re-plumbs all of nir_loop_analyze to use nir_ssa_scalar for all intermediate values so that we can properly handle swizzles. Even though if conditions are required to be scalars, they may still consume swizzles so you could have ((a.yzw < b.zzx).xz && c.xx).y == 0 as your loop termination condition. The old code would just bail the moment it saw its first non-zero swizzle but we can now properly chase the scalar from the if condition to all the way to a, b, and c. Shader-db results on Kaby Lake: total loops in shared programs: 4388 -> 4364 (-0.55%) loops in affected programs: 29 -> 5 (-82.76%) helped: 29 HURT: 5 Shader-db results on Haswell: total loops in shared programs: 4370 -> 4373 (0.07%) loops in affected programs: 2 -> 5 (150.00%) helped: 2 HURT: 5 Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit ff972c7a3a7e80a426b72f285902d35f6ca3b820)
*	nir: Add some helpers for chasing SSA values properly	Jason Ekstrand	2019-07-18	1	-0/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are various cases in which we want to chase SSA values through ALU ops ranging from hand-written optimizations to back-end translation code. In all these cases, it can be very tricky to do properly because of swizzles. This set of helpers lets you easily work with a single component of an SSA def and chase through ALU ops safely. Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 8f7405ed9d473c1729d48c5add4f0d9fe147c75a) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/compiler/nir/nir.h
*	nir/loop_analyze: Refactor detection of limit vars	Jason Ekstrand	2019-07-18	1	-54/+51
\| \| \| \| \| \| \| \| \| \|	This commit reworks both get_induction_and_limit_vars() and try_find_trip_count_vars_in_iand to return true on success and not modify their output parameters on failure. This makes their callers significantly simpler. Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 0333649e638a38258957fd8b7e0367d73bbc7a80)
*	nir/regs_to_ssa: Handle regs in phi sources properly	Jason Ekstrand	2019-07-17	1	-2/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sources of phi instructions act as if they occur at the very end of the predecessor block not the block in which the phi lives. In order to handle them correctly, we have to skip phi sources on the normal instruction walk and handle them as a separate walk over the successor phis. While registers in phi instructions is a bit of an oddity it can happen when we temporarily go out-of-SSA for control-flow manipulations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111075 Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 6fb685fe4b762c8030f86895707516e2481e9ece)
*	spirv: Fix stride calculation when lowering Workgroup to offsets	Caio Marcelo de Oliveira Filho	2019-07-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Use alignment to calculate the stride associated with the pointer types. That stride is used when the pointers are casted to arrays. Note that size alone is not sufficient, e.g. struct { vec2 a; vec1 b; } will have element an element size of 12 bytes, but the stride needs to be 16 bytes to respect the 8 byte alignment. Fixes: 050eb6389a8 "spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup" Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 026cfa10995ff3316476fa19507fa27adc531de5)
*	nir,intel: Add support for lowering 64-bit nir_opt_extract_*	Jason Ekstrand	2019-07-16	2	-0/+39
\| \| \| \| \| \| \| \| \| \|	We need this when doing full software 64-bit emulation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309 Fixes: cbad201c2b3 "nir/algebraic: Add missing 64-bit extract_[iu]8..." Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 0ba508d7a3b6a006b5b8db1e865d33efc8d0abd5)
*	nir/opt_if: Clean up single-src phis in opt_if_loop_terminator	Jason Ekstrand	2019-07-16	3	-0/+16
\| \| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071 Fixes: 2a74296f24ba "nir: add opt_if_loop_terminator()" Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 7a19e05e8c84152af3a15868f5ef781142ac8e23)
*	nir/loop_analyze: Bail if we encounter swizzles	Jason Ekstrand	2019-07-15	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	None of the current code knows what to do with swizzles. Take the safe option for now and bail if we see one. This does have a small shader-db impact but it is at least safe. Shader-db results on Kaby Lake: total loops in shared programs: 4364 -> 4388 (0.55%) loops in affected programs: 5 -> 29 (480.00%) helped: 5 HURT: 29 Shader-db results on Haswell: total loops in shared programs: 4373 -> 4370 (-0.07%) loops in affected programs: 5 -> 2 (-60.00%) helped: 5 HURT: 2 Fixes: 6772a17acc8ee "nir: Add a loop analysis pass" Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 9a3cb6f5fec040dea4a229b93f789995b36f9c09)
*	nir/loop_analyze: Handle bit sizes correctly in calculate_iterations	Jason Ekstrand	2019-07-15	1	-27/+48
\| \| \| \| \| \| \| \| \| \| \|	The current code assumes everything is 32-bit which is very likely true but not guaranteed by any means. Instead, use nir_eval_const_opcode to do the calculations in a bit-size-agnostic way. We also use the new constant constructors to build the correct size constants. Fixes: 6772a17acc8ee "nir: Add a loop analysis pass" Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 268ad47c1115be8a8444d8e0e40af71623f9d281)
*	nir: Add more helpers for working with const values	Jason Ekstrand	2019-07-15	2	-0/+135
\| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit ce5581e23e54be91e4c1ad6a6c5990eca6677ceb)
*	nir/loop_analyze: Fix phi-of-identical-alu detection	Jason Ekstrand	2019-07-15	1	-26/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	One issue was that the original version didn't check that swizzles matched when comparing ALU instructions so it could end up matching very different instructions. Using the nir_instrs_equal function from nir_instr_set.c which we use for CSE should be much more reliable. Another was that the loop assumes it will only run two iterations which may not be true. If there's something which guarantees that this case only happens for phis after ifs, it wasn't documented. Fixes: 9e6b39e1d521 "nir: detect more induction variables" Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 9f7ffe41dd185487479ea8846df1f5cdbf1b83a6)
*	nir/instr_set: Expose nir_instrs_equal()	Jason Ekstrand	2019-07-15	2	-59/+62
\| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 6e984bcb92cf5e8b7da7387bc73cf6519ea2f43d)
*	nir: Add a helper to determine if an intrinsic can be reordered	Connor Abbott	2019-07-15	3	-11/+13
\| \| \| \| \| \| \| \|	This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit a1c737927c0d96f26ce487930aa9a2ed323814c9)
*	nir: Use nir_src_bit_size instead of alu1->dest.dest.ssa.bit_size	Ian Romanick	2019-07-09	2	-1/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is important because, for example nir_op_fne has dest.dest.ssa.bit_size == 1, but the source operands can be 16-, 32-, or 64-bits. Fixing this helps partial redundancy elimination for compares in a few more shaders. v2: Add unit tests for nir_opt_comparison_pre that are fixed by this commit. All Intel platforms had similar results. total instructions in shared programs: 17179408 -> 17179081 (<.01%) instructions in affected programs: 43958 -> 43631 (-0.74%) helped: 118 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 2.87 x̃: 2 helped stats (rel) min: 0.06% max: 4.12% x̄: 1.19% x̃: 0.81% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 5.83% max: 6.06% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for instructions value: -3.08 -2.37 95% mean confidence interval for instructions %-change: -1.30% -0.85% Instructions are helped. total cycles in shared programs: 360959066 -> 360942386 (<.01%) cycles in affected programs: 774274 -> 757594 (-2.15%) helped: 111 HURT: 4 helped stats (abs) min: 1 max: 1591 x̄: 169.49 x̃: 36 helped stats (rel) min: <.01% max: 24.43% x̄: 8.86% x̃: 2.24% HURT stats (abs) min: 1 max: 2068 x̄: 533.25 x̃: 32 HURT stats (rel) min: 0.02% max: 5.10% x̄: 3.06% x̃: 3.56% 95% mean confidence interval for cycles value: -200.61 -89.47 95% mean confidence interval for cycles %-change: -10.32% -6.58% Cycles are helped. Reviewed-by: Jason Ekstrand <[email protected]> [v1] Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Fixes: be1cc3552bc ("nir: Add nir_const_value_negative_equal") (cherry picked from commit 0ac5ff9ecb26ebc07a48e4f15539f975cef9b82a)
*	nir: Add unit tests for nir_opt_comparison_pre	Ian Romanick	2019-07-09	4	-1/+334
\| \| \| \| \| \| \| \| \| \|	Each tests has a comment with the expected before and after NIR. The tests don't actually check this. The tests only check whether or not the optimization pass reported progress. I couldn't think of a robust, future-proof way to check the before and after code. Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit b08d7040518cdf76792952ceef72cadaa54d0179)
*	spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup	Caio Marcelo de Oliveira Filho	2019-07-03	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1): For objects in the Uniform, StorageBuffer, or PushConstant storage classes, the element’s address or location is calculated using a stride, which will be the Base-type’s Array Stride when the Base type is decorated with ArrayStride. For all other objects, the implementation will calculate the element’s address or location. For non-CL shaders the driver should layout the Workgroup storage class, so override any explicitly set ArrayStride in the shader. This currently fixes only the lower_workgroup_access_to_offsets case, which is used by anv. Reviewed-by: Juan A. Suarez <[email protected]> (cherry picked from commit 050eb6389a8867e6173644fbb6b2d13ad0db454b)
*	glsl: Fix round64 conversion function	Sagar Ghuge	2019-06-26	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix round64 function to handle round to nearest even cases specially with positive and negative numbers with fraction part 0.5. v2: 1) Simplify unused bits (Elie Tournier) Fixes: KHR-GL45.gpu_shader_fp64.builtin.round_dvec2 KHR-GL45.gpu_shader_fp64.builtin.round_dvec3 KHR-GL45.gpu_shader_fp64.builtin.round_dvec4 KHR-GL45.gpu_shader_fp64.builtin.roundeven_double KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec2 KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec3 KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec4 Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Elie Tournier <[email protected]> Acked-by: Anuj Phogat <[email protected]> (cherry picked from commit 06807e1948f1bced9806b00908c892f1e3c3db5b)
*	glsl: Don't increase the iteration count when there are no terminators	Ian Romanick	2019-06-25	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Incrementing the iteration count was intended to fix an off-by-one error when the first terminator was superseded by a later terminator. If there is no first terminator or later terminator, there is no off-by-one error. Incrementing the loop count creates one. This can be seen in loops like: do { if (something) { // No breaks or continues here. } } while (false); Reviewed-by: Timothy Arceri <[email protected]> Tested-by: Abel Briggs <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953 Fixes: 646621c66da ("glsl: make loop unrolling more like the nir unrolling path") (cherry picked from commit ee1c69faddb3624ace6548dafaff50549a031380)
*	glsl: Fix out of bounds read in shader_cache_read_program_metadata	Kenneth Graunke	2019-06-18	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The VaryingNames array has NumVaryings entries. But BufferStride is a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with more than 4 varyings would read out of bounds. Also, BufferStride is set based on the shader itself, which means that it's inherently already included in the hash, and doesn't need to be included again. At the point when shader_cache_read_program_metadata is called, the linker hasn't even set those fields yet. So, just drop it entirely. Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test. Fixes: 6d830940f78 glsl/shader_cache: Allow shader cache usage with transform feedback Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 3c10a2726bcf686f03e31e79e40786e3894ff063)
*	nir/propagate_invariant: Don't add NULL vars to the hash table	Jason Ekstrand	2019-06-06	1	-1/+10
\| \| \| \| \| \| \| \|	Fixes: 8410cf66d "nir/propagate_invariant: Skip unknown vars" Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit d96878a66a559f6690f01e82f06fcf92ae958d3c)
*	nir: Actually propagate progress in nir_opt_move_load_ubo.	Bas Nieuwenhuizen	2019-06-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Found with Jasons new metadata rework (https://gitlab.freedesktop.org/mesa/mesa/merge_requests/950). Fixes: af355aaa071 "nir: add nir_opt_move_load_ubo() optimization pass" Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit e24a7840f60ac2290761ea2dc2831e8c3ba8bbfc)
*	nir/dead_cf: Call instructions aren't dead	Jason Ekstrand	2019-05-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	When we inlined cf_node_has_side_effects into node_is_dead, all the conditions flipped and we forgot to flip one. Fortunately, it doesn't matter right now because no one uses this pass on shaders with more than one function. Fixes: b50465d197 "nir/dead_cf: Inline cf_node_has_side_effects" Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 8948048c6f01209bac0051e41cd84c38853bd251)
*	nir/lower_non_uniform: safely iterate over blocks	Lionel Landwerlin	2019-05-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a problem where the same instruction gets replaced twice. This was happening when the replaced instruction would be at the end of a block. Replacement of : if ssa_8 { .... intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf / / image_array=false / / format=34836 / / access=32 / } Would be : if ssa_8 { loop { vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) () vec1 1 ssa_48 = ieq ssa_47, ssa_44 if ssa_48 { loop { vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) () vec1 1 ssa_50 = ieq ssa_49, ssa_44 if ssa_50 { intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) / image_dim=Buf / / image_array=false / / format=34836 / / access=32 */ break } else { .... } Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 366811bedb67ae7d31a02ea9b1f9fa942fb93602)
*	nir: Fix clone of nir_variable state slots	Caio Marcelo de Oliveira Filho	2019-05-21	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	When num_state_slots is 0, don't create the array. This was triggering the following assert when running vkcube with NIR_TEST_CLONE=1 vkcube: ../src/compiler/nir/nir_split_per_member_structs.c:66: split_variable: Assertion `var->state_slots == NULL' failed. Fixes: 9fbd390dd4b "nir: Add support for cloning shaders" Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 005cc9ae37ca45960d87389dc9eace5ed29d1b99)
*	glsl: init packed in more constructors.	Dave Airlie	2019-05-21	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \|	src/compiler/glsl_types.cpp:577: uninit_member: Non-static class member "packed" is not initialized in this constructor nor in any functions that it calls. from Coverity. Fixes: 659f333b3a4 (glsl: add packed for struct types) Acked-by: Ilia Mirkin <[email protected]> (cherry picked from commit b2d4d08a5cae29759bdbd4ac4e942ea372fe7735)
*	nir: Fix nir_opt_idiv_const when negatives are involved	Caio Marcelo de Oliveira Filho	2019-05-21	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	First, allow the case for negative powers of two. Then ensure that we use the absolute value of the non-constant value to calculate the quotient -- this was hinted in the code by the name 'uq'. This fixes an issue when 'd' is positive and 'n' is negative. The ishr will propagate the negative sign and we'll use nir_ineg() again, incorrectly. v2: First version used only ishr, but that isn't sufficient, since it never can produce a zero as a result. (Jason) Allow negative powers of two. (Caio) Fixes: 74492ebad94 "nir: Add a pass for lowering integer division by constants" Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 8a995f2b5e1e3f2a2eafd32870ebfb43b5cfdf27)
*	nir: lower_non_uniform_access: iterate over instructions safely	Lionel Landwerlin	2019-05-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This pass moves instructions around and adds control-flow in the middle of blocks. We need to use nir_foreach_instr_safe to ensure that we iterate over instructions correctly anyway. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit e04cf0b61269ca60b3260d81d94e625965d39901)
*	nir: fix lower_non_uniform_access pass	Lionel Landwerlin	2019-05-16	1	-0/+1
\| \| \| \| \| \| \| \| \|	Obviously missing the instruction insertion into the SSA list. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 391a836e8fb1c84170f3aa7550f0b347d31528f3)
*	Revert "nir: add late opt to turn inot/b2f combos back to bcsel"	Ian Romanick	2019-05-15	2	-19/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 7acc8652268205a266068ea4d059eccce43e1f78. With these optimizations in place, the extra constant folding added in the next commit extends some live ranges of 0.0 and ±1.0 constants, and that causes several hundred shaders to have more spills and fills. I believe this optimization we made basically irrelevant by 7725d609387 "intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))". All Gen7.5+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17225303 -> 17224634 (<.01%) instructions in affected programs: 879402 -> 878733 (-0.08%) helped: 679 HURT: 1 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.03% max: 0.93% x̄: 0.24% x̃: 0.05% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45% 95% mean confidence interval for instructions value: -1.02 -0.95 95% mean confidence interval for instructions %-change: -0.26% -0.22% Instructions are helped. total cycles in shared programs: 360842595 -> 360828542 (<.01%) cycles in affected programs: 110443594 -> 110429541 (-0.01%) helped: 389 HURT: 265 helped stats (abs) min: 1 max: 7525 x̄: 162.81 x̃: 28 helped stats (rel) min: <.01% max: 18.66% x̄: 1.11% x̃: 0.11% HURT stats (abs) min: 1 max: 7614 x̄: 185.96 x̃: 48 HURT stats (rel) min: <.01% max: 25.08% x̄: 0.95% x̃: 0.10% 95% mean confidence interval for cycles value: -75.65 32.67 95% mean confidence interval for cycles %-change: -0.49% -0.06% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 12159 -> 12161 (0.02%) spills in affected programs: 13 -> 15 (15.38%) helped: 0 HURT: 1 total fills in shared programs: 25207 -> 25208 (<.01%) fills in affected programs: 25 -> 26 (4.00%) helped: 0 HURT: 1 Ivy Bridge total instructions in shared programs: 12082019 -> 12082013 (<.01%) instructions in affected programs: 1033 -> 1027 (-0.58%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.41% max: 0.83% x̄: 0.61% x̃: 0.59% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.78% -0.45% Instructions are helped. total cycles in shared programs: 179849270 -> 179849157 (<.01%) cycles in affected programs: 4735 -> 4622 (-2.39%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 74 x̄: 28.25 x̃: 18 helped stats (rel) min: 0.13% max: 6.53% x̄: 2.85% x̃: 2.36% 95% mean confidence interval for cycles value: -82.73 26.23 95% mean confidence interval for cycles %-change: -7.98% 2.28% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10882750 -> 10882748 (<.01%) instructions in affected programs: 266 -> 264 (-0.75%) helped: 2 HURT: 0 Iron Lake total cycles in shared programs: 188609440 -> 188609448 (<.01%) cycles in affected programs: 4320 -> 4328 (0.19%) helped: 0 HURT: 2 GM45 total cycles in shared programs: 129016868 -> 129016872 (<.01%) cycles in affected programs: 2302 -> 2306 (0.17%) helped: 0 HURT: 1 Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit d2a9ba03e30602f040687da325470d72eeddef1a) [Juan: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/compiler/nir/nir_opt_algebraic.py
*	mesa: Makefile.sources: Add nir_lower_fb_read.c to Makefile.sources list	John Stultz	2019-05-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit a99c360a4630 (nir: add pass to lower fb reads), a new file was added that needs to also be added to the Makefile.sources list used by the Android and SCons build system. Cc: Rob Clark <[email protected]> Cc: Emil Velikov <[email protected]> Cc: Amit Pundir <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Alistair Strachan <[email protected]> Cc: Greg Hartman <[email protected]> Cc: Tapani Pälli <[email protected]> Cc: Jason Ekstrand <[email protected]> Fixes: a99c360a463 ("nir: add pass to lower fb reads") Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: John Stultz <[email protected]> (cherry picked from commit c7f2145b4b1551d521de2303b0dc97b56a0e3907)
*	spirv/cl: support vload/vstore	Karol Herbst	2019-05-04	1	-0/+55
\| \| \| \| \|	Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add nir_op_vec helper	Karol Herbst	2019-05-04	3	-22/+14
\| \| \| \| \| \| \| \| \|	with that we can simplify code where nir vectors are created v2: merge both lines in nir_vec Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add a nir_builder_alu variant which takes an array of components	Karol Herbst	2019-05-04	1	-14/+36
\| \| \| \| \| \| \|	v2: rename to nir_build_alu_src_arr Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	vtn: handle bitcast with pointer src/dest	Karol Herbst	2019-05-04	3	-29/+45
\| \| \| \| \| \| \|	v2: use vtn_push_ssa and vtn_ssa_value Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add a SSA type gathering pass	Jason Ekstrand	2019-05-04	4	-0/+223
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new pass (which isn't even compile-tested) attempts to determine the ALU type of all the SSA values in a function impl. It takes a greedy approach and assigns intness or floatness to everything it thinks can possibly contain an int or a float. Some values will be labled as both int and float and some will be labled as neither and it is up to the caller to decide what to do with this information. However, for a "nice" shader where the original source contained no bit-casts and no implicit bit-casts were introduced by optimizations, there shouldn't be any overlap in the two sets save for the odd CSEd zero constant. Reviewed-by: Vasily Khoruzhick <[email protected]>
*	nir/algebraic: Don't emit empty initializers for MSVC	Connor Abbott	2019-05-04	1	-0/+4
\| \| \| \| \| \| \| \| \|	Just don't emit the transform array at all if there are no transforms v2: - Don't use len(array) > 0 (Dylan) - Keep using ARRAY_SIZE to make the generated C code easier to read (Jason).
*	meson: Don't build glsl cache_test when shader cache is disabled	Dylan Baker	2019-05-03	1	-12/+13
\| \| \| \| \| \| \|	v2: - Use new with_shader_cache variable instead of host_machine.system() == 'windows' Reviewed-by: Eric Anholt <[email protected]>
*	glsl/tests: define ssize_t on windows	Dylan Baker	2019-05-03	1	-0/+4
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>