summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir: add access to image_deref intrinsicsLionel Landwerlin2019-07-291-1/+3
| | | | | | | | | | | | | | | | | SPIRV added the ability to access variables and have expressions non dynamically uniform and because spirv_to_nir generates deref instructions, we'll need to have that access there. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 8c330728f3094f2c836e022e57f003d0c82953ef) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/compiler/nir/nir.c
* spirv: Fix order of barriers in SpvOpControlBarrierDaniel Schürmann2019-07-251-4/+4
| | | | | | | | | Semantically, the memory barrier has to come first to wait for the completion of pending memory requests. Afterwards, the workgroups can be synchronized. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit e352b4d650d37730e5087792b9a74ef31d1974ab)
* nir: don't return voidEric Engestrom2019-07-241-1/+2
| | | | | | | Fixes: 14531d676b11999123c0 ("nir: make nir_const_value scalar") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Karol Herbst <[email protected]> (cherry picked from commit 3acc4278ad4138ad3a914085aefd7c47d46e1ad4)
* nir/loop_analyze: Properly handle swizzles in loop conditionsJason Ekstrand2019-07-181-140/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit re-plumbs all of nir_loop_analyze to use nir_ssa_scalar for all intermediate values so that we can properly handle swizzles. Even though if conditions are required to be scalars, they may still consume swizzles so you could have ((a.yzw < b.zzx).xz && c.xx).y == 0 as your loop termination condition. The old code would just bail the moment it saw its first non-zero swizzle but we can now properly chase the scalar from the if condition to all the way to a, b, and c. Shader-db results on Kaby Lake: total loops in shared programs: 4388 -> 4364 (-0.55%) loops in affected programs: 29 -> 5 (-82.76%) helped: 29 HURT: 5 Shader-db results on Haswell: total loops in shared programs: 4370 -> 4373 (0.07%) loops in affected programs: 2 -> 5 (150.00%) helped: 2 HURT: 5 Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit ff972c7a3a7e80a426b72f285902d35f6ca3b820)
* nir: Add some helpers for chasing SSA values properlyJason Ekstrand2019-07-181-0/+79
| | | | | | | | | | | | | | | | There are various cases in which we want to chase SSA values through ALU ops ranging from hand-written optimizations to back-end translation code. In all these cases, it can be very tricky to do properly because of swizzles. This set of helpers lets you easily work with a single component of an SSA def and chase through ALU ops safely. Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 8f7405ed9d473c1729d48c5add4f0d9fe147c75a) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/compiler/nir/nir.h
* nir/loop_analyze: Refactor detection of limit varsJason Ekstrand2019-07-181-54/+51
| | | | | | | | | | This commit reworks both get_induction_and_limit_vars() and try_find_trip_count_vars_in_iand to return true on success and not modify their output parameters on failure. This makes their callers significantly simpler. Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 0333649e638a38258957fd8b7e0367d73bbc7a80)
* nir/regs_to_ssa: Handle regs in phi sources properlyJason Ekstrand2019-07-171-2/+32
| | | | | | | | | | | | | | Sources of phi instructions act as if they occur at the very end of the predecessor block not the block in which the phi lives. In order to handle them correctly, we have to skip phi sources on the normal instruction walk and handle them as a separate walk over the successor phis. While registers in phi instructions is a bit of an oddity it can happen when we temporarily go out-of-SSA for control-flow manipulations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111075 Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 6fb685fe4b762c8030f86895707516e2481e9ece)
* spirv: Fix stride calculation when lowering Workgroup to offsetsCaio Marcelo de Oliveira Filho2019-07-161-1/+1
| | | | | | | | | | | | | Use alignment to calculate the stride associated with the pointer types. That stride is used when the pointers are casted to arrays. Note that size alone is not sufficient, e.g. struct { vec2 a; vec1 b; } will have element an element size of 12 bytes, but the stride needs to be 16 bytes to respect the 8 byte alignment. Fixes: 050eb6389a8 "spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup" Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 026cfa10995ff3316476fa19507fa27adc531de5)
* nir,intel: Add support for lowering 64-bit nir_opt_extract_*Jason Ekstrand2019-07-162-0/+39
| | | | | | | | | | We need this when doing full software 64-bit emulation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309 Fixes: cbad201c2b3 "nir/algebraic: Add missing 64-bit extract_[iu]8..." Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 0ba508d7a3b6a006b5b8db1e865d33efc8d0abd5)
* nir/opt_if: Clean up single-src phis in opt_if_loop_terminatorJason Ekstrand2019-07-163-0/+16
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071 Fixes: 2a74296f24ba "nir: add opt_if_loop_terminator()" Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 7a19e05e8c84152af3a15868f5ef781142ac8e23)
* nir/loop_analyze: Bail if we encounter swizzlesJason Ekstrand2019-07-151-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | None of the current code knows what to do with swizzles. Take the safe option for now and bail if we see one. This does have a small shader-db impact but it is at least safe. Shader-db results on Kaby Lake: total loops in shared programs: 4364 -> 4388 (0.55%) loops in affected programs: 5 -> 29 (480.00%) helped: 5 HURT: 29 Shader-db results on Haswell: total loops in shared programs: 4373 -> 4370 (-0.07%) loops in affected programs: 5 -> 2 (-60.00%) helped: 5 HURT: 2 Fixes: 6772a17acc8ee "nir: Add a loop analysis pass" Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 9a3cb6f5fec040dea4a229b93f789995b36f9c09)
* nir/loop_analyze: Handle bit sizes correctly in calculate_iterationsJason Ekstrand2019-07-151-27/+48
| | | | | | | | | | | The current code assumes everything is 32-bit which is very likely true but not guaranteed by any means. Instead, use nir_eval_const_opcode to do the calculations in a bit-size-agnostic way. We also use the new constant constructors to build the correct size constants. Fixes: 6772a17acc8ee "nir: Add a loop analysis pass" Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 268ad47c1115be8a8444d8e0e40af71623f9d281)
* nir: Add more helpers for working with const valuesJason Ekstrand2019-07-152-0/+135
| | | | | Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit ce5581e23e54be91e4c1ad6a6c5990eca6677ceb)
* nir/loop_analyze: Fix phi-of-identical-alu detectionJason Ekstrand2019-07-151-26/+29
| | | | | | | | | | | | | | One issue was that the original version didn't check that swizzles matched when comparing ALU instructions so it could end up matching very different instructions. Using the nir_instrs_equal function from nir_instr_set.c which we use for CSE should be much more reliable. Another was that the loop assumes it will only run two iterations which may not be true. If there's something which guarantees that this case only happens for phis after ifs, it wasn't documented. Fixes: 9e6b39e1d521 "nir: detect more induction variables" Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 9f7ffe41dd185487479ea8846df1f5cdbf1b83a6)
* nir/instr_set: Expose nir_instrs_equal()Jason Ekstrand2019-07-152-59/+62
| | | | | Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 6e984bcb92cf5e8b7da7387bc73cf6519ea2f43d)
* nir: Add a helper to determine if an intrinsic can be reorderedConnor Abbott2019-07-153-11/+13
| | | | | | | | This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit a1c737927c0d96f26ce487930aa9a2ed323814c9)
* nir: Use nir_src_bit_size instead of alu1->dest.dest.ssa.bit_sizeIan Romanick2019-07-092-1/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is important because, for example nir_op_fne has dest.dest.ssa.bit_size == 1, but the source operands can be 16-, 32-, or 64-bits. Fixing this helps partial redundancy elimination for compares in a few more shaders. v2: Add unit tests for nir_opt_comparison_pre that are fixed by this commit. All Intel platforms had similar results. total instructions in shared programs: 17179408 -> 17179081 (<.01%) instructions in affected programs: 43958 -> 43631 (-0.74%) helped: 118 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 2.87 x̃: 2 helped stats (rel) min: 0.06% max: 4.12% x̄: 1.19% x̃: 0.81% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 5.83% max: 6.06% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for instructions value: -3.08 -2.37 95% mean confidence interval for instructions %-change: -1.30% -0.85% Instructions are helped. total cycles in shared programs: 360959066 -> 360942386 (<.01%) cycles in affected programs: 774274 -> 757594 (-2.15%) helped: 111 HURT: 4 helped stats (abs) min: 1 max: 1591 x̄: 169.49 x̃: 36 helped stats (rel) min: <.01% max: 24.43% x̄: 8.86% x̃: 2.24% HURT stats (abs) min: 1 max: 2068 x̄: 533.25 x̃: 32 HURT stats (rel) min: 0.02% max: 5.10% x̄: 3.06% x̃: 3.56% 95% mean confidence interval for cycles value: -200.61 -89.47 95% mean confidence interval for cycles %-change: -10.32% -6.58% Cycles are helped. Reviewed-by: Jason Ekstrand <[email protected]> [v1] Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Fixes: be1cc3552bc ("nir: Add nir_const_value_negative_equal") (cherry picked from commit 0ac5ff9ecb26ebc07a48e4f15539f975cef9b82a)
* nir: Add unit tests for nir_opt_comparison_preIan Romanick2019-07-094-1/+334
| | | | | | | | | | Each tests has a comment with the expected before and after NIR. The tests don't actually check this. The tests only check whether or not the optimization pass reported progress. I couldn't think of a robust, future-proof way to check the before and after code. Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit b08d7040518cdf76792952ceef72cadaa54d0179)
* spirv: Ignore ArrayStride in OpPtrAccessChain for WorkgroupCaio Marcelo de Oliveira Filho2019-07-031-4/+6
| | | | | | | | | | | | | | | | | | From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1): For objects in the Uniform, StorageBuffer, or PushConstant storage classes, the element’s address or location is calculated using a stride, which will be the Base-type’s Array Stride when the Base type is decorated with ArrayStride. For all other objects, the implementation will calculate the element’s address or location. For non-CL shaders the driver should layout the Workgroup storage class, so override any explicitly set ArrayStride in the shader. This currently fixes only the lower_workgroup_access_to_offsets case, which is used by anv. Reviewed-by: Juan A. Suarez <[email protected]> (cherry picked from commit 050eb6389a8867e6173644fbb6b2d13ad0db454b)
* glsl: Fix round64 conversion functionSagar Ghuge2019-06-261-9/+12
| | | | | | | | | | | | | | | | | | | | | Fix round64 function to handle round to nearest even cases specially with positive and negative numbers with fraction part 0.5. v2: 1) Simplify unused bits (Elie Tournier) Fixes: KHR-GL45.gpu_shader_fp64.builtin.round_dvec2 KHR-GL45.gpu_shader_fp64.builtin.round_dvec3 KHR-GL45.gpu_shader_fp64.builtin.round_dvec4 KHR-GL45.gpu_shader_fp64.builtin.roundeven_double KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec2 KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec3 KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec4 Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Elie Tournier <[email protected]> Acked-by: Anuj Phogat <[email protected]> (cherry picked from commit 06807e1948f1bced9806b00908c892f1e3c3db5b)
* glsl: Don't increase the iteration count when there are no terminatorsIan Romanick2019-06-251-1/+7
| | | | | | | | | | | | | | | | | | | | Incrementing the iteration count was intended to fix an off-by-one error when the first terminator was superseded by a later terminator. If there is no first terminator or later terminator, there is no off-by-one error. Incrementing the loop count creates one. This can be seen in loops like: do { if (something) { // No breaks or continues here. } } while (false); Reviewed-by: Timothy Arceri <[email protected]> Tested-by: Abel Briggs <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953 Fixes: 646621c66da ("glsl: make loop unrolling more like the nir unrolling path") (cherry picked from commit ee1c69faddb3624ace6548dafaff50549a031380)
* glsl: Fix out of bounds read in shader_cache_read_program_metadataKenneth Graunke2019-06-181-3/+2
| | | | | | | | | | | | | | | | | | | The VaryingNames array has NumVaryings entries. But BufferStride is a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with more than 4 varyings would read out of bounds. Also, BufferStride is set based on the shader itself, which means that it's inherently already included in the hash, and doesn't need to be included again. At the point when shader_cache_read_program_metadata is called, the linker hasn't even set those fields yet. So, just drop it entirely. Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test. Fixes: 6d830940f78 glsl/shader_cache: Allow shader cache usage with transform feedback Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit 3c10a2726bcf686f03e31e79e40786e3894ff063)
* nir/propagate_invariant: Don't add NULL vars to the hash tableJason Ekstrand2019-06-061-1/+10
| | | | | | | | Fixes: 8410cf66d "nir/propagate_invariant: Skip unknown vars" Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit d96878a66a559f6690f01e82f06fcf92ae958d3c)
* nir: Actually propagate progress in nir_opt_move_load_ubo.Bas Nieuwenhuizen2019-06-031-1/+1
| | | | | | | | | | Found with Jasons new metadata rework (https://gitlab.freedesktop.org/mesa/mesa/merge_requests/950). Fixes: af355aaa071 "nir: add nir_opt_move_load_ubo() optimization pass" Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit e24a7840f60ac2290761ea2dc2831e8c3ba8bbfc)
* nir/dead_cf: Call instructions aren't deadJason Ekstrand2019-05-311-1/+1
| | | | | | | | | | | | When we inlined cf_node_has_side_effects into node_is_dead, all the conditions flipped and we forgot to flip one. Fortunately, it doesn't matter right now because no one uses this pass on shaders with more than one function. Fixes: b50465d197 "nir/dead_cf: Inline cf_node_has_side_effects" Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> (cherry picked from commit 8948048c6f01209bac0051e41cd84c38853bd251)
* nir/lower_non_uniform: safely iterate over blocksLionel Landwerlin2019-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a problem where the same instruction gets replaced twice. This was happening when the replaced instruction would be at the end of a block. Replacement of : if ssa_8 { .... intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */ } Would be : if ssa_8 { loop { vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) () vec1 1 ssa_48 = ieq ssa_47, ssa_44 if ssa_48 { loop { vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) () vec1 1 ssa_50 = ieq ssa_49, ssa_44 if ssa_50 { intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */ break } else { .... } Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 366811bedb67ae7d31a02ea9b1f9fa942fb93602)
* nir: Fix clone of nir_variable state slotsCaio Marcelo de Oliveira Filho2019-05-211-3/+5
| | | | | | | | | | | | | When num_state_slots is 0, don't create the array. This was triggering the following assert when running vkcube with NIR_TEST_CLONE=1 vkcube: ../src/compiler/nir/nir_split_per_member_structs.c:66: split_variable: Assertion `var->state_slots == NULL' failed. Fixes: 9fbd390dd4b "nir: Add support for cloning shaders" Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 005cc9ae37ca45960d87389dc9eace5ed29d1b99)
* glsl: init packed in more constructors.Dave Airlie2019-05-211-6/+6
| | | | | | | | | | | src/compiler/glsl_types.cpp:577: uninit_member: Non-static class member "packed" is not initialized in this constructor nor in any functions that it calls. from Coverity. Fixes: 659f333b3a4 (glsl: add packed for struct types) Acked-by: Ilia Mirkin <[email protected]> (cherry picked from commit b2d4d08a5cae29759bdbd4ac4e942ea372fe7735)
* nir: Fix nir_opt_idiv_const when negatives are involvedCaio Marcelo de Oliveira Filho2019-05-211-3/+5
| | | | | | | | | | | | | | | | | | First, allow the case for negative powers of two. Then ensure that we use the absolute value of the non-constant value to calculate the quotient -- this was hinted in the code by the name 'uq'. This fixes an issue when 'd' is positive and 'n' is negative. The ishr will propagate the negative sign and we'll use nir_ineg() again, incorrectly. v2: First version used only ishr, but that isn't sufficient, since it never can produce a zero as a result. (Jason) Allow negative powers of two. (Caio) Fixes: 74492ebad94 "nir: Add a pass for lowering integer division by constants" Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 8a995f2b5e1e3f2a2eafd32870ebfb43b5cfdf27)
* nir: lower_non_uniform_access: iterate over instructions safelyLionel Landwerlin2019-05-161-1/+1
| | | | | | | | | | | This pass moves instructions around and adds control-flow in the middle of blocks. We need to use nir_foreach_instr_safe to ensure that we iterate over instructions correctly anyway. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit e04cf0b61269ca60b3260d81d94e625965d39901)
* nir: fix lower_non_uniform_access passLionel Landwerlin2019-05-161-0/+1
| | | | | | | | | Obviously missing the instruction insertion into the SSA list. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 391a836e8fb1c84170f3aa7550f0b347d31528f3)
* Revert "nir: add late opt to turn inot/b2f combos back to bcsel"Ian Romanick2019-05-152-19/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 7acc8652268205a266068ea4d059eccce43e1f78. With these optimizations in place, the extra constant folding added in the next commit extends some live ranges of 0.0 and ±1.0 constants, and that causes several hundred shaders to have more spills and fills. I believe this optimization we made basically irrelevant by 7725d609387 "intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))". All Gen7.5+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17225303 -> 17224634 (<.01%) instructions in affected programs: 879402 -> 878733 (-0.08%) helped: 679 HURT: 1 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.03% max: 0.93% x̄: 0.24% x̃: 0.05% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45% 95% mean confidence interval for instructions value: -1.02 -0.95 95% mean confidence interval for instructions %-change: -0.26% -0.22% Instructions are helped. total cycles in shared programs: 360842595 -> 360828542 (<.01%) cycles in affected programs: 110443594 -> 110429541 (-0.01%) helped: 389 HURT: 265 helped stats (abs) min: 1 max: 7525 x̄: 162.81 x̃: 28 helped stats (rel) min: <.01% max: 18.66% x̄: 1.11% x̃: 0.11% HURT stats (abs) min: 1 max: 7614 x̄: 185.96 x̃: 48 HURT stats (rel) min: <.01% max: 25.08% x̄: 0.95% x̃: 0.10% 95% mean confidence interval for cycles value: -75.65 32.67 95% mean confidence interval for cycles %-change: -0.49% -0.06% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 12159 -> 12161 (0.02%) spills in affected programs: 13 -> 15 (15.38%) helped: 0 HURT: 1 total fills in shared programs: 25207 -> 25208 (<.01%) fills in affected programs: 25 -> 26 (4.00%) helped: 0 HURT: 1 Ivy Bridge total instructions in shared programs: 12082019 -> 12082013 (<.01%) instructions in affected programs: 1033 -> 1027 (-0.58%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.41% max: 0.83% x̄: 0.61% x̃: 0.59% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.78% -0.45% Instructions are helped. total cycles in shared programs: 179849270 -> 179849157 (<.01%) cycles in affected programs: 4735 -> 4622 (-2.39%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 74 x̄: 28.25 x̃: 18 helped stats (rel) min: 0.13% max: 6.53% x̄: 2.85% x̃: 2.36% 95% mean confidence interval for cycles value: -82.73 26.23 95% mean confidence interval for cycles %-change: -7.98% 2.28% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10882750 -> 10882748 (<.01%) instructions in affected programs: 266 -> 264 (-0.75%) helped: 2 HURT: 0 Iron Lake total cycles in shared programs: 188609440 -> 188609448 (<.01%) cycles in affected programs: 4320 -> 4328 (0.19%) helped: 0 HURT: 2 GM45 total cycles in shared programs: 129016868 -> 129016872 (<.01%) cycles in affected programs: 2302 -> 2306 (0.17%) helped: 0 HURT: 1 Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit d2a9ba03e30602f040687da325470d72eeddef1a) [Juan: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/compiler/nir/nir_opt_algebraic.py
* mesa: Makefile.sources: Add nir_lower_fb_read.c to Makefile.sources listJohn Stultz2019-05-061-0/+1
| | | | | | | | | | | | | | | | | | | | In commit a99c360a4630 (nir: add pass to lower fb reads), a new file was added that needs to also be added to the Makefile.sources list used by the Android and SCons build system. Cc: Rob Clark <[email protected]> Cc: Emil Velikov <[email protected]> Cc: Amit Pundir <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Alistair Strachan <[email protected]> Cc: Greg Hartman <[email protected]> Cc: Tapani Pälli <[email protected]> Cc: Jason Ekstrand <[email protected]> Fixes: a99c360a463 ("nir: add pass to lower fb reads") Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: John Stultz <[email protected]> (cherry picked from commit c7f2145b4b1551d521de2303b0dc97b56a0e3907)
* spirv/cl: support vload/vstoreKarol Herbst2019-05-041-0/+55
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add nir_op_vec helperKarol Herbst2019-05-043-22/+14
| | | | | | | | | with that we can simplify code where nir vectors are created v2: merge both lines in nir_vec Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add a nir_builder_alu variant which takes an array of componentsKarol Herbst2019-05-041-14/+36
| | | | | | | v2: rename to nir_build_alu_src_arr Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vtn: handle bitcast with pointer src/destKarol Herbst2019-05-043-29/+45
| | | | | | | v2: use vtn_push_ssa and vtn_ssa_value Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add a SSA type gathering passJason Ekstrand2019-05-044-0/+223
| | | | | | | | | | | | | | This new pass (which isn't even compile-tested) attempts to determine the ALU type of all the SSA values in a function impl. It takes a greedy approach and assigns intness or floatness to everything it thinks can possibly contain an int or a float. Some values will be labled as both int and float and some will be labled as neither and it is up to the caller to decide what to do with this information. However, for a "nice" shader where the original source contained no bit-casts and no implicit bit-casts were introduced by optimizations, there shouldn't be any overlap in the two sets save for the odd CSEd zero constant. Reviewed-by: Vasily Khoruzhick <[email protected]>
* nir/algebraic: Don't emit empty initializers for MSVCConnor Abbott2019-05-041-0/+4
| | | | | | | | | Just don't emit the transform array at all if there are no transforms v2: - Don't use len(array) > 0 (Dylan) - Keep using ARRAY_SIZE to make the generated C code easier to read (Jason).
* meson: Don't build glsl cache_test when shader cache is disabledDylan Baker2019-05-031-12/+13
| | | | | | | v2: - Use new with_shader_cache variable instead of host_machine.system() == 'windows' Reviewed-by: Eric Anholt <[email protected]>
* glsl/tests: define ssize_t on windowsDylan Baker2019-05-031-0/+4
| | | | Reviewed-by: Eric Anholt <[email protected]>
* glsl: fix general_ir_test with mingwDylan Baker2019-05-031-7/+7
| | | | | | | | Somewhere down in the depths of the mingw headers 'interface' is defined, change it to iface like a similar patch did. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: fix lower vars to ssa for larger vector sizes.Dave Airlie2019-05-031-4/+4
| | | | | | | This has a couple of hardcoded vec4 limits in it, change them to the proper sizing to avoid future issues. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: fix SpvOpBitSize return value.Dave Airlie2019-05-031-3/+1
| | | | | | The spir-v spec says this returns a bool. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: fix nir tex print harderRob Clark2019-05-021-6/+5
| | | | | | Fixes: 691d5a825a6 nir: rework tex instruction printing Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* glsl: fix and clean up NV_compute_shader_derivatives supportMarek Olšák2019-05-021-54/+24
| | | | | | | - make sure compute shader derivatives are exposed for all extensions - unify duplicated code Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: add pass to lower fb readsRob Clark2019-05-025-6/+141
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: fix lower_wpos_ytransform in load_frag_coord caseRob Clark2019-05-021-10/+11
| | | | | | | | | | | | | | | Apparently we never hit this path. Or at least haven't for a rather long time. But in either case (load_deref or load_frag_coord), we can just directly use the intrinsic's ssa dest. So stop passing the nir_variable (which would be NULL in the load_frag_coord case) around and instead just use &intr->dest.ssa. (This ofc means we need to setup the cursor to insert *after* the instruction, which seems to be another bug of the original implementation.) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: rework tex instruction printingRob Clark2019-05-021-8/+10
| | | | | | | The extra comma at the end was annoying me. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir/search: Add debugging code to dump the pattern matchedConnor Abbott2019-05-021-0/+75
| | | | | | This was useful while debugging the previous commit. Reviewed-by: Jason Ekstrand <[email protected]>