summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir: Lock around validation fail shader dumpingJason Ekstrand2019-03-291-0/+10
| | | | | | | This prevents getting mixed-up results if a multi-threaded app has two validation errors in different threads. Reviewed-by: Timothy Arceri <[email protected]>
* nir/validate: validate that tex deref sources are actually derefsKarol Herbst2019-03-291-0/+11
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/print: fix printing the image_array intrinsic indexKarol Herbst2019-03-291-2/+2
| | | | | | | Fixes: 0de003be0363 ("nir: Add handle/index-based image intrinsics") Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: use {0} initializer instead of {} to fix MSVC buildBrian Paul2019-03-281-2/+2
| | | | | | Trivial change. Fixes: c6ee46a75 ("nir: Add nir_alu_srcs_negative_equal")
* nir: Add partial redundancy elimination for comparesIan Romanick2019-03-285-0/+414
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass attempts to dectect code sequences like if (x < y) { z = y - x; ... } and replace them with sequences like t = x - y; if (t < 0) { z = -t; ... } On architectures where the subtract can generate the flags used by the if-statement, this saves an instruction. It's also possible that moving an instruction out of the if-statement will allow nir_opt_peephole_select to convert the whole thing to a bcsel. Currently only floating point compares and adds are supported. Adding support for integer will be a challenge due to integer overflow. There are a couple possible solutions, but they may not apply to all architectures. v2: Fix a typo in the commit message and a couple typos in comments. Fix possible NULL pointer deref from result of push_block(). Add missing (-A + B) case. Suggested by Caio. v3: Fix is_not_const_zero to work correctly with types other than nir_type_float32. Suggested by Ken. v4: Add some comments explaining how this works. Suggested by Ken. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add nir_alu_srcs_negative_equalIan Romanick2019-03-283-0/+192
| | | | | | | v2: Move bug fix in get_neg_instr from the next patch to this patch (where it was intended to be in the first place). Noticed by Caio. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add nir_const_value_negative_equalIan Romanick2019-03-284-0/+398
| | | | | | | v2: Rebase on 1-bit Boolean changes. Reviewed-by: Thomas Helland <[email protected]> [v1] Reviewed-by: Kenneth Graunke <[email protected]>
* nir/algebraic: Add missing 16-bit extract_[iu]8 patternsIan Romanick2019-03-281-0/+3
| | | | | | | | | | | | | | No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <[email protected]> [v1] Reviewed-by: Dylan Baker <[email protected]> [v2] Acked-by: Jason Ekstrand <[email protected]> [v2]
* nir/algebraic: Add missing 64-bit extract_[iu]8 patternsIan Romanick2019-03-281-0/+3
| | | | | | | | | | | | | | No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <[email protected]> [v1] Reviewed-by: Dylan Baker <[email protected]> [v2] Acked-by: Jason Ekstrand <[email protected]> [v2]
* nir/algebraic: Remove redundant extract_[iu]8 patternsIan Romanick2019-03-281-14/+4
| | | | | | | | No shader-db changes on any Intel platform. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Fix up extract_[iu]8 after loop unrollingIan Romanick2019-03-281-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Skylake, Broadwell, and Haswell had similar results. (Skylake shown) total instructions in shared programs: 15256840 -> 15256837 (<.01%) instructions in affected programs: 4713 -> 4710 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.06% max: 0.08% x̄: 0.06% x̃: 0.06% total cycles in shared programs: 372286583 -> 372286583 (0.00%) cycles in affected programs: 198516 -> 198516 (0.00%) helped: 1 HURT: 1 helped stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.01% max: 0.01% x̄: 0.01% x̃: 0.01% No changes on any other Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. v3: Fix a copy-and-paste bug in the extract_[ui] of ishl loop that would replace an extract_i8 with and extract_u8. This broke ~180 tests. This bug was introduced in v2. Reviewed-by: Matt Turner <[email protected]> [v1] Reviewed-by: Dylan Baker <[email protected]> [v2] Acked-by: Jason Ekstrand <[email protected]> [v2]
* nir/deref: fix struct wrapper casts. (v3)Dave Airlie2019-03-291-2/+36
| | | | | | | | | | | llvm/spir-v spits out some struct a { struct b {} }, but it doesn't deref, it casts (struct a) to (struct b), reconstruct struct derefs instead of casts for these. v2: use ssa_def_rewrite uses, rework the type restrictions (Jason) v3: squish more stuff into one function, drop unused temp (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: propagate the access flag for store and load derefsSamuel Pitoiset2019-03-273-24/+32
| | | | | | | It was only propagated when UBO/SSBO access are lowered to offsets. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: <Jason Ekstrand [email protected]>
* nir: add nir_{load,store}_deref_with_access() helpersSamuel Pitoiset2019-03-271-3/+21
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: <Jason Ekstrand [email protected]>
* spirv: make use of the select control support in nirTimothy Arceri2019-03-271-0/+18
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841
* nir: add support for user defined select controlTimothy Arceri2019-03-274-1/+21
| | | | | | | | | | | This will allow us to make use of the selection control support in spirv and the GL support provided by EXT_control_flow_attributes. Note this only supports if-statements as we dont support switches in NIR. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841
* spirv: make use of the loop control support in nirTimothy Arceri2019-03-271-0/+20
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841
* nir: add support for user defined loop controlTimothy Arceri2019-03-273-5/+22
| | | | | | | | This will allow us to make use of the loop control support in spirv and the GL support provided by EXT_control_flow_attributes. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841
* spirv: Handle the NonUniformEXT decorationJason Ekstrand2019-03-252-0/+28
|
* nir: Add access flags to deref and SSBO atomicsJason Ekstrand2019-03-252-28/+34
| | | | | | | We will need them for a new ACCESS_NON_UNIFORM flag that's about to be added in the next commit. Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Add texture sources and intrinsics for bindlessJason Ekstrand2019-03-254-10/+29
| | | | | | | | | On Intel, we have both bindless and bindful and we'd like to use them at the same time if we can so we need to be able to distinguish at the NIR level between the two. This also fixes nir_lower_tex to properly handle bindless in its tex_texture_size and get_texture_lod helpers. Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Add a lowering pass for non-uniform resource accessJason Ekstrand2019-03-255-0/+286
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/lower_io: Add a bounds-checked 64-bit global address formatJason Ekstrand2019-03-252-6/+93
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* compiler/nir: add lowering for 16-bit ldexpIago Toral Quiroga2019-03-251-2/+7
| | | | | | | | | v2 (Topi): - Make bit-size handling order be 16-bit, 32-bit, 64-bit - Clamp lower exponent range at -28 instead of -30. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* compiler/nir: add lowering for 16-bit flrpIago Toral Quiroga2019-03-252-0/+2
| | | | | | | | | And enable it on Intel. v2: - Squash the change to enable it on Intel (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* compiler/nir: add lowering option for 16-bit fmodIago Toral Quiroga2019-03-252-0/+2
| | | | | | | | | And enable it on Intel. v2: - Squash the change to enable this lowering on Intel (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: fix a few signed/unsigned comparison warningsBrian Paul2019-03-251-2/+2
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir/split_vars: fixup some more explicit_stride related issues.Dave Airlie2019-03-251-2/+1
| | | | | | | | | | | | | | | | With vkpipelinedb Samuel discovered a regression since we stopped stripping types at the spir-v level. This adds a check to the var splitting for the case where it asserts the type hasn't changed, when it has just created a bare type, and it's different than the original type which has an explicit stride. This also removes a pointless assert that also triggers. Fixes: 3b3653c4cf (nir/spirv: don't use bare types, remove assert in split vars for testing) Acked-by: Jason Ekstrand <[email protected]>
* spirv: Use interface type for block and buffer blockCaio Marcelo de Oliveira Filho2019-03-232-4/+36
| | | | | | | | Also handle GLSL_TYPE_INTERFACE the same way we do GLSL_TYPE_STRUCT in various places. Motivated by ARB_gl_spirv work, that will take advantage of the interface types when handling NIR coming from SPIR-V. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Add an execution environment to the optionsCaio Marcelo de Oliveira Filho2019-03-231-0/+8
| | | | | | | | | | Also updates gl_spirv to pick the right one. At the moment nothing uses it, but upcoming functionality part of ARB_gl_spirv will use it, and we also later can be more assertful when handling certain features for each of the execution environments. Reviewed-by: Alejandro Piñeiro <[email protected]> Acked-by: Karol Herbst <[email protected]>
* nir: Handle array-deref-of-vector case in loop analysisCaio Marcelo de Oliveira Filho2019-03-221-3/+6
| | | | | | | SPIR-V can produce those for SSBO and UBO access. Found when testing the ARB_gl_spirv series. Reviewed-by: Timothy Arceri <[email protected]>
* spirv,nir: lower frexp_exp/frexp_sig inside a new NIR passSamuel Pitoiset2019-03-225-133/+216
| | | | | | | | | | This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: use generic float types for frexp_exp and frexp_sigSamuel Pitoiset2019-03-221-2/+2
| | | | | | | Only the exponent needs to be 32-bit signed integer. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Fix anonymous union initialization with older GCC.Vinson Lee2019-03-221-4/+6
| | | | | | | | | | | | | | | | | | | | Fix this build error with GCC 4.4.7. CC nir/nir_opt_copy_prop_vars.lo nir/nir_opt_copy_prop_vars.c: In function ‘load_element_from_ssa_entry_value’: nir/nir_opt_copy_prop_vars.c:454: error: unknown field ‘ssa’ specified in initializer nir/nir_opt_copy_prop_vars.c:455: error: unknown field ‘def’ specified in initializer nir/nir_opt_copy_prop_vars.c:456: error: unknown field ‘component’ specified in initializer nir/nir_opt_copy_prop_vars.c:456: error: extra brace group at end of initializer nir/nir_opt_copy_prop_vars.c:456: error: (near initialization for ‘(anonymous).<anonymous>’) nir/nir_opt_copy_prop_vars.c:456: warning: excess elements in union initializer nir/nir_opt_copy_prop_vars.c:456: warning: (near initialization for ‘(anonymous).<anonymous>’) Fixes: 96c32d77763c ("nir/copy_prop_vars: handle load/store of vector elements") Signed-off-by: Vinson Lee <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109810 Reviewed-by: Andres Gomez <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Cross validate variable's invariance by explicit invariance onlyDanylo Piliaiev2019-03-217-9/+30
| | | | | | | | | | | | | | | | | | 'invariant' qualifier is propagated on variables which are used to calculate other invariant variables, however when we are matching variable's declarations we should take into account only explicitly declared invariance because invariance propagation is an implementation specific detail. Thus new flag is added to ir_variable_data which indicates 'invariant' qualifier being explicitly set in the shader. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100316 Fixes: 89b60492 ('glsl: Add a pass to propagate the "invariant" and "precise" qualifiers') Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir: fix opt_if_loop_last_continue()Timothy Arceri2019-03-221-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than skipping code that looked like this: loop { ... if (cond) { do_work_1(); continue; } else { break; } do_work_2(); } Previously we would turn this into: loop { ... if (cond) { do_work_1(); continue; } else { do_work_2(); break; } } This was clearly wrong. This change checks for this case and makes sure we now leave it for nir_opt_dead_cf() to clean up. Reviewed-by: Ian Romanick <[email protected]>
* nir: Record non-vector/scalar varyings as unmovable when compactingKenneth Graunke2019-03-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | In some cases, we can end up with varying structs that aren't split to their member variables. nir_compact_varyings attempted to record these as unmovable, so it would leave them be. Unfortunately, it didn't do it right for non-vector/scalar types. It set the mask to: ((1 << (elements * dmul)) - 1) << var->data.location_frac where elements is the number of vector elements. For structures and other non-vector/scalars, elements is 0...so the whole mask became 0. This caused nir_compact_varyings to assign other varyings on top of the structure varying's location (as it appeared to take up no space). To combat this, we just set elements to 4 for non-vector/scalar types, so that the entire slot gets marked as unmovable. Fixes KHR-GL45.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in on iris. Reviewed-by: Timothy Arceri <[email protected]>
* nir: move gls_type_get_{sampler,image}_count()Rob Clark2019-03-213-42/+45
| | | | | | | I need at least the sampler variant in ir3.. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: only override previous alu during loop analysis if supportedTimothy Arceri2019-03-211-2/+4
| | | | | | | | | | | | | | | | Users of this function expect alu to be a supported comparision if the induction variable is not NULL. Since we attempt to override the return values if the first limit is not a const, we must make sure we are dealing with a valid comparision before overriding the alu instruction. Fixes an unreachable in inverse_comparison() with the game Assasins Creed Odyssey. Fixes: 3235a942c16b ("nir: find induction/limit vars in iand instructions") Acked-by: Samuel Pitoiset <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110216
* spirv: Drop inline tg4 loweringJason Ekstrand2019-03-211-54/+30
| | | | Reviewed-by: Karol Herbst <[email protected]>
* nir/lower_tex: Add support for tg4 offsets loweringKarol Herbst2019-03-212-0/+62
| | | | Signed-off-by: Karol Herbst <[email protected]>
* nir: add support for gather offsetsKarol Herbst2019-03-218-7/+66
| | | | | | | | | | | | | | | | | | | | | Values inside the offsets parameter of textureGatherOffsets are required to be constants in the range of [GL_MIN_PROGRAM_TEXTURE_GATHER_OFFSET, GL_MAX_PROGRAM_TEXTURE_GATHER_OFFSET]. As this range is never outside [-32, 31] for all existing drivers inside mesa, we can simply store the offsets as a int8_t[4][2] array inside nir_tex_instr. Right now only Nvidia hardware supports this in hardware, so we can turn this on inside Nouveau for the NIR path as it is already enabled with the TGSI one. v2: use memcpy instead of for loops add missing bits to nir_instr_set don't show offsets if they are all 0 v3: default offsets aren't all 0 v4: rename offsets -> tg4_offsets rename nir_tex_instr_has_explicit_offsets -> nir_tex_instr_has_explicit_tg4_offsets Signed-off-by: Karol Herbst <[email protected]>
* nir/deref: remove casts of casts which are likely redundant (v3)Dave Airlie2019-03-211-2/+26
| | | | | | | | | Not sure how ptr_stride should be taken into account if at all here v2: reorder check to avoid src walking (Jason) v3: remove is_cast_cast checks, keep going afterwards (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir/spirv: don't use bare types, remove assert in split vars for testingDave Airlie2019-03-212-4/+3
| | | | | | | | | | For OpenCL we never want to strip the info from the types, and it makes type comparisons easier in later stages. We might later need a nir pass to strip this for GLSL, but so far the only regression is the assert and Jason said removing that is fine. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: deref only for OpTypePointerJuan A. Suarez Romero2019-03-201-8/+14
| | | | | | | | | | | | | | Fixes dEQP-VK.binding_model.buffer_device_address.* and dEQP-VK.ssbo.phys.layout* Vulkan CTS tests. v2: set val->type->stride in the section below (Jason) v3: restore val->type->type to original place (Jason) Fixes: d0ba326f238 ("nir/spirv: support physical pointers") CC: Karol Herbst <[email protected]> CC: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Constant values are per-column not per-componentJason Ekstrand2019-03-201-1/+2
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* Revert "glsl: relax input->output validation for SSO programs"Andres Gomez2019-03-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 1aa5738e666a9534c7e5b46f077327e6d647c64f. This patch incorrectly asumed that for SSOs no inner interface matching check was needed. From the ARB_separate_shader_objects spec v.25: " With separable program objects, interfaces between shader stages may involve the outputs from one program object and the inputs from a second program object. For such interfaces, it is not possible to detect mismatches at link time, because the programs are linked separately. When each such program is linked, all inputs or outputs interfacing with another program stage are treated as active. The linker will generate an executable that assumes the presence of a compatible program on the other side of the interface. If a mismatch between programs occurs, no GL error will be generated, but some or all of the inputs on the interface will be undefined." This completes the fix from commit: 3be05dd2679 ("glsl/linker: don't fail non static used inputs without matching outputs") Fixes: 1aa5738e666 ("glsl: relax input->output validation for SSO programs") Cc: Tapani Pälli <[email protected]> Cc: Timothy Arceri <[email protected]> Cc: Ilia Mirkin <[email protected]> Cc: Samuel Iglesias Gonsálvez <[email protected]> Cc: Ian Romanick <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl/linker: simplify xfb_offset vs xfb_stride overflow checkAndres Gomez2019-03-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation uses a complicated calculation which relies in an implicit conversion to check the integral part of 2 division results. However, the calculation actually checks that the xfb_offset is smaller or a multiplier of the xfb_stride. For example, while this is expected to fail, it actually succeeds: " ... layout(xfb_buffer = 2, xfb_stride = 12) out block3 { layout(xfb_offset = 0) vec3 c; layout(xfb_offset = 12) vec3 d; // ERROR, requires stride of 24 }; ... " Fixes: 2fab85aaea5 ("glsl: add xfb_stride link time validation") Cc: Timothy Arceri <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl/linker: don't fail non static used inputs without matching outputsAndres Gomez2019-03-191-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is no Static Use of an input variable, the linker shouldn't fail whenever there is no defined matching output variable in the previous stage. From page 47 (page 51 of the PDF) of the GLSL 4.60 v.5 spec: " Only the input variables that are statically read need to be written by the previous stage; it is allowed to have superfluous declarations of input variables." Now, we complete this exception whenever the input variable has an explicit location. Previously, 18004c338f6 ("glsl: fail when a shader's input var has not an equivalent out var in previous") took care of the cases in which the input variable didn't have an explicit location. v2: do the location based interface matching check regardless on whether it is a separable program or not (Ilia). Fixes: 1aa5738e666 ("glsl: relax input->output validation for SSO programs") Cc: Timothy Arceri <[email protected]> Cc: Iago Toral Quiroga <[email protected]> Cc: Samuel Iglesias Gonsálvez <[email protected]> Cc: Tapani Pälli <[email protected]> Cc: Ian Romanick <[email protected]> Cc: Ilia Mirkin <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl/linker: always validate explicit location among inputsAndres Gomez2019-03-191-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Outputs are always validated when having explicit locations and we were trusting its outcome to catch similar problems with the inputs since, in case of having undefined outputs for existing inputs, we would be already reporting a linker error. However, consider this case: " Shader stage n: --------------- ... layout(location = 0) out float a; ... Shader stage n+1: ----------------- ... layout(location = 0) in float b; layout(location = 0) in float c; ... " Currently, this won't report a linker error even though location aliasing is happening for the inputs. Therefore, we also need to validate the inputs independently from the outcome of the outputs validation. Cc: Timothy Arceri <[email protected]> Cc: Iago Toral Quiroga <[email protected]> Cc: Ilia Mirkin <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>