summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Don't increase the iteration count when there are no terminatorsIan Romanick2019-06-241-1/+7
| | | | | | | | | | | | | | | | | | | Incrementing the iteration count was intended to fix an off-by-one error when the first terminator was superseded by a later terminator. If there is no first terminator or later terminator, there is no off-by-one error. Incrementing the loop count creates one. This can be seen in loops like: do { if (something) { // No breaks or continues here. } } while (false); Reviewed-by: Timothy Arceri <[email protected]> Tested-by: Abel Briggs <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953 Fixes: 646621c66da ("glsl: make loop unrolling more like the nir unrolling path")
* glsl/nir: Fix copying 64-bit values in uniform storageCaio Marcelo de Oliveira Filho2019-06-241-1/+1
| | | | | | | | | | | The iterator `i` already walks the right amount now that is incremented by `dmul`, so no need to `* 2`. Fixes invalid memory access in upcoming ARB_gl_spirv tests. Failure bisected by Arcady Goldmints-Orlov. Fixes: b019fe8a5b6 "glsl/nir: Fix handling of 64-bit values in uniform storage" Reviewed-by: Jason Ekstrand <[email protected]>
* glsl/nir: Fix copying vector constant valuesCaio Marcelo de Oliveira Filho2019-06-241-1/+1
| | | | | | | | | | For n_columns == 1, we have a vector which is handled by the else case. Fixes invalid memory access in upcoming ARB_gl_spirv tests. Failure bisected by Arcady Goldmints-Orlov. Fixes: 81e51b412e9 "nir: Make nir_constant a vector rather than a matrix" Reviewed-by: Jason Ekstrand <[email protected]>
* nir: introduce lowering of bitfield_insert to bfm and a new opcode ↵Daniel Schürmann2019-06-243-0/+11
| | | | | | | | | | bitfield_select. bitfield_select is defined as: bitfield_select(mask, base, insert) = (mask & base) | (~mask & insert) matching the behavior of AMD's BFI instruction. Reviewed-by: Connor Abbott <[email protected]>
* nir/algebraic: Use unsigned comparison when lowering bitfield insert/extractDaniel Schürmann2019-06-241-2/+2
| | | | | | | | This lets us use the optimization pattern (('ult', 31, ('iand', b, 31)), False) to remove the bcsel instruction for code originating in D3D shaders. Reviewed-by: Connor Abbott <[email protected]>
* nir/algebraic: Remove unnecessary iand of [iu]bfe and bfm sourcesDaniel Schürmann2019-06-241-0/+8
| | | | | | | | The [iu]bfe and bfm instructions are defined to only use the five least significant bits. This optimizes a common pattern from D3D -> SPIR-V translation. Reviewed-by: Connor Abbott <[email protected]>
* nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.Daniel Schürmann2019-06-243-31/+18
| | | | | | | | | | | That is: the five least significant bits provide the values of 'bits' and 'offset' which is the case for all hardware currently supported by NIR and using the bfm/bfe instructions. This patch also changes the lowering of bitfield_insert/extract using shifts to not use bfm and removes the flag 'lower_bfm'. Tested-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/algebraic: add optimization pattern for ('ult', a, ('and', b, a)) and ↵Daniel Schürmann2019-06-241-0/+4
| | | | | | | | | | | friends. These optimizations are based on the fact that 'and(a,b) <= umin(a,b)'. For AMD, this series moves the optimization from LLVM to NIR, so currently no vkpipeline-db changes here. Reviewed-by: Ian Romanick <[email protected]>
* nir/lower_tex: Add an assert() in nir_lower_txs_lod()Boris Brezillon2019-06-201-0/+1
| | | | | | | | | We don't expect the output of a TXS instruction to be wider than a vec3. Add an assert() to make sure this never happens. Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* spirv: Restrict use of descriptor intrinsics to VulkanCaio Marcelo de Oliveira Filho2019-06-191-1/+8
| | | | | | | In ARB_gl_spirv we'll be able to use variables for uniform buffers, so don't use the descriptor intrinsics to lower the block access. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Make nir_constant a vector rather than a matrixJason Ekstrand2019-06-198-132/+136
| | | | | | | | | | Most places in NIR, we treat matrices like arrays. The one annoying exception to this has been nir_constant where a matrix is a first-class thing. This commit changes that so a matrix nir_constant is the same as an array nir_constant. This makes matrix nir_constants a tiny bit more expensive but shrinks all others by 96B. Reviewed-by: Karol Herbst <[email protected]>
* glsl/nir: Fix handling of 64-bit values in uniform storageJason Ekstrand2019-06-191-1/+2
| | | | Reviewed-by: Karol Herbst <[email protected]>
* spirv: Only copy needed components for OpSpecConstantOpJason Ekstrand2019-06-191-1/+6
| | | | Reviewed-by: Karol Herbst <[email protected]>
* spirv: Use a single path for OpSpecConstantOp of OpVectorShuffleJason Ekstrand2019-06-191-37/+19
| | | | | | | | | Now that nir_const_value is a scalar, there's no reason why we need multiple paths here and it's just extra paths to keep working. While we're here, we also add a vtn_fail_if check that component indices are in-bounds. Reviewed-by: Karol Herbst <[email protected]>
* spirv: Use vtn_constan_uint() for array lengths and gather componentsJason Ekstrand2019-06-191-4/+2
| | | | Reviewed-by: Karol Herbst <[email protected]>
* spirv: Add a vtn_constant_int helperJason Ekstrand2019-06-192-17/+19
| | | | Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Add a real is_integer helperJason Ekstrand2019-06-193-2/+10
| | | | Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Rename is_integer to is_integer_32Jason Ekstrand2019-06-1913-31/+32
| | | | | | | It only accepts 32-bit integers so it should have a more descriptive name. This patch should not be a functional change. Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Ignore bit sizes in contains_integer()Jason Ekstrand2019-06-191-1/+1
| | | | | | | | | All of the callers for this function are looking at interpolation qualifiers and want to make sure they're declared flat. Any 64-bit integer inputs need to be flat. It's also makes the function make more sense since "integer" is fairly generic. Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Handle all bit sizes in glsl_type_is_integerJason Ekstrand2019-06-191-1/+1
| | | | | | | All of the callers of this function really just want to know if the type is an integer and don't care about bit size. Reviewed-by: Karol Herbst <[email protected]>
* glsl/nir_opt_access: Update uniforms correctly when only vars changeCaio Marcelo de Oliveira Filho2019-06-191-1/+13
| | | | | | | | | | | | | | Even if only variables access flags are changed, the existing NIR infrastructure expects metadata to be explicitly preserved, so do that. Don't care about avoiding preserve to be called twice since the cost is negligible. This scenario can be triggered by dead variables, and also by other intrinsics that read the variables -- but not cause progress to be made when processing the intrinsics. Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Fix getting the sampler dim when arrays are involvedCaio Marcelo de Oliveira Filho2019-06-191-1/+2
| | | | | | | | | | Unwrap any array in the variable type so we can get the sampler dim. This fixes piglit test spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test. Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Use reorderable access flagConnor Abbott2019-06-191-4/+12
| | | | | | No changes with radeonsi shader-db. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add a helper to determine if an intrinsic can be reorderedConnor Abbott2019-06-193-11/+13
| | | | | | | This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by: Timothy Arceri <[email protected]>
* glsl/nir: Add optimization pass for access flagsConnor Abbott2019-06-194-0/+324
| | | | | | | | | | | | | | Right now, this just deduces when we can arbitrarily reorder SSBO and image loads, matching the existing logic in radeonsi's TGSI->LLVM pass. This approach can't handle some things that nir_opt_copy_prop_vars can, but it can handle images, and with GCM it lets us hoist reads outside of loops. We can also pass this information to LLVM which lets it do its own optimizations on it. This is GLSL only as I haven't tested it on Vulkan yet, and it would probably need a few changes to work there. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add reorderable memory access enumConnor Abbott2019-06-192-1/+10
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/copy_prop_vars: Ignore volatile accessesConnor Abbott2019-06-191-0/+13
| | | | | | | | | The spec explicitly says that volatile writes can't be removed and volatile reads do not guarantee that the same value will still be around after the read, as if there were a barrier after each read/write. Just ignore them. Reviewed-by: Timothy Arceri <[email protected]>
* glsl/nir: Propagate access qualifiersConnor Abbott2019-06-192-6/+59
| | | | | | | | | | We were completely ignoring these before, except for putting them on variables. While we're here, don't set access qualifiers when converting to bindless since glsl_to_nir will already have set a more accurate qualifier that includes any qualifiers on struct members that are dereferenced. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Allow qualifiers on copy_deref and image instructionsConnor Abbott2019-06-196-12/+48
| | | | | | | | In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by: Timothy Arceri <[email protected]>
* nir: add a vectorization passConnor Abbott2019-06-184-0/+457
| | | | | | | | | | | | | | | | | | | | | | | | | | This effectively does the opposite of nir_lower_alus_to_scalar, trying to combine per-component ALU operations with the same sources but different swizzles into one larger ALU operation. It uses a similar model as CSE, where we do a depth-first approach and keep around a hash set of instructions to be combined, but there are a few major differences: 1. For now, we only support entirely per-component ALU operations. 2. Since it's not always guaranteed that we'll be able to combine equivalent instructions, we keep a stack of equivalent instructions around, trying to combine new instructions with instructions on the stack. The pass isn't comprehensive by far; it can't handle operations where some of the sources are per-component and others aren't, and it can't handle phi nodes. But it should handle the more common cases, and it should be reasonably efficient. [Alyssa: Rebase on latest master, updating with respect to typeless moves] Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/lower_tex: Add a way to lower TXS(non-0-LOD) instructionsBoris Brezillon2019-06-182-0/+52
| | | | | | | | | | | | | | | The V3D driver has an open-coded solution for this, and we need the same thing for Panfrost, so let's add a generic way to lower TXS(LOD) into max(TXS(0) >> LOD, 1). Changes in v2: * Use == 0 instead of ! * Rework the minification logic as suggested by Jason * Assign cursor pos at the beginning of the function * Patch the LOD just after retrieving the old value Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* nir/lower_tex: Update ->sampler_dim value before calling get_texture_size()Boris Brezillon2019-06-181-2/+5
| | | | | | | | | | | | | | | | | | | | | | get_texture_size() will create a txs instruction with ->sampler_dim set to the original tex->sampler_dim. The condition to call lower_rect() only checks the value of ->sampler_dim and whether lower_rect is requested or not. This leads to an infinite loop when calling nir_lower_tex() with the same options until it returns false. In order to avoid that, let's move the tex->sampler_dim patching before get_texture_size() is called. This way the txs instruction will have ->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try to lower it on the subsequent passes. Changes in v2: * Add Jason R-b * Add a comment explaining why we patch ->sampler_dim at the beginning of the lower_rect() func Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* nir/lower_tex: Actually report when projector lowering happenedBoris Brezillon2019-06-181-4/+4
| | | | | | | | | | | | | | | | | The code considers that projector lowering was done even if it's not really the case. Change the project_src() prototype to return a bool encoding whether projector lowering happened or not and update the progress var accordingly in nir_lower_tex_block(). --- Changes in v2: * Add Jason R-b * Drop the part suggesting that nir_lower_rect() could be called in a do-while(progress) loop. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* glsl: Fix out of bounds read in shader_cache_read_program_metadataKenneth Graunke2019-06-171-3/+2
| | | | | | | | | | | | | | | | | | The VaryingNames array has NumVaryings entries. But BufferStride is a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with more than 4 varyings would read out of bounds. Also, BufferStride is set based on the shader itself, which means that it's inherently already included in the hash, and doesn't need to be included again. At the point when shader_cache_read_program_metadata is called, the linker hasn't even set those fields yet. So, just drop it entirely. Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test. Fixes: 6d830940f78 glsl/shader_cache: Allow shader cache usage with transform feedback Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Set default precision on record membersNeil Roberts2019-06-141-1/+10
| | | | | | | | | | | | | | Record types have their own slot to store the precision for each member in glsl_struct_field. Previously if the member didn’t have an explicit precision qualifier this was being left as GLSL_PRECISION_NONE. This patch makes it take into account the type’s default precision qualifier like it does for regular variables in apply_type_qualifier_to_variable. This has the additional benefit of correctly reporting an error when a float type is used in a struct without declaring the default type. Reviewed-by: Eric Anholt <[email protected]>
* glsl/linker: Make precision matching optional in intrastage_matchNeil Roberts2019-06-143-8/+24
| | | | | | | | | | This function is confusingly also used to match interstage interfaces as well as intrastage. In the interstage case it needs to avoid comparing the precisions. This patch adds a parameter to specify whether to take the precision into account or not so that it can be used for both cases. Reviewed-by: Eric Anholt <[email protected]>
* glsl/linker: Don’t check precision for shader interfaceNeil Roberts2019-06-141-2/+5
| | | | | | | | | | | | | On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. Section 4.3.10 of the GLSL ES 3.00 spec: “The type of vertex outputs and fragment inputs with the same name must match, otherwise the link command will fail. The precision does not need to match.” Reviewed-by: Eric Anholt <[email protected]>
* compiler/types: Making comparing record precision optionalNeil Roberts2019-06-142-5/+53
| | | | | | | | | | | | | | | On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. This adds an extra argument to glsl_types::record_compare to disable the precision comparison. This will later be used for the shader interface check. In order to make this work this patch also adds a helper function to recursively compare types while ignoring the precision. v2: Call record_compare from within compare_no_precision to avoid duplicating code (Eric Anholt). Reviewed-by: Eric Anholt <[email protected]>
* nir: detect more dynamically uniform expressionsIago Toral Quiroga2019-06-141-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shader-db results for v3d: total instructions in shared programs: 9132728 -> 9119238 (-0.15%) instructions in affected programs: 596886 -> 583396 (-2.26%) helped: 1118 HURT: 224 total threads in shared programs: 234298 -> 234308 (<.01%) threads in affected programs: 10 -> 20 (100.00%) helped: 5 HURT: 0 total uniforms in shared programs: 3022949 -> 3022622 (-0.01%) uniforms in affected programs: 29163 -> 28836 (-1.12%) helped: 108 HURT: 37 total max-temps in shared programs: 1328030 -> 1327762 (-0.02%) max-temps in affected programs: 10097 -> 9829 (-2.65%) helped: 263 HURT: 15 total spills in shared programs: 3793 -> 3777 (-0.42%) spills in affected programs: 432 -> 416 (-3.70%) helped: 16 HURT: 0 total fills in shared programs: 4380 -> 4266 (-2.60%) fills in affected programs: 828 -> 714 (-13.77%) helped: 16 HURT: 0 Reviewed-by: Eric Anholt <[email protected]>
* nir: Don't manually index intrinsic index enumConnor Abbott2019-06-131-20/+20
| | | | | | | This fixes a rebase fail in ea51275e07b, and prevents it from happening again. There's no reason to do this manually. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv/nir: add support for AMD_shader_ballot and Groups capabilityDaniel Schürmann2019-06-135-9/+136
| | | | | | | | This commit also renames existing AMD capabilities: - gcn_shader -> amd_gcn_shader - trinary_minmax -> amd_trinary_minmax Reviewed-by: Connor Abbott <[email protected]>
* nir: add intrinsics for AMD_shader_ballotDaniel Schürmann2019-06-133-0/+31
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/spirv: add support for the SubgroupBallotKHR SPIR-V capabilityDaniel Schürmann2019-06-132-7/+13
| | | | | | This capability is required for the VK_EXT_shader_subgroup_ballot extension. Reviewed-by: Connor Abbott <[email protected]>
* nir/spirv: add support for the SubgroupVoteKHR SPIR-V capabilityDaniel Schürmann2019-06-132-4/+20
| | | | | | This capability is required for the VK_EXT_shader_subgroup_vote extension. Reviewed-by: Connor Abbott <[email protected]>
* glsl: Check order and uniqueness of interlock functionsCaio Marcelo de Oliveira Filho2019-06-104-4/+35
| | | | | | | | With this commit all remaining compilation tests in Piglit for ARB_fragment_shader_interlock will pass. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* glsl: Make interlock builtins follow same compiler rules as barriersCaio Marcelo de Oliveira Filho2019-06-101-5/+10
| | | | | | | | | | | Generalize the barrier code to provide correct error messages for other builtins. Fixes most of piglit compilation tests for ARB_fragment_shader_interlock. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* nir/opt_algebraic: Fix rules for imadsh_mix16Eduardo Lima Mitev2019-06-101-2/+2
| | | | | | | | | | | | | | | | | | The rules added in patch 3addd7c are inverted: It should be: (al * bh) << 16 + c instead of: (ah * bl) << 16 + c Fixes a number of regressions under dEQP-GLES31.functional.draw_indirect.compute_interop.large.* on Freedreno. Reviewed-by: Rob Clark <[email protected]>
* nir: fix s/&&/||/ typoEric Engestrom2019-06-071-1/+1
| | | | | | Fixes: cd73b6174b093b75f581 "nir/lower_to_source_mods: Stop turning add, sat, and neg into mov" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir_algebraic: Add basic optimizations for umul_low and imadsh_mix16Eduardo Lima Mitev2019-06-072-0/+55
| | | | | | | | | | | | | | | | | | For umul_low (al * bl), zero is returned if the low 16-bits word of either source is zero. for imadsh_mix16 (ah * bl << 16 + c), c is returned if either 'ah' or 'bl' is zero. A couple of nir_search_helpers are added: is_upper_half_zero() returns true if the highest word of all components of an integer NIR alu src are zero. is_lower_half_zero() returns true if the lowest word of all components of an integer nir alu src are zero. Reviewed-by: Eric Anholt <[email protected]>
* nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodesEduardo Lima Mitev2019-06-071-1/+14
| | | | | | | | | | | | | 'umul_low' is the low 32-bits of unsigned integer multiply. It maps directly to ir3's MULL_U. 'imadsh_mix16' is multiply add with shift and mix, an ir3 specific instruction that maps directly to ir3's IMADSH_M16. Both are necessary for the lowering of integer multiplication on Freedreno, which will be introduced later in this series. Reviewed-by: Eric Anholt <[email protected]>