aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir/lower_tex: Add an assert() in nir_lower_txs_lod()Boris Brezillon2019-06-201-0/+1
| | | | | | | | | We don't expect the output of a TXS instruction to be wider than a vec3. Add an assert() to make sure this never happens. Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* spirv: Restrict use of descriptor intrinsics to VulkanCaio Marcelo de Oliveira Filho2019-06-191-1/+8
| | | | | | | In ARB_gl_spirv we'll be able to use variables for uniform buffers, so don't use the descriptor intrinsics to lower the block access. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Make nir_constant a vector rather than a matrixJason Ekstrand2019-06-198-132/+136
| | | | | | | | | | Most places in NIR, we treat matrices like arrays. The one annoying exception to this has been nir_constant where a matrix is a first-class thing. This commit changes that so a matrix nir_constant is the same as an array nir_constant. This makes matrix nir_constants a tiny bit more expensive but shrinks all others by 96B. Reviewed-by: Karol Herbst <[email protected]>
* glsl/nir: Fix handling of 64-bit values in uniform storageJason Ekstrand2019-06-191-1/+2
| | | | Reviewed-by: Karol Herbst <[email protected]>
* spirv: Only copy needed components for OpSpecConstantOpJason Ekstrand2019-06-191-1/+6
| | | | Reviewed-by: Karol Herbst <[email protected]>
* spirv: Use a single path for OpSpecConstantOp of OpVectorShuffleJason Ekstrand2019-06-191-37/+19
| | | | | | | | | Now that nir_const_value is a scalar, there's no reason why we need multiple paths here and it's just extra paths to keep working. While we're here, we also add a vtn_fail_if check that component indices are in-bounds. Reviewed-by: Karol Herbst <[email protected]>
* spirv: Use vtn_constan_uint() for array lengths and gather componentsJason Ekstrand2019-06-191-4/+2
| | | | Reviewed-by: Karol Herbst <[email protected]>
* spirv: Add a vtn_constant_int helperJason Ekstrand2019-06-192-17/+19
| | | | Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Add a real is_integer helperJason Ekstrand2019-06-193-2/+10
| | | | Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Rename is_integer to is_integer_32Jason Ekstrand2019-06-1913-31/+32
| | | | | | | It only accepts 32-bit integers so it should have a more descriptive name. This patch should not be a functional change. Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Ignore bit sizes in contains_integer()Jason Ekstrand2019-06-191-1/+1
| | | | | | | | | All of the callers for this function are looking at interpolation qualifiers and want to make sure they're declared flat. Any 64-bit integer inputs need to be flat. It's also makes the function make more sense since "integer" is fairly generic. Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Handle all bit sizes in glsl_type_is_integerJason Ekstrand2019-06-191-1/+1
| | | | | | | All of the callers of this function really just want to know if the type is an integer and don't care about bit size. Reviewed-by: Karol Herbst <[email protected]>
* glsl/nir_opt_access: Update uniforms correctly when only vars changeCaio Marcelo de Oliveira Filho2019-06-191-1/+13
| | | | | | | | | | | | | | Even if only variables access flags are changed, the existing NIR infrastructure expects metadata to be explicitly preserved, so do that. Don't care about avoiding preserve to be called twice since the cost is negligible. This scenario can be triggered by dead variables, and also by other intrinsics that read the variables -- but not cause progress to be made when processing the intrinsics. Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Fix getting the sampler dim when arrays are involvedCaio Marcelo de Oliveira Filho2019-06-191-1/+2
| | | | | | | | | | Unwrap any array in the variable type so we can get the sampler dim. This fixes piglit test spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test. Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Use reorderable access flagConnor Abbott2019-06-191-4/+12
| | | | | | No changes with radeonsi shader-db. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add a helper to determine if an intrinsic can be reorderedConnor Abbott2019-06-193-11/+13
| | | | | | | This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by: Timothy Arceri <[email protected]>
* glsl/nir: Add optimization pass for access flagsConnor Abbott2019-06-194-0/+324
| | | | | | | | | | | | | | Right now, this just deduces when we can arbitrarily reorder SSBO and image loads, matching the existing logic in radeonsi's TGSI->LLVM pass. This approach can't handle some things that nir_opt_copy_prop_vars can, but it can handle images, and with GCM it lets us hoist reads outside of loops. We can also pass this information to LLVM which lets it do its own optimizations on it. This is GLSL only as I haven't tested it on Vulkan yet, and it would probably need a few changes to work there. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add reorderable memory access enumConnor Abbott2019-06-192-1/+10
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/copy_prop_vars: Ignore volatile accessesConnor Abbott2019-06-191-0/+13
| | | | | | | | | The spec explicitly says that volatile writes can't be removed and volatile reads do not guarantee that the same value will still be around after the read, as if there were a barrier after each read/write. Just ignore them. Reviewed-by: Timothy Arceri <[email protected]>
* glsl/nir: Propagate access qualifiersConnor Abbott2019-06-192-6/+59
| | | | | | | | | | We were completely ignoring these before, except for putting them on variables. While we're here, don't set access qualifiers when converting to bindless since glsl_to_nir will already have set a more accurate qualifier that includes any qualifiers on struct members that are dereferenced. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Allow qualifiers on copy_deref and image instructionsConnor Abbott2019-06-196-12/+48
| | | | | | | | In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by: Timothy Arceri <[email protected]>
* nir: add a vectorization passConnor Abbott2019-06-184-0/+457
| | | | | | | | | | | | | | | | | | | | | | | | | | This effectively does the opposite of nir_lower_alus_to_scalar, trying to combine per-component ALU operations with the same sources but different swizzles into one larger ALU operation. It uses a similar model as CSE, where we do a depth-first approach and keep around a hash set of instructions to be combined, but there are a few major differences: 1. For now, we only support entirely per-component ALU operations. 2. Since it's not always guaranteed that we'll be able to combine equivalent instructions, we keep a stack of equivalent instructions around, trying to combine new instructions with instructions on the stack. The pass isn't comprehensive by far; it can't handle operations where some of the sources are per-component and others aren't, and it can't handle phi nodes. But it should handle the more common cases, and it should be reasonably efficient. [Alyssa: Rebase on latest master, updating with respect to typeless moves] Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/lower_tex: Add a way to lower TXS(non-0-LOD) instructionsBoris Brezillon2019-06-182-0/+52
| | | | | | | | | | | | | | | The V3D driver has an open-coded solution for this, and we need the same thing for Panfrost, so let's add a generic way to lower TXS(LOD) into max(TXS(0) >> LOD, 1). Changes in v2: * Use == 0 instead of ! * Rework the minification logic as suggested by Jason * Assign cursor pos at the beginning of the function * Patch the LOD just after retrieving the old value Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* nir/lower_tex: Update ->sampler_dim value before calling get_texture_size()Boris Brezillon2019-06-181-2/+5
| | | | | | | | | | | | | | | | | | | | | | get_texture_size() will create a txs instruction with ->sampler_dim set to the original tex->sampler_dim. The condition to call lower_rect() only checks the value of ->sampler_dim and whether lower_rect is requested or not. This leads to an infinite loop when calling nir_lower_tex() with the same options until it returns false. In order to avoid that, let's move the tex->sampler_dim patching before get_texture_size() is called. This way the txs instruction will have ->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try to lower it on the subsequent passes. Changes in v2: * Add Jason R-b * Add a comment explaining why we patch ->sampler_dim at the beginning of the lower_rect() func Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* nir/lower_tex: Actually report when projector lowering happenedBoris Brezillon2019-06-181-4/+4
| | | | | | | | | | | | | | | | | The code considers that projector lowering was done even if it's not really the case. Change the project_src() prototype to return a bool encoding whether projector lowering happened or not and update the progress var accordingly in nir_lower_tex_block(). --- Changes in v2: * Add Jason R-b * Drop the part suggesting that nir_lower_rect() could be called in a do-while(progress) loop. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* glsl: Fix out of bounds read in shader_cache_read_program_metadataKenneth Graunke2019-06-171-3/+2
| | | | | | | | | | | | | | | | | | The VaryingNames array has NumVaryings entries. But BufferStride is a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with more than 4 varyings would read out of bounds. Also, BufferStride is set based on the shader itself, which means that it's inherently already included in the hash, and doesn't need to be included again. At the point when shader_cache_read_program_metadata is called, the linker hasn't even set those fields yet. So, just drop it entirely. Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test. Fixes: 6d830940f78 glsl/shader_cache: Allow shader cache usage with transform feedback Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Set default precision on record membersNeil Roberts2019-06-141-1/+10
| | | | | | | | | | | | | | Record types have their own slot to store the precision for each member in glsl_struct_field. Previously if the member didn’t have an explicit precision qualifier this was being left as GLSL_PRECISION_NONE. This patch makes it take into account the type’s default precision qualifier like it does for regular variables in apply_type_qualifier_to_variable. This has the additional benefit of correctly reporting an error when a float type is used in a struct without declaring the default type. Reviewed-by: Eric Anholt <[email protected]>
* glsl/linker: Make precision matching optional in intrastage_matchNeil Roberts2019-06-143-8/+24
| | | | | | | | | | This function is confusingly also used to match interstage interfaces as well as intrastage. In the interstage case it needs to avoid comparing the precisions. This patch adds a parameter to specify whether to take the precision into account or not so that it can be used for both cases. Reviewed-by: Eric Anholt <[email protected]>
* glsl/linker: Don’t check precision for shader interfaceNeil Roberts2019-06-141-2/+5
| | | | | | | | | | | | | On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. Section 4.3.10 of the GLSL ES 3.00 spec: “The type of vertex outputs and fragment inputs with the same name must match, otherwise the link command will fail. The precision does not need to match.” Reviewed-by: Eric Anholt <[email protected]>
* compiler/types: Making comparing record precision optionalNeil Roberts2019-06-142-5/+53
| | | | | | | | | | | | | | | On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. This adds an extra argument to glsl_types::record_compare to disable the precision comparison. This will later be used for the shader interface check. In order to make this work this patch also adds a helper function to recursively compare types while ignoring the precision. v2: Call record_compare from within compare_no_precision to avoid duplicating code (Eric Anholt). Reviewed-by: Eric Anholt <[email protected]>
* nir: detect more dynamically uniform expressionsIago Toral Quiroga2019-06-141-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shader-db results for v3d: total instructions in shared programs: 9132728 -> 9119238 (-0.15%) instructions in affected programs: 596886 -> 583396 (-2.26%) helped: 1118 HURT: 224 total threads in shared programs: 234298 -> 234308 (<.01%) threads in affected programs: 10 -> 20 (100.00%) helped: 5 HURT: 0 total uniforms in shared programs: 3022949 -> 3022622 (-0.01%) uniforms in affected programs: 29163 -> 28836 (-1.12%) helped: 108 HURT: 37 total max-temps in shared programs: 1328030 -> 1327762 (-0.02%) max-temps in affected programs: 10097 -> 9829 (-2.65%) helped: 263 HURT: 15 total spills in shared programs: 3793 -> 3777 (-0.42%) spills in affected programs: 432 -> 416 (-3.70%) helped: 16 HURT: 0 total fills in shared programs: 4380 -> 4266 (-2.60%) fills in affected programs: 828 -> 714 (-13.77%) helped: 16 HURT: 0 Reviewed-by: Eric Anholt <[email protected]>
* nir: Don't manually index intrinsic index enumConnor Abbott2019-06-131-20/+20
| | | | | | | This fixes a rebase fail in ea51275e07b, and prevents it from happening again. There's no reason to do this manually. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv/nir: add support for AMD_shader_ballot and Groups capabilityDaniel Schürmann2019-06-135-9/+136
| | | | | | | | This commit also renames existing AMD capabilities: - gcn_shader -> amd_gcn_shader - trinary_minmax -> amd_trinary_minmax Reviewed-by: Connor Abbott <[email protected]>
* nir: add intrinsics for AMD_shader_ballotDaniel Schürmann2019-06-133-0/+31
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/spirv: add support for the SubgroupBallotKHR SPIR-V capabilityDaniel Schürmann2019-06-132-7/+13
| | | | | | This capability is required for the VK_EXT_shader_subgroup_ballot extension. Reviewed-by: Connor Abbott <[email protected]>
* nir/spirv: add support for the SubgroupVoteKHR SPIR-V capabilityDaniel Schürmann2019-06-132-4/+20
| | | | | | This capability is required for the VK_EXT_shader_subgroup_vote extension. Reviewed-by: Connor Abbott <[email protected]>
* glsl: Check order and uniqueness of interlock functionsCaio Marcelo de Oliveira Filho2019-06-104-4/+35
| | | | | | | | With this commit all remaining compilation tests in Piglit for ARB_fragment_shader_interlock will pass. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* glsl: Make interlock builtins follow same compiler rules as barriersCaio Marcelo de Oliveira Filho2019-06-101-5/+10
| | | | | | | | | | | Generalize the barrier code to provide correct error messages for other builtins. Fixes most of piglit compilation tests for ARB_fragment_shader_interlock. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* nir/opt_algebraic: Fix rules for imadsh_mix16Eduardo Lima Mitev2019-06-101-2/+2
| | | | | | | | | | | | | | | | | | The rules added in patch 3addd7c are inverted: It should be: (al * bh) << 16 + c instead of: (ah * bl) << 16 + c Fixes a number of regressions under dEQP-GLES31.functional.draw_indirect.compute_interop.large.* on Freedreno. Reviewed-by: Rob Clark <[email protected]>
* nir: fix s/&&/||/ typoEric Engestrom2019-06-071-1/+1
| | | | | | Fixes: cd73b6174b093b75f581 "nir/lower_to_source_mods: Stop turning add, sat, and neg into mov" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir_algebraic: Add basic optimizations for umul_low and imadsh_mix16Eduardo Lima Mitev2019-06-072-0/+55
| | | | | | | | | | | | | | | | | | For umul_low (al * bl), zero is returned if the low 16-bits word of either source is zero. for imadsh_mix16 (ah * bl << 16 + c), c is returned if either 'ah' or 'bl' is zero. A couple of nir_search_helpers are added: is_upper_half_zero() returns true if the highest word of all components of an integer NIR alu src are zero. is_lower_half_zero() returns true if the lowest word of all components of an integer nir alu src are zero. Reviewed-by: Eric Anholt <[email protected]>
* nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodesEduardo Lima Mitev2019-06-071-1/+14
| | | | | | | | | | | | | 'umul_low' is the low 32-bits of unsigned integer multiply. It maps directly to ir3's MULL_U. 'imadsh_mix16' is multiply add with shift and mix, an ir3 specific instruction that maps directly to ir3's IMADSH_M16. Both are necessary for the lowering of integer multiplication on Freedreno, which will be introduced later in this series. Reviewed-by: Eric Anholt <[email protected]>
* glsl/loop_analysis: Don't search for NULL variables in the hash tableJason Ekstrand2019-06-061-0/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/propagate_invariant: Don't add NULL vars to the hash tableJason Ekstrand2019-06-061-1/+10
| | | | | | | Fixes: 8410cf66d "nir/propagate_invariant: Skip unknown vars" Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Combine lower_fmod16/32 back into a single lower_fmod.Kenneth Graunke2019-06-052-5/+4
| | | | | | | | | | | | | | We originally had a single lower_fmod option. In commit 2ab2d2e5, Sam split 32 and 64-bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16-bit unhandled, so Iago added a lower_fmod16 option in commit ca31df6f. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewed-by: Marek Olšák <[email protected]>
* nir: Drop lower_fmod64 option.Kenneth Graunke2019-06-052-2/+0
| | | | | | | | nir_lower_doubles offers a wide variety of fp64 lowering, including lowering fmod@64. The version there also better handles imprecisions due to lowered frcp@64. Let's consolidate on one version. Reviewed-by: Marek Olšák <[email protected]>
* nir: Don't replace the nir_shader when NIR_TEST_SERIALIZE=1Jason Ekstrand2019-06-052-10/+16
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957 Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: Don't replace the nir_shader when NIR_TEST_CLONE=1Jason Ekstrand2019-06-052-2/+42
| | | | | | | | | | | | Instead, we add a new helper which stomps one nir_shader and replaces it with another. The new helper effectively just changes which pointer gets used for the base nir_shader. It should be 99% as good at testing cloning but without requiring that everything handle having the shader swapped out from under it constantly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957 Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir/algebraic: Simplify max(abs(a), 0.0) -> abs(a)Alyssa Rosenzweig2019-06-041-0/+1
| | | | | | | | | | | This pattern was noticed in glmark's jellyfish scene. v2: Add inexact qualifier due to NaN behaviour. Minimal shader-db changes (slightly helped). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* spirv: Update the OpenCL.std.h headerCaio Marcelo de Oliveira Filho2019-06-042-144/+339
| | | | | | | | | | | | This corresponds to commit 8b911bd2ba37677037b38c9bd286c7c05701bcda on GitHub. We previously tweaked OpenCL.std.h from upstream to be included in C code. Now upstream header can be included, however the symbol names are slightly different (include an OpenCLstd_ prefix), so this patch also fixes vtn_opencl.c to use those. Reviewed-by: Karol Herbst <[email protected]>