summaryrefslogtreecommitdiffstats
path: root/src/compiler/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir: add tess patch support to nir_remove_unused_varyings()Timothy Arceri2017-11-031-19/+42
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: Add hooks for testing serializationJason Ekstrand2017-10-312-0/+36
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir: add serialization and deserializationConnor Abbott2017-10-313-0/+1246
| | | | | | | | | | | | | | | | | | | | v2 (Jason Ekstrand): - Various whitespace cleanups - Add helpers for reading/writing objects - Rework derefs - [de]serialize nir_shader::num_* - Fix uses of blob_reserve_bytes - Use a bitfield struct for packing tex_instr data v3: - Zero nir_variable struct on deserialization. (Jordan) - Allow nir_serialize.h to be included in C++. (Jordan) - Handle NULL info.name. (Jason) - Set info.name to NULL when name is NULL. (Jordan) Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARBNeil Roberts2017-10-311-2/+22
| | | | | | | | | | | | | | | | | | Previously the values were calculated by just shifting ~0 by the invocation ID. This would end up including bits that are higher than gl_SubGroupSizeARB. The corresponding CTS test effectively requires that these high bits be zero so it was failing. There is a Piglit test as well but this appears to checking the wrong values so it passes. For the two greater-than bitmasks, this patch adds an extra mask with (~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero. Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3 Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected] Signed-off-by: Neil Roberts <[email protected]>
* nir: Make nir_gather_info collect a uses_fddx_fddy flag.Kenneth Graunke2017-10-291-0/+16
| | | | | | | | i965 turns fddx/fddy into their coarse/fine variants based on the ctx->Hint.FragmentShaderDerivative setting. It needs to know whether this can impact a shader in order to better guess NOS settings. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/opt_intrinsics: Rework progressJason Ekstrand2017-10-251-5/+9
| | | | | | | | | This commit fixes two issues: First, we were returning false regardless of whether or not the function made progress. Second, we were calling nir_metadata_preserve far more often than needed; we only need to call it once per impl. Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/lower_wpos_ytransform: Support system value intrinsicsJason Ekstrand2017-10-251-0/+4
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Zero nir_load_const_instr::value for valgrind & nir_serializeJordan Justen2017-10-251-1/+1
| | | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Zero local_size const struct for valgrind & nir_serializeJordan Justen2017-10-251-0/+1
| | | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/intrinsics: Set the correct num_indices for load_outputJason Ekstrand2017-10-251-1/+1
| | | | | | | Cc: [email protected] Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* meson: extract out variable for nir_algebraic.pyRob Clark2017-10-241-0/+2
| | | | | | | | Also needed in freedreno/ir3. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* nir: Print the components referenced for split or packed shader in/outs.Eric Anholt2017-10-201-1/+25
| | | | | | | | | | | | | | Having 4 variables all called "gl_in_TexCoord0@n" isn't very informative, much better to see: decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0 (VARYING_SLOT_VAR0.x, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@0 (VARYING_SLOT_VAR0.y, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@1 (VARYING_SLOT_VAR0.z, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@2 (VARYING_SLOT_VAR0.w, 1, 0) v2: Handle arrays and structs better (by Timothy) Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add a safety check that we don't remove dead I/O vars after lowering.Eric Anholt2017-10-201-4/+14
| | | | | | | | | The pass only looks at var load/store intrinsics, not input load/store intrinsics, so assert that we don't see the other type. v2: Adjust comment indentation. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Get rid of nir_shader::stageJason Ekstrand2017-10-2022-48/+51
| | | | | | | | It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir: set default lod to texture opcodes that needed it but don't provide itSamuel Iglesias Gonsálvez2017-10-201-0/+13
| | | | | | | | | | | | v2: - Use helper to add a new source to the texture instruction. v3: - Use nir_tex_instr_src_index() to simplify the patch (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add a helper for adding texture instruction sourcesJason Ekstrand2017-10-173-25/+28
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: add component level support to remove_unused_io_vars()Timothy Arceri2017-10-161-16/+21
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nir: add variant of lower_io_to_scalar to be called earlierTimothy Arceri2017-10-162-0/+255
| | | | | | | | | | | This is intended to be called before nir_lower_io() so that we can do some linking optimisations with the results. It can also be used with drivers that don't use nir_lower_io() at all such as RADV. v2: pass mode mask rather than first and last stage integer. Reviewed-by: Eric Anholt <[email protected]>
* nir: Get rid of the variable on vote intrinsicsJason Ekstrand2017-10-121-3/+3
| | | | | | | | This looks like a copy+paste error. They don't actually write into that variable as would be implied by putting the return there. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* nir/opcodes: Fix constant-folding of ufind_msbJason Ekstrand2017-10-121-1/+1
| | | | | | | | | We didn't fold correctly in the case of 0x1 because we never let the loop counter hit 0. Switching it to bit >= 0 solves this problem. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: [email protected]
* nir: Make nir_shader_gather_info() track texelFetch texture accesses.Kenneth Graunke2017-10-121-1/+13
| | | | | | | | For TGSI-based drivers, st_glsl_to_tgsi records this information. For NIR-based drivers, nir_shader_gather_info() will do so. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: bump loop unroll limit to 96.Dave Airlie2017-10-111-1/+3
| | | | | | | | | | | With the ssao demo from Vulkan demos: radv/rx480: 440->440fps anv/haswell: 24->34 fps The demo does a 0->32 loop across a ubo with 32 members. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: Move vc4's alpha test lowering to core NIR.Eric Anholt2017-10-104-0/+139
| | | | | | | | | | | | | I've been doing this inside of vc4, but vc5 wants it as well and it may be useful for other drivers (Intel has a related path for pre-gen6 with MRT, and freedreno had a TGSI path for it at one point). This required defining a common enum for the standard comparison functions, but other lowering passes are likely to also want that enum. v2: Add to meson.build as well. Acked-by: Rob Clark <[email protected]>
* meson: add nir_linking_helpers.c to libnirDylan Baker2017-10-091-0/+1
| | | | | | | This was missed in a rebase, and doesn't affect radv or anv, only i965. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson: convert gtest to an internal dependencyDylan Baker2017-10-031-2/+2
| | | | | | | | | | | | In truth gtest is an external dependency that upstream expects you to "vendor" into your own tree. As such, it makes sense to treat it more like a dependency than an internal library, and collect it's requirements together in a dependency object. v2: - include with -isystem instead of setting compiler args (Eric) Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson: Add build Intel "anv" vulkan driverDylan Baker2017-09-271-0/+205
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows building and installing the Intel "anv" Vulkan driver using meson and ninja, the driver has been tested against the CTS and has seems to pass the same series of tests (they both segfault when the CTS tries to run wayland wsi tests). There are still a mess of TODO, XXX, and FIXME comments in here. Those are mostly for meson bugs I'm trying to fix, or for additional things to implement for other drivers/features. I have configured all intermediate libraries and optional tools to not build by default, meaning they will only be built if they're pulled in as a dependency of a target that will actually be installed) this allows us to avoid massive if chains, while ensuring that only the bits that need to be built are. v2: - enable anv, x11, and wayland by default - add configure option to disable valgrind v3: - fix typo in meson_options (Nicholas) v4: - Remove dead code (Eric) - Remove change to generator that was from v0 (Eric) - replace if chain with loop (Eric) - Fix typos (Eric) - define HAVE_DLOPEN for both libdl and builtin dl cases (Eric) v5: - rebase on util string buffer implementation Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (v4)
* nir: add some helpers for doing linkingTimothy Arceri2017-09-262-0/+150
| | | | | | | | | | | | | The initial helpers add support for removing unused varyings between stages. V2: - Moved the io mask helper function into this file rather than nir.h so it's not used elsewhere considering it doesn't handle all corner cases. - Use bitmask rather than hash table to handle tcs outputs (Ken) Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add always_active_io to nir variableTimothy Arceri2017-09-261-0/+10
| | | | | | | | Will be used in nir link pass to decided if we can remove a varying or not. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir: put compact into bitfields in nir_variable_dataDave Airlie2017-09-071-1/+1
| | | | | | | | | | This being declared bool means it won't get merged with the previous bitfields, this seems like an oversight rather than deliberate. Noticed when running pahole. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: Remove series of unnecessary conversionsMatt Turner2017-08-291-1/+1
| | | | | | | | | | | | | | | | | | | | Clang warns: warning: absolute value function 'fabsf' given an argument of type 'const float64_t' (aka 'const double') but has parameter of type 'float' which may cause truncation of value [-Wabsolute-value] float64_t dst = bit_size == 64 ? fabs(src0) : fabsf(src0); The type of the ternary expression will be the common type of fabs() and fabsf(): double. So fabsf(src0) will be implicitly converted to double. We may as well just convert src0 to double before a call to fabs() and remove the needless complexity, à la float64_t dst = fabs(src0); Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Fix system_value_from_intrinsic for subgroupsJason Ekstrand2017-08-281-4/+4
| | | | | | | A couple of the cases were backwards Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* nir: Fix some whatespaceJason Ekstrand2017-08-281-5/+5
| | | | | | Somehow tabs got in there... Reviewed-by: Matt Turner <[email protected]>
* nir: fix algebraic optimizationsConnor Abbott2017-08-011-2/+2
| | | | | | | | The optimizations are only valid for 32-bit integers. They were mistakenly firing for 64-bit integers as well. Cc: [email protected] Reviewed-by: Matt Turner <[email protected]>
* nir: add nir_lower_uniforms_to_ubo passNicolai Hähnle2017-07-312-0/+98
| | | | | | | | | | This is a further lowering of default-block uniform loads that transforms load_uniform intrinsics into load_ubo intrinsics. This simplifies the rest of the backend. v2: transform from load_uniform instead of straight from variables Reviewed-by: Eric Anholt <[email protected]>
* nir: add nir_lower_samplers_as_deref passNicolai Hähnle2017-07-312-0/+245
| | | | | | This pass is a replacement for the nir_lower_samplers pass, which has the advantage of keeping sampler references as derefs. This allows a unified treatment of texture instructions and image intrinsics in the backend.
* nir: add load_frag_coord system value intrinsicNicolai Hähnle2017-07-313-0/+6
| | | | | | | Some drivers prefer to treat gl_FragCoord as a system value rather than a fragment shader input, see Const.GLSLFragCoordIsSysVal. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: fix nir_lower_wpos_ytransform when gl_FragCoord is a system valueNicolai Hähnle2017-07-311-2/+4
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add nir_instr_rewrite_derefNicolai Hähnle2017-07-312-0/+15
| | | | | | Allows modifying a texture instruction's texture and sampler derefs. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Optimize find_lsb/imsb/umsb error checksMatt Turner2017-07-201-0/+11
| | | | | | | | Two of the ARB_shader_ballot piglit tests hit the find_lsb case, removing some of the noise allowed me to better debug the test when it was failing. Reviewed-by: Connor Abbott <[email protected]>
* nir: Reduce destination size of ballot intrinsic when possibleMatt Turner2017-07-202-0/+20
| | | | | | | | | Some hardware, like i965, doesn't support group sizes greater than 32. In that case, we can reduce the destination size of the ballot intrinsic, which will simplify our code generation. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add pass to scalarize read_invocation/read_first_invocationMatt Turner2017-07-202-1/+113
| | | | | | | i965 will want these to be scalar operations. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add system values from ARB_shader_ballotMatt Turner2017-07-205-1/+81
| | | | | | | | | | | | | We already had a channel_num system value, which I'm renaming to subgroup_invocation to match the rest of the new system values. Note that while ballotARB(true) will return zeros in the high 32-bits on systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB variables do not consider whether channels are enabled. See issue (1) of ARB_shader_ballot. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add intrinsics from ARB_shader_ballotMatt Turner2017-07-201-0/+13
| | | | | Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Support lowering vote intrinsicsMatt Turner2017-07-202-2/+4
| | | | | | | ... trivially (as allowed by the spec!) by reusing the existing nir_opt_intrinsics code. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add pass to optimize intrinsicsMatt Turner2017-07-202-0/+97
| | | | | | | Specifically, constant fold intrinsics from ARB_shader_group_vote, but I suspect it'll be useful for other things in the future. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add intrinsics from ARB_shader_group_voteMatt Turner2017-07-201-0/+5
| | | | | | | | These are intrinsics rather than opcodes, because they operate across channels. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Use nir_src_copy instead of direct assignments.Kenneth Graunke2017-07-183-9/+9
| | | | | | | | | | | | | | | If the source is an indirect register, there is ralloc'd data. Copying with a direct assignment will copy the pointer, but the data will still belong to the old instruction's memory context. Since we're lowering and throwing away instructions, that could free the data by mistake. Instead, use nir_src_copy, which properly handles this. This is admittedly not a common case, so I think the bug is real, but unlikely to be hit. Cc: [email protected] Reviewed-by: Matt Turner <[email protected]>
* nir: fix nir_opt_copy_prop_vars() for arrays of arraysTimothy Arceri2017-07-191-6/+6
| | | | | | | | | | Previously we only incremented the guide for a single dimension/wildcard. V2: rework logic to avoid code duplication Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected]
* nir/vars_to_ssa: Handle missing struct members in foreach_deref_nodeJason Ekstrand2017-07-191-2/+6
| | | | | | | | | This can happen if, for instance, you have an array of structs and there are both direct and wildcard references to the same struct and some members only have direct or only have indirect. Reviewed-by: Timothy Arceri <[email protected]> Cc: [email protected]
* nir/lower_io_to_temporaries: don't set compact on shadow varsConnor Abbott2017-07-131-0/+1
| | | | | | | | | The compact flag doesn't make sense on local variables, since the packing on them is up to the driver. This fixes nir_validate assertions in some cases, particularly when lower_io_to_temporaries is used on per-vertex inputs/outputs. Reviewed-by: Jason Ekstrand <[email protected]>