| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
It's redundant with nir_shader::info::stage.
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is intended to be called before nir_lower_io() so that we
can do some linking optimisations with the results. It can also
be used with drivers that don't use nir_lower_io() at all such
as RADV.
v2: pass mode mask rather than first and last stage integer.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've been doing this inside of vc4, but vc5 wants it as well and it may be
useful for other drivers (Intel has a related path for pre-gen6 with MRT,
and freedreno had a TGSI path for it at one point).
This required defining a common enum for the standard comparison
functions, but other lowering passes are likely to also want that enum.
v2: Add to meson.build as well.
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial helpers add support for removing unused varyings between
stages.
V2:
- Moved the io mask helper function into this file rather than
nir.h so it's not used elsewhere considering it doesn't handle
all corner cases.
- Use bitmask rather than hash table to handle tcs outputs (Ken)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Will be used in nir link pass to decided if we can remove a varying
or not.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This being declared bool means it won't get merged with the previous
bitfields, this seems like an oversight rather than deliberate.
Noticed when running pahole.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is a further lowering of default-block uniform loads that transforms
load_uniform intrinsics into load_ubo intrinsics. This simplifies the rest
of the backend.
v2: transform from load_uniform instead of straight from variables
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
This pass is a replacement for the nir_lower_samplers pass, which has the
advantage of keeping sampler references as derefs. This allows a unified
treatment of texture instructions and image intrinsics in the backend.
|
|
|
|
|
|
| |
Allows modifying a texture instruction's texture and sampler derefs.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Some hardware, like i965, doesn't support group sizes greater than 32.
In that case, we can reduce the destination size of the ballot
intrinsic, which will simplify our code generation.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
i965 will want these to be scalar operations.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already had a channel_num system value, which I'm renaming to
subgroup_invocation to match the rest of the new system values.
Note that while ballotARB(true) will return zeros in the high 32-bits on
systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
variables do not consider whether channels are enabled. See issue (1) of
ARB_shader_ballot.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
... trivially (as allowed by the spec!) by reusing the existing
nir_opt_intrinsics code.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Specifically, constant fold intrinsics from ARB_shader_group_vote, but I
suspect it'll be useful for other things in the future.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Similar with support for YUYV but with byte order difference in sampler
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
This will allow to constify other things.
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info
from being embedded into being just a pointer. The idea was that
sharing the shader_info between NIR and GLSL would be easier if it were
a pointer pointing to the same shader_info struct. This, however, has
caused a few problems:
1) There are many things which generate NIR without GLSL. This means
we have to support both NIR shaders which come from GLSL and ones
that don't and need to have an info elsewhere.
2) The solution to (1) raises all sorts of ownership issues which have
to be resolved with ralloc_parent checks.
3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been
using nir_gather_info to fill out the final shader_info. Thanks to
cloning and the above ownership issues, the nir_shader::info may not
point back to the gl_shader anymore and so we have to do a copy of
the shader_info from NIR back to GLSL anyway.
All of these issues go away if we just embed the shader_info in the
nir_shader. There's a little downside of having to copy it back after
calling nir_gather_info but, as explained above, we have to do that
anyway.
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Fixes: 53aa109b ("nir: add pass to lower atomic counters to SSBO")
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
This is equivalent to what mesa/st does in glsl_to_tgsi. For most hw
there isn't a particularly good reason to treat these differently.
Signed-off-by: Rob Clark <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This shuffles constants down in the reverse of what the previous
patch does and applies some simpilifications that may be made
possible from doing so.
Shader-db results BDW:
total instructions in shared programs: 12980814 -> 12977822 (-0.02%)
instructions in affected programs: 281889 -> 278897 (-1.06%)
helped: 1231
HURT: 128
total cycles in shared programs: 246562852 -> 246567288 (0.00%)
cycles in affected programs: 11271524 -> 11275960 (0.04%)
helped: 1630
HURT: 1378
V2: mark float opts as inexact
Reviewed-by: Elie Tournier <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to section 14.6 of the Vulkan specification:
"When sample shading is enabled, the x and y components of FragCoord
reflect the location of the sample corresponding to the shader
invocation."
So add a boolean parameter to the lowering pass to select this behavior
when we need it.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
And from nir_lower_regs_to_ssa_impl() as well.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Vedran Miletić <[email protected]>
Acked-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The algorithms used by this pass, especially for division, are heavily
based on the work Ian Romanick did for the similar int64 lowering pass
in the GLSL compiler.
v2: Properly handle vectors
v3: Get rid of log2_denom stuff. Since we're using bcsel, we do all the
calculations anyway and this is just extra instructions.
v4:
- Add back in the log2_denom stuff since it's needed for ensuring that
the shifts don't overflow.
- Rework the looping part of the pass to be easier to expand.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
It's a problem waiting to happen. Individual headers should be annotated
if needed.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
There's nothing "double" about it other than, perhaps, the fact that it
packs two 32-bit values.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This tries to move comparisons (a common source of boolean values)
closer to their first use. For GPUs which use condition codes,
this can eliminate a lot of temporary booleans and comparisons
which reload the condition code register based on a boolean.
V2: (Timothy Arceri)
- fix move comparision for phis so we dont end up with:
vec1 32 ssa_227 = phi block_34: ssa_1, block_38: ssa_240
vec1 32 ssa_235 = feq ssa_227, ssa_1
vec1 32 ssa_230 = phi block_34: ssa_221, block_38: ssa_235
- add nir_op_i2b/nir_op_f2b to the list of comparisons.
V3: (Timothy Arceri)
- tidy up suggested by Jason.
- add inot/fnot to move comparison list
V4: (Jason Ekstrand)
- clean up move_comparison_source
- get rid of the tuple
- rework phi handling
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]> [v1]
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Vulkan, we always have both the TCS and TES available in the same
pipeline, so we can simply use the TCS OutputVertices execution mode
value as the TES PatchVertices built-in.
For GLSL, we handle this in the linker. But we could use this pass
in the case when both TCS and TES are linked together, if we wanted.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|