| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
The nir_lower_source_mods pass does a weak form of copy propagation to
clean up all of the mov-with-negate's that get generated. However, we
weren't properly checking that the sources were SSA and so we could end up
moving a register read which is not, in general, valid.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
The old code wasn't correctly handling the case where the new value of the
source contains an indirect.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, this function returned the number of elements for structures
and arrays and 0 for everything else. In NIR, this is almost never what
you want because we also treat matricies as arrays so you have to
special-case constantly. This commit glsl_get_length treat matrices
as an array of columns by returning the number of columns instead of 0
This also fixes a bug in locals_to_regs caused by not checking for the
matrix case in one place.
v2: Only special-case for matrices and return a length of 0 for vectors as
we did before. This was needed to not break the TGSI-based drivers and
doesn't really affect NIR at the moment.
Reviewed-by: Connor Abbott <[email protected]>
Tested-by: Rob Clark <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 4314531 -> 4308949 (-0.13%)
instructions in affected programs: 429085 -> 423503 (-1.30%)
helped: 1680
HURT: 0
GAINED: 0
LOST: 111
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
For lowering if/else, I need a way to insert at the end of the previous
block.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Suggested by Jason on a different patch after some comments /
questions by Ilia.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader-db results:
GM45 NIR:
total instructions in shared programs: 4082044 -> 4081919 (-0.00%)
instructions in affected programs: 27609 -> 27484 (-0.45%)
helped: 44
Iron Lake NIR:
total instructions in shared programs: 5678776 -> 5678646 (-0.00%)
instructions in affected programs: 27406 -> 27276 (-0.47%)
helped: 45
Sandy Bridge NIR:
total instructions in shared programs: 7329995 -> 7329096 (-0.01%)
instructions in affected programs: 142035 -> 141136 (-0.63%)
helped: 406
HURT: 19
Ivy Bridge NIR:
total instructions in shared programs: 6769314 -> 6768359 (-0.01%)
instructions in affected programs: 140820 -> 139865 (-0.68%)
helped: 423
HURT: 2
Haswell NIR:
total instructions in shared programs: 6183693 -> 6183298 (-0.01%)
instructions in affected programs: 96538 -> 96143 (-0.41%)
helped: 303
HURT: 4
Broadwell NIR:
total instructions in shared programs: 7501711 -> 7498170 (-0.05%)
instructions in affected programs: 266403 -> 262862 (-1.33%)
helped: 705
HURT: 5
GAINED: 4
v2: Rebase on top of Connor's fix.
v3: Convert the if-test for num_inputs == 2 to an assertion. Suggested
by Jason after some comments / questions by Ilia.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]> [v1]
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Cc: "10.5" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nir/nir.h: In function 'nir_validate_shader':
nir/nir.h:1567:56: warning: unused parameter 'shader' [-Wunused-parameter]
static inline void nir_validate_shader(nir_shader *shader) { }
^
nir/nir_opt_cse.c: In function 'src_is_ssa':
nir/nir_opt_cse.c:165:32: warning: unused parameter 'data' [-Wunused-parameter]
src_is_ssa(nir_src *src, void *data)
^
nir/nir_opt_cse.c: In function 'dest_is_ssa':
nir/nir_opt_cse.c:171:35: warning: unused parameter 'data' [-Wunused-parameter]
dest_is_ssa(nir_dest *dest, void *data)
^
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We weren't comparing the right number of components when checking
swizzles. Use nir_ssa_alu_instr_num_src_components() to do the right
thing.
No piglit regressions, and no fixes either.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Storing this here is pretty sketchy - I don't know if any driver other
than i965 will want to use it. But this will make it a lot easier to
generate NIR code at link time. We'll probably rework it anyway.
(Ian suggested making nir_assign_var_locations_scalar_direct_first
simply modify the nir_shader's fields, rather than passing pointers
to them. If this stays long term, we should do that. But Jason and
I suspect we'll be reworking this area again in the near future.)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I guess I was looking too much at how lower_system_values worked when
writing lower_idiv.
Since ttn wasn't emitting load_var for sysvals and the only drivers
using lower_idiv were using ttn, I think nothing was broken as a result.
But might as well fix this before it becomes a problem.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Originally you had to have one or the other. But actually I don't want
either. (Or rather I want whatever is the minimum # of instructions.)
TODO: not sure where the best place to insert a check that driver hasn't
set *both* lower_negate and lower_sub?
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that we're not generating linker errors, we don't actually modify
this.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These should never happen. Plus, NIR passes really shouldn't be
reporting linker errors - this is past link time.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We don't actually need a gl_program struct. We only used it to
translate prog->Target (i.e. GL_VERTEX_PROGRAM) to the gl_shader_stage
(i.e. MESA_SHADER_VERTEX). We may as well just pass that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I want to use this in some code that doesn't currently include mtypes.h.
It seems like a better place for it anyway.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass performs a mark and sweep pass over a nir_shader's associated
memory - anything still connected to the program will be kept, and any
dead memory we dropped on the floor will be freed.
The expectation is that this will be called when finished building and
optimizing the shader. However, it's also fine to call it earlier, and
many times, to free up memory earlier.
v2: (feedback from Jason Ekstrand)
- Skip sweeping impl->start_block, as it's already in the CF list.
- Don't sweep SSA defs (they're owned by their defining instruction)
- Don't steal phi sources (they're owned by nir_phi_instr).
- Don't steal tex->src (it's owned by the tex_inst itself)
- Don't sweep dereference chains (top-level dereferences are owned by
the instruction; sub-dereferences are owned by the parent deref).
- Don't sweep sources and destinations (SSA defs are handled as part of
the defining instruction, and registers are handled as part of
function implementations).
- Just steal instructions; don't walk them (no longer required).
v3: (feedback from Jason Ekstrand)
- Steal indirect sources from nir_src/nir_dest.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jason pointed out that variable dereferences in NIR are really part of
their parent instruction, and should have the same lifetime.
Unlike in GLSL IR, they're not used very often - just for intrinsic
variables, call parameters & return, and indirect samplers for
texturing. Also, nir_deref_var is the top-level concept, and
nir_deref_array/nir_deref_record are child nodes.
This patch attempts to allocate nir_deref_vars out of their parent
instruction, and any sub-dereferences out of their parent deref.
It enforces these restrictions in the validator as well.
This means that freeing an instruction should free its associated
dereference chain as well. The memory sweeper pass can also happily
ignore them.
v2: Rename make_deref to evaluate_deref and make it take a nir_instr *
instead of void *. This involves adding &instr->instr everywhere.
(Requested by Jason Ekstrand.)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We can't allocate them out of the nir_ssa_def itself, because it may not
be ralloc'd (for example, nir_dest embeds a nir_ssa_def).
However, allocating them out of the instruction should work.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Phi sources are part of the phi instruction and should have the same
lifetime.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
The lifetime of the params array needs to be match the nir_call_instr
itself. So, allocate it using the instruction itself as the context.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
These don't work in MSVC or in older versions of GCC
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89899
Reviewed-by: Mark Janes <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
These were added in commit f2616e56, presumably in preparation for
translating ARB vp/fp into GLSL IR. That never happened, and neither did
a lowering pass that actually generated these instructions.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD().
See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an
adaptation of the nv50 code from Ilia Mirkin).
A python/numpy script which implements the same algorithm (and is
possibly useful for debugging or analysis) can be found here:
http://people.freedesktop.org/~robclark/div-lowering.py
I've tested this on i965 hacked up to insert the idiv lowering pass,
and on freedreno with NIR frontend.
Signed-off-by: Rob Clark <[email protected]>
Tested-by: Eric Anholt <[email protected]> (vc4)
|
|
|
|
|
|
|
|
| |
v2: discovered that i2b/b2i are also confused
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In freedreno these get implemented as the matching f* instruction plus a
u2f to convert the result to float 1.0/0.0. But less lines of code to
just let nir_opt_algebraic handle this for us, plus opens up some small
window for other opt passes to improve (ie. if some shader ended up with
both a flt and slt with same src args, for example).
v2: use b2f rather than u2f
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
As of da5ec2a, we allocate instruction sources out of the instruction
itself. When we realloc the texture sources we need to use the right
memory context or ralloc will get angry and assert-fail
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds a pass to L1-normalize cube-map coordinates. Some hardware
such as i965 requires that largest cube-map coordinate is +-1. We had a
pass to perform this normalization in GLSL IR but we need it in NIR for
cube maps on ARB programs to work correctly.
Reviewed-by: Jordan Justen <[email protected]>
v2 (Suggested by Eric):
- Do a vector fabs and split into components later
- Move to core NIR
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
Not much hardware wants them these days, and it might give us a chance to
do CSE or algebraic at the NIR level.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
We use nir_ssa_defs for nir_builder args, so this takes a nir_src and
makes one so it can be passed in.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
So far we'd only used nir_builder to build brand new programs. But if
we're doing modifications to instructions (like in a lowering pass), then
we want to generate new stuff before the instruction we're modifying.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
The lifetime of the sources array needs to be match the nir_tex_instr
itself. So, allocate it using the instruction itself as the context.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
These sets are part of the block, and their lifetime needs to match the
block itself. So, allocate them using the block itself as the context.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|