| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
At a later stage we might want to split out the NIR specific [XXX:
which one was it], as to make things move obvious and rename the files
appropriately. This patch aims to split it out of nir.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Allows us to remove the SCons workaround :-)
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This way one can reuse it in glsl, nir or other infrastructure without
pulling nir as dependency.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently it's an empty library, although it'll be used to store common
code between GLSL and NIR that is compiler specific (rather than generic
as the one in src/util).
XXX: strictly speaking we could add a python/mako parser to generate the
relevant files instead including builtin_type_macros.h in such a manner.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
SCons doesn't understand nir yet and doesn't want to compile the glsl to
nir pass. Move the files to their own variable so we can add it only for
automake.
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
libglsl_la_SOURCES includes both NIR_FILES and LIBGLSL_FILES, so for
libglsl.la consumers, this is a no-op. libnir.la however no longer uses
any GLSL IR infrastructure and can be used without also linking to
libglsl.la.
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In this lowering pass, shared variables are decomposed into intrinsic
calls.
v2:
* Send mem_ctx as a parameter (Iago)
v3:
* Shared variables don't have an associated interface block (Iago)
* Always use 430 packing (Iago)
* Comment / whitespace cleanup (Iago)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This class has code that will be shared by lower_ubo_reference and
lower_shared_reference. (lower_shared_reference will be used to
support compute shader shared variables.)
v2:
* Add lower_buffer_access.h to makefile (Emil)
* Remove static is_dereferenced_thing_row_major from
lower_buffer_access.cpp. This will become a lower_buffer_access
method in the next commit.
* Pass mem_ctx as parameter rather than using a member variable (Iago)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
Commit b9b40ef9b76 moved the file, but forgot to update the reference in
the makefile. Thus the out of tree build was busted :\
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
This commit is heavily based on one by Rob Clark <[email protected]> but
reworked to re-use nir_create functions and do less hashing.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Because the next patch will add an optimization that is specific to i965,
we want to move this loweing pass to that driver altogether.
This is safe because i965 is the only consumer.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've assumed that we could lower per-component vector access from
vec[i] = scalar
to
vec = ir_triop_vector_insert(vec, scalar, i)
but with SSBOs (and compute shader SLM and tesselation outputs) this is
no longer valid. If a vector is "externally visible", multiple threads
can write independent components simultaneously. With lowering to
ir_triop_vector_insert, each thread read the entire vector, changes one
component, then writes out the entire vector. This is racy.
Instead of generating a ir_binop_vector_extract when we see v[i], we
generate ir_dereference_array. We then add a lowering pass to lower the
ir_dereference_array to ir_binop_vector_extract for rvalues and for to
vector_insert for lvalues in a separate lowering pass.
The resulting IR is the same as before, but we now have a window between
ast->ir conversion and the lowering pass where v[i] appears in the IR as
an array deref. This lets us run lowering passes that lower the vector
access to I/O (eg for SSBO load/store) before we lower the per-component
access to full vector writes.
Reviewed-by: Jordan Justen <[email protected]>
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
|
|
|
|
|
|
| |
It doesn't actually operate on variables.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move glsl_types into NIR, now that the dependency on glsl_symbol_table
has been split out.
Possibly makes sense to rename things at this point, but if we do that
I'd like to keep it split out into a separate patch to make git history
easier to follow (IMHO).
v2: fix android build
v3: I f***ing hate scons.. but at least it builds
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
First step towards inverting the dependency between glsl and nir (so nir
can be used without glsl). Also solves this issue with 'make distclean'
Making distclean in mesa
make[2]: Entering directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa'
Makefile:2486: ../glsl/.deps/shader_enums.Plo: No such file or directory
make[2]: *** No rule to make target '../glsl/.deps/shader_enums.Plo'. Stop.
make[2]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa'
Makefile:684: recipe for target 'distclean-recursive' failed
make[1]: *** [distclean-recursive] Error 1
make[1]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src'
Makefile:615: recipe for target 'distclean-recursive' failed
make: *** [distclean-recursive] Error 1
Reported-by: Andy Furniss <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now nir_instrs_equal() is tied pretty tightly to CSE, but we're
going to introduce the idea of an instruction set and tie it to that
instead. In anticipation of that, move this into its own file where
we'll add the rest of the instruction set implementation later.
v2: Rebase on texture support.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With NIR, it actually hurts things.
total instructions in shared programs: 6529329 -> 6528888 (-0.01%)
instructions in affected programs: 14833 -> 14392 (-2.97%)
helped: 299
HURT: 1
In all affected programs I inspected (including the single hurt one) the
pass CSE'd some multiplies and caused some reassociation (e.g., caused
(A * B) * C to be A * (B * C)) when the original intermediate result was
reused elsewhere.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Some hardware (such as Broadwell) can run geometry shaders more
efficiently when the number of vertices emitted is statically known.
This pass provides a way to obtain the constant vertex count, or
-1 indicating that the vertex count is unknown/non-constant.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch also introduces a lowering pass to convert the simple GS
intrinsics to the new ones. See the comments above that for the
rationale behind the new intrinsics.
This should be useful for i965; it's a generic enough mechanism that I
could see other drivers potentially using it as well, so I don't feel
too bad about putting it in the generic code.
v2:
- Use nir_after_block_before_jump for the cursor (caught by Jason
Ekstrand - I'd mistakenly used nir_after_block when rebasing this
code onto the new NIR control flow API).
- Remove the old emit_vertex intrinsic at the end, rather than in
the middle (requested by Jason).
- Use state->... directly rather than locals (requested by Jason).
- Report progress from nir_lower_gs_intrinsics() (requested by me).
- Remove "Authors:" section from file comment (requested by
Michael Schellenberger Costa).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the only C++ function having its own wrapper we can 'demote' this
file to a normal C one. This allows us to get rid of extern C { #include
<foo.h> } 'hacks'. Plus some of the headers may use C99 initializers,
which are not supported by the ISO standard.
This may cause build issue on incremental builds. If so run the
following:
sed -i -e 's|samplers\.cpp|samplers.c|' src/glsl/nir/.deps/nir_lower_samplers.Plo
Fixes: ef8eebc6ad5(nir: support indirect indexing samplers in struct arrays)
Signed-off-by: Emil Velikov <[email protected]>
Reported-by: Gottfried Haider <[email protected]>
Tested-by: Gottfried Haider <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since the following patches will add additional tex-lowering related
functionality, which doesn't make sense to split out into a separate
pass (as they would require duplication of the projector lowering
logic), let's give this pass a more generic name.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vertex shader lowering adds calculation for CLIPDIST, if needed
(ie. user-clip-planes), and the frag shader lowering adds conditional
kills based on CLIPDIST value (which should be treated as a normal
interpolated varying by the driver).
Note that this won't quite do the right thing in the face of MSAA plus
user-clip-planes, since all the samples would be killed or not (rather
than potentially only a portion of them). But it's better than no UCP
support at all for drivers that don't have this in hw.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
v2 (Jason Ekstrand):
- Handle non-SSA sources and destinations
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
| |
This pass can be used as a helper for NIR producers so they don't have to
worry about creating the temporaries themselves.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than folding one variable within the other only to unwrap them,
just use the ones we need.
v2: bring back LOCAL_PATH prefix for nir_constant_expressions,h
Cc: 11.0 <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]> (v1)
|
|
|
|
|
|
|
| |
v2: use nir_cf_node_remove_after() instead of our own broken thing.
v3: use the new control flow modification helpers.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to start reworking and expanding this code, but it'll be a lot
easier to do once we disentangle it from the rest of the stuff in nir.c.
Unfortunately, there are a few unavoidable dependencies in nir.c on
methods we'd rather not expose publicly, since if not used in very
specific situations they can cause Bad Things (tm) to happen. Namely, we
need to do some magical control flow munging when adding/removing jumps.
In the future, we may disallow adding/removing jumps in
nir_instr_insert_*() and nir_instr_remove(), and use separate functions
that are part of the control flow modification code, but for now we
expose them and put them in a separate, private header.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We may find a cause to do more undef optimization in the future, but for
now this fixes up things after if flattening. vc4 was handling this
internally most of the time, but a GLB2.7 shader that did a conditional
discard and assign gl_FragColor in the else was still emitting some extra
code.
total instructions in shared programs: 100809 -> 100795 (-0.01%)
instructions in affected programs: 37 -> 23 (-37.84%)
v2: Use nir_instr_rewrite_src() to update def/use on src[0] (by Thomas
Helland).
v3: Make sure to flag metadata dirties, and copy the swizzle and abs/neg
over to src[0], too (by anholt).
Reviewed-by: Thomas Helland <[email protected]> (v2)
Tested-by: Thomas Helland <[email protected]> (v2)
|
|
|
|
|
|
|
|
| |
This is useful to increase the CSE opportunities for a scalar backend. It
avoids regressions when dropping vc4's custom CSE implementation.
v2: Cleanups by Matt (decl in the for loop, and unreachable()).
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This lowers the enhanced ir_call using the lookaside table
of subroutines into an if ladder. This initially was done
at the AST level but it caused some ordering issues so a separate
pass was required.
v2: clone return value derefs.
v2.1: update for subroutine->int convert.
v2.2: add a clone for the array index
Reviewed-by: Chris Forbes <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Similar to gl_ClipDistance -> gl_ClipDistanceMESA
v2: - renamed is_mesa_var to lowered_builtin_array_variable
- moved LowerTessLevel into gl_constants
- cosmetic changes in lower_tess_level.cpp
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise `make distcheck' will fail.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass performs a mark and sweep pass over a nir_shader's associated
memory - anything still connected to the program will be kept, and any
dead memory we dropped on the floor will be freed.
The expectation is that this will be called when finished building and
optimizing the shader. However, it's also fine to call it earlier, and
many times, to free up memory earlier.
v2: (feedback from Jason Ekstrand)
- Skip sweeping impl->start_block, as it's already in the CF list.
- Don't sweep SSA defs (they're owned by their defining instruction)
- Don't steal phi sources (they're owned by nir_phi_instr).
- Don't steal tex->src (it's owned by the tex_inst itself)
- Don't sweep dereference chains (top-level dereferences are owned by
the instruction; sub-dereferences are owned by the parent deref).
- Don't sweep sources and destinations (SSA defs are handled as part of
the defining instruction, and registers are handled as part of
function implementations).
- Just steal instructions; don't walk them (no longer required).
v3: (feedback from Jason Ekstrand)
- Steal indirect sources from nir_src/nir_dest.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD().
See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an
adaptation of the nv50 code from Ilia Mirkin).
A python/numpy script which implements the same algorithm (and is
possibly useful for debugging or analysis) can be found here:
http://people.freedesktop.org/~robclark/div-lowering.py
I've tested this on i965 hacked up to insert the idiv lowering pass,
and on freedreno with NIR frontend.
Signed-off-by: Rob Clark <[email protected]>
Tested-by: Eric Anholt <[email protected]> (vc4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds a pass to L1-normalize cube-map coordinates. Some hardware
such as i965 requires that largest cube-map coordinate is +-1. We had a
pass to perform this normalization in GLSL IR but we need it in NIR for
cube maps on ARB programs to work correctly.
Reviewed-by: Jordan Justen <[email protected]>
v2 (Suggested by Eric):
- Do a vector fabs and split into components later
- Move to core NIR
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Not much hardware wants them these days, and it might give us a chance to
do CSE or algebraic at the NIR level.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
i965/nir: Use the dedicated ffma peephole
total instructions in shared programs: 4418748 -> 4394618 (-0.55%)
instructions in affected programs: 1292790 -> 1268660 (-1.87%)
helped: 5999
HURT: 457
GAINED: 4
LOST: 9
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NIR uses these enums/#defines in nir_variables and associated intrinsics,
but I want to be able to use them from TGSI->NIR and NIR->TGSI.
Otherwise, we had to pull in all of mtypes.h.
This doesn't cover all of the enums we might want from a shared compiler
core (like varying slots or vert attribs), but it at least covers what I
need at the moment (system values and interp qualifiers).
v2: Move to src/glsl since util/ is really vague. Include in Makefile.am
list. Use plain bitshifts and stdint types instead of undefined
BITFIELD64_BIT.
v3: Rename to shader_enums.h. Move it into Makefile.sources.
Reviewed-by: Kenneth Graunke <[email protected]> (v2, with
recommendation to rename)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The header was added with commit 2a135c470e3(nir: Add an ALU op builder
kind of like ir_builder.h) but did not made it into to the sources list.
Fortunately it remained unused until a recent commit faf6106c6f6(nir:
Implement a Mesa IR -> NIR translator.)
v2: Remove the bogus dependency. Tweak commit message.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
st_glsl_to_tgsi and ir_to_mesa have handled conditional discards for a
long time; the previous patch added that capability to i965.
i965 (Haswell) shader-db stats:
Without NIR:
total instructions in shared programs: 5792133 -> 5776360 (-0.27%)
instructions in affected programs: 737585 -> 721812 (-2.14%)
helped: 6300
HURT: 68
GAINED: 2
With NIR:
total instructions in shared programs: 5787538 -> 5769569 (-0.31%)
instructions in affected programs: 767843 -> 749874 (-2.34%)
helped: 6522
HURT: 35
GAINED: 6
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2 Jason Ekstrand <[email protected]>:
- Use nir_dominance_lca for computing least common anscestors
- Use the block index for comparing dominance tree depths
- Pin things that do partial derivatives
Reviewed-by: Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Rebase on the nir_opcodes.h python code generation support.
v3: Use SSA values, and set an appropriate writemask on dot products.
v4: Make the arguments be SSA references as well. This lets you stack up
expressions in the arguments of other expressions, at the cost of
having to insert a fmov/imov if you want to swizzle. Also, add
the generated file to NIR_GENERATED_FILES.
v5: Use more pythonish style for iterating the list.
v6: Infer the size of the dest from the size of the srcs, and auto-swizzle
a single small src out to the appropriate size.
v7: Add little helpers for initializing the struct, add a typedef for the
struct like other nir types have.
Reviewed-by: Kenneth Graunke <[email protected]> (v6)
Reviewed-by: Connor Abbott <[email protected]> (v7)
|
|
|
|
|
|
|
| |
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes phi nodes whose sources all point to the same thing.
Shader-db results:
total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%)
NIR instructions in affected programs: 126564 -> 122480 (-3.23%)
helped: 615
HURT: 0
total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%)
FS instructions in affected programs: 24622 -> 23174 (-5.88%)
helped: 138
HURT: 0
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Jason Ekstrand <[email protected]>
Signed-off-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2 Jason Ekstrand <[email protected]>:
- Add better comments
- Use nir_ssa_dest_init and nir_src_for_ssa more places
- Fix some void * casts
v3 Jason Ekstrand <[email protected]>:
- Rework the way we determine whether or not to sccalarize a phi node to
make the recursion non-bogus
- Treat load_const instructions as scalarizable
v4 Jason Ekstrand <[email protected]>:
- Allow uniform and input loads to be scalarizable
v5 Jason Ekstrand <[email protected]>:
- Also consider loads of inputs (varying, uniform, or ubo) to be
scalarizable. We were already doing this for load_var on uniforms and
inputs.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a required field to the Opcode class, const_expr, that contains an
expression or statement that computes the result of the opcode given known
constant inputs. Then take those const_expr's and expand them into a function
that takes an opcode and an array of constant inputs and spits out the constant
result. This means that when adding opcodes, there's one less place to update,
and almost all the opcodes are self-documenting since the information on how to
compute the result is right next to the definition.
The helper functions in nir_constant_expressions.c were taken from
ir_constant_expressions.cpp.
v3 Jason Ekstrand <[email protected]>
- Use mako to generate one function per opcode instead of doing piles of
string splicing
v4 Jason Ekstrand <[email protected]>
- More comments and better indentation in the mako
- Add a description of the constant expression language in nir_opcodes.py
- Added nir_constant_expressions.py to EXTRA_DIST in Makefile.am
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|