| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
When NIR was originally drafted, there was no easy way to determine if
something was constant or not. The result was that we had lots of
special-casing for constant values such as this. Now that load_const
instructions are SSA-only, it's really easy to find constants and this
isn't really needed anymore.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
This allows building Freedreno with clang
Signed-off-by: Bernhard Rosenkränzer <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We're about to separate the two concepts. When we do, the sampler will
become optional. Doing a rename first makes the separation a bit more
safe because drivers that depend on GLSL or TGSI behaviour will be fine to
just use the texture index all the time.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If you're worried about the duplication of some CAPs, we can remove them
later.
v2: add fields for memory eviction stats
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This cap indicates whether pipe->create_surface can reinterpret a texture
as a surface with a format of different block width/height (but equal
block size).
v2: fix whitespace
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This cap indicates that the driver only supports R, RG, RGB and RGBA
formats for PIPE_BUFFER sampler views.
v2: move into "unsupported features" section for nouveau (Ilia Mirkin)
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since we emulate clip-planes, the clip-vertex is used within the VS
itself (thanks to nir_lower_clip). So just ignore it as a VS output.
Fixes a boatload of piglit tests that were asserting on unknown
varying slot.
(Also unrelated spelling/typo fix.)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
With glsl_to_nir we end up with local variables, instead of global, for
arrays.
Note that we'll eventually have to do something more clever, I think,
when we support multiple functions, but that will probably take some
work in a few places.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Something we start to see with glsl_to_nir.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
With tgsi_to_nir we get this as a normal input with VARYING_SLOT_FACE.
But glsl_to_nir plus nir_lower_system_values this becomes an intrinsic.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Experimentally derived max size.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This way one can reuse it in glsl, nir or other infrastructure without
pulling nir as dependency.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Christian Gmeiner <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Writes string to cmdstream in payload of a no-op packet.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since the GREMEDY extensions are normally only exposed by the gremedy
debugger (and could possibly trigger debug paths in the app), we don't
expose the extension by default, but instead only with
ST_DEBUG=gremedy.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Once we go past half of the "GPR" register file, it seems like we need
to run frag shader with smaller threadsize. (The vertex shader already
runs at TWO_QUADS, which is the minimum.)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Some a4xx firmware doesn't implement the "PFD" (prefetch-disabled)
version of the CP_INDIRECT_BUFFER packet. So allow for PFD vs PFE per
generation. Switch a3xx and a4xx over to using prefetch-enabled version
(which is also what blob does.. it seems only on a2xx we cannot use
PFE).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
In fad158a0 ("freedreno/ir3: array rework") the src # (n) shifted by
one, but missed updating delay-slot calc.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Detect arrays which don't conflict with each other and allow overlapping
register allocation.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It at least happens with some piglit tests, like
$piglit/bin/vp-address-01
VERT
DCL IN[0]
DCL IN[1]
DCL OUT[0], POSITION
DCL OUT[1], COLOR
DCL CONST[0..7]
DCL ADDR[0]
0: ARL ADDR[0].x, IN[1].xxxx
1: MOV_SAT OUT[1], CONST[ADDR[0].x-1]
2: DP4 OUT[0].x, CONST[4], IN[0]
3: DP4 OUT[0].y, CONST[5], IN[0]
4: DP4 OUT[0].z, CONST[6], IN[0]
5: DP4 OUT[0].w, CONST[7], IN[0]
6: END
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Seems like in certain cases, we cannot use c<a0.x+0> as the third src to
cat3 instructions. This may be slightly conservative, we may only have
this restriction when the first src is also const.
This fixes, for example, +24/-0 of the variable-indexing piglit tests.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If we handle separately the special case of eliminating output mov
(which includes keeps and various other cases where we don't have a
consuming instruction's src register to collapse things into), we
can simplify the logic.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Shuffle things slightly, passing instr-data to ra_name() to reduce the
number of places where we need to add support for array names.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Collapse two nested if's into one to reduce indent level.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new interface to support hardware mipmap generation.
PIPE_CAP_GENERATE_MIPMAP is added to allow a driver to specify
if this new interface is supported; if not supported, the state tracker will
fallback to mipmap generation by rendering/texturing.
v2: add PIPE_CAP_GENERATE_MIPMAP to the disabled section for all drivers
v3: add format to the generate_mipmap interface to allow mipmap generation
using a format other than the resource format
v4: fix return type of trace_context_generate_mipmap()
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It makes sense to re-use pipe->invalidate_resource for the purpose of
glInvalidateBufferData, but this function is already implemented in vc4
where it doesn't have the expected behavior. So add a capability flag
to indicate that the driver supports the expected behavior.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
v2: document the integer behavior
Reviewed-by: Edward O'Callaghan <[email protected]
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Found during NIR_TEST_CLONE=1 piglit run. We were using block->index
but forgetting to require it. Causing things to not work with a cloned
shader which didn't preserve block_index.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Immediately convert into NIR and do an initial key-agnostic lowering/
optimization pass. This should let us share most of the per-variant
transformations between each variant, and hopefully minimize the draw-
time variant creation part of the compilation process.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
It will still hit a compile_assert() in emit_tex, which has the
advantage of dumping out the offending shader.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The fixed alignment of u_upload_mgr will go away.
This is the first step.
The motivation is that one u_upload_mgr can have multiple users,
each allocating from the same buffer, but requiring a different alignment.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This allows the state tracker to know that the various draw parameters
are available in vertex shaders.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When Connor originally drafted NIR, he copied the same function+overload
system that GLSL IR had with a few names changed. However, this
double-indirection is not really needed and has only served to confuse
people. Instead, let's just have functions which may not have unique names
and may or may not have an implementation. If someone wants to do overload
resolving, they can hav a hash table based function+overload system in the
overload resolving pass. There's no good reason to keep it in core NIR.
Reviewed-by: Connor Abbott <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
ir3 bits are
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tessellation control shaders need to be careful when writing outputs.
Because multiple threads can concurrently write the same output
variables, we need to only write the exact components we were told.
Traditionally, for sub-vector writes, we've read the whole vector,
updated the temporary, and written the whole vector back. This breaks
down with concurrent access.
This patch prepares the way for a solution by adding a writemask field
to store_var intrinsics, as well as the other store intrinsics. It then
updates all produces to emit a writemask of "all channels enabled". It
updates nir_lower_io to copy the writemask to output store intrinsics.
Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks
by doing a read-modify-write cycle (which is safe, because local
variables are specific to a single thread).
This should have no functional change, since no one actually emits
partial writemasks yet.
v2: Make nir_validate momentarily assert that writemasks cover the
complete value - we shouldn't have partial writemasks yet
(requested by Jason Ekstrand).
v3: Fix accidental SSBO change that arose from merge conflicts.
v4: Don't try to handle writemasks in ir3_compiler_nir - my code
for indirects was likely wrong, and TTN doesn't generate partial
writemasks today anyway. Change them to asserts as requested by
Rob Clark.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]> [v3]
|