| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
This allows dropping 1 call to st_nir_opts, because shaders are always
optimized after linking.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The removed st_nir_opts calls are mostly redundant.
There is an improvement with shader-db on radeonsi:
Before:
real 1m54.047s
user 28m37.857s
sys 0m7.573s
After:
real 1m52.012s
user 28m3.412s
sys 0m7.808s
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
DPH isn't actually commutative, so this doesn't work. If the immediate
in src0 would be a VF candidate, we could do better. *shrug*
No shader-db changes on any Intel platform.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Fixes: b04beaf41d2 ("intel/vec4: Try both sources as candidates for being immediates")
|
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Fixes: 09705747d72 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Suggested-by: Nanley Chery <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Rework:
* Disallow linear 1D stencil buffers (Nanley)
* Force Y for gen12 stencil rather than ~W (Nanley)
Co-authored-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
| |
It's no longer supported by the hardware
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Rework:
* NULL stencil buffer path (Jason)
* genxml fixes (Nanley)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Reworks:
* Fix 3DSTATE_DEPTH_BUFFER "Surface Format" end in xml (Jason)
* Remove WM_HZ_OP changes (Nanley)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function was difficult to implement for new formats due to the
combination of endianness and swapbytes support. Since it's mostly
used for fast paths, bugs in it were often missed during testing.
Just reimplement it on top of the recent
_mesa_format_from_format_and_type() which can give us a canonical
MESA_FORMAT for a format and type enum (while respecting endianness).
Fixes:
- R4G4B4A4_UNORM, B4G4R4_UINT, R4G4B4A4_UINT incorrectly matched with
swapBytes (you can't just reverse the channels if the channels
aren't bytes)
- A4R4G4B4_UNORM and A4R4G4B4_UINT missing BGRA/4444_REV matches
- failing to match RGB/BGR unorm8 array formats on BE
- 2101010 formats incorrectly matching with swapBytes set.
- UINT/SINT byte formats failed to match with swapBytes set.
This deletes the part of tests/mesa_formats.cpp that called
_mesa_format_matches_format_and_type() to make sure it didn't
assertion fail, as it now would assertion fail due to the fact that we
were passing an invalid format (GL_RG) for most types.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
In desktop GL, you can specify things like GL_DEPTH_COMPONENT/GL_BYTE as a
ReadPixels format, and we need to be able to represent that to see if we
have proper MESA_FORMATs for them. That's exactly what the
mesa_array_format enum is for.
v2: Drop _mesa from static fn.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We had missed this case where GLES3 allows glReadPixels(DEPTH, UINT_24_8),
and just got lucky by the readpixels path never asking for the matching
format from this function.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The GL spec says the 24-bit component is in the high bits, and
format_unpack.c looks at the high 24 bits in the S8Z24 case, not
Z24SS8.
Avoids a regression in the next commit.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The unreachable() that follows isn't very useful for debug, and by adding
this here we get a nice description of the failure in debug builds.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Some queries are still failing and layered rending needs more work.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
When we don't have streamout enabled, we have to read this register to
get the number of primitives emitted.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
We have GS state now.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When used in a GS pipeline, the VS doesn't end with the END
instruction. Instead it chains to the GS, which continues running with
the same register allocation. The intended use cases seems to be that
you can compile a regular VS (ie outputs in registers and ending with
END) but then tack on link-time generated code past the END to write
the outputs using STLW, in case the VS is used with GS.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
We don't know what kind of loads we might have to wait on when coming
in from chsh in the VS so set both sync flags.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These sysvals have to be unclobbered by VS and in the same registers
in both VS and GS, since the chsh from VS to GS doesn't reload the
values. We use the pre-color argument to ir3_ra() to always place
these values in r0.x and r0.y.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inputs are the GS header, which contains vertex ID, local primitive ID
and thread ID as well as primitive ID. The setup is a little different
from other sysvals, since we always have to receive them in the VS so
that it can pass them on into the GS.
The vertex flag outputs from GS is set up as a proper nir output in
the lowering pass and doesn't need special handling here.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
This implements the load_vs_primitive_stride_ir3,
load_vs_vertex_stride_ir3 and load_primitive_location_ir3 intrinsics,
used for getting the primitive layout strides and locations.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
This introduces two new lowering passes. One to lower VS to explicit
outputs using STLW and one to lower GS to load input using LDLW and
implement the GS specific functionality.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
Since the presence of GS changes how the VS operates we need to track
that in the shader key.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
These intrinsics will let us do all the offset calculations in nir,
which is nicer to work with and lets nir_opt_algebraic eat it all up.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
These access memory used for passing data between geometry stages.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
We'll need to pre-color certain input registers betwee VS and GS
shaders.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
Before, offset held the offset, which can be either immediate or a
register. Use a third register to hold the offset so that we can use
a register.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
Just add the constructors for now and special case similar to END so
we don't remove them.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
We know what these do an either write them in the program stateobj or
don't need to write them.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
| |
trivial
|
|
|
|
| |
trivial
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
so that we immediately set the no-op dispatch
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tests the combinations of cases of RAW, WAW and WAR hazards involving
both inorder and outoforder instructions. Also tests that
dependencies combine and propagate correctly through control
flow (loops and conditionals).
v2: Add an extra test illustrating that the non-logical CFG edge
between then-block and else-block is being taking into
account. (Curro)
Reviewed-by: Francisco Jerez <[email protected]>
|