| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Properly validate DLL matches OBJ for jitted function
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Create shader_lib during build, link with shaders at DLL generation time
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new define (USE_SIMD16_VS), to denote calling a 16-wide vertex shader.
This is needed because the mesa driver can do 16-wide shaders, but rasty
cannot yet, so we need to distinguish.
Create a new VertexID entry (VertexID16) for the USE_SIMD16_VS case, since
we need to format the vertex id in a way that is digestible by the 16-wide VS
Disabled for now. To be enabled in a future checkin when driver work
is complete.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Add name argument to x86 autogenerated macros.
Add useful variable names for DCL_inputVec implementation.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
in shader and fetch dump files
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
- Move debug .ll files to JIT_CACHE_DIR
- Don't link against jitter SRGBLut table, add global data to shader that needs it.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Added support for Fetch / Sample / LD functions
Added DLL link to JitCache implementation
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Adds ability to step into jitted llvm IR in Visual Studio.
- Updated llvm type generation script to also generate corresponding debug types.
- New module pass inserts debug metadata into the IR for each function
Disabled by default.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
+ ZeroMemory() macro definition for non win32-compilation in common/os.h
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part 2 of 2 (part 1 is autoconf changes, part 2 is C++ changes)
When only a single SWR architecture is being used, this allows that
architecture to be builtin rather than as a separate libswrARCH.so that
gets loaded via dlopen. Since there are now several different code
paths for each detected CPU architecture, the log output is also
adjusted to convey where the backend is getting loaded from.
This allows SWR to be used for static mesa builds which are still
important for large HPC environments where shared libraries can impose
unacceptable application startup times as hundreds of thousands of copies
of the libs are loaded from a shared parallel filesystem.
Based on an initial implementation by Tim Rowley.
v2: Refactor repetitive preprocessor checks to reduce code duplication
v3: Formatting changes per Bruce C. Also delay screen creation until end
to avoid leaks when failure conditions are hit.
Signed-off-by: Chuck Atkins <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
CC: Tim Rowley <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part 1 of 2 (part 1 is autoconf changes, part 2 is C++ changes)
When only a single SWR architecture is being used, this allows that
architecture to be builtin rather than as a separate libswrARCH.so that
gets loaded via dlopen. Since there are now several different code
paths for each detected CPU architecture, the log output is also
adjusted to convey where the backend is getting loaded from.
This allows SWR to be used for static mesa builds which are still
important for large HPC environments where shared libraries can impose
unacceptable application startup times as hundreds of thousands of copies
of the libs are loaded from a shared parallel filesystem.
Based on an initial implementation by Tim Rowley.
v2: Fix comment placement pointed out by Bruce C.
Signed-off-by: Chuck Atkins <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
CC: Tim Rowley <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Only one piglit test fails,
sso-vs-gs-fs-array-interleave
There are 3 tests using ssbo without checking sizes failing also
but those are test bugs.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
if no destination:
a) convert _RET instructions to non _RET variants if no dst
b) set src0 to undefined if it's a READ, this should get DCE then.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The normal ssa renumbering isn't sufficient for LDS queue access,
this uses two stacks, one for the lds queue, and one for the
lds r/w ordering.
The LDS oq values are incremented in their use in a linear
fashion.
The LDS rw values are incremented in their definitions and used
in the next lds operation to ensure reordering doesn't occur.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
So LDS ops have to be SLOT_X,
and LDS OQ reads have read port restrictions so we try
and force those into only having one per slot and avoiding
bank swizzles.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This tries to avoid an lds queue read getting scheduled separately
from an lds ret read, the non-sb code uses the same style of hammer,
this isn't foolproof.
We can do better, but it's a bit tricky, as you have to scan ahead
and either schedule more lds oq moves and more lds reads and that
could lead to you running out of space anyways.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for tracking the lds oq read/writes
so can avoid scheduling other things in between.
This patch just adds the tracking and assert to show
problems.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
You have to schedule LDS_READ_RET _, x and MOV reg, LDS_OQ_A_POP
in the same basic block/clause. This makes sure once we've issues
and MOV we don't add another block until we balance it with an
LDS read.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This adds lds to the geom emit handling
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Don't try and fold LDS using expressions.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
We need to convert these to the hw special registers.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This handles parsing the LDS ops and queue accessess.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This fixes bad interactions with the LDS special values.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Although these are op3s they don't have a dst reg.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For LDS read/write ordering we use the LDS_RW value, reads
will wait on previous writes.
For LDS read/read from LDS queue ordering we use the LDS_OQ
values, we define two for now, though initially we'll just
support OQA.
Also add the check for the lds oq values
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It's rare to have a final alu clause on normal shaders (exports)
but tess shaders write to LDS as their output, so we see some
alu clauses, and the CF_END get put in the wrong place.
This makes sure to update last_cf correctly.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds support for GDS ops to sb backend.
This seems to work for atomics and tess factor writes.
Acked-By: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This stops them being optimised out.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some tess shaders were doing MOVA_INT _, c0.x on cayman, and then
hitting an assert in sb_bc_finalize.cpp:translate_kcache.
This makes sure the toplevel kcache tracker gets updated,
and the clause gets fixed up.
Reviewed-by: Roland Scheidegger <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Just saves a pointless a = a + 0;
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This field is ignored for tf writes so should be 0.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Gert Wollny <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This query shows the ratio of total commands vs. drawing commands sent
to the vgpu device. This gives some idea of how many state changes
are sent per draw call. The closer the ratio is to 1.0, the better.
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We still have more work to do but piglit results are looking
pretty good.
At GLSL 1.50 we have 30647/31118 piglit tests passing.
At GLSL 4.50 we have 37927/38551 piglit tests passing.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The code to load outputs is essentially the same as load inputs
so we make the interface more generic to maximise code sharing.
We will make use of the new support in the following patch.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Trivial. Found by Coccinelle.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Only update them when the pointers are valid.
Signed-off-by: Indrajit Das <[email protected]>
Reviewed-by: Christian König <[email protected]>
|