| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Reviewed-by: Dave Airlie <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5085>
|
|
|
|
|
| |
Signed-off-by: Glenn Kennard <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
the vpm bit wasn't being applied to the push/pop instructions.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This field is ignored for tf writes so should be 0.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far on pre-cayman chipsets the CF instructions CF_OP_LOOP_END,
CF_OP_CALL_FS, CF_OP_POP, and CF_OP_GDS an extra CF_NOP instruction
was added to add the EOP flag, even though this is not actually
needed, because all these instrutions support the EOP flag.
This patch removes the fixup code, adds setting the EOP flag for the
according instructions as well as others like CF_OP_TEX and CF_OP_VTX,
and adds writing out EOP for this type of instruction in the disassembler.
This also fixes a bug where shaders were created that didn't actually have
the EOP flag set in the last CF instruction, which might have resulted
in GPU lockups.
[airlied: cleaned up a little]
Signed-off-by: Gert Wollny <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This adds support for emitting RAT instructions to the assembler.
RAT instructions are used to implement image accessors.
Reviewed-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This adds support to the assembler for the mark bit
on the export word1.
Reviewed-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This just adds support to the assembler for setting the valid
pixel mode on the CF clause.
Reviewed-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This adds support for the GDS operations needed to do atomic
counters.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These are used in tessellation shaders to read/write values
between VS/TCS/TES.
This splits the eg alu assembler out to handle these
instructions.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This just adds enough for the tessellation shaders,
which require TF_WRITE to work.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This just adds support to the assembler dumper and allows
stream instructions to be generated. Also fix up the stream
debugging to add stream info.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Cayman needs a different method to upload the CF IDX0/1
This fixes 31 piglits when ARB_gpu_shader5 is forced on
with cayman.
Reviewed-by: Glenn Kennard <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Caveat: Shaders using UBO/sampler indexing will
not be optimized by SB, due to SB not currently
supporting the necessary CF_INDEX_[01] index
registers.
Signed-off-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is Vadim's initial work with a few regression fixes squashed in.
v2: (airlied)
fix regression in glsl-max-varyings - need to use vs and ps_dirty
fix regression in shader exports from rebasing.
whitespace fixing.
v2.1: squash fix assert
Signed-off-by: Vadim Girlin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
It looks like we need these for geom shaders in the future.
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: fix regression on r600 NOP instructions.
Signed-off-by: Vadim Girlin <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v3: added some flags including condition codes for ALU,
fixed issue with CF reverse lookup (overlapping ranges of CF_ALU_xxx
and other CF instructions)
rebased on current master
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
LOOP_START_DX10 ignores the LOOP_CONFIG* registers, so it is not limited
to 4096 iterations like the other LOOP_* instructions. Compute shaders
need to use this instruction, and since we aren't optimizing loops with
the LOOP_CONFIG* registers for pixel and vertex shaders, it seems like
we should just use it for everything.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This is a pseudo instruction that enables the LLVM backend to encode
instructions and pass it through r600_bytecode_build()
Signed-off-by: Tom Stellard <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
Add support for multiple kcache banks (constant buffers).
Lock the required lines only.
Allow up to 4 kcache line sets in the alu clause by using ALU_EXTENDED on eg+.
Signed-off-by: Vadim Girlin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
r600: DONE.
r700: MOSTLY (done but locks up).
Evergreen: MOSTLY (done but doesn't work for an unknown reason).
The kernel support will come soon.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need something that looks like a compiler and not like some hacker
put some functions together. /rant
This is a band-aid for these two problems:
- The R600 and EG control-flow instructions appear in switch statements
next to each other, causing conflicts when adding new instructions.
- The ALU control-flow instructions are bitshifted by 3 (from CF_INST 26:29
to CF_INST 23:29, as is defined by r600 ISA) even for EG, where CF_INST
is 22:29.
To fix this mess, the 'inst' field is bitshifted to the left either by 22, 23,
or 26 (directly in the definitions), such that it can be just or'd when making
bytecode without any shifting. All switch statements have been divided into
two, one for R600 and the other for EG.
Of course, there is a better way to do this, but that is left for future
work.
Tested on RV730 and REDWOOD with no regressions.
v2: minor cleanup as per Alex's comment.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
It took me a while to figure out what it stands for.
|
|
|
|
| |
Signed-off-by: Henri Verbeet <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cayman is the RadeonHD 69xx series of GPUs. This adds support for
3D acceleration to the r600g driver.
Major changes:
Some context registers moved around - mainly MSAA and clipping/guardband related.
GPR allocation is all dynamic
no vertex cache - all unified in texture cache.
5-wide to 4-wide shader engines (no scalar or trans slot)
- some changes to how instructions are placed into slots
- removal of END_OF_PROGRAM bit in favour of END flow control clause
- no vertex fetch clause - TC accepts vertex or texture
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Signed-off-by: Henri Verbeet <[email protected]>
|
| |
|
|
|
|
|
| |
Join multiple exports into just one instruction
instead of exporting each register separately.
|
|
|
|
|
| |
If last instruction is an CF_INST_ALU we don't need to emit an
additional CF_INST_POP for stack clean up after an IF ELSE ENDIF.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vertex elements change are less frequent than draw call, those to
avoid rebuilding fetch shader to often build the fetch shader along
vertex elements. This also allow to move vertex buffer setup out
of draw path and make update to it less frequent.
Shader update can still be improved to only update SPI regs (based
on some rasterizer state like flat shading or point sprite ...).
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
| |
Use fetch shader instead of having fetch instruction in the vertex
shader. Allow to restrict shader update to a smaller part when
vertex buffer input layout changes.
Signed-off-by: Jerome Glisse <[email protected]>
|
| |
|
|
|
|
| |
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
| |
Lot of clean can now happen.
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
| |
Avoid using r600_screen structure to get ptr to radeon
winsys structure.
Signed-off-by: Jerome Glisse <[email protected]>
|
|
adds shader opcodes + assembler support (except ARL)
uses constant buffers
add interp instructions in fragment shader
adds all evergreen hw states
adds evergreen pm4 support.
this runs gears for me on my evergreen
|