| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Reduced stack size allows to run more threads in some cases,
improving performance for the shaders that use stack (that is, for the
shaders with control flow instructions). E.g. with unigine-based apps.
v4: implement exact computation taking into account wavefront size
v5: add cases for RV620, RS880
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This should be used by both SI and R600.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
between begin_query and end_query
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Virtual address is used for PIPE_QUERY_SO* queries in
r600_emit_query_begin, but not in r600_emit_query_end.
This will trigger a GPU fault when one of those queries is
made and virtual address is enabled.
Note: this is a candidate for the 9.1 branch
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very
fast with software rasterizer. Now Gallium drivers have the ability to turn
them off.
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This helps minimize confusion / effort when moving between branches or
helping others.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.
The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.
With this patch only nvc0 and nv30 will request that they be used.
v2: introduce a CAP so other drivers don't have to bother with
the new semantic
v3: adapt to introduction gl_varying_slot enum
|
|
|
|
|
|
|
|
|
| |
Doesn't exist on the asic and will cause a CS rejection
if VM is disabled.
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
Not using HiS yet, but matches what we do on evergreen+.
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To further improve the optimization of source and destination
indirect addressing we need the ability to store a reference
to the declaration of the addressed operands.
Since most of the fields in tgsi_src_register doesn't apply for
an indirect addressing operand replace it with a separate
tgsi_ind_register structure and so make room for extra information.
v2: rename Declaration to ArrayID, put the ArrayID into () instead of []
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Needs to be set for depth, stencil, and fmask just
like other blocks.
v2: drop additional cayman bits for now
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On cayman, 128bpp surfaces require non_disp ordering for hw
access to both linear and tiled surfaces. When we use the 3D
engine we can set the non_disp ordering on both the tiled and
linear sides (via CB or texture), but when we use the DMA
engine, we can only set the non_disp ordering on the tiled
side, so after a L2T operation with the DMA engine, the data
ends up in the wrong order on the tiled side.
v2: cayman/TN only
v3: fix comments
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
| |
This will be invaluable for debugging and bug reports.
|
|
|
|
|
|
| |
This avoids the kernel CS checker errors with MSAA textures.
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
| |
It's nice to see so much code that did pretty much nothing go away.
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
| |
also change names of other functions, so that they make sense
Reviewed-by: Jerome Glisse <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only the disassembler is used to dump shaders. Here's a few examples
how to use R600_DEBUG.
Log compute info:
R600_DEBUG=compute
Dump all shaders:
R600_DEBUG=fs,vs,gs,ps,cs
Dump pixel shaders only:
R600_DEBUG=ps
Disable Hyper-Z:
R600_DEBUG=nohyperz
Disable the LLVM backend:
R600_DEBUG=nollvm
Or use any combination of the above, or print all options:
R600_DEBUG=help
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
v2: remove unrelated changes
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
| |
To match recent LLVM changes.
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
| |
Fixes a llvm uncovered (rare) bug where consecutive exports were
merged even if they have incompatible mask.
|
|
|
|
|
| |
Tested-by: Vincent Lejeune <vljn at ovi.com>
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
|
|
|
|
|
|
| |
Tested across several 6xx parts, no piglit regressions.
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
which is a leftover from the days when we used streamout to copy buffers
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Any driver can implement this simple and efficient optimization.
Team Fortress 2 hits it always. The DISCARD_RANGE codepath is not even used
with TF2 anymore, so we avoid a ton of useless buffer copies.
Tested-by: Andreas Boll <[email protected]>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
| |
These registers are either already emitted elsewhere or moved to start_cs.
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
| |
The states were split because we thought it caused a hardlock. Now we know
the hardlock was caused by something else and has since been fixed.
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Andreas Boll <[email protected]>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't fix any issue we know of, but there indeed is a week spot
in draw_vbo where streamout can fail. After streamout is enabled,
the need_cs_space call can flush the context, which causes the streamout
to be disabled right after it was enabled and bad things happen.
One way to fix it is to atomize the beginning part, so that no context flush
can happen between streamout enabling and the first drawing.
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
| |
probably a typo
Tested-by: Andreas Boll <[email protected]>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
| |
Tested-by: Andreas Boll <[email protected]>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
|
|
|
|
|
| |
This work around disable hyperz if write to zbuffer is disabled. Somehow
using hyperz when not writting to the zbuffer trigger GPU lockup. See :
https://bugs.freedesktop.org/show_bug.cgi?id=60848
Candidate for 9.1
Signed-off-by: Jerome Glisse <[email protected]>
|