| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
u_upload_mgr sets it, so that util_range_add can skip the lock.
The time spent in tc_transfer_flush_region decreases from 0.8% to 0.2%
in torcs on radeonsi.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the current code, we didn't do the space checks prior
to atomic counter setup emission, but we also didn't add
atomic counters to the space check so we could get a flush
later as well.
These flushes would be bad, and lead to problems with
parallel tests. We have to ensure the atomic counter copy in,
draw emits and counter copy out are kept in the same command
submission unit.
This reworks the code to drop some useless masks, make the
counting separate to the emits, and make the space checker
handle atomic counter space.
[airlied: want this in 18.2]
Fixes: 06993e4ee (r600: add support for hw atomic counters. (v3))
|
| |
|
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Call r600_dma_emit_wait_idle only when there is a possibility of
a read-after-write hazard. Buffers not yet used by the SDMA IB don't
have to wait.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
The main impact is that fast color clear doesn't flush TC, CONST, DB.
Reviewed-by: Alex Deucher <[email protected]>
Tested-by: Grazvydas Ignotas <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel,
otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that
Reviewed-by: Alex Deucher <[email protected]>
Tested-by: Grazvydas Ignotas <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
Mostly generated using a sed-script, with manual fix-up for multi-line
statements.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This prevents IB rejections due to insane memory usage from
many concecutive texture uploads.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
Use the priority flags and expand them.
This information will be used for debugging.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
this name should be easy to understand without other knowledge
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
*_dma_copy calls either *_dma_copy_buffer or *_dma_copy_tile.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
It's flushed by calling r600_context_bo_reloc.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Niels Ole Salscheider <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
The DMA functions modify dst_offset and size and util_range_add gets wrong
values.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
operations"
This reverts commit 7948ed1250cae78ae1b22dbce4ab23aceacc6159.
It caused graphical corruption. I've got no idea why.
Bugzilla:
https://bugs.freedesktop.org/show_bug.cgi?id=70042
https://bugs.freedesktop.org/show_bug.cgi?id=68451
Conflicts:
src/gallium/drivers/r600/evergreen_hw_context.c
src/gallium/drivers/r600/r600_hw_context.c
src/gallium/drivers/r600/r600_pipe.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This streamout state code will be used by radeonsi.
There are new structures r600_common_context and r600_common_screen.
What is inherited by what is shown here:
pipe_context -> r600_common_context -> r600_context
pipe_screen -> r600_common_screen -> r600_screen
The common structures reside in drivers/radeon. Currently they only contain
enough functionality to be able to handle streamout. Eventually I'd like
the whole pipe_screen implementation to be shared and some of the context
stuff too.
This is quite big, but most changes are because of the new structures and
the fact r600_write_value is replaced by radeon_emit.
Thanks to Tom Stellard for fixing the build for r600g/compute.
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Christian König <[email protected]>
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
| |
I broke this with 7948ed1250cae78ae1b22dbce4ab23aceacc6159 for r700 at least.
|
|
|
|
|
|
|
| |
This should increase performance if constant uploads are done with the CP DMA,
because only the cache that needs to be flushed is flushed.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
also flushing any cache in evergreen_emit_cs_shader seems to be superfluous
(we don't flush caches when changing the other shaders either)
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. flush SH with read caches
2. add flag for DB flushes
3. add flag for CB flushes
v2: flush all CBs, remove redundant emit_state variable.
v3: Marek: also set the new flags in r600_context_flush, the CP dma functions,
and texture_barrier, and rename them
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Lighter weight then using streamout. Only evergreen
and newer asics support embedded data as src with
CP DMA.
Reviewed-by: Jerome Glisse <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
| |
It's nice to see so much code that did pretty much nothing go away.
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Any driver can implement this simple and efficient optimization.
Team Fortress 2 hits it always. The DISCARD_RANGE codepath is not even used
with TF2 anymore, so we avoid a ton of useless buffer copies.
Tested-by: Andreas Boll <[email protected]>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
| |
These registers are either already emitted elsewhere or moved to start_cs.
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
| |
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
| |
v2: Add virtual address to dma src/dst offset for cayman
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We keep track of ring emission order in a stack, whenever we need to
flush we empty the stack in a fifo order. There is few helpers function
for bo mapping and other ring activities that will make sure that
the ring stack is properly flush and submitted.
v2: fix st flush path, and other flush path to properly flush all
rings if necessary
v3: - improve name of ring helpers
- make sure that each time a cs is gona be written it endup at
top of the stack to avoid any issue such as :
STACK[0] = dma (withbo A,B)
STACK[1] = gfx (withbo C,D)
Now if code try to emit a dma command relative to bo C or D
it will start writting cmd stream into the cs and once it
reach the point where it adds relocation it will flush.
At that point the cs will have cmd that don't have proper
relocation into the relocation buffer and kernel will just
refuse to run.
v4: - Drop the stack idea as it turn out there is no way to use it
or benefit from it. Any time the driver start command on other
ring, it always need to flush the previous ring. So make code
simpler by not using a stack.
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
| |
Also update the register value in more appropriate places
than r600_update_derived_state.
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
| |
The workaround for R600 lacking VPORT_SCISSOR_ENABLE has also been simplified.
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
| |
POLY_OFFSET_DB_FMT_CNTL is moved to the framebuffer state, because it only
depends on the zbuffer format.
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
| |
The state object is actually a buffer, it's literally a buffer containing
the shader code.
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is not so trivial, because we disable blending if the dual src
blending is turned on and the number of color outputs is less than 2.
I decided to create 2 command buffers in the blend state object and just
switch between them when needed, because there are other states unrelated
to blending (like the color mask) and those shouldn't be changed
(the old code had it wrong).
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
Tested on RS880, Evergreen and Cayman.
Reviewed-by: Alex Deucher <[email protected]>
|