| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When both UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY are set, we can get into a
situation where the queue never executes and grows to a huge size
due to all other threads being busy.
This is the case with the shader cache when attempting to compile a
huge number of shaders up front. If all threads are busy compiling
shaders the cache queues memory use can climb into the many GBs
very fast.
The use of these two flags with the shader cache is intended to
allow shaders compiled at runtime to be compiled as fast as possible.
To avoid huge memory use but still allow the queue to perform
optimally in the run time compilation case, we now add the ability
to track memory consumed by the jobs in the queue and limit it to
a hardcoded 256MB which should be more than enough.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
for ARB_parallel_shader_compile
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
for ARB_parallel_shader_compile
|
|
|
|
|
| |
Initial version discussed with Rob Clark under a different patch name.
This approach leaves his driver unaffected.
|
|
|
|
|
|
|
| |
v2: simplifications
Reviewed-by: Timothy Arceri <[email protected]> (v1)
Reviewed-by: Eric Engestrom <[email protected]> (v1)
|
|
|
|
|
| |
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
The relevant define changed in the final revision of the simple mutex
patch.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Fixes e.g. piglit/bin/bufferstorage-persistent read -auto
Fixes: e6dbc804a87a ("winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences")
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
- style fixes
- fix missing timeout handling in futex path
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Schedule one job for every thread, and wait on a barrier inside the job
execution function.
v2: avoid alloca (fixes Windows build error)
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fences are now 4 bytes instead of 96 bytes (on my 64-bit system).
Signaling a fence is a single atomic operation in the fast case plus a
syscall in the slow case.
Testing if a fence is signaled is the same as before (a simple comparison),
but waiting on a fence is now no more expensive than just testing it in
the fast (already signaled) case.
v2:
- style fixes
- use p_atomic_xxx macros with the right barriers
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the following situation:
mtx_lock(mutex);
do_something();
util_queue_add_job(...);
mtx_unlock(mutex);
If the queue is full, util_queue_add_job will wait for a free slot.
If the job which is currently being executed tries to lock the mutex,
it will be stuck forever, because util_queue_add_job is stuck.
The deadlock can be trivially resolved by increasing the queue size
(reallocating the queue) in util_queue_add_job if the queue is full.
Then util_queue_add_job becomes wait-free.
radeonsi will use it.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
for HUD integration in following commits. This valuable profiling data
will allow us to see on the HUD how well glthread is able to utilize
parallelism. This is better than benchmarking, because you can see
exactly what's happening and you don't have to be CPU-bound.
u_threaded_context has the same counters.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
This will allow us to use it outside of gallium for things like
compressing shader cache entries.
Reviewed-by: Marek Olšák <[email protected]>
|