| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: fix silly typo
Cc: "17.2 17.3" <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's still printed after linking, but it makes more sense to
have SPIRV->NIR->LLVM IR->ASM.
Fixes: f0a2bbd1a4 (radv: move nir print after linking is done)
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Going from binary to hex has a 2x blowup.
Fixes: 14216252923 'radv: create on-disk shader cache'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
This will allow dead components of varyings to be removed.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is needed for RADV to support explicit component packing.
This is also required to use the new NIR component splitting /
packing passes.
V2:
- add commponent packing support for interpolate_at* intrinsics
- improve store packing support when not all varyings are scalar
as spotted by Bas the store source was incorrectly offset.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Not sure how useful this is, but it makes it more consistent.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.3" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For certain buffer meta ops we can use the CP or a compute shader,
we should use a define to rather than hardcoding 4096, allows
for easier testing and more consistency.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
radeonsi only emits these when dfsm is enabled, so for now
just hinge them on a flag we never set.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This restores performance for the drirc workaround, i.e.
KILL_IF does:
visible = src0 >= 0;
kill_flag &= visible; // accumulate kills
amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only
And all helper pixels are killed at the end of the shader:
amdgcn_kill(kill_flag);
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
We now have linking optimisations so we want to delay dumping the
nir until after these are complete.
Fixes: 06f05040eb73 (radv: Link shaders)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The IR is reused in different pipeline combinations so we need
to clone it to avoid link time optimistaions messing up the
original copy.
Fixes: 06f05040eb73 (radv: Link shaders)
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was the actual cause of GPU hangs fixed by 0fdd531457ec ("radv:
Fix pipeline cache locking issues"), since multiple threads would end
up trying to create the variants for a single entry.
Now that we're locking around the whole of this function, this isn't
really necessary (we either create all or none of the variants), but
fix this anyway in case things change later.
Signed-off-by: Alex Smith <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
CC: 17.3 <[email protected]>
|
|
|
|
|
|
| |
We know that num_components will be > 0, but it doesn't.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes KHR-GL45.texture_swizzle.smoke and others on Vega.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.
I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.
This at least fixes radv dEQP-VK.texture.shadow.* on VI.
Fixes: 0f9e32519bb 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Since it also uses the output vector before writing to memory.
Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Fixes: 1bcb953e166 'radv: handle GFX9 1D textures'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Otherwise we just need to write them to the tf ring.
this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.*
tests.
Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Mirrors the vram path.
Fixes: d4ecc3c9299 'ac/nir: Add loading from LDS for merged GS.'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When the gs_copy_shader is NULL (due to an incomplete cache), but
the main shaders are found, we still do the nir, but we shouldn't
compile the shaders again. For merged shaders we should also account
for the missing shaders.
Fixes: ce03c119ce0 'radv: Add code to compile merged shaders.'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
We specify 127 instead of 32 as the limit in vulkan.
Fixes: 6bc42855f92 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
With merged shaders the vertex shader may not exist. This got in
because the offending patch was written before merged shaders were
upstream, but committed after.
Fixes: 75dfab24a2c 'radv: refactor indirect draws with radv_draw_info'
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise for non-indexed draws we set and immediately unset
RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should
clear their own bit, this is unnecessary.
Fixes: 341529dbee5 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
As they were emitted after the new pipeline, the changed pipeline
detection was not working anymore.
Fixes: 341529dbee5 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
They don't take a single wave anymore and we need the barriers.
Fixes: 6bc42855f92 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Fixes: ffaf4d608a1 'radv: Enable tessellation shaders for GFX9.'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Need to lock around the whole process of retrieving cached shaders, and
around GetPipelineCacheData.
This fixes GPU hangs observed when creating multiple pipelines in
parallel, which appeared to be due to invalid shader code being pulled
from the cache.
Signed-off-by: Alex Smith <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implicit sync kicks in when a buffer is used by two different amdgpu
contexts simultaneously. Jobs that use explicit synchronization
mechanisms end up needlessly waiting to be scheduled for long periods
of time in order to achieve serialized execution.
This patch disables implicit synchronization for all radv allocations
except for wsi bos. The only systems that require implicit
synchronization are DRI2/3 and PRIME.
v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC
v3: Add drm version check (Bas)
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This allows us to pass extra parameters to the memory allocation
operation that are not defined in the vulkan spec. This is useful for
internal usage.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Expose the extension string as supported
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch helps lower high priority compute latency. Found by
bisecting a perf regression on computeparticles with high priority
compute queues enabled.
Reverting this micro-optimization doesn't seem to have any negative
effect on performance on Dota2 or ssao.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This extension allows the caller to change a queue's system wide
priority. This is useful for applications with specific
latency constraints.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When WAVE_LIMIT is set, a submission will opt-in for SPI based resource
scheduling. Because this mechanism is cooperative, we must ensure that
all submissions have this field set, otherwise they will bypass resource
arbitration.
We always hardcode the field to its maximum value, instead of attempting
to calculate an approximate usage. In testing, there were no benefits to
using anything other than the maximum.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
It's redundant with nir_shader::info::stage.
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Ported from RadeonSI. The time where shaders are idle should
be shorter now. This can give a little boost, like +6% with
the dynamicubo Vulkan demo.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Vulkan specification says:
"... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_-
PIPE_BIT in the source stage mask will effectively not wait for
any prior commands to complete."
Signed-off-by: Fredrik Höglund <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
Fixes two compilation warnings in release build. Trivial.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
Similar to the dispatch codepath.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Indirect draws with a count buffer will be refactored in a
separate patch.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|