| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the new kernel interfaces for reduced cs overhead,
We only set the local flag for memory allocations that don't have
a dedicated allocation and ones that aren't imports.
v2: add to all the internal buffer creation paths.
v3: missed some command submission paths, handle 0/empty bo lists.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When binding a new pipeline, we applied all dynamic states
without checking if they really need to be re-emitted. This
doesn't seem to be useful for the meta operations because only
the viewports/scissors are updated.
This should reduce the number of commands added to the IB
when a new graphics pipeline is bound.
Also, rename radv_dynamic_state_copy() to radv_bind_dynamic_state()
and set the dirty flags directly there.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
The depth bounds test values are either set at pipeline
creation or dynamically using vkCmdSetDepthBounds().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This was the same between si and ac.
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was unused.
v2: drop args.
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
These get used in fair few places.
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This just avoids having two copies of these.
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This was duplicated between both drivers, share here.
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
The beginning of the end for the shader keys. Not entirely sure
what I'm going to replace them with for the compiler though, so this
is the first step.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
To decouple the key used for info gathering and the cache from
whatever we pass to the compiler.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: fix silly typo
Cc: "17.2 17.3" <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's still printed after linking, but it makes more sense to
have SPIRV->NIR->LLVM IR->ASM.
Fixes: f0a2bbd1a4 (radv: move nir print after linking is done)
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Going from binary to hex has a 2x blowup.
Fixes: 14216252923 'radv: create on-disk shader cache'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
This will allow dead components of varyings to be removed.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is needed for RADV to support explicit component packing.
This is also required to use the new NIR component splitting /
packing passes.
V2:
- add commponent packing support for interpolate_at* intrinsics
- improve store packing support when not all varyings are scalar
as spotted by Bas the store source was incorrectly offset.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Not sure how useful this is, but it makes it more consistent.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.3" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For certain buffer meta ops we can use the CP or a compute shader,
we should use a define to rather than hardcoding 4096, allows
for easier testing and more consistency.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
radeonsi only emits these when dfsm is enabled, so for now
just hinge them on a flag we never set.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This restores performance for the drirc workaround, i.e.
KILL_IF does:
visible = src0 >= 0;
kill_flag &= visible; // accumulate kills
amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only
And all helper pixels are killed at the end of the shader:
amdgcn_kill(kill_flag);
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
We now have linking optimisations so we want to delay dumping the
nir until after these are complete.
Fixes: 06f05040eb73 (radv: Link shaders)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The IR is reused in different pipeline combinations so we need
to clone it to avoid link time optimistaions messing up the
original copy.
Fixes: 06f05040eb73 (radv: Link shaders)
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was the actual cause of GPU hangs fixed by 0fdd531457ec ("radv:
Fix pipeline cache locking issues"), since multiple threads would end
up trying to create the variants for a single entry.
Now that we're locking around the whole of this function, this isn't
really necessary (we either create all or none of the variants), but
fix this anyway in case things change later.
Signed-off-by: Alex Smith <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
CC: 17.3 <[email protected]>
|
|
|
|
|
|
| |
We know that num_components will be > 0, but it doesn't.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes KHR-GL45.texture_swizzle.smoke and others on Vega.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.
I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.
This at least fixes radv dEQP-VK.texture.shadow.* on VI.
Fixes: 0f9e32519bb 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Since it also uses the output vector before writing to memory.
Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Fixes: 1bcb953e166 'radv: handle GFX9 1D textures'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Otherwise we just need to write them to the tf ring.
this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.*
tests.
Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Mirrors the vram path.
Fixes: d4ecc3c9299 'ac/nir: Add loading from LDS for merged GS.'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When the gs_copy_shader is NULL (due to an incomplete cache), but
the main shaders are found, we still do the nir, but we shouldn't
compile the shaders again. For merged shaders we should also account
for the missing shaders.
Fixes: ce03c119ce0 'radv: Add code to compile merged shaders.'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
We specify 127 instead of 32 as the limit in vulkan.
Fixes: 6bc42855f92 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
With merged shaders the vertex shader may not exist. This got in
because the offending patch was written before merged shaders were
upstream, but committed after.
Fixes: 75dfab24a2c 'radv: refactor indirect draws with radv_draw_info'
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise for non-indexed draws we set and immediately unset
RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should
clear their own bit, this is unnecessary.
Fixes: 341529dbee5 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
As they were emitted after the new pipeline, the changed pipeline
detection was not working anymore.
Fixes: 341529dbee5 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
They don't take a single wave anymore and we need the barriers.
Fixes: 6bc42855f92 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Fixes: ffaf4d608a1 'radv: Enable tessellation shaders for GFX9.'
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Need to lock around the whole process of retrieving cached shaders, and
around GetPipelineCacheData.
This fixes GPU hangs observed when creating multiple pipelines in
parallel, which appeared to be due to invalid shader code being pulled
from the cache.
Signed-off-by: Alex Smith <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implicit sync kicks in when a buffer is used by two different amdgpu
contexts simultaneously. Jobs that use explicit synchronization
mechanisms end up needlessly waiting to be scheduled for long periods
of time in order to achieve serialized execution.
This patch disables implicit synchronization for all radv allocations
except for wsi bos. The only systems that require implicit
synchronization are DRI2/3 and PRIME.
v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC
v3: Add drm version check (Bas)
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This allows us to pass extra parameters to the memory allocation
operation that are not defined in the vulkan spec. This is useful for
internal usage.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Expose the extension string as supported
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch helps lower high priority compute latency. Found by
bisecting a perf regression on computeparticles with high priority
compute queues enabled.
Reverting this micro-optimization doesn't seem to have any negative
effect on performance on Dota2 or ssao.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This extension allows the caller to change a queue's system wide
priority. This is useful for applications with specific
latency constraints.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|