| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
('shader_ballot')
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
| |
This commit also renames existing AMD capabilities:
- gcn_shader -> amd_gcn_shader
- trinary_minmax -> amd_trinary_minmax
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have to emit a CACHE_FLUSH_AND_INV_TS_EVENT to be sure all
prior GPU work is done. While we are at it, also flush and
invalidate DB.
This fixes the following CTS (when the small hint is disabled):
dEQP-VK.query_pool.statistics_query.reset_before_copy.*
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Do not want it for perf reasons. Always have to disable DCC when
transferring to external queue.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Transitions to external queue should do the transition & make sure
it works on all queues.
Fixes: 8ebc7dcb59a "radv: Allow fast clears with concurrent queue mask for some layouts."
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Upcoming changes to LLVM will emit LDS objects as symbols in the ELF
symbol table, with relocations that will be resolved with this change.
Callers will also be able to define LDS symbols that are shared between
shader parts. This will be used by radeonsi for the ESGS ring in gfx9+
merged shaders.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This is more convenient for changing it around during debug.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
A new variant of ac_compile_module_to_binary that allows us to
keep the entire ELF around.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Using an explicit linker instead of just concatenating .text
sections will allow us to start using .rodata sections and
explicit descriptions of data on LDS that is shared between
stages.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When the visible VRAM size is equal to the VRAM size only two
heaps are exposed.
This fixes dEQP-VK.api.info.device.memory_budget.
Cc: 19.0 19.1 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The number of render backends is 16 but the enabled mask is 0xaaaa.
As noticed by Bas, allowing disabled render backends might break
the OCCLUSION_QUERY packet. We don't use it yet but keep this in
mind.
This fixes dEQP-VK.query_pool.* and dEQP-VK.multiview.*.
Cc: 19.0 19.1 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the Vulkan spec, inline uniform blocks are not allowed
to be updated through vkCmdPushDescriptorSetKHR().
These are the spec quotes from "13.2.1. Descriptor Set Layout"
that are relevant for this case:
"VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies
that descriptor sets must not be allocated using this layout, and
descriptors are instead pushed by vkCmdPushDescriptorSetKHR."
"If flags contains
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all
elements of pBindings must not have a descriptorType of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT".
There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid
this case but it is implied in the creation of the descriptor set
layout as aforementioned.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
baseArrayLayer is defined twice, trivial.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
When decompressing resolve source images, we should rely on the
framebuffer layer count instead of resolving all images layers.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Use an explicit pipeline barrier for doing layout transitions
instead of duplicating some code.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When resolving inside a subpass, we should rely on the framebuffer
layer count instead of resolving all images layers. This should
improve performance of layered resolves a bit.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Was watching a presentation on YT where this was used and it turns
out it is not invalid.
The only case it is actually valid as format in the creation of an
image or image view is with Android Hardware Buffers which have
their format specified externally.
So we can just ignore all entries with VK_FORMAT_UNDEFINED.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
uintptr_t is 32-bits then and shifting it by 32 bits results in undefined
behavior IIRC.
Fixes: b3c8de1c55c "radv: save all descriptor pointers into the trace BO"
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
CB_SHADER_MASK was computed without the second color buffer
format which looks totally wrong to me.
While we are at it, copy a comment from RadeonSI.
Cc: 19.0 19.1 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When alphaToCoverage is enabled, we should always write the alpha
channel of MRT0 if it's unused. This now matches RadeonSI.
This fixes the new CTS:
dEQP-VK.pipeline.multisample.alpha_to_coverage_unused_attachment.samples_*.alpha_invisible
Cc: 19.0 19.1 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This is now supported.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the Vulkan spec 1.1.109:
"Some implementations may need to evaluate depth image values
while performing image layout transitions. To accommodate this,
instances of the VkSampleLocationsInfoEXT structure can be
specified for each situation where an explicit or automatic
layout transition has to take place. [...] and
VkRenderPassSampleLocationsBeginInfoEXT can be chained from
VkRenderPassBeginInfo to provide sample locations for layout
transitions performed implicitly by a render pass instance."
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the Vulkan spec 1.1.109,
"Some implementations may need to evaluate depth image values
while performing image layout transitions. To accommodate this,
instances of the VkSampleLocationsInfoEXT structure can be
specified for each situation where an explicit or automatic
layout transition has to take place. VkSampleLocationsInfoEXT
can be chained from VkImageMemoryBarrier structures to provide
sample locations for layout transitions performed by
vkCmdWaitEvents and vkCmdPipelineBarrier calls."
This handles explicit depth/stencil layout transitions performed
with CmdWaitEvents() or CmdPipelineBarrier().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
If VK_EXT_sample_locations is used, the driver might need to emit
the sample locations specified during layout transitions.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This will be used for the depth decompress pass that might need
to emit variable sample locations during layout transitions.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
While we're here, copy the comment explaining this from radeonsi.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
This might fix initial subpass transitions when multiview is used.
Noticed while implementing sample locations during layout transitions.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This removes one useless SMEM load operations which pointed to
the same descriptor anyway.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
The driver will emit GLC=1.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Both input and output images use the same type.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Cc: 19.1 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
This way we handle linear images etc. correctly.
Acked-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise LLVM can sink them and their texture coordinate calculations
into divergent branches.
v2: simplify the conditions on which the intrinsic is marked as convergent
v3: only mark as convergent in FS and CS with derivative groups
Cc: <[email protected]>
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes -Woverflow warnings with GCC 9.1.1
v2: use a cast instead of a bitwise and
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The driver should only fast depth clears with the graphics path
when the view covers all image layers, otherwise this might
corrupt layers when HTILE is enabled.
Cc: 19.0 19.1 [email protected]
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
It's unsupported, only load/store format with vec3 are supported.
Fixes: 6970a9a6ca9 ("ac,radv: remove the vec3 restriction with LLVM 9+")"
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
trivial
|
| |
|