| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
For gfx9 the addressing for images has changed, so we need to
provide the hw with the level0, however we still need to scale
for format block differences (so our compressed upload paths still
work).
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit bae7723e132d3177697606c799eabbb7cdde2f38)
|
|
|
|
|
|
|
|
|
|
| |
If the image view has the same format, we don't need to rescale
the w/h.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit a74d98743115b928eaeabc0d58b63174158aa209)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoid passing the vulkan image creation into the image view descriptor
setup. This cleans up the usage of range inside the init, instead
using the properly inited values in the image view.
This is just a cleanup but some future vega changes will depend on it.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 5378b5d0710be00d1316e42e692a52d4bc5d92fe)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GFX9 needs the SX MRT blend registers programmed, port over
the code from radeonsi to workout the values from the blend
state, and program the registers on rbplus systems.
This fixes lots of:
dEQP-VK.pipeline.blend.*
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 9c080100d336e4f90575d5138508b519ed334eef)
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the GFX9 packet we need one more dword.
Fixes an assert in:
dEQP-VK.draw.shader_draw_parameters.base_vertex.draw_indexed
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 864eb1852778abaa6f63ca106216001c9f375f05)
|
|
|
|
|
|
|
|
|
| |
This fixes disabled Z/stencil.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit d987b4ab9e240b479c71129c3c261982112c57d8)
|
|
|
|
|
|
|
|
|
| |
There was an off by one here.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 11834195e9c276e1f3756cf8f6161be14124261b)
|
|
|
|
|
|
|
|
|
|
| |
We need to use all the levels when filling out the gfx9
descriptor.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit df09f1f3cd5110874899ed0f4b4c33ba9b006c50)
|
|
|
|
|
|
|
|
|
|
| |
I'm working on this, but I'm not sure I'll make 17.2 at this stage,
maybe 17.2.1.
Cc: "17.2" <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 611076a41aac3095a82dff2432943d7f8d429822)
|
|
|
|
|
|
|
|
|
| |
The legacy test won't work on gfx9.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 694d59fbaf4bc85daaff6cc411162dd6d1232968)
|
|
|
|
|
|
|
|
|
| |
port the opaque metadata changes from radeonsi for gfx9.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit e43cc3e3afc98783310f81f8c0151a8314044739)
|
|
|
|
|
|
|
|
|
| |
This is also a GFX9 register.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 674ecbfef2acb17be363867425a013ca151e16b2)
|
|
|
|
|
|
|
|
|
|
| |
We set this later in the non-gfx9 path, just remove these
bits from here.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit fc600eb98d5846fe59f4a79ed1c7ad2a0667e927)
|
|
|
|
|
|
|
|
|
| |
The predication packet changed format on GFX9, update the driver.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 5247b311e9b348fedd74980a34c4b6542d85b07b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This seems like a workaround, but we don't see the bug on CIK/VI.
On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
tests, when one tests complete, the first flush at the start of the next
test causes a VM fault as we've destroyed the VM, but we end up flushing
the compute shader then, and it must still be in the process of doing
something.
Could also be a kernel difference between SI and CIK.
v2: hit this with a bigger hammer. This fixes a bunch of hangs
in the vk cts with the robustness tests.
Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Acked-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 82ba384c10d598bee4786ef5f79e92a0e7b53892)
|
|
|
|
|
|
|
|
|
|
|
| |
This ports the workaround from radeonsi, that was missing in radv.
This fixes Talos rendering when MSAA is enabled on my Tahiti card.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 8bf39307517a04263532e3c5a49b5be1f4a99032)
|
|
|
|
|
|
|
|
|
|
| |
The argument here is a bitmask, so the old code selected .xy, which
got silently truncated to .x when constructing the vec4 from components,
instead of using .w.
Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <[email protected]>
(cherry picked from commit acba3a3151dbbba0ab834e062e0feb12af4873de)
|
|
|
|
|
|
|
|
|
|
| |
It justs works with the fragment shader resolve, so no need to do
a custom conversion. In fact with SRGB dest, it actually gives
wrong results.
Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <[email protected]>
(cherry picked from commit 15e5a7a6832bba011564bfa2045fba9e833eede2)
|
|
|
|
|
|
|
|
|
|
| |
These seem to store very bogus results. Luckily there is some code
that converts srgb->linear already, so just making the descriptor
format UNORM should work.
Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <[email protected]>
(cherry picked from commit 8286c3a49f03dc219e57d4a9ec27a4d840c5f603)
|
|
|
|
|
|
|
|
|
|
|
| |
Need to take the sample count into account in the depth decompress and
resummarize pipelines and render pass.
Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Alex Smith <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
(cherry picked from commit 2e9a13bf2205b6e96cba408e3f48f1c3fe49634a)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.
v2: get it right, reverse the polarity.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 36a1b61321561634c6b243cf876c347fef73dfa4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.*
for a2r10g10b10 formats as destination on SI/CIK hardware.
This adds support to the meta program for emitting 10-bit
outputs, and adds 10-bit support to the fragment shader key.
It also only does the int8/10 on SI/CIK.
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit df61a05019d5c7479d4b29d251af4231f125e61c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some APU situations the reported visible size can be larger than
VRAM size. This properly clamps the value.
Surprisingly both CTS and spec seem to allow a heap type with size 0,
so this seemed like the easiest option to me.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Fixes: 4ae84efbc5c "radv: Use enum for memory heaps."
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
(cherry picked from commit 8229706ad86b27ed571f17872006a488fcd35378)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On SI this was causing a hang in
dEQP-VK.pipeline.render_to_image.core.2d_array.mipmap.r16g16_sint_s8_uint
This was due to not handling the tile mode index for depth like
I fixed previously for new GPUs.
Fixes: 01d0c5a9 (radv: fix stencil regression since new addrlib import)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 800d1622096ca52b955bdfc20eb770b80ef15221)
|
|
|
|
|
|
|
|
|
|
|
| |
Until we support sync fd, don't report the info.
Fixes CTS dEQP-VK.api.external.semaphore.sync_fd.* from crashing.
Fixes: eaa56eab6 (radv: initial support for shared semaphores (v2))
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 6cbc8cf178e4984e464c9fe19434d1514d2ae37d)
|
|
|
|
|
|
|
|
|
|
| |
Fixes CTS dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024
on SI/CIK with radv.
Fixes: f4e499ec (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit ca82ef5ac75e50abb109986b55002cca24f7c0fb)
|
|
|
|
|
|
|
| |
This calculates ps_iter_samples from the minSampleShading input
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is an alternate fix for the buffer export dedicated interaction.
Fixes CTS dEQP-VK.api.external.memory.opaque_fd.dedicated.buffer.info
Fixes: b70829708a (radv: Implement VK_KHR_external_memory)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If the layer base was > 0, it wasn't getting passed as the start
instance or getting added in the shaders.
Fixes CTS dEQP-VK.api.image_clearing.core.clear_color_attachment.2d_r8_uint_multiple_layers
Fixes: 7e0382fb (radv: add support for layered clears (v2))
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The spec says we should return VK_ERROR_FEATURE_NOT_PRESENT.
Ported from anv.
Fixes CTS test dEQP-VK.api.device_init.create_device_unsupported_features
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If we get an fd, we need to close it before returning.
Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.device_only.import_multiple_times
Fixes: b70829708a (radv: Implement VK_KHR_external_memory)
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The image is set on Memory allocation already, but the image doesn't
have to have the BindImageMemory called yet. Luckily, we know offset
within a BO has to be 0 for dedicated allocations, so we can just
use the dummy 0 in the address calaculations.
Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.image.export_bind_import_bind
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Fixes: b70829708ac "radv: Implement VK_KHR_external_memory"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This just sets them to INVALID COLOR, instead of shifting the
attachments together.
This also fixes a number of cases where we use it first and only
then check if it is VK_ATTACHMENT_UNUSED.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When I ported from libdrm, I forgot to add the line to reset
the sem, we just need to reset the context.
This fixes a regression in DOOM.
Fixes: 9ac1432a571 ("radv: port to new libdrm API.")
Reported-by: Grazvydas Ignotas <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The two generators forked from each other, and they remain basically the
same. This rebases the radv version on the anv version, but with the
radv changes ported over. The result is that we get rid of the "cat |"
madness and gain mako, correct "generated by" attributions, and write
files out directly.
The only differences between the output is whitespace and comments.
Signed-off-by: Dylan Baker <[email protected]>
Acked-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can also use storage images internally for resolves, which don't
require TRANSFER_DST usage on the image, so currently we may not create
the needed descriptors.
Just create these descriptors unconditionally.
Fixes: 0e1886efb9e ("radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT")
Reported-by: Grazvydas Ignotas <[email protected]>
Signed-off-by: Alex Smith <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for sharing semaphores using kernel syncobjects.
Syncobj backed semaphores are used for any semaphore which is
created with external flags, and when a semaphore is imported,
otherwise we use the current non-kernel semaphores.
Temporary imports from syncobj fd are also available, these
just override the current user until the next wait, when the
temp syncobj is dropped.
v2: allocate more chunks upfront, fix off by one after
previous refactor of syncobj setup, remove unnecessary null
check.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This just adds syncobj create/destroy/export/import paths into
the winsys interface.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Just a trivial enable.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Acked-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Acked-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This bumps the libdrm requirement for amdgpu to the 2.4.82.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
easier.
This just introduces a central semaphore info struct, and passes it around,
and introduces some wrappers that will make porting off libdrm_amdgpu easier.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This looks like a regression from df301237940 ("radv: use
ac_compute_surface"). Before that, the opt4Space addrlib flag was set
to true unless the image has FMASK (ac_compute_surface will similarly
only set that flag for images without FMASK).
This saves multiple gigabytes of VRAM on one of our games, and brings
its VRAM utilisation on RADV in line with AMDGPU-PRO and NVIDIA.
Signed-off-by: Alex Smith <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Coverity warned about dead code below, as meta_va was being shadowed.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.
ANV did so since day one (back in 2015)
Cc: [email protected]
Cc: Bas Nieuwenhuizen <[email protected]>
Cc: Dave Airlie <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using DCC some clear values don't require a cmask eliminate
step. This patch adds support for black and black with alpha 1,
there are other values, but I don't have access to a comprehensive list.
This works by setting the cmask eliminate predicate when doing the
fast clear, and later when doing the cmask elimination making sure
the draws are predicated.
This increases the fps on Sascha Willems deferred.
Tonga: 580fps->670fps on a Tonga PRO card.
Polaris 730->850fps
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can only fast clear 128-bit images if the r/g/b channels
are the same, and we are using DCC.
For DCC we'll bail out on translate if this isn't true,
and we catch cmask clears explicitly.
v2: remove 64-bit block (Bas), add uint32 as well.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch uses addrlib to workout the tile swizzles according
to the surface index. It seems to produce the same values as
amdgpu-pro for the deferred test.
v2: don't apply swizzle to CMASK. the eg docs don't mention
it, and we clearly don't align cmask for that.
v3: disable surf index for dedicated images, as these will
most likely be shared, and I don't think the metadata has
space for this info in it yet.
v4: update for shareable images, rename combined_swizzle
to tile_swizzle
This gets the deferred demo from 730->950fps on my rx480.
(dcc cmask elim predication patches get it further)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some of the Sascha Willems demos pick a D32/S8 format for the depth
buffer, then do a LOAD_OP_CLEAR/LOAD_OP_DONT_CARE on it, which means
we don't get to merge the undefined->depth and clear htile transitions.
This add the stencil aspect to the pending clears if there is a depth
clear pending and the stencil aspect is don't care.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
To not confuse apps in thinking it might be faster.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Andres Rodriguez <[email protected]>
|