| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Set 3DSTATE_WM/ThreadDispatchEnable bit on/off based on the same
conditions as used in the GL version.
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Compressing a render target and decompressing it in the same
single-subpass render pass may waste bandwidth. While this may be
beneficial in some circumstances, it does not help in all. Reclaims
about 1.95% FPS for Dota 2 on some configurations.
v2 (Jason Ekstrand):
- Provide a more thorough comment
- Enable CCS_D for input attachments
v3 (Jason Ekstrand):
- Provide performance numbers
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The data port can't handle CCS at all so replace the finishme with
better comments.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I enabled CCS for storage images in the Vulkan driver and ran it through
the CTS. It didn't result in any hangs but it demonstrated that the data
port cannot handle CCS.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Nothing uses this yet but it serves as a nice bit of documentation
that's relatively easy to find.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The term "lossless compression" could potentially mean multisample
color compression, single-sample color compression or HiZ because they
are all lossless. The term CCS_E, however, has a very precise meaning;
in ISL and is only used to refer to single-sample color compression.
It's also much shorter which is nice.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 968ffd6c868af7226e8f889573eef709888151cb stored the last subpass
index of all the attachments but that of the depth-stencil attachment.
This could cause depth buffers used in multiple subpasses not to be in
the requested final layout. Fix this error.
Cc: "17.0" <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Enables 10 tests from:
dEQP-VK.draw.shader_draw_parameters.*
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: use define for buffer ID (Jason)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up to now on Gen8+ we only allocated a vertex element for
gl_InstanceIndex or gl_VertexIndex when a vertex shader uses
gl_BaseInstanceARB or gl_BaseVertexARB. This is because we would
configure the VF_SGVS packet to make the VF unit write the
gl_InstanceIndex & gl_VertexIndex values right behind the values
computed from the vertex buffers.
In the next commit we will also write the gl_DrawIDARB value. Our
backend expects to pull the gl_DrawIDARB value from the element
following the element containing gl_InstanceIndex, gl_VertexIndex,
gl_BaseInstanceARB and gl_BaseVertexARB (see
vec4_vs_visitor::setup_attributes). Therefore we need to allocate an
element for the SGVS elements as long as at least one of the SGVS
element is read by the shader. Otherwise our shader will use a
gl_DrawIDARB value pulled from the URB one element too far (most
likely garbage).
v2: Fix my english (Lionel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: use define for buffer ID (Jason)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For RGB formats in Vulkan, we use the corresponding RGBA format with a
swizzle of RGB1. While this swizzle is exactly what we want for
texturing, it's not allowed for rendering according to the docs. While
we haven't been getting hangs or anything, we should probably obey the
docs. This commit just sanitizes all render swizzles so that the alpha
channel maps to ALPHA.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
| |
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes to issues spotted by Emil Velikov:
- set ANV_TIMESTAMP corretly
- fix typo with VULKAN_GEM_FILES
v2: update to use Makefile.sources under vulkan
instead of having own
v3: update to changes to generate from vk.xml
(commit c7fc310)
v4: remove 'hw' relative path
cleanups, remove unnecessary cruft
review from Emil Velikov:
- move to vulkan folder
- remove timestamp gen, no longer necessary
- more cleanups
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS
actually is. We know from experimentation that we need to flush the
render cache prior to emitting STATE_BASE_ADDRESS and invalidate the
texture cache afterwards. The only thing the PRM says is that, on gen8+
we're supposed to invalidate the state cache after STATE_BASE_ADDRESS
but experimentation has indicated that doing so does nothing whatsoever.
Since we don't really know, let's do just a bit more flushing in the
hopes that this won't be a problem again. In particular:
1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't
really know whether or not it's pipelined.
2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS
is a compute shader.
3) Invalidate the state and constant caches after STATE_BASE_ADDRESS
because the state may be getting cached there (we don't really know).
Reported-by: Mark Janes <[email protected]>
Tested-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We had no good reason for *not* doing this on gen7 before but we didn't
know it was needed. Recently, when trying update to Vulkan CTS version
1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear
to be STATE_BASE_ADDRESS related. This commit fixes them.
Reported-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This causes hangs on Broadwell if you try to render to it. I have no
idea how we managed to not hit this earlier.
Tested-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 2852efcda40274acf3272611c6a3b7731523a72d moved the location of
the depth input attachment surface state from the render pass to the
image view, but failed to update the surface state location used when
emitting the binding table. Fix this by loading the surface state from
the correct location.
Fixes:
dEQP-VK.renderpass.formats.d16_unorm.input.*
dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.*
dEQP-VK.renderpass.formats.d32_sfloat.input.*
dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.*
dEQP-VK.renderpass.attachment_allocation.input_output.93
dEQP-VK.renderpass.attachment_allocation.input_output.92
dEQP-VK.renderpass.attachment_allocation.input_output.82
dEQP-VK.renderpass.attachment_allocation.input_output.46
Cc: "17.0" <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
| |
I'm pretty sure we've kept up with the bug fixes.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
| |
Blorp clears already have an equivalent.
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.
Fixes a crash due to out of bound memory access in:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
v2: Drop debug traces (Lionel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "17.0 13.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is:
gvec4 subpassLoad(gsubpassInput subpass);
gvec4 subpassLoad(gsubpassInputMS subpass, int sample);
So the multisampled case always receives an explicit sample index that we
should use. The current implementation was ignoring this parameter
and using gl_SampleID value instead.
Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstranad <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add a helper function, anv_get_queue_family_properties(), which fills the
struct. This patch reduces churn in the following patch that implements
vkGetPhysicalDeviceQueueFamilyProperties2KHR.
Reviewed-by: Jason Ekstranad <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Add a helper function, anv_get_image_format_properties(), which does all
the work and has a VkPhysicalDeviceImageFormatInfo2KHR parameter. This
patch reduces churn in the following patch that implements
vkGetPhysicalDeviceImageFormatProperties2KHR.
Reviewed-by: Jason Ekstranad <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The struct was deleted by:
commit efe9d1cde3340d3a9d17e5560b609a4fb839d43d
Author: Edward O'Callaghan <[email protected]>
Subject: anv: Clean up some unused variables
Unlike the original anv_common, the new one has a non-const pNext
pointer because we will use it for the output structs of
VK_KHR_get_physical_device_properties2.
v2:
- Retype pNext from void* to struct anv_common*.
Reviewed-by: Jason Ekstranad <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This is a printf-like macro that prints a debug message to stderr when
built with DEBUG. If no DEBUG, then do nothing.
Reviewed-by: Jason Ekstranad <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The spec section 5.2 says:
"vkAllocateCommandBuffers can be used to create multiple command
buffers. If the creation of any of those command buffers fails, the
implementation must destroy all successfully created command buffer
objects from this command, set all entries of the pCommandBuffers
array to VK_NULL_HANDLE and return the error."
Fixes:
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
As per VK_KHR_maintenance1, clients can render to a slice of a 3D image
by creating a VK_IMAGE_VIEW_TYPE_2D view of it.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
As of VK_KHR_maintenance1, these are supposed to be reported for any
formats on which we support transfer operations. For us, this is
anything that we can texture from.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
Our command buffers already efficiently use a global pool so trimming
doesn't really need to do anything.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
As per VK_KHR_maintenance1, setting a negative height in the viewport
can be used to get flipped coordinates. This is, aparently, very useful
when porting D3D apps to Vulkan. All we need to do to support this is
to make sure we actually set the min and max correctly.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
These files belong to the vulkan loader.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
immutable value
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to handle CCS_E, we stomp the image format to a UINT format and
then do some bitcasting logic in the shader. This works fine since SKL
render compression only considers the channel layout of the format and
not the format itself. In order for this to work on images that have
been fast-cleared, we need to also convert the clear color so that, when
interpreted as UINT, it provides the same bit value as it would have in
the original format. This fixes a bunch of OpenGL ES CTS tests for
copy_image when we start using CCS more aggressively.
Reviewed-by: Topi Pohjolainen <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blorp can deal with depth/stencil surfaces blits/copies without the
render target requirement. Also having both render target and
depth/stencil requirement is incompatible from isl's point of view.
This fixes an image creation issue in the high level quality settings
of the Unity3D player, which requires a depth texture with src/dst
transfer & 4x multisampling.
v2: Simply aspect checking condition (Jason)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: 13.0 17.0 <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The same we do in the OpenGL driver (comment copied from there).
This is required to ensure that we execute the fragment shader stage when
side-effects (such as image or ssbo stores) are present but there are no
color writes.
I found this while writing a test to check rendering to a framebuffer
without attachments where the fragment shader does not produce any
color outputs but writes to an image via imageStore(). Without this patch
the fragment shader does not execute and the image is not written,
which is not correct.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes crash in dEQP-VK.ubo.random.all_shared_buffer.48 due to a
fragment shader code bigger than 128 kB.
This patch increases the allocation size limit to 1 MB.
v2:
- Increase it to 1 MB (Jason)
- Increase device->instruction_block_pool allocation size in
anv_device.c (Jason)
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|