| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
used to select between 8 bit subpixel precision (value 0) or 4 bit
subpixel precision (value 1). As this value is not set, means it is
taking the value 0, so 8 bit are used.
On the other side, in the Vulkan CTS tests, if the reference rasterizer,
which uses 8 bit precision, as it is used to check what should be the
expected value for the tests, is changed to use 4 bit as ANV was
advertising so far, some of the tests will fail.
So it seems ANV is actually using 8 bits.
v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason)
v3: use _8Bit definition as value (Jason)
v4: (by Jason)
anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect
This field was added on gen8 even though there's an identically defined
one in 3DSTATE_SF.
CC: Jason Ekstrand <[email protected]>
CC: Kenneth Graunke <[email protected]>
CC: 18.3 19.0 <[email protected]>
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
A new extension allowing the user to explictly specify the clipping
behavior.
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There was an issue recently caused by the system header being included
by mistake, so let's just get rid of this include path and always
explicitly #include "drm-uapi/FOO.h"
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
It is always false on Gen8+. Also, move the variable definition near
its use.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
Annoyingly, this requires that we implement integer division on the
command streamer. Fortunately, we're only ever dividing by constants so
we can use the mulh+add+shift trick and it's not as bad as it sounds.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Conditional rendering affects next functions:
- vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect
- vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR
- vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase
- vkCmdClearAttachments
Value from conditional buffer is cached into designated register,
MI_PREDICATE is emitted every time conditional rendering is enabled
and command requires it.
v2: by Jason Ekstrand
- Use vk_find_struct_const instead of manually looping
- Move draw count loading to prepare function
- Zero the top 32-bits of MI_ALU_REG15
v3: Apply pipeline flush before accessing conditional buffer
(The issue was found by Samuel Iglesias)
v4: - Remove support of Haswell due to possible hardware bug
- Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to
define registers in one place.
v5: thanks to Jason Ekstrand and Lionel Landwerlin
- Workaround the fact that MI_PREDICATE_RESULT is not
accessible on Haswell by manually calculating
MI_PREDICATE_RESULT and re-emitting MI_PREDICATE
when necessary.
v6: suggested by Lionel Landwerlin
- Instead of calculating the result of predicate once - re-emit
MI_PREDICATE to make it easier to investigate error states.
v7: suggested by Jason
- Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL
if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set.
v8: suggested by Lionel
- Precompute conditional predicate's result to
support secondary command buffers.
- Make prepare_for_draw_count_predicate more readable.
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
We have all the state buffers snooped, so we don't need to clflush
everything anymore.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had defined MAX_IMAGES as 8, which we used to size the array for
image push constant data. The comment there stated that this was for
gen8, but anv_nir_apply_pipeline_layout runs for all gens and writes
that array, asserting that we don't exceed that number of images,
which imposes a limit of MAX_IMAGES on all gens.
Furthermore, despite this, we are exposing up to 64 images per shader
stage on all gens, gen8 included.
This patch lowers the number of images we expose in gen8 to 8 and
keeps 64 images for gen9+ while making sure that only pre-SKL gens
use push constant space to handle images.
v2:
- <= instead of < in the assert (Eric, Lionel)
- Change the way the assertion is written (Eric)
v3:
- Revert the way the assertion is written to the form it had in v1,
the version in v2 was not equivalent and was incorrect. (Lionel)
v4:
- gen9+ doesn't need push constants for images at all (Jason)
Cc: [email protected]
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]> (v3)
|
| |
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we don't know the exact format at creation time, some initialization
is done only when bound with memory in vkBindImageMemory.
v2: demand dedicated allocation in vkGetImageMemoryRequirements2 if
image has external format
v3: refactor prepare_ahw_image, support vkBindImageMemory2,
calculate stride correctly for rgb(x) surfaces, rename as
'resolve_ahw_image'
v4: rebase to b43f955037c changes
v5: add some assertions to verify input correctness (Lionel)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: add support for non-image buffers (AHARDWAREBUFFER_FORMAT_BLOB)
v3: properly handle usage bits when creating from image
v4: refactor, code cleanup (Jason)
v5: rebase to b43f955037c changes,
initialize bo flags as ANV_BO_EXTERNAL (Lionel)
v6: add assert that anv_bo_cache_import succeeds, add comment
about multi-bo support to clarify current implementation (Lionel)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes it cleaner to introduce more cases where we import memory
from different types of external memory buffers.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
These are usually used for dealing with sparse resources but there's no
reason why we can't hook them up before we have sparse. We have the
hardware; let's light it up.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per chapter 3.2 "Instances":
> Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing
> an apiVersion of 0 is equivalent to providing an apiVersion of
> VK_MAKE_VERSION(1,0,0).
Reported-by: Niklas Haas <[email protected]>
Fixes: 8c048af5890d43578ca4 "anv: Copy the appliation info into the instance"
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Our compile already splits UBO loads into scalars and the untyped
surface read messages we use for SSBO reads and writes only require
dword alignment.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
| |
This lets us get rid of a bunch of duplicated error messages.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
snprintf() guarantees that it will not write more chars than allowed,
and that the string will be null-terminated, without the need to fill
the whole thing with zeroes to begin with.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of having weak references to the anv functions and separate
trampoline functions with their own dispatch table, just make the
trampoline functions weak. This gets rid of a dispatch table and
potentially lets the compiler delete the unused weak function. The
end result is a reduction in the .text section of 5.7K and a reduction
in the .data section of 1.4K.
Before:
text data bss dec hex filename
3190329 282232 8960 3481521 351fb1 _install/lib64/libvulkan_intel.so
After:
text data bss dec hex filename
3184548 280792 8960 3474300 35037c _install/lib64/libvulkan_intel.so
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Offers three clocks, device, clock monotonic and clock monotonic
raw. Could use some kernel support to reduce the deviation between
clock values.
v2:
Ensure deviation is at least as big as the GPU time interval.
v3:
Set device->lost when returning DEVICE_LOST.
Use MAX2 and DIV_ROUND_UP instead of open coding these.
Delete spurious TIMESTAMP in radv version.
Suggested-by: Jason Ekstrand <[email protected]>
Suggested-by: Lionel Landwerlin <[email protected]>
v4:
Add anv_gem_reg_read to anv_gem_stubs.c
Suggested-by: Jason Ekstrand <[email protected]>
v5:
Adjust maxDeviation computation to max(sampled_clock_period) +
sample_interval.
Suggested-by: Bas Nieuwenhuizen <[email protected]>
Suggested-by: Jason Ekstrand <[email protected]>
Signed-off-by: Keith Packard <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Even though the Intel GPU are always at the same PCI location, all the
info we need is already provided by libdrm. Let's be future proof.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no reason why we need generate trampoline functions for instance
functions or carry N copies of the instance dispatch table around for
every hardware generation. Splitting the tables and being more
conservative shaves about 34K off .text and about 4K off .data when
built with clang.
Before splitting dispatch tables:
text data bss dec hex filename
3224305 286216 8960 3519481 35b3f9 _install/lib64/libvulkan_intel.so
After splitting dispatch tables:
text data bss dec hex filename
3190325 282232 8960 3481517 351fad _install/lib64/libvulkan_intel.so
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Broadwell and above, we have to use different MOCS settings to allow
the kernel to take over and disable caching when needed for external
buffers. On Broadwell, this is especially important because the kernel
can't disable eLLC so we have to do it in userspace. We very badly
don't want to do that on everything so we need separate MOCS for
external and internal BOs.
In order to do this, we add an anv-specific BO flag for "external" and
use that to distinguish between buffers which may be shared with other
processes and/or display and those which are entirely internal. That,
together with an anv_mocs_for_bo helper lets us choose the right MOCS
settings for each BO use.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507
Cc: [email protected]
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This is the minimum value according to the spec.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was about to make the claim to someone that every field in isl_surf
is either an enum or has explicit units. Then I looked at isl_surf and
discovered this claim was wrong. We should fix that. This commit does
a few refactors:
* Add _B suffixes to some struct fields
* Add _B to some variables and parameters
* Rename row_pitch_tiles -> row_pitch_tl
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[63/93] Compiling C object 'src/intel/vulkan/...intel@vulkan@@anv_common@sta/anv_device.c.o'.
../src/intel/vulkan/anv_device.c:685:30: warning: passing 'const char *' to parameter of type 'void *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
vk_free(&instance->alloc, instance->app_info.app_name);
^~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/vulkan/util/vk_alloc.h:62:51: note: passing argument to parameter 'data' here
vk_free(const VkAllocationCallbacks *alloc, void *data)
^
../src/intel/vulkan/anv_device.c:686:30: warning: passing 'const char *' to parameter of type 'void *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
vk_free(&instance->alloc, instance->app_info.engine_name);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/vulkan/util/vk_alloc.h:62:51: note: passing argument to parameter 'data' here
vk_free(const VkAllocationCallbacks *alloc, void *data)
^
[65/93] Compiling C object 'src/intel/vulkan/...ommon@sta/anv_nir_apply_pipeline_layout.c.o'.
../src/intel/vulkan/anv_nir_apply_pipeline_layout.c:519:13: warning: unused variable 'image_uniform' [-Wunused-variable]
unsigned image_uniform;
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Fixes: 8c048af5890d4 "anv: Copy the appliation info into the instance"
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Cc: "18.2" <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Cc: "18.2" <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
anv_GetPhysicalDeviceProperties2()
VkPhysicalDeviceProtectedMemoryProperties structure is new on Vulkan 1.1.
Fixes Vulkan CTS CL#2849.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension can be supported on SKL+. With this patch,
all corresponding tests (6K+) in CTS can pass. No test fails.
I verified CTS with the command below:
deqp-vk --deqp-case=dEQP-VK.pipeline.sampler.view_type.*reduce*
v2: 1) support all depth formats, not depth-only formats, 2) fix
a wrong indention (Jason).
v3: fix a few nits (Lionel).
v4: fix failures in CI: disable sampler reduction when sampler
reduction mode is not specified via this extension (Lionel).
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
`device` is used 2 lines below, even visible in the diff context printed.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During code review, Jason pointed out that:
2b3064c0731 "i965, anv: Use INTEL_DEBUG for disk_cache driver flags"
Didn't account for INTEL_SCALER_* environment variables.
To fix this, let the compiler return the disk_cache driver flags.
Another possible fix would be to pull the INTEL_SCALER_* into
INTEL_DEBUG bits, but as we are currently using 41 of 64 bits, I
didn't think it was a good use of 4 more of these bits. (5 since
INTEL_PRECISE_TRIG needs to be accounted for as well.)
Cc: Jason Ekstrand <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
CovID: 1438132
Fixes: a99c9e63a07477634ab73 "anv: finish the binding_table_pool on
destroyDevice when use_softpin"
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since various options within INTEL_DEBUG could impact code generation,
we need to set the disk cache driver_flags parameter based on the
INTEL_DEBUG flags in use.
An example that will affect the program generated by i965 is the
INTEL_DEBUG=nocompact option.
The DEBUG_DISK_CACHE_MASK value is added to mask the settings of
INTEL_DEBUG that can affect program generation.
v2:
* Use driver_flags (Tim)
* Also update Anvil (Jason)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extra character should not be used by snprintf, but we make it
available to verify that we printed the exact number we wanted, and
didn't overflow.
v2:
* Also update Anvil
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|