| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
v2 (Jason): Adjust directly in surf_fake_rgb_with_red()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910
CC: [email protected]
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
(cherry picked from commit 393ec1a5071263d300e91f43058ed3b594d41418)
|
|
|
|
|
|
|
|
|
|
| |
We can't sample from depth-stencil formats but on gen7 but we can sample
from depth-only formats.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024
Reviewed-by: Juan A. Suarez Romero <[email protected]>
Cc: [email protected]
(cherry picked from commit 06d3115bb97740a4c8f36c645944a8bd0bde3f68)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If dual object compile fails (as seems to happen with virgl a
fair bit, and does piglit even have any tests for it?), we end up
not restarting the pull params, so we call
vec4_visitor::move_uniform_array_access_to_pull_constant
a second time and it runs over the ends of the alloc.
Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
running inside virgl on ivybridge.
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 271fa3a684ef0eefe99087c13d1abb099784163f)
|
|
|
|
|
|
|
|
|
|
| |
In the previous commit, forgot to apply v2 suggestions.
Fixes: 28d0c38 (anv/pipeline: use unsigned long long constant to check
enable vertex inputs)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit 5cd4ece34ebdc1383f1e2376c88097d06544e2f6)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.
Cc: "17.2" <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit 31f1863ace73d31a579e5c36252a957818ad09cf)
|
|
|
|
|
|
| |
Cc: "17.2" <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit a848e693efc8e2a1d355dc1076409968b374153f)
|
|
|
|
|
|
|
|
|
|
|
|
| |
We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT. We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.
Found by assertion failures while enabling VK_ANDROID_native_buffer.
Cc: [email protected]
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit 5d6905211355464de4885492511e5f9d936cc058)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the Vulkan 1.0.53 spec VU for vkCreateImageView:
"image must have been created with a usage value containing at least
one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"
We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: [email protected]
(cherry picked from commit c5700ed72e765043bb1c8523a05ade235496e053)
|
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: [email protected]
(cherry picked from commit cbdfd1daa24ee9a7a612f7b0e9aa4610af05e211)
|
|
|
|
|
|
|
|
|
| |
We were early returning and never created the NULL surface state.
Reviewed-by: Lionel Landwerlin <[email protected]>
Tested-by: James Legg <[email protected]>
Cc: [email protected]
(cherry picked from commit bd41564746ca4f4bd46185b99754eaa012c359e5)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.
ANV did so since day one (back in 2015)
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit 43c188f9708b3e80b9f1c9c4c6bb16ac94b5ce5e)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/intel/vulkan/anv_device.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When initializing the ANV pipeline, one of the tasks is checking which
vertex inputs are enabled. This is done by checking if the enabled bits
in inputs_read.
But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 +
desc->location))`. The problem here is that if location is 15 or
greater, the sum is 32 or greater. But C is handling 1 as a 32-bit
integer, which means the displaced bit is out of range and thus the full
value is 0.
Thus, use 1ull, which is an unsigned long long value.
This fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved
v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner)
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Cc: [email protected]
(cherry picked from commit 28d0c38d85d94cab23667049f03ea072b8e7907c)
|
|
|
|
|
|
|
|
|
| |
V2: Use 2^31 bytes (2GB) surface size limit on pre-gen9 and
2^38 bytes for gen9+.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
(cherry picked from commit c07271fef095164c8bcfb54fdc95567c3774a866)
|
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
(cherry picked from commit 70229782370c7ed9a63e05689f4d8bfc80128dd9)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some reshuffle in the Makefiles under src/intel resulted in Android
libraries being no longer linked with code using
src/intel/common/gen_debug.h that contains references to functions
exported by those libraries (namely ALOGW macro, which is currently
resolved into a call to __android_log_print() from cutils).
Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for
affected module on Android NDK builds.
Fixes: d5b355ce5fd ("i965: Move intel_debug.h to intel/common/gen_debug.h")
Signed-off-by: Tomasz Figa <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
(cherry picked from commit 50a8a7377ae071d5b4b927e9055a7ec8391acc59)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new table added in this patch matches with the table
in gfxspecs. We were programming the wrong values earlier.
V2: Update the comment.
Cc: "17.1" <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit 8521559e086a3d56f549962ab8e9f45a6a5989d8)
[Andres Gomez: gen 10 was not still there on 17.1]
Signed-off-by: Andres Gomez <[email protected]>
Conflicts:
src/intel/common/gen_l3_config.c
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new field helps simplify l3 way size computations
in next patch.
V2: Initialize the l3_banks to 0 in macros.
Suggested-by: Francisco Jerez <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit eb23be1d97da290073d76c2510b8999b250f0139)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valid values for URBAllocation start at 32, so substract that
before programming the register.
This was missed when porting from the GL driver.
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit a52ee32a9a49b48c51a80b8a35aa26bd583cabb7)
|
|
|
|
|
|
|
| |
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit 39adea9330376a64a4b5e8da98f5e055ebd3331e)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit 50d0eb5096bd9514821a641f25c0b3455c0f8a88)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This makes us walk over the heaps one at a time and add the types for
LLC and !LLC to each heap.
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit 34581fdd4f149894dfa51777a2f7eb289bd08b71)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit b83b1af6f6936f36db42a8f8b8e0854d0f9491fd)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Conflicts:
src/intel/vulkan/anv_device.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea behind doing this was to make it easier to set various flags.
However, we have enough custom flag settings floating around the driver
that this is more of a nuisance than a help. This commit has the
following functional changes:
1) The workaround_bo created in anv_CreateDevice loses both flags.
This shouldn't matter because it's very small and entirely internal
to the driver.
2) The bo created in anv_CreateDmaBufImageINTEL loses the
EXEC_OBJECT_ASYNC flag. In retrospect, it never should have gotten
EXEC_OBJECT_ASYNC in the first place.
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit 00df1cd9d6234cdfc9fb2bf3615196ff83a3c956)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Conflicts:
src/intel/vulkan/anv_allocator.c
src/intel/vulkan/anv_device.c
src/intel/vulkan/anv_queue.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of returning valid types as just a number, we now walk the list
and check the buffer's usage against the usage flags we store in the new
anv_memory_type structure. Currently, valid_buffer_usage == ~0.
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit f7736ccf53eaeb66c4270afe0916e2cb29ab8667)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Conflicts:
src/intel/vulkan/anv_device.c
src/intel/vulkan/anv_private.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we were just comparing the type index to 0. Now we actually
look the type up in the table and check its properties to determine what
kind of mapping we want to do.
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit 92325a7efc769c32e03031323e21700dc55171e4)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Conflicts:
src/intel/vulkan/anv_device.c
src/intel/vulkan/anv_private.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
(cherry picked from commit 35e626bd0e59e7ce9fd97ccef66b2468c09206a4)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Squashed with:
anv/tests: Create a dummy instance as well as device
This fixes crashes caused by 35e626bd0e59e7ce9fd97ccef66b2468c09206a4
which made us start referencing the instance in the allocators. With
this commit, the tests now happily pass again.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100877
Tested-by: Vinson Lee <[email protected]>
(cherry picked from commit 6ef1bd4fa57b36efc7919773fd26c36fd43d2ea9)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compiler can discard the shared ones from the link chain, since
there is no user (the static libraries) before it on the command line.
Cc: [email protected]
Reported-by: Laurent Carlier <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
(cherry picked from commit 3e8790bff096a1a56bd1a3046c556a7f93b68ca8)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit 10fad58b31ee2354330152ca4072327d228fc2e7)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit c1f4343807d1040bd7b5440aa2f5fccf5f12842d)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Conflicts:
src/intel/vulkan/anv_device.c
src/intel/vulkan/anv_private.h
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't matter right now since it only affects whether or not we
set the kernel bit but, if we ever do anything else based on it, we'll
want it to be correct per-gen.
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit eceaf7e2340fca0079300692733206b2af555bd9)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up until now, we've been memsetting the auxiliary surface to 0 at
BindImageMemory time to ensure that it is properly initialized.
However, this isn't correct because apps are allowed to freely alias
memory between different images and buffers so long as they properly
track whether or not a particular image is valid and, if it isn't,
transition from UNINITIALIZED to something else before using it. We
now implement those transitions so we can drop the hack.
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit 4eecd534f0544b62ae831a97708ade007541bd32)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit cc45c4bb8072b6593812f9b68a7b3d2d00bfb9f0)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This causes dEQP-VK.api.copy_and_blit.resolve_image.partial.* to start
failing due to test bugs. See CL 1031 for a test fix.
Reviewed-by: Nanley Chery <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit 75edecf5020a9b833ff7e2929f64ceb11c9df679)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we are having the XCB_DRI3 dependencies duplicated,
partially.
Just do a once-off check and add all of the respective CFLAGS/LIBS
where needed.
As a nice side effect this helps us solve a couple of FIXMEs.
DRI3 is not a thing w/o X11 so disable it in such cases.
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
(cherry picked from commit acf3d2afab0571b74c0c0d1aee0f631b33fdc7da)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
squashed with:
configure.ac: add xcb-fixes to the XCB DRI3 list
The XCB module is used by the VL targets. Thus omitting it can lead to
link-time errors due to unresolved symbols.
Other DRI3 users such as the Vulkan WSI and the dri3 loader helper do
not use an update region in their xcb_present_pixmap() call. We will
look into that at a later stage.
Fixes: acf3d2afab0 ("configure: check once for DRI3 dependencies")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101110
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 9a90d6a9d4ee1632aa357a2ac9be150e058e2c10)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
squashed with:
configure.ac: s/xcb-fixes/xcb-xfixes/
Former is not a thing, even if I have a hacked xcb-fixes.pc on my system.
Thanks for spotting it Mark!
Fixes: 9a90d6a9d4e ("configure.ac: add xcb-fixes to the XCB DRI3 list")
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 48cd1919ff1584c211ec7958864cac2e1cb347cf)
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The procedure for decompressing an opaque BC1 Vulkan format is dependant on the
comparison of two colors stored in the first 32 bits of the compressed block.
Here's the specified OpenGL (and Vulkan) behavior for reference:
The RGB color for a texel at location (x,y) in the block is given by:
RGB0, if color0 > color1 and code(x,y) == 0
RGB1, if color0 > color1 and code(x,y) == 1
(2*RGB0+RGB1)/3, if color0 > color1 and code(x,y) == 2
(RGB0+2*RGB1)/3, if color0 > color1 and code(x,y) == 3
RGB0, if color0 <= color1 and code(x,y) == 0
RGB1, if color0 <= color1 and code(x,y) == 1
(RGB0+RGB1)/2, if color0 <= color1 and code(x,y) == 2
BLACK, if color0 <= color1 and code(x,y) == 3
The sampling operation performed on an opaque DXT1 Intel format essentially
hard-codes the comparison result of the two colors as color0 > color1. This
means that the behavior is incompatible with OpenGL and Vulkan. This is stated
in the SKL PRM, Vol 5: Memory Views:
Opaque Textures (DXT1_RGB)
Texture format DXT1_RGB is identical to DXT1, with the exception that the
One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and
the resulting texel color is derived strictly from the Opaque Color Encoding.
The alpha channel defaults to 1.0.
Programming Note
Context: Opaque Textures (DXT1_RGB)
The behavior of this format is not compliant with the OGL spec.
The opaque and non-opaque BC1 Vulkan formats are specified to be decoded in
exactly the same way except the BLACK value must have a transparent alpha
channel in the latter. Use the four-channel BC1 Intel formats with the alpha
set to 1 to provide the behavior required by the spec.
v2 (Kenneth Graunke):
- Provide a more detailed commit message.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925
Cc: <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
(cherry picked from commit 56458cb168bf79ae51ba1efc3acec15874cc34a9)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 6facb0c0 ("android: fix libz dynamic library dependencies")
unconditionally adds libz as a dependency to all shared libraries.
That is unnecessary.
Commit 85a9b1b5 introduced libz as a dependency to libmesa_util.
So only the shared libraries that use libmesa_util need libz.
Fix Android Lollipop build by adding the include path of zlib to
libmesa_util explicitly instead of getting the path implicitly
from zlib since it doesn't export the include path in Lollipop.
Fixes: 6facb0c0 "android: fix libz dynamic library dependencies"
Signed-off-by: Chih-Wei Huang <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
(cherry picked from commit bfc0c23843008fd510afa263ebe371bef3346445)
|
|
|
|
|
|
|
|
|
|
| |
After successful drmGetDevices2() call, drmFreeDevices() needs to be
called.
Fixes: b1fb6e8d "anv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]> # radv version
(cherry picked from commit 0ef302638f2883789a3b39c2b6cfd20814efa0bb)
|
|
|
|
|
|
|
|
|
|
|
| |
drmGetDevices2 takes count and not size. Probably hasn't caused problems
yet in practice and was missed as setups with more than 8 DRM devices
are not very common.
Fixes: b1fb6e8d "anv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit e0aee8b667955675e2e6c647a88048b64bc2796e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reorder the uniforms to load first the dvec4-aligned variables in the
push constant buffer and then push the vec4-aligned ones. It takes
into account that the relocated uniforms should be aligned to their
channel size.
This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
the rest in next GRF, so the region parameters to read that could break
the HW rules.
v2:
- Fix broken logic.
- Add a comment to explain what should be needed to optimise the usage
of the push constant buffer slots, as this patch does not pack the
uniforms.
v3:
- Implemented the push constant buffer usage optimization.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Acked-by: Francisco Jerez <[email protected]>
(cherry picked from commit e69e5c7006da80af62c9ef08dec215b3b4b30946)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
offset
It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
were a vector or scalar, so this can lead to problems when loading them
to the push constant buffer.
Moreover, 'shift' calculation was designed to calculate the offset in
DWORDS, but it doesn't take into account DFs, so the calculated swizzle
for the later ones was wrong.
The indirect case is not changed because MOV INDIRECT will write
to all components. Added an assert to verify that these uniforms
are aligned.
v2:
- Fix 'shift' calculation (Curro)
- Set both swizzle and writemask.
- Add assert(shift == 0) for the indirect case.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit 8aa6ada8384a961b37dfefec7f9e40e5a4e27ce7)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vec4_gs_visitor execution
We are going to add a packing feature to reduce the usage of the push
constant buffer. One of the consequences is that 'nr_params' would be
modified by vec4_visitor's run call, so we need to restore it if one of
them failed before executing the fallback ones. Same thing happens to the
uniforms values that would be reordered afterwards.
Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when
the dvec4 alignment and packing patch is applied.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Acked-by: Francisco Jerez <[email protected]>
(cherry picked from commit 354f7f2cb9c7206e12646c79d8ff5becbaffa61b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reasoning Chad gave in the comment for choosing a valign of 4 is
entirely bunk. The fact that you have to multiply pitch by 2 is
completely unrelated to the halign/valign parameters used for texture
layout. (Not completely unrelated. W-tiling is just Y-tiling with a
bit of extra swizzling which turns 8x8 W-tiled chunks into 16x4 y-tiled
chunks so it makes everything easier if miplevels are always aligned to
8x8.) The fact that RENDER_SURFACE_STATE::SurfaceVerticalAlignmet
doesn't have a VALIGN_8 option doesn't matter since this is gen7 and you
can't do stencil texturing anyway.
v2 (Jason Ekstrand):
- Delete most of Chad's comment and add a more descriptive commit
message.
Signed-off-by: Topi Pohjolainen <[email protected]>
Cc: "17.0 17.1" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
(cherry picked from commit 236f17a9f73935db6cddafd91e53a5fae34aae6e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Take it into account when checking if the mapping failed.
v2:
- Remove map == NULL and its related comment (Emil)
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Fixes: 6f3e3c715a7 ("vk/allocator: Add a BO pool")
Fixes: 9919a2d34de ("anv/image: Memset hiz surfaces to 0 when binding memory")
Cc: "17.0 17.1" <[email protected]>
(cherry picked from commit b546c9d318731b988aa3d8c4e4735cdbb596cfbf)
Squashed with:
anv: vkBindImageMemory() should return VK_ERROR_OUT_OF_{HOST,DEVICE}_MEMORY on failure
According to the spec we get VK_ERROR_OUT_OF_HOST_MEMORY or
VK_ERROR_OUT_OF_DEVICE_MEMORY on vkBindImageMemory failure.
Fixes returned value changed by b546c9d.
Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error")
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.0 17.1" <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
(cherry picked from commit 939b015736d5091faeabde4f5a373e6a1612c5ed)
Squashed with:
anv: fix anv_gem_mmap comment to not mention NULL
The function cannot return NULL, update the comment accordingly.
Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error")
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
(cherry picked from commit 9d2aa6e5067752efbc0acbd728bc0bde49aefb61)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions
The regioning parameters are now properly set by convert_to_hw_regs()
and we don't need to fix them in the generator. That latter fix
previously done in the generator was strictly speaking wrong for any
non-identity regions.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit f57e234fdd52331d0aa6656a36efdebea9d11e9d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gen7, the swizzles used in DF align16 instructions works for element
size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that
in the rest of the code and prepare the instructions for this (scalarize_df()),
we need to set it to two again.
However, for DF align1 instructions, a width of 2 is wrong as we are not
reading the data we want. For example, an uniform would have a region of
<0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access
to the first 4.
This patch sets the default one to 4 and then modifies the width of
align16 instruction's DF sources when we translate the logical swizzle
to the physical one.
v2:
- Remove conditional (Curro).
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit aaeb1c99beed39d85c300ebdb8a7bf056ee6717c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From IVB PRM, vol4, part3, "General Restrictions on Regioning
Parameters":
"If ExecSize = Width and HorzStride ≠ 0, VertStride must
be set to Width * HorzStride."
In next patch, we are going to modify the region parameter for
uniforms and vgrf. For uniforms that are the source of
DF align1 instructions, they will have <0, 4, 1> regioning and
the execsize for those instructions will be 4, so they will break
the regioning rule. This will be the same for VGRF sources where
we use the vstride == 0 exploit.
As we know we are not going to cross the GRF boundary with that
execsize and parameters (not even with the exploit), we just fix
the vstride here.
v2:
- Move is_align1_df() (Curro)
- Refactor exec_size == width calculation (Curro)
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit 7f728bce811fc283e672e3a07b008bb7b52de35e)
|
|
|
|
|
|
|
|
|
| |
The command is really operating on a Queue not a command buffer and the
nearest object to that with an allocator is VkDevice.
Reviewed-by: Chad Versace <[email protected]>
Cc: "17.0 17.1" <[email protected]>
(cherry picked from commit bd3a9813b92bd2e116b58f0932bc7f1f722a9f63)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes rendering corruptions in DOOM. Hopefully, it will also make
Jenkins a bit more stable as we've been seeing some random failures and
GPU hangs ever since turning on 48bit.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100620
Fixes: 651ec926fc1 "anv: Add support for 48-bit addresses"
Tested-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "17.1" <[email protected]>
(cherry picked from commit c43b4bc85eddba8bc31665cfee5928bed8343516)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now the spilling cost calculation was neglecting the amount of
data read from the register during the spilling cost calculation.
This caused it to make suboptimal decisions in some cases leading to
higher memory bandwidth usage than necessary.
Improves Unigine Heaven performance by ~4% on BDW, reversing an
unintended FPS regression from my previous commit
147e71242ce539ff28e282f009c332818c35f5ac with n=12 and statistical
significance 5%. In addition SynMark2 OglCSDof performance is
improved by an additional ~5% on SKL, and a Kerbal Space Program
apitrace around the Moho planet I can provide on request improves by
~20%.
Cc: <[email protected]>
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit 58324389be7bc7c5e10093b9cc0a8efa9b4c93a9)
|
|
|
|
|
|
|
|
|
|
|
| |
This is what we use later on to compute the number of registers that
will actually get spilled to memory, so it's more likely to match
reality than the current open-coded approximation.
Cc: <[email protected]>
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit ecc19e12dca95d2571d3761dea6dec24b061013c)
|