summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* intel/blorp: Adjust intra-tile x when faking rgb with red-onlyTopi Pohjolainen2017-08-251-0/+1
| | | | | | | | | | | v2 (Jason): Adjust directly in surf_fake_rgb_with_red() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910 CC: [email protected] Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> (cherry picked from commit 393ec1a5071263d300e91f43058ed3b594d41418)
* anv/formats: Allow sampling on depth-only formats on gen7Jason Ekstrand2017-08-191-1/+2
| | | | | | | | | | We can't sample from depth-stencil formats but on gen7 but we can sample from depth-only formats. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024 Reviewed-by: Juan A. Suarez Romero <[email protected]> Cc: [email protected] (cherry picked from commit 06d3115bb97740a4c8f36c645944a8bd0bde3f68)
* intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.Dave Airlie2017-08-191-0/+1
| | | | | | | | | | | | | | | | If dual object compile fails (as seems to happen with virgl a fair bit, and does piglit even have any tests for it?), we end up not restarting the pull params, so we call vec4_visitor::move_uniform_array_access_to_pull_constant a second time and it runs over the ends of the alloc. Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test running inside virgl on ivybridge. Reviewed-by: Kenneth Graunke <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit 271fa3a684ef0eefe99087c13d1abb099784163f)
* anv/pipeline: do not use BITFIELD64_BIT()Juan A. Suarez Romero2017-08-031-1/+1
| | | | | | | | | | In the previous commit, forgot to apply v2 suggestions. Fixes: 28d0c38 (anv/pipeline: use unsigned long long constant to check enable vertex inputs) Signed-off-by: Juan A. Suarez Romero <[email protected]> (cherry picked from commit 5cd4ece34ebdc1383f1e2376c88097d06544e2f6)
* anv: only expose up to 28 vertex attributesIago Toral Quiroga2017-08-031-1/+1
| | | | | | | | | | | | | The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs. However, the maximum allowed value of "Vertex URB Entry Read Length" in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements. Because we also need to reserve a vertex buffer to upload VertexIndex/InstanceIndex and another to upload DrawID when needed, we can only expose 28. Cc: "17.2" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> (cherry picked from commit 31f1863ace73d31a579e5c36252a957818ad09cf)
* anv/cmd_buffer: fix off by one error in assertionIago Toral Quiroga2017-08-031-1/+1
| | | | | | Cc: "17.2" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> (cherry picked from commit a848e693efc8e2a1d355dc1076409968b374153f)
* anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BITChad Versace2017-08-031-3/+4
| | | | | | | | | | | | We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT. We looked for the bit in VkImageCreateInfo::usage, but it's actually in VkImageCreateInfo::flags. Found by assertion failures while enabling VK_ANDROID_native_buffer. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]> (cherry picked from commit 5d6905211355464de4885492511e5f9d936cc058)
* anv/image: Add INPUT_ATTACHMENT to the list of required usagesJason Ekstrand2017-08-031-0/+1
| | | | | | | | | | | | | | | | From the Vulkan 1.0.53 spec VU for vkCreateImageView: "image must have been created with a usage value containing at least one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT, VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT, VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT" We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected] (cherry picked from commit c5700ed72e765043bb1c8523a05ade235496e053)
* anv: Stop leaking the no_aux sampler surface stateJason Ekstrand2017-08-031-0/+5
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected] (cherry picked from commit cbdfd1daa24ee9a7a612f7b0e9aa4610af05e211)
* anv/cmd_buffer: Properly handle render passes with 0 attachmentsJason Ekstrand2017-08-031-12/+11
| | | | | | | | | We were early returning and never created the NULL surface state. Reviewed-by: Lionel Landwerlin <[email protected]> Tested-by: James Legg <[email protected]> Cc: [email protected] (cherry picked from commit bd41564746ca4f4bd46185b99754eaa012c359e5)
* anv: advertise v6 of the wayland surface extensionEmil Velikov2017-08-031-1/+1
| | | | | | | | | | | | | | | | | Jason updated the Khronos spec to explicitly state that Wayland surfaces must support VK_PRESENT_MODE_MAILBOX_KHR. ANV did so since day one (back in 2015) Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 43c188f9708b3e80b9f1c9c4c6bb16ac94b5ce5e) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/intel/vulkan/anv_device.c
* anv/pipeline: use unsigned long long constant to check enable vertex inputsJuan A. Suarez Romero2017-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | When initializing the ANV pipeline, one of the tasks is checking which vertex inputs are enabled. This is done by checking if the enabled bits in inputs_read. But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 + desc->location))`. The problem here is that if location is 15 or greater, the sum is 32 or greater. But C is handling 1 as a 32-bit integer, which means the displaced bit is out of range and thus the full value is 0. Thus, use 1ull, which is an unsigned long long value. This fixes: dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner) Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Cc: [email protected] (cherry picked from commit 28d0c38d85d94cab23667049f03ea072b8e7907c)
* intel/isl: Add the maximum surface size limitAnuj Phogat2017-07-121-0/+22
| | | | | | | | | V2: Use 2^31 bytes (2GB) surface size limit on pre-gen9 and 2^38 bytes for gen9+. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]> (cherry picked from commit c07271fef095164c8bcfb54fdc95567c3774a866)
* intel/isl: Use uint64_t to store total surface sizeAnuj Phogat2017-07-122-2/+3
| | | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]> (cherry picked from commit 70229782370c7ed9a63e05689f4d8bfc80128dd9)
* intel: common: Fix link failure with standalone Android buildTomasz Figa2017-07-121-0/+5
| | | | | | | | | | | | | | | | Some reshuffle in the Makefiles under src/intel resulted in Android libraries being no longer linked with code using src/intel/common/gen_debug.h that contains references to functions exported by those libraries (namely ALOGW macro, which is currently resolved into a call to __android_log_print() from cutils). Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for affected module on Android NDK builds. Fixes: d5b355ce5fd ("i965: Move intel_debug.h to intel/common/gen_debug.h") Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 50a8a7377ae071d5b4b927e9055a7ec8391acc59)
* i965: Fix broxton 2x6 l3 configAnuj Phogat2017-06-281-0/+16
| | | | | | | | | | | | | | | | | The new table added in this patch matches with the table in gfxspecs. We were programming the wrong values earlier. V2: Update the comment. Cc: "17.1" <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit 8521559e086a3d56f549962ab8e9f45a6a5989d8) [Andres Gomez: gen 10 was not still there on 17.1] Signed-off-by: Andres Gomez <[email protected]> Conflicts: src/intel/common/gen_l3_config.c
* i965: Add and initialize l3_banks field for gen7+Anuj Phogat2017-06-282-3/+27
| | | | | | | | | | | | This new field helps simplify l3 way size computations in next patch. V2: Initialize the l3_banks to 0 in macros. Suggested-by: Francisco Jerez <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit eb23be1d97da290073d76c2510b8999b250f0139)
* anv: Fix L3 cache programming on Bay TrailJonas Kulla2017-06-281-1/+1
| | | | | | | | | | | | Valid values for URBAllocation start at 32, so substract that before programming the register. This was missed when porting from the GL driver. Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit a52ee32a9a49b48c51a80b8a35aa26bd583cabb7)
* anv: Require vertex buffers to come from a 32-bit heapJason Ekstrand2017-06-031-0/+12
| | | | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit 39adea9330376a64a4b5e8da98f5e055ebd3331e) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv: Advertise both 32-bit and 48-bit heaps when we have enough memoryJason Ekstrand2017-06-021-6/+36
| | | | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit 50d0eb5096bd9514821a641f25c0b3455c0f8a88) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv: Refactor memory type setupJason Ekstrand2017-06-021-36/+40
| | | | | | | | | | This makes us walk over the heaps one at a time and add the types for LLC and !LLC to each heap. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit 34581fdd4f149894dfa51777a2f7eb289bd08b71) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv: Make supports_48bit_addresses a heap propertyJason Ekstrand2017-06-022-3/+14
| | | | | | | | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit b83b1af6f6936f36db42a8f8b8e0854d0f9491fd) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/intel/vulkan/anv_device.c
* anv: Stop setting BO flags in bo_init_newJason Ekstrand2017-06-023-7/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | The idea behind doing this was to make it easier to set various flags. However, we have enough custom flag settings floating around the driver that this is more of a nuisance than a help. This commit has the following functional changes: 1) The workaround_bo created in anv_CreateDevice loses both flags. This shouldn't matter because it's very small and entirely internal to the driver. 2) The bo created in anv_CreateDmaBufImageINTEL loses the EXEC_OBJECT_ASYNC flag. In retrospect, it never should have gotten EXEC_OBJECT_ASYNC in the first place. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit 00df1cd9d6234cdfc9fb2bf3615196ff83a3c956) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/intel/vulkan/anv_allocator.c src/intel/vulkan/anv_device.c src/intel/vulkan/anv_queue.c
* anv: Add valid_bufer_usage to the memory type metadataJason Ekstrand2017-06-022-8/+26
| | | | | | | | | | | | | | | | Instead of returning valid types as just a number, we now walk the list and check the buffer's usage against the usage flags we store in the new anv_memory_type structure. Currently, valid_buffer_usage == ~0. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit f7736ccf53eaeb66c4270afe0916e2cb29ab8667) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/intel/vulkan/anv_device.c src/intel/vulkan/anv_private.h
* anv: Determine the type of mapping based on type metadataJason Ekstrand2017-06-022-7/+7
| | | | | | | | | | | | | | | | Before, we were just comparing the type index to 0. Now we actually look the type up in the table and check its properties to determine what kind of mapping we want to do. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit 92325a7efc769c32e03031323e21700dc55171e4) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/intel/vulkan/anv_device.c src/intel/vulkan/anv_private.h
* anv: Set EXEC_OBJECT_ASYNC when availableJason Ekstrand2017-06-028-4/+26
| | | | | | | | | | | | | | | | | | | Reviewed-by: Chad Versace <[email protected]> (cherry picked from commit 35e626bd0e59e7ce9fd97ccef66b2468c09206a4) Signed-off-by: Juan A. Suarez Romero <[email protected]> Squashed with: anv/tests: Create a dummy instance as well as device This fixes crashes caused by 35e626bd0e59e7ce9fd97ccef66b2468c09206a4 which made us start referencing the instance in the allocators. With this commit, the tests now happily pass again. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100877 Tested-by: Vinson Lee <[email protected]> (cherry picked from commit 6ef1bd4fa57b36efc7919773fd26c36fd43d2ea9) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv: automake: list shared libraries after the static onesEmil Velikov2017-06-011-16/+15
| | | | | | | | | | | | The compiler can discard the shared ones from the link chain, since there is no user (the static libraries) before it on the command line. Cc: [email protected] Reported-by: Laurent Carlier <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> (cherry picked from commit 3e8790bff096a1a56bd1a3046c556a7f93b68ca8) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv: Set image memory types based on the type countJason Ekstrand2017-06-011-2/+4
| | | | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit 10fad58b31ee2354330152ca4072327d228fc2e7) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv: Set up memory types and heaps during physical device initJason Ekstrand2017-06-012-44/+81
| | | | | | | | | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit c1f4343807d1040bd7b5440aa2f5fccf5f12842d) [Juan A. Suarez: resolve trivial conflicts] Signed-off-by: Juan A. Suarez Romero <[email protected]> Conflicts: src/intel/vulkan/anv_device.c src/intel/vulkan/anv_private.h
* anv: Predicate 48bit support on gen >= 8Jason Ekstrand2017-05-311-1/+6
| | | | | | | | | | | This doesn't matter right now since it only affects whether or not we set the kernel bit but, if we ever do anything else based on it, we'll want it to be correct per-gen. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit eceaf7e2340fca0079300692733206b2af555bd9) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv/image: Get rid of the memset(aux, 0, sizeof(aux)) hackJason Ekstrand2017-05-311-28/+0
| | | | | | | | | | | | | | | Up until now, we've been memsetting the auxiliary surface to 0 at BindImageMemory time to ensure that it is properly initialized. However, this isn't correct because apps are allowed to freely alias memory between different images and buffers so long as they properly track whether or not a particular image is valid and, if it isn't, transition from UNINITIALIZED to something else before using it. We now implement those transitions so we can drop the hack. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit 4eecd534f0544b62ae831a97708ade007541bd32) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv: Handle transitioning depth from UNDEFINED to other layoutsJason Ekstrand2017-05-312-19/+19
| | | | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit cc45c4bb8072b6593812f9b68a7b3d2d00bfb9f0) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv: Handle color layout transitions from the UNINITIALIZED layoutJason Ekstrand2017-05-313-2/+108
| | | | | | | | | | This causes dEQP-VK.api.copy_and_blit.resolve_image.partial.* to start failing due to test bugs. See CL 1031 for a test fix. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit 75edecf5020a9b833ff7e2929f64ceb11c9df679) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* configure: check once for DRI3 dependenciesEmil Velikov2017-05-311-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we are having the XCB_DRI3 dependencies duplicated, partially. Just do a once-off check and add all of the respective CFLAGS/LIBS where needed. As a nice side effect this helps us solve a couple of FIXMEs. DRI3 is not a thing w/o X11 so disable it in such cases. Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (cherry picked from commit acf3d2afab0571b74c0c0d1aee0f631b33fdc7da) Signed-off-by: Juan A. Suarez Romero <[email protected]> squashed with: configure.ac: add xcb-fixes to the XCB DRI3 list The XCB module is used by the VL targets. Thus omitting it can lead to link-time errors due to unresolved symbols. Other DRI3 users such as the Vulkan WSI and the dri3 loader helper do not use an update region in their xcb_present_pixmap() call. We will look into that at a later stage. Fixes: acf3d2afab0 ("configure: check once for DRI3 dependencies") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101110 Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 9a90d6a9d4ee1632aa357a2ac9be150e058e2c10) Signed-off-by: Juan A. Suarez Romero <[email protected]> squashed with: configure.ac: s/xcb-fixes/xcb-xfixes/ Former is not a thing, even if I have a hacked xcb-fixes.pc on my system. Thanks for spotting it Mark! Fixes: 9a90d6a9d4e ("configure.ac: add xcb-fixes to the XCB DRI3 list") Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit 48cd1919ff1584c211ec7958864cac2e1cb347cf) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv/formats: Update the three-channel BC1 mappingsNanley Chery2017-05-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The procedure for decompressing an opaque BC1 Vulkan format is dependant on the comparison of two colors stored in the first 32 bits of the compressed block. Here's the specified OpenGL (and Vulkan) behavior for reference: The RGB color for a texel at location (x,y) in the block is given by: RGB0, if color0 > color1 and code(x,y) == 0 RGB1, if color0 > color1 and code(x,y) == 1 (2*RGB0+RGB1)/3, if color0 > color1 and code(x,y) == 2 (RGB0+2*RGB1)/3, if color0 > color1 and code(x,y) == 3 RGB0, if color0 <= color1 and code(x,y) == 0 RGB1, if color0 <= color1 and code(x,y) == 1 (RGB0+RGB1)/2, if color0 <= color1 and code(x,y) == 2 BLACK, if color0 <= color1 and code(x,y) == 3 The sampling operation performed on an opaque DXT1 Intel format essentially hard-codes the comparison result of the two colors as color0 > color1. This means that the behavior is incompatible with OpenGL and Vulkan. This is stated in the SKL PRM, Vol 5: Memory Views: Opaque Textures (DXT1_RGB) Texture format DXT1_RGB is identical to DXT1, with the exception that the One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and the resulting texel color is derived strictly from the Opaque Color Encoding. The alpha channel defaults to 1.0. Programming Note Context: Opaque Textures (DXT1_RGB) The behavior of this format is not compliant with the OGL spec. The opaque and non-opaque BC1 Vulkan formats are specified to be decoded in exactly the same way except the BLACK value must have a transparent alpha channel in the latter. Use the four-channel BC1 Intel formats with the alpha set to 1 to provide the behavior required by the spec. v2 (Kenneth Graunke): - Provide a more detailed commit message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925 Cc: <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Nanley Chery <[email protected]> (cherry picked from commit 56458cb168bf79ae51ba1efc3acec15874cc34a9)
* Android: correct libz dependencyChih-Wei Huang2017-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | Commit 6facb0c0 ("android: fix libz dynamic library dependencies") unconditionally adds libz as a dependency to all shared libraries. That is unnecessary. Commit 85a9b1b5 introduced libz as a dependency to libmesa_util. So only the shared libraries that use libmesa_util need libz. Fix Android Lollipop build by adding the include path of zlib to libmesa_util explicitly instead of getting the path implicitly from zlib since it doesn't export the include path in Lollipop. Fixes: 6facb0c0 "android: fix libz dynamic library dependencies" Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Rob Herring <[email protected]> (cherry picked from commit bfc0c23843008fd510afa263ebe371bef3346445)
* anv: don't leak DRM devicesGrazvydas Ignotas2017-05-181-0/+1
| | | | | | | | | | After successful drmGetDevices2() call, drmFreeDevices() needs to be called. Fixes: b1fb6e8d "anv: do not open random render node(s)" Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> # radv version (cherry picked from commit 0ef302638f2883789a3b39c2b6cfd20814efa0bb)
* anv: fix possible stack corruptionGrazvydas Ignotas2017-05-181-1/+1
| | | | | | | | | | | drmGetDevices2 takes count and not size. Probably hasn't caused problems yet in practice and was missed as setups with more than 8 DRM devices are not very common. Fixes: b1fb6e8d "anv: do not open random render node(s)" Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> (cherry picked from commit e0aee8b667955675e2e6c647a88048b64bc2796e)
* i965/vec4: load dvec3/4 uniforms first in the push constant bufferSamuel Iglesias Gonsálvez2017-05-181-27/+80
| | | | | | | | | | | | | | | | | | | | | | | | | Reorder the uniforms to load first the dvec4-aligned variables in the push constant buffer and then push the vec4-aligned ones. It takes into account that the relocated uniforms should be aligned to their channel size. This fixes a bug were the dvec3/4 might be loaded one part on a GRF and the rest in next GRF, so the region parameters to read that could break the HW rules. v2: - Fix broken logic. - Add a comment to explain what should be needed to optimise the usage of the push constant buffer slots, as this patch does not pack the uniforms. v3: - Implemented the push constant buffer usage optimization. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Acked-by: Francisco Jerez <[email protected]> (cherry picked from commit e69e5c7006da80af62c9ef08dec215b3b4b30946)
* i965/vec4: fix swizzle and writemask when loading an uniform with constant ↵Samuel Iglesias Gonsálvez2017-05-181-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | offset It was setting XYWZ swizzle and writemask to all uniforms, no matter if they were a vector or scalar, so this can lead to problems when loading them to the push constant buffer. Moreover, 'shift' calculation was designed to calculate the offset in DWORDS, but it doesn't take into account DFs, so the calculated swizzle for the later ones was wrong. The indirect case is not changed because MOV INDIRECT will write to all components. Added an assert to verify that these uniforms are aligned. v2: - Fix 'shift' calculation (Curro) - Set both swizzle and writemask. - Add assert(shift == 0) for the indirect case. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit 8aa6ada8384a961b37dfefec7f9e40e5a4e27ce7)
* i965/vec4/gs: restore the uniform values which was overwritten by failed ↵Samuel Iglesias Gonsálvez2017-05-181-0/+26
| | | | | | | | | | | | | | | | | | vec4_gs_visitor execution We are going to add a packing feature to reduce the usage of the push constant buffer. One of the consequences is that 'nr_params' would be modified by vec4_visitor's run call, so we need to restore it if one of them failed before executing the fallback ones. Same thing happens to the uniforms values that would be reordered afterwards. Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when the dvec4 alignment and packing patch is applied. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Acked-by: Francisco Jerez <[email protected]> (cherry picked from commit 354f7f2cb9c7206e12646c79d8ff5becbaffa61b)
* intel/isl/gen7: Use stencil vertical alignment of 8 instead of 4Pohjolainen, Topi2017-05-181-23/+5
| | | | | | | | | | | | | | | | | | | | | | The reasoning Chad gave in the comment for choosing a valign of 4 is entirely bunk. The fact that you have to multiply pitch by 2 is completely unrelated to the halign/valign parameters used for texture layout. (Not completely unrelated. W-tiling is just Y-tiling with a bit of extra swizzling which turns 8x8 W-tiled chunks into 16x4 y-tiled chunks so it makes everything easier if miplevels are always aligned to 8x8.) The fact that RENDER_SURFACE_STATE::SurfaceVerticalAlignmet doesn't have a VALIGN_8 option doesn't matter since this is gen7 and you can't do stencil texturing anyway. v2 (Jason Ekstrand): - Delete most of Chad's comment and add a more descriptive commit message. Signed-off-by: Topi Pohjolainen <[email protected]> Cc: "17.0 17.1" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> (cherry picked from commit 236f17a9f73935db6cddafd91e53a5fae34aae6e)
* anv: anv_gem_mmap() returns MAP_FAILED as mapping errorSamuel Iglesias Gonsálvez2017-05-082-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Take it into account when checking if the mapping failed. v2: - Remove map == NULL and its related comment (Emil) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Fixes: 6f3e3c715a7 ("vk/allocator: Add a BO pool") Fixes: 9919a2d34de ("anv/image: Memset hiz surfaces to 0 when binding memory") Cc: "17.0 17.1" <[email protected]> (cherry picked from commit b546c9d318731b988aa3d8c4e4735cdbb596cfbf) Squashed with: anv: vkBindImageMemory() should return VK_ERROR_OUT_OF_{HOST,DEVICE}_MEMORY on failure According to the spec we get VK_ERROR_OUT_OF_HOST_MEMORY or VK_ERROR_OUT_OF_DEVICE_MEMORY on vkBindImageMemory failure. Fixes returned value changed by b546c9d. Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error") Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.0 17.1" <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 939b015736d5091faeabde4f5a373e6a1612c5ed) Squashed with: anv: fix anv_gem_mmap comment to not mention NULL The function cannot return NULL, update the comment accordingly. Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> (cherry picked from commit 9d2aa6e5067752efbc0acbd728bc0bde49aefb61)
* i965/vec4: don't modify regioning parameters to the sources of DF align1 ↵Samuel Iglesias Gonsálvez2017-05-051-8/+1
| | | | | | | | | | | | | | instructions The regioning parameters are now properly set by convert_to_hw_regs() and we don't need to fix them in the generator. That latter fix previously done in the generator was strictly speaking wrong for any non-identity regions. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit f57e234fdd52331d0aa6656a36efdebea9d11e9d)
* i965/vec4: fix register width for DF VGRF and UNIFORMSamuel Iglesias Gonsálvez2017-05-051-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | On gen7, the swizzles used in DF align16 instructions works for element size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that in the rest of the code and prepare the instructions for this (scalarize_df()), we need to set it to two again. However, for DF align1 instructions, a width of 2 is wrong as we are not reading the data we want. For example, an uniform would have a region of <0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access to the first 4. This patch sets the default one to 4 and then modifies the width of align16 instruction's DF sources when we translate the logical swizzle to the physical one. v2: - Remove conditional (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit aaeb1c99beed39d85c300ebdb8a7bf056ee6717c)
* i965/vec4: fix vertical stride to avoid breaking region parameter ruleSamuel Iglesias Gonsálvez2017-05-051-18/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | From IVB PRM, vol4, part3, "General Restrictions on Regioning Parameters": "If ExecSize = Width and HorzStride ≠ 0, VertStride must be set to Width * HorzStride." In next patch, we are going to modify the region parameter for uniforms and vgrf. For uniforms that are the source of DF align1 instructions, they will have <0, 4, 1> regioning and the execsize for those instructions will be 4, so they will break the regioning rule. This will be the same for VGRF sources where we use the vstride == 0 exploit. As we know we are not going to cross the GRF boundary with that execsize and parameters (not even with the exploit), we just fix the vstride here. v2: - Move is_align1_df() (Curro) - Refactor exec_size == width calculation (Curro) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (cherry picked from commit 7f728bce811fc283e672e3a07b008bb7b52de35e)
* anv/cmd_buffer: Use the device allocator for QueueSubmitJason Ekstrand2017-04-301-3/+3
| | | | | | | | | The command is really operating on a Queue not a command buffer and the nearest object to that with an allocator is VkDevice. Reviewed-by: Chad Versace <[email protected]> Cc: "17.0 17.1" <[email protected]> (cherry picked from commit bd3a9813b92bd2e116b58f0932bc7f1f722a9f63)
* anv: Don't place scratch buffers above the 32-bit boundaryJason Ekstrand2017-04-301-0/+19
| | | | | | | | | | | | | This fixes rendering corruptions in DOOM. Hopefully, it will also make Jenkins a bit more stable as we've been seeing some random failures and GPU hangs ever since turning on 48bit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100620 Fixes: 651ec926fc1 "anv: Add support for 48-bit addresses" Tested-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "17.1" <[email protected]> (cherry picked from commit c43b4bc85eddba8bc31665cfee5928bed8343516)
* intel/fs: Take into account amount of data read in spilling cost heuristic.Francisco Jerez2017-04-301-1/+1
| | | | | | | | | | | | | | | | | | | | Until now the spilling cost calculation was neglecting the amount of data read from the register during the spilling cost calculation. This caused it to make suboptimal decisions in some cases leading to higher memory bandwidth usage than necessary. Improves Unigine Heaven performance by ~4% on BDW, reversing an unintended FPS regression from my previous commit 147e71242ce539ff28e282f009c332818c35f5ac with n=12 and statistical significance 5%. In addition SynMark2 OglCSDof performance is improved by an additional ~5% on SKL, and a Kerbal Space Program apitrace around the Moho planet I can provide on request improves by ~20%. Cc: <[email protected]> Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 58324389be7bc7c5e10093b9cc0a8efa9b4c93a9)
* intel/fs: Use regs_written() in spilling cost heuristic for improved accuracy.Francisco Jerez2017-04-301-2/+1
| | | | | | | | | | | This is what we use later on to compute the number of registers that will actually get spilled to memory, so it's more likely to match reality than the current open-coded approximation. Cc: <[email protected]> Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit ecc19e12dca95d2571d3761dea6dec24b061013c)