summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* anv: Enable VK_KHR_16bit_storage for PushConstantJose Maria Casanova Crespo2018-02-281-1/+1
| | | | | | Enables storagePushConstant16 features of VK_KHR_16bit_storage for Gen8+. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv/i965/anv: Relax push constant offset assertions being 32-bit alignedJose Maria Casanova Crespo2018-02-282-7/+10
| | | | | | | | | | | | | | | | The introduction of 16-bit types with VK_KHR_16bit_storages implies that push constant offsets could be multiple of 2-bytes. Some assertions are updated so offsets should be just multiple of size of the base type but in some cases we can not assume it as doubles aren't aligned to 8 bytes in some cases. For 16-bit types, the push constant offset takes into account the internal offset in the 32-bit uniform bucket adding 2-bytes when we access not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0. v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Enable VK_KHR_16bit_storage for SSBO and UBOJose Maria Casanova Crespo2018-02-282-3/+4
| | | | | | | Enables storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccesss features of VK_KHR_16bit_storage for Gen8+. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layoutJose Maria Casanova Crespo2018-02-281-7/+15
| | | | | | | | | | | | | | | | Restrict the use of untyped_surface_write with 16-bit pairs in ssbo to the cases where we can guarantee that offset is multiple of 4. Taking into account that VK_KHR_relaxed_block_layout is available in ANV we can only guarantee that when we have a constant offset that is multiple of 4. For non constant offsets we will always use byte_scattered_write. v2: (Jason Ekstrand) - Assert offset_reg to be multiple of 4 if it is immediate. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layoutJose Maria Casanova Crespo2018-02-281-14/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | 16-bit load_ubo/ssbo operations that call do_untyped_read_vector don't guarantee that offsets are multiple of 4-bytes as required by untyped_read message. This happens for example in the case of f16mat3x3 when then VK_KHR_relaxed_block_layout is enabled. Vectors reads when we have non-constant offsets are implemented with multiple byte_scattered_read messages that not require 32-bit aligned offsets. Now for all constant offsets we can use the untyped_read_surface message. In the case of constant offsets not aligned to 32-bits, we calculate a start offset 32-bit aligned and use the shuffle_32bit_load_result_to_16bit_data function and the first_component parameter to skip the copy of the unneeded component. v2: (Jason Ekstrand) Use untyped_read_surface messages always we have constant offsets. v3: (Jason Ekstrand) Simplify loop for reads with non constant offsets. Use end - start to calculate the number of 32-bit components to read with constant offsets. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: shuffle_32bit_load_result_to_16bit_data now skips componentsJose Maria Casanova Crespo2018-02-283-3/+6
| | | | | | | | | | | | | | | | This helper used to load 16bit components from 32-bits read now allows skipping components with the new parameter first_component. The semantics now skip components until we reach the first_component, and then reads the number of components passed to the function. All previous uses of the helper are updated to use 0 as first_component. This will allow read 16-bit components when the first one is not aligned 32-bit. Enabling more usages of untyped_reads with 16-bit types. v2: (Jason Ektrand) Change parameters order to first_component, num_components Reviewed-by: Jason Ekstrand <[email protected]>
* isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bitJose Maria Casanova Crespo2018-02-283-2/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The surfaces that backup the GPU buffers have a boundary check that considers that access to partial dwords are considered out-of-bounds. For example, buffers with 1,3 16-bit elements has size 2 or 6 and the last two bytes would always be read as 0 or its writting ignored. The introduction of 16-bit types implies that we need to align the size to 4-bytew multiples so that partial dwords could be read/written. Adding an inconditional +2 size to buffers not being multiple of 2 solves this issue for the general cases of UBO or SSBO. But, when unsized arrays of 16-bit elements are used it is not possible to know if the size was padded or not. To solve this issue the implementation calculates the needed size of the buffer surfaces, as suggested by Jason: surface_size = isl_align(buffer_size, 4) + (isl_align(buffer_size, 4) - buffer_size) So when we calculate backwards the buffer_size in the backend we update the resinfo return value with: buffer_size = (surface_size & ~3) - (surface_size & 3) It is also exposed this buffer requirements when robust buffer access is enabled so these buffer sizes recommend being multiple of 4. v2: (Jason Ekstrand) Move padding logic fron anv to isl_surface_state. Move calculus of original size from spirv to driver backend. v3: (Jason Ekstrand) Rename some variables and use a similar expresion when calculating. padding than when obtaining the original buffer size. Avoid use of unnecesary component call at brw_fs_nir. v4: (Jason Ekstrand) Complete comment with buffer size calculus explanation in brw_fs_nir. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Always set has_context_priorityJason Ekstrand2018-02-281-3/+1
| | | | | | | | | | We don't zalloc the physical device so we need to unconditionally set everything. Crucible helpfully initializes all allocations to 139 so it was getting true regardless of whether or not the kernel actually supports context priorities. Fixes: 6d8ab53303331 "anv: implement VK_EXT_global_priority extension" Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't emit MOVs with undefined registers for Gen4 point clipping.Kenneth Graunke2018-02-281-1/+1
| | | | | | | | | | | | Gen4 point clipping calls brw_clip_tri_alloc_regs with nr_verts == 0, which means that c->reg.vertex[] isn't initialized. It then emits MOVs to stomp components of those uninitialized registers to 0. This started causing assertions after Matt's recent series, when those uninitialized registers started getting BRW_REGISTER_TYPE_NF, which definitely doesn't exist on Gen4-5. Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Re-add .vs_inputs_dual_locations = trueMatt Turner2018-02-281-0/+1
| | | | | | Looks like a rebase mistake. Fixes: 89fe5190a256 ("intel/compiler: Lower flrp32 on Gen11+")
* intel/compiler: Add ICL to test_eu_validate.cppMatt Turner2018-02-281-0/+1
| | | | | | | With the Align16 tests now disabled, we can run the rest of the tests in ICL mode (and see them pass!) Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Disable Align16 tests on Gen11+Matt Turner2018-02-281-0/+16
| | | | | | Align16 is no more. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Add instruction compaction support on Gen11Matt Turner2018-02-281-0/+42
| | | | | | Gen11 only differs from SKL+ in that it uses a new datatype index table. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Mark line, pln, and lrp as removed on Gen11+Matt Turner2018-02-281-4/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Lower flrp32 on Gen11+Matt Turner2018-02-285-17/+26
| | | | | | The LRP instruction is no more. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/fs: Implement ddy without using align16 for Gen11+Matt Turner2018-02-281-8/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Align16 is no more. We previously generated an align16 ADD instruction to calculate DDY: add(16) g25<1>F -g23<4>.xyxyF g23<4>.zwzwF { align16 1H }; Without align16, we now implement it as: add(4) g25<1>F -g23<0,2,1>F g23.2<0,2,1>F { align1 1N }; add(4) g25.4<1>F -g23.4<0,2,1>F g23.6<0,2,1>F { align1 1N }; add(4) g26<1>F -g24<0,2,1>F g24.2<0,2,1>F { align1 1N }; add(4) g26.4<1>F -g24.4<0,2,1>F g24.6<0,2,1>F { align1 1N }; where only the first two instructions are needed in SIMD8 mode. Note: an earlier version of the patch implemented this in two instructions in SIMD16: add(8) g25<2>F -g23<4,2,0>F g23.2<4,2,0>F { align1 1N }; add(8) g25.1<2>F -g23.1<4,2,0>F g23.3<4,2,0>F { align1 1N }; but I realized that the channel enable bits will not be correct. If we knew we were under uniform control flow, we could emit only those two instructions however. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/fs: Simplify ddx/ddy code generationMatt Turner2018-02-281-42/+21
| | | | | | The brw_reg() constructor just obfuscates things here, in my opinion. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcodeMatt Turner2018-02-282-8/+10
| | | | | | | In a future patch, generate_ddy will want to inspect inst->exec_size. Change generate_ddx as well for consistency. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/fs: Don't generate integer DWord multiply on Gen11Matt Turner2018-02-283-5/+6
| | | | | | | Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer multiplies. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+Matt Turner2018-02-282-4/+46
| | | | | | | | | | | | | | | | | | | The PLN instruction is no more. Its functionality is now implemented using two MAD instructions with the new native-float type. Instead of pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F we now have mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F ... and in the case of SIMD8 only the first pair of MAD instructions is used. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/fs: Return multiple_instructions_emitted from generate_linterpMatt Turner2018-02-282-4/+8
| | | | | | | If multiple instructions are emitted, special handling of things like conditional mod and NoDDClr/NoDDChk need to be performed. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pairMatt Turner2018-02-281-2/+11
| | | | | | | | | This isn't technically broken, but the next patch will make this function report whether it generated multiple instructions, and that information will be used to disable the application of conditional mod by the generic code. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Add Gen11+ native float typeMatt Turner2018-02-286-2/+32
| | | | | | | | | | | | This new type exposes the additional precision offered by the accumulator register and will be used in the next patch to implement the functionality of the PLN instruction using a pair of MAD instructions. One weird thing to note: align1 ternary instructions may only have an accumulator in the dst or src1 normally, but when src0's type is :NF the accumulator is read. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Add Gen11 register typesMatt Turner2018-02-281-8/+65
| | | | | | | The hardware register types' encodings have changed on Gen11. Good thing we have that superfluous looking brw_reg_type abstraction lying around! Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Disable 64-bit extensions on platforms without 64-bit typesMatt Turner2018-02-282-0/+4
| | | | | | | | Gen11 does not support DF, Q, UQ types in hardware. As a result, we have to disable some GL extensions until they can be reimplemented. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel: Add icl pci id for INTEL_DEVID_OVERRIDEAnuj Phogat2018-02-281-0/+1
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Anuj Phogat <[email protected]>
* intel: Add a preliminary device for Ice LakeAnuj Phogat2018-02-281-1/+56
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Anuj Phogat <[email protected]>
* anv: remove anv_gem_set_context_priority helperTapani Pälli2018-02-283-12/+3
| | | | | | | | anv_gem_set_context_param is to be used directly instead! Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: implement VK_EXT_global_priority extensionTapani Pälli2018-02-285-0/+95
| | | | | | | | | | | | | | | | v2: add ANV_CONTEXT_REALTIME_PRIORITY (Chris) use unreachable with unknown priority (Samuel) v3: add stubs in gem_stubs.c (Emil) use priority defines from gen_defines.h v4: cleanup, add anv_gem_set_context_param (Jason) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> (v2) Reviewed-by: Chris Wilson <[email protected]> (v2) Reviewed-by: Emil Velikov <[email protected]> (v3) Reviewed-by: Jason Ekstrand <[email protected]>
* intel: add new common header gen_defines.hTapani Pälli2018-02-282-0/+55
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* anv: set maxResourceSize to the respective value for each generationSamuel Iglesias Gonsálvez2018-02-281-1/+14
| | | | | | | | v2: - Add the proper values to gen9+ (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add lower_ldexp to nir compiler optionsTimothy Arceri2018-02-281-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* intel/ir: Fix invalid type aliasing with undefined behavior in test_eu_compact.Francisco Jerez2018-02-271-3/+3
| | | | | | | | | | | | | | | | | test_fuzz_compact_instruction() was attempting to modify the uint64_t data array of a brw_inst through a pointer to uint32_t, which has undefined behavior. This was causing the test_eu_compact unit test to fail mysteriously for me on GCC 7 with some additional harmless-looking changes I had applied to my tree, which happened to affect the order instructions are emitted by GCC causing the bit twiddling to be done after the clear_pad_bits() call which is supposed to overwrite the same data through a pointer of different type, leading to data corruption. A similar failure has been reported by Vinson Lee on the master branch built with GCC 8. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105052 Tested-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/tools: Use gen_device_name_to_pci_device_id in aubinatorJordan Justen2018-02-271-24/+6
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel/common: Add gen_device_name_to_pci_device_idJordan Justen2018-02-272-6/+14
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel/vulkan: Support INTEL_DEVID_OVERRIDE environment variableJordan Justen2018-02-271-4/+10
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel/common: Add gen_get_pci_device_id_overrideJordan Justen2018-02-272-0/+52
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel/vulkan: Support INTEL_NO_HW environment variableJordan Justen2018-02-273-1/+6
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* android: fix source files path for libmesa_anv_gen11Harish Krupo2018-02-271-1/+1
| | | | | Signed-off-by: Harish Krupo <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* intel: aubinator_error_decode: fix segfault on missing registerLionel Landwerlin2018-02-261-1/+2
| | | | | | | | Some register might be missing in our genxmls. Don't try to decode them. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* android: anv: add dependency on libnativewindow for O and laterMauro Rossi2018-02-261-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | Similar to 90dd6e5 ("Android: egl: add dependency on libnativewindow") Fixes the following building errors: In file included from external/mesa/src/intel/vulkan/gen7_cmd_buffer.c:30: In file included from external/mesa/src/intel/vulkan/anv_private.h:72: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^~~~~~~~~~~~~~~~~ 1 error generated. ... In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: In file included from external/mesa/src/intel/vulkan/anv_private.h:72: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^~~~~~~~~~~~~~~~~ 1 error generated. Cc: "18.0" <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* android: anv/extensions: fix generated sources buildMauro Rossi2018-02-261-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Building rules are aligned to automake ones The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py Generation rules for anv_extensions.c requires --out-c option Generation rules for anv_extensions.h were missing Necessary include paths are added to avoid following build errors: cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c': No such file or directory In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: dd088d4bec7 ("anv/extensions: Generate a header file with extension tables") Cc: "18.0" <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* anv/blorp: multisample resolve all attachment layersIago Toral Quiroga2018-02-221-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | We were only resolving the first. v2: - Do not require that the number of layers on dst and src are an exact match, it is okay if the dst has more layers so long as it has at least the same that we are going to resolve. - Do not always resolve array_len layers, we should resolve only from base_array_layer to array_len. v3: - v2 was assuming that array_len represented the total number of layers in the image, but it represents the number of layers starting at the base array ayer. v4: - The number of layers to resolve should be taken from the framebuffer (Nanley). Fixes new CTS tests for multisampled layered rendering: dEQP-VK.renderpass.multisample_resolve.layers_* Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Improve the documentation on get_default_aux_stateJason Ekstrand2018-02-211-4/+20
| | | | Reviewed-by: Chad Versace <[email protected]>
* anv/image: Add support for modifiers for WSIJason Ekstrand2018-02-214-4/+104
| | | | | | This adds support for the modifiers portion of the WSI "extension". Reviewed-by: Daniel Stone <[email protected]>
* anv/image: Separate modifiers from legacy scanoutJason Ekstrand2018-02-213-38/+21
| | | | | | | | | | | | | | | | | | | For a bit there, we had a bug in i965 where it ignored the tiling of the modifier and used the one from the BO instead. At one point, we though this was best fixed by setting a tiling from Vulkan. However, we've decided that i965 was just doing the wrong thing and have fixed it as of 50485723523d2948a44570ba110f02f726f86a54. The old assumptions also affected the solution we used for legacy scanout in Vulkan. Instead of treating it specially, we just treated it like a modifier like we do in GL. This commit goes back to making it it's own thing so that it's clear in the driver when we're using modifiers and when we're using legacy paths. v2 (Jason Ekstrand): - Rename legacy_scanout to needs_set_tiling Reviewed-by: Daniel Stone <[email protected]>
* anv: Don't assert that stencil HiZ clears are single-sliceJason Ekstrand2018-02-211-3/+6
| | | | | | | | | It's true for depth HiZ clears because we only have HiZ on single-slice images right now. However, for stencil-only clears there is no such restriction. Tested-by: Rafael Antognolli <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Only copy clear dwords if we're rendering to the first sliceJason Ekstrand2018-02-211-1/+4
| | | | Reviewed-by: Rafael Antognolli <[email protected]>
* i965: Fix compiler warning about write being undefined.Eric Anholt2018-02-201-1/+1
| | | | | | | This looks like it should be protected by the assume() about nr_color_regions, but my compiler warns anyway. Reviewed-by: Matt Turner <[email protected]>
* anv/blorp: Use layout_to_aux_usage when a layout is providedJason Ekstrand2018-02-201-25/+46
| | | | | | | | | | Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me something reasonable" we now use anv_layout_to_aux_usage whenever a layout is available. If a layout is available, we ignore the aux_usage parameter. For the cases where we have an explicit aux usage such as clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>