summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* anv/pipeline: set active_stages earlyCaio Marcelo de Oliveira Filho2018-03-192-3/+10
| | | | | | | | | | | | | | Since the intermediate states of active_stages are not used, i.e. active_stages is read only after all stages were set into it, just set its value before compiling the shaders. This will allow to conditionally run certain passes based on what other shaders are being used, e.g. a certain pass might only be applicable to the vertex shader if there's no geometry or tessellation shader being used. v2: Use vk_to_mesa_shader_stage. (Lionel) Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pipeline: fail if TCS/TES compile failCaio Marcelo de Oliveira Filho2018-03-191-7/+9
| | | | | | | v2: Add Fixes tag. (Lionel) Fixes: e50d4807a35e679 ("anv: Compile TCS/TES shaders.") Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Silence warning about heap_size.Eric Anholt2018-03-161-1/+1
| | | | | | | We only get VK_SUCCESS if it was initialized, but apparently my compiler doesn't track that far. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Silence compiler warning about promoted_constants.Eric Anholt2018-03-161-1/+1
| | | | | | | | We only have a cfg != NULL if we went through one of the paths that set it, but my compiler doesn't figure that out. Reviewed-by: Lionel Landwerlin <[email protected]> Fixes: 6411defdcd6f ("intel/cs: Re-run final NIR optimizations for each SIMD size")
* anv: Silence compiler warnings about uninitialized bind_offset.Eric Anholt2018-03-161-1/+1
| | | | | | | | | This is a legitimate warning: if anv's blorp_alloc_binding_table() throws an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently continue to use this undefined value. The rest of this code doesn't seem very allocation-error-proof, though, either. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/compiler: Use gen_get_device_info() in test_eu_validateMatt Turner2018-03-163-39/+19
| | | | | | | | | | | | | Previously the unit test filled out a minimal devinfo struct. A previous patch caused the test to begin assert failing because the devinfo was not complete. Avoid this by using the real mechanism to create devinfo. Note that we have to drop icl from the table, since we now rely on the name -> PCI ID translation done by gen_device_name_to_pci_device_id(), and ICL's PCI IDs are not upstream yet. Fixes: f89e735719a6 ("intel/compiler: Check for unsupported register sizes.") Reviewed-by: Rafael Antognolli <[email protected]>
* intel: Add cfl to gen_device_name_to_pci_device_id()Matt Turner2018-03-161-0/+1
| | | | Reviewed-by: Rafael Antognolli <[email protected]>
* intel/compiler: Check for unsupported register sizes.Rafael Antognolli2018-03-161-0/+3
| | | | | | | | | Make sure we don't emit 64 bit types if the hardware doesn't support them. Signed-off-by: Rafael Antognolli <[email protected]> Suggested-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* anv: silence unused variable warningLionel Landwerlin2018-03-151-7/+0
| | | | | | Fixes: 59b0ea0c748 ("anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* anv: silence unused function warning on gen11Lionel Landwerlin2018-03-152-1/+3
| | | | | | | | | | [84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'. ../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch *batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* compiler: int8/uint8 supportKarol Herbst2018-03-143-0/+9
| | | | | | | | | | OpenCL kernels also have int8/uint8. v2: remove changes in nir_search as Jason posted a patch for that Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Rob Clark <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commandsIago Toral Quiroga2018-03-141-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | af5f2322d0c64 addressed this for extension commands, but the spec mandates this behavior also for core API commands. From the Vulkan spec, Table 2. vkGetDeviceProcAddr behavior: device pname return ---------------------------------------------------------- (..) device core device-level command fp (...) See that it specifically states "device-level". Since the vk.xml file doesn't state if core commands are instance or device level, we identify device level commands as the ones that take a VkDevice, VkQueue or VkCommandBuffer as their first parameter. Fixes test failures in new work-in-progress CTS tests. Also see the public issue: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323 v2: - Include reference to github issue (Emil) - Rebased on top of Vulkan 1.1 changes. v3: - Remove the not in the condition and switch the then/else cases (Jason) Reviewed-by: Emil Velikov <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
* anv/entrypoints: dispatches to VkQueue are device-levelIago Toral Quiroga2018-03-141-2/+7
| | | | | | | | v2: - Add trampoline functions (Jason) - Add an assertion for unhandled trampoline cases Reviewed-by: Jason Ekstrand <[email protected]>
* intel/vulkan: Hard code CS scratch_ids_per_subslice for CherryviewJordan Justen2018-03-091-17/+28
| | | | | | | | | | Ken suggested that we might be underallocating scratch space on HD 400. Allocating scratch space as though there was actually 8 EUs seems to help with a GPU hang seen on synmark CSDof. Cc: <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Allow CSE on subset VF constant loadsIan Romanick2018-03-081-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Rewrite the code that generates the VF mask. Suggested by Ken. No changes on other platforms. Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13059891 -> 13059884 (<.01%) instructions in affected programs: 431 -> 424 (-1.62%) helped: 7 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -3.39% -0.71% Instructions are helped. total cycles in shared programs: 409260032 -> 409260018 (<.01%) cycles in affected programs: 4228 -> 4214 (-0.33%) helped: 7 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28% 95% mean confidence interval for cycles value: -2.00 -2.00 95% mean confidence interval for cycles %-change: -1.15% 0.07% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Relax writemask condition in CSEIan Romanick2018-03-081-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the previously seen instruction generates more fields than the new instruction, still allow CSE to happen. This doesn't do much, but it also enables a couple more shaders in the next patch. It helped quite a bit in another change series that I have (at least for now) abandoned. v2: Add some extra comentary about the parameters to instructions_match. Suggested by Ken. No changes on Skylake, Broadwell, Iron Lake or GM45. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11780295 -> 11780294 (<.01%) instructions in affected programs: 302 -> 301 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 257308315 -> 257308313 (<.01%) cycles in affected programs: 2074 -> 2072 (-0.10%) helped: 1 HURT: 0 Sandy Bridge total instructions in shared programs: 10506687 -> 10506686 (<.01%) instructions in affected programs: 335 -> 334 (-0.30%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Merge CMP and SEL into CSEL on Gen8+Ian Romanick2018-03-082-0/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Fix several problems handling inverted predicates. Add a much bigger comment around the BRW_CONDITIONAL_NZ case. v3: Allow uniforms and shader inputs as sources for the original SEL and CMP instructions. This enables a LOT more shaders to receive CSEL merging (5816 vs 8564 on SKL). v4: Report progress. Broadwell and Skylake had similar results. (Broadwell shown) helped: 8527 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70% 95% mean confidence interval for instructions value: -2.51 -2.36 95% mean confidence interval for instructions %-change: -1.15% -1.10% Instructions are helped. total cycles in shared programs: 559442317 -> 558288357 (-0.21%) cycles in affected programs: 372699860 -> 371545900 (-0.31%) helped: 6748 HURT: 1450 helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12 helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70% HURT stats (abs) min: 1 max: 2538 x̄: 53.08 x̃: 14 HURT stats (rel) min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90% 95% mean confidence interval for cycles value: -179.01 -102.51 95% mean confidence interval for cycles %-change: -2.37% -2.08% Cycles are helped. LOST: 0 GAINED: 6 No changes on earlier platforms. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> [v1] Reviewed-by: Kenneth Graunke <[email protected]> [v3] Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Add infrastructure for generating CSEL instructions.Kenneth Graunke2018-03-088-1/+34
| | | | | | | | | | | | | | | | v2 (idr): Don't allow CSEL with a non-float src2. v3 (idr): Add CSEL to fs_inst::flags_written. Suggested by Matt. v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt). Don't reset the access mode afterwards (suggested by Samuel and Matt). Add support for CSEL not modifying the flags to more places (requested by Matt). Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> [v3] Reviewed-by: Matt Turner <[email protected]>
* anv: Support version overridesJason Ekstrand2018-03-071-1/+7
| | | | | | While always sketchy to do, this is useful for debugging. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Enable Vulkan 1.1Jason Ekstrand2018-03-071-1/+4
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Add support for SPIR-V 1.3 subgroup operationsJason Ekstrand2018-03-074-2/+39
| | | | | | | This requires us to bump the subgroup size to 32 for all shader stages because Vulkan requires that to be a physical device query. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Add support for subgroup quad operationsJason Ekstrand2018-03-075-0/+124
| | | | | | | | NIR has code to lower these away for us but we can do significantly better in many cases with register regioning and SIMD4x2. Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Implement reduce and scan opeprationsJason Ekstrand2018-03-072-0/+162
| | | | | Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Add a helper for emitting scan operationsJason Ekstrand2018-03-072-0/+148
| | | | | | | | | | | | | This commit adds a helper to the builder for emitting "scan" operations. Given a binary operation #, a scan takes the vector [a0, a1, ..., aN] and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each channel contains the combination of all previous channels. The sequence of instructions to perform the scan is fairly optimal; a 16-wide scan on a 32-bit type is only 6 instructions. The subgroup scan and reduction operations will be implemented in terms of this. Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Add a couple of simple helper opcodesJason Ekstrand2018-03-074-0/+76
| | | | | Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: Add support for nir_intrinsic_shuffleJason Ekstrand2018-03-077-0/+150
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: Support nir_intrinsic_vote_feqJason Ekstrand2018-03-071-0/+6
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Generalize nir_intrinsic_vote_eqJason Ekstrand2018-03-071-1/+1
| | | | | | | | | | | | | The SPIR-V extension wants us to be able to do an AllEqual on any vector or scalar type. This has two implications: 1) We need to be able to handle vectors so we switch the vote_eq intrinsics to be vectorized intrinsics. 2) We need to handle floats which have different behavior with respect to +-0, NaN, etc. than the integer variant so we need two variants. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: Implement basic SPIR-V subgroup intrinsicsJason Ekstrand2018-03-072-0/+26
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVERJason Ekstrand2018-03-071-29/+7
| | | | | | | | | | | From the Vulkan 1.1 spec: "Vulkan 1.0 implementations were required to return VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0. Implementations that support Vulkan 1.1 or later must not return VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion." Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Implement vkEnumerateInstanceVersionJason Ekstrand2018-03-071-0/+7
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/device: fail to initialize device if we have queues with unsupported flagsIago Toral Quiroga2018-03-071-0/+9
| | | | | | | | | | This is not strictly necessary since users should not be requesting any flags that are not valid for the list of enabled features requested and we already fail if they attempt to use an unsupported feature, however it is an easy to implement sanity check that would help developes realize that they are doing things wrong, so we might as well do it. Reviewed-by: Jason Ekstrand <[email protected]>
* anv/device: GetDeviceQueue2 should only return queues with matching flagsIago Toral Quiroga2018-03-072-1/+7
| | | | | | | | | | | | | From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure: "The queue returned by vkGetDeviceQueue2 must have the same flags value from this structure as that used at device creation time in a VkDeviceQueueCreateInfo instance. If no matching flags were specified at device creation time then pQueue will return VK_NULL_HANDLE." For us this means no flags at all since we don't support any. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Support querying for protected memoryJason Ekstrand2018-03-071-0/+6
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement GetDeviceQueue2Jason Ekstrand2018-03-071-0/+12
| | | | | | | This belongs to the protected memory feature but there's nothing about it that's specific to protected memory. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Trivially implement VK_KHR_device_groupJason Ekstrand2018-03-076-11/+100
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Implement vkCmdDispatchBaseJason Ekstrand2018-03-079-6/+165
| | | | | | | | | | This is part of the device groups extension/feature but it's a decent chunk of work in its own right so it's worth breaking into its own patch. The mechanism we use is fairly straightforward: we just push the base work group id into the shader and add it to the work group id we get from dispatch. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Implement VK_KHR_maintenance3Jason Ekstrand2018-03-074-18/+77
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Support VkPhysicalDeviceShaderDrawParameterFeaturesJason Ekstrand2018-03-071-0/+6
| | | | | | | This advertises the VK_KHR_shader_draw_parameters functionality as a "core optimal feature" in Vulkan 1.1. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Drop support for protect attributesJason Ekstrand2018-03-071-7/+0
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* Get rid of a bunch of KHR suffixesJason Ekstrand2018-03-0711-248/+248
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Add version 1.1.0 but leave it disabledJason Ekstrand2018-03-077-21/+22
| | | | | | | | This requires us to rename any Vulkan API entrypoints which became core in 1.1 to no longer have the KHR suffix. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Generate #ifdef guards from platform attributesJason Ekstrand2018-03-071-0/+8
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/extensions: Add support for multiple API versionsJason Ekstrand2018-03-072-10/+40
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints_gen: Add support for aliases in the XMLJason Ekstrand2018-03-071-19/+46
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Allow an entrypoint to require multiple extensionsJason Ekstrand2018-03-071-9/+12
| | | | | | | | In this case, we say an entrypoint is supported if ANY of the extensions is supported. This is because, in the XML, entrypoints don't require extensions so much as extensions require entrypoints. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Add an is_device_entrypoint helperJason Ekstrand2018-03-071-2/+5
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints_gen: Allow the string map to growJason Ekstrand2018-03-071-4/+6
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints_gen: A bit of refactoringJason Ekstrand2018-03-071-15/+7
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Generalize the string map a bitJason Ekstrand2018-03-071-85/+103
| | | | | | | | | | | | | The original string map assumed that the mapping from strings to entrypoints was a bijection. This will not be true the moment we add entrypoint aliasing. This reworks things to be an arbitrary map from strings to non-negative signed integers. The old one also had a potential bug if we ever had a hash collision because it didn't do the strcmp inside the lookup loop. While we're at it, we break things out into a helpful class. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>