summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* anv: Support version overridesJason Ekstrand2018-03-071-1/+7
| | | | | | While always sketchy to do, this is useful for debugging. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Enable Vulkan 1.1Jason Ekstrand2018-03-071-1/+4
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Add support for SPIR-V 1.3 subgroup operationsJason Ekstrand2018-03-074-2/+39
| | | | | | | This requires us to bump the subgroup size to 32 for all shader stages because Vulkan requires that to be a physical device query. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Add support for subgroup quad operationsJason Ekstrand2018-03-075-0/+124
| | | | | | | | NIR has code to lower these away for us but we can do significantly better in many cases with register regioning and SIMD4x2. Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Implement reduce and scan opeprationsJason Ekstrand2018-03-072-0/+162
| | | | | Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Add a helper for emitting scan operationsJason Ekstrand2018-03-072-0/+148
| | | | | | | | | | | | | This commit adds a helper to the builder for emitting "scan" operations. Given a binary operation #, a scan takes the vector [a0, a1, ..., aN] and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each channel contains the combination of all previous channels. The sequence of instructions to perform the scan is fairly optimal; a 16-wide scan on a 32-bit type is only 6 instructions. The subgroup scan and reduction operations will be implemented in terms of this. Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Add a couple of simple helper opcodesJason Ekstrand2018-03-074-0/+76
| | | | | Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: Add support for nir_intrinsic_shuffleJason Ekstrand2018-03-077-0/+150
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: Support nir_intrinsic_vote_feqJason Ekstrand2018-03-071-0/+6
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Generalize nir_intrinsic_vote_eqJason Ekstrand2018-03-071-1/+1
| | | | | | | | | | | | | The SPIR-V extension wants us to be able to do an AllEqual on any vector or scalar type. This has two implications: 1) We need to be able to handle vectors so we switch the vote_eq intrinsics to be vectorized intrinsics. 2) We need to handle floats which have different behavior with respect to +-0, NaN, etc. than the integer variant so we need two variants. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: Implement basic SPIR-V subgroup intrinsicsJason Ekstrand2018-03-072-0/+26
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVERJason Ekstrand2018-03-071-29/+7
| | | | | | | | | | | From the Vulkan 1.1 spec: "Vulkan 1.0 implementations were required to return VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0. Implementations that support Vulkan 1.1 or later must not return VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion." Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Implement vkEnumerateInstanceVersionJason Ekstrand2018-03-071-0/+7
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/device: fail to initialize device if we have queues with unsupported flagsIago Toral Quiroga2018-03-071-0/+9
| | | | | | | | | | This is not strictly necessary since users should not be requesting any flags that are not valid for the list of enabled features requested and we already fail if they attempt to use an unsupported feature, however it is an easy to implement sanity check that would help developes realize that they are doing things wrong, so we might as well do it. Reviewed-by: Jason Ekstrand <[email protected]>
* anv/device: GetDeviceQueue2 should only return queues with matching flagsIago Toral Quiroga2018-03-072-1/+7
| | | | | | | | | | | | | From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure: "The queue returned by vkGetDeviceQueue2 must have the same flags value from this structure as that used at device creation time in a VkDeviceQueueCreateInfo instance. If no matching flags were specified at device creation time then pQueue will return VK_NULL_HANDLE." For us this means no flags at all since we don't support any. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Support querying for protected memoryJason Ekstrand2018-03-071-0/+6
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement GetDeviceQueue2Jason Ekstrand2018-03-071-0/+12
| | | | | | | This belongs to the protected memory feature but there's nothing about it that's specific to protected memory. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Trivially implement VK_KHR_device_groupJason Ekstrand2018-03-076-11/+100
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Implement vkCmdDispatchBaseJason Ekstrand2018-03-079-6/+165
| | | | | | | | | | This is part of the device groups extension/feature but it's a decent chunk of work in its own right so it's worth breaking into its own patch. The mechanism we use is fairly straightforward: we just push the base work group id into the shader and add it to the work group id we get from dispatch. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Implement VK_KHR_maintenance3Jason Ekstrand2018-03-074-18/+77
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Support VkPhysicalDeviceShaderDrawParameterFeaturesJason Ekstrand2018-03-071-0/+6
| | | | | | | This advertises the VK_KHR_shader_draw_parameters functionality as a "core optimal feature" in Vulkan 1.1. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Drop support for protect attributesJason Ekstrand2018-03-071-7/+0
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* Get rid of a bunch of KHR suffixesJason Ekstrand2018-03-0711-248/+248
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Add version 1.1.0 but leave it disabledJason Ekstrand2018-03-077-21/+22
| | | | | | | | This requires us to rename any Vulkan API entrypoints which became core in 1.1 to no longer have the KHR suffix. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Generate #ifdef guards from platform attributesJason Ekstrand2018-03-071-0/+8
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/extensions: Add support for multiple API versionsJason Ekstrand2018-03-072-10/+40
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints_gen: Add support for aliases in the XMLJason Ekstrand2018-03-071-19/+46
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Allow an entrypoint to require multiple extensionsJason Ekstrand2018-03-071-9/+12
| | | | | | | | In this case, we say an entrypoint is supported if ANY of the extensions is supported. This is because, in the XML, entrypoints don't require extensions so much as extensions require entrypoints. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Add an is_device_entrypoint helperJason Ekstrand2018-03-071-2/+5
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints_gen: Allow the string map to growJason Ekstrand2018-03-071-4/+6
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints_gen: A bit of refactoringJason Ekstrand2018-03-071-15/+7
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/entrypoints: Generalize the string map a bitJason Ekstrand2018-03-071-85/+103
| | | | | | | | | | | | | The original string map assumed that the mapping from strings to entrypoints was a bijection. This will not be true the moment we add entrypoint aliasing. This reworks things to be an arbitrary map from strings to non-negative signed integers. The old one also had a potential bug if we ever had a hash collision because it didn't do the strcmp inside the lookup loop. While we're at it, we break things out into a helpful class. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* vulkan: Rename multiview from KHX to KHRJason Ekstrand2018-03-074-10/+10
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* android: anv: add libmesa_intel_dev static dependencyMauro Rossi2018-03-071-0/+1
| | | | | | | | | | | | Fixes the following building errors: external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override' external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name' external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: 272bef0601a "intel: Split gen_device_info out into libintel_dev" Reviewed-by: Tapani Pälli <[email protected]>
* intel: Add missing includes for building on AndroidClayton Craft2018-03-061-0/+1
| | | | | | | | | | | This adds a missing library to the i965/Android.mk file, and updates intel/Android.mk to include the new library. Without this, mesa does not build on Android. Fixes: 272bef0601a "intel: Split gen_device_info out into libintel_dev" Reviewed-by: Kenneth Graunke <[email protected]>
* vulkan: do not expose surface/swapchain extensions on AndroidTapani Pälli2018-03-061-1/+1
| | | | | | | | | On Android surface/swapchain extensions are implemented by the loader. Patch modifies both anv and radv extension scripts disabling currently exposed ones. See also earlier commit 9f763c1f9b. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Don't expose VK_KHX_multiview on android.Tapani Pälli2018-03-061-1/+1
| | | | | | | | | | | | | | Just like commit 2ffe395 does for radv. Fixes following dEQP test on i965: dEQP-VK.api.info.android.no_unknown_extensions v2: make it !ANDROID since this extension is not about surfaces/swapchain Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* intel: Drop SURFACE_FORMAT enum from genxml.Kenneth Graunke2018-03-0514-2265/+31
| | | | | | | | | | | We want people to be using ISL_FORMAT_*, rather than the genxml format enumerations. This patch drops 10 separate copies, and drops a bunch of ugly casting. Reviewed-by: Jordan Justen <[email protected]> [[email protected]: Minor changes for rebase] Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel/common: Use isl for decoder surface formatsJordan Justen2018-03-053-1/+10
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel/isl: Add isl_format_is_validJordan Justen2018-03-052-0/+10
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel: Split gen_device_info out into libintel_devJordan Justen2018-03-0528-25/+131
| | | | | | | | | | | | Split out the device info so isl doesn't depend on intel/common. Now it will depend on the new intel/dev device info lib. This will allow the decoder in intel/common to use isl, allowing us to apply Ken's patch that removes the genxml duplication of surface formats. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* isl: Silence unused parameter warnings in __gen_combine_address implementationsIan Romanick2018-03-022-2/+6
| | | | | | | | | | | | | | | | Reduces my build from 1808 warnings to 1772 warnings by silencing 36 instances of things like ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’: ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter] __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta) ^~~~ ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter] __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta) ^~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* genxml: Silence unused parameter warnings in generated pack codeIan Romanick2018-03-021-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Reduces my build from 1960 warnings to 1808 warnings by silencing 152 instances of things like In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0, from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36: src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’: src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_uint(uint64_t v, uint32_t start, uint32_t end) ^~~ src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’: src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter] __gen_offset(uint64_t v, uint32_t start, uint32_t end) ^~~~~ src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_offset(uint64_t v, uint32_t start, uint32_t end) ^~~ src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’: src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits) ^~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Silence unused parameter warnings in blorpIan Romanick2018-03-021-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduces my build from 2023 warnings to 1960 warnings by silencing 63 instances of things like In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params *params) ^~~~~~ ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params *params) ^~~~~~ In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0: ../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params *params) ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch *batch, void *start, size_t size) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch *batch, void *start, size_t size) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch *batch, void *start, size_t size) ^~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Silence warnings about mixing enum and non-enum in conditionalIan Romanick2018-03-021-1/+1
| | | | | | | | | | | | | | | | | | | | Reduces my build from 6451 warnings to 6301 warnings by silencing 150 instances of ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info*, const brw_inst*)’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra] unsigned file = __builtin_strcmp("dst", #reg) == 0 ? \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ BRW_GENERAL_REGISTER_FILE : \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ brw_inst_##reg##_reg_file(devinfo, inst); \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’ REG_TYPE(src1) ^~~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel/compiler: Silence unused parameter warnings in release buildsIan Romanick2018-03-022-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduces my build from 7005 warnings to 6451 warnings by silencing 554 instances of In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0: ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_3src_a1_src2_imm(const struct gen_device_info *devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_set_3src_a1_src0_imm(const struct gen_device_info *devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_set_3src_a1_src2_imm(const struct gen_device_info *devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_imm_uq(const struct gen_device_info *devinfo, const brw_inst *insn) ^~~~~~~ In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0, from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29: ../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’: ../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’: ../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter] unsigned _reg_file, ^~~~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel: Drop program size pointer from vec4/fs assembly getters.Kenneth Graunke2018-03-029-25/+17
| | | | | | | | | These days, we're just passing a pointer to a prog_data field, which we already have access to. We can just use it directly. (In the past, it was a pointer to a separate value.) Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/compiler: Memory fence commit must always be enabled for gen10+Anuj Phogat2018-03-021-1/+3
| | | | | | | | | | | | Commit bit in the message descriptor (Bit 13) must be always set to true in CNL+ for memory fence messages. It also fixes a piglit GPU hang on cnl+ in simulation environment. Piglit test: arb_shader_image_load_store-shader-mem-barrier See HSD ES # 1404612949 Signed-off-by: Anuj Phogat <[email protected]> Cc: [email protected] Reviewed-by: Francisco Jerez <[email protected]>
* Revert "i965/fs: Predicate byte scattered writes if needed"Francisco Jerez2018-03-021-14/+1
| | | | | | | | | | | | | This reverts commit a4031bdfa927fb4c3c5d0bdadc70634f3c1a5eac. It's redundant with the sample mask predication done at this point by the common logical send lowering infrastructure, and rather buggy because it wasn't applying the correct sample mask in shaders using discard, since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS doesn't reflect samples discarded by the shader, so it could have led to data corruption in fragment shader invocations that execute discard based on a non-dynamically uniform condition. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Handle surface opcode sample masks via predication.Francisco Jerez2018-03-021-1/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main motivation is to enable HDC surface opcodes on ICL which no longer allows the sample mask to be provided in a message header, but this is enabled all the way back to IVB when possible because it decreases the instruction count of some shaders using HDC messages significantly, e.g. one of the SynMark2 CSDof compute shaders decreases instruction count by about 40% due to the removal of header setup boilerplate which in turn makes a number of send message payloads more easily CSE-able. Shader-db results on SKL: total instructions in shared programs: 15325319 -> 15314384 (-0.07%) instructions in affected programs: 311532 -> 300597 (-3.51%) helped: 491 HURT: 1 Shader-db results on BDW where the optimization needs to be disabled in some cases due to hardware restrictions: total instructions in shared programs: 15604794 -> 15598028 (-0.04%) instructions in affected programs: 220863 -> 214097 (-3.06%) helped: 351 HURT: 0 The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL laptop with this change. According to Eero this improves performance of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2 desktop. Reviewed-by: Kenneth Graunke <[email protected]> Tested-By: Eero Tamminen <[email protected]>