aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* nir: Add system values from ARB_shader_ballotMatt Turner2017-07-202-3/+3
| | | | | | | | | | | | | We already had a channel_num system value, which I'm renaming to subgroup_invocation to match the rest of the new system values. Note that while ballotARB(true) will return zeros in the high 32-bits on systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB variables do not consider whether channels are enabled. See issue (1) of ARB_shader_ballot. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Implement ARB_shader_group_vote operationsMatt Turner2017-07-201-0/+50
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Handle explicit flag destinations in flags_written()Francisco Jerez2017-07-201-4/+19
| | | | | | | The implementations of the ARB_shader_group_vote intrinsics will explicitly write the flag as the destination register. Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Lower ARB_shader_group_vote intrinsicsMatt Turner2017-07-201-0/+1
| | | | | | | | I don't expect anyone is going to care about using this in vec4 programs (vertex/tessellation/geometry on Gen6/7), no one has come up with a good way to implement it much less test it. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add pass to optimize intrinsicsMatt Turner2017-07-201-0/+1
| | | | | | | Specifically, constant fold intrinsics from ARB_shader_group_vote, but I suspect it'll be useful for other things in the future. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/isl/gen4: Represent cube maps with 3D layoutTopi Pohjolainen2017-07-201-6/+35
| | | | | | | v2 (Jason): Check for !ISL_SURF_DIM_3D instead of CUBE_BIT. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add i915 to isl_tiling converterTopi Pohjolainen2017-07-202-0/+20
| | | | | | | | v2: s/i915_tiling_to_isl_tiling(/isl_tiling_from_i915_tiling/ Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BITChad Versace2017-07-191-3/+4
| | | | | | | | | | | We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT. We looked for the bit in VkImageCreateInfo::usage, but it's actually in VkImageCreateInfo::flags. Found by assertion failures while enabling VK_ANDROID_native_buffer. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/blorp/gen4: Drop cube map flag for single face copyTopi Pohjolainen2017-07-181-1/+7
| | | | | | | | This will falsely trigger an assert on number of layers once isl is used for 3D layouts of Gen4 cube maps. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* intel/isl: Take 3D surfaces into account in image paramsTopi Pohjolainen2017-07-181-2/+6
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* anv: Advertise support for VK_KHR_variable_pointersJason Ekstrand2017-07-183-0/+13
| | | | | | | | We don't support the general version yet because that requires us to lower shared variables up-front in SPIR-V -> NIR. This shouldn't be a whole lot of work but it's not something we support today. Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Advertise support for VK_KHR_storage_buffer_storage_classJason Ekstrand2017-07-182-0/+5
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/isl: Add a row_pitch parameter to surf_get_ccs_surfJason Ekstrand2017-07-173-3/+6
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* anv/image: Add INPUT_ATTACHMENT to the list of required usagesJason Ekstrand2017-07-171-0/+1
| | | | | | | | | | | | | | | From the Vulkan 1.0.53 spec VU for vkCreateImageView: "image must have been created with a usage value containing at least one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT, VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT, VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT" We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* anv: Stop leaking the no_aux sampler surface stateJason Ekstrand2017-07-171-0/+5
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* anv/cmd_buffer: Properly handle render passes with 0 attachmentsJason Ekstrand2017-07-171-12/+11
| | | | | | | | We were early returning and never created the NULL surface state. Reviewed-by: Lionel Landwerlin <[email protected]> Tested-by: James Legg <[email protected]> Cc: [email protected]
* anv: advertise v6 of the wayland surface extensionEmil Velikov2017-07-171-1/+1
| | | | | | | | | | | Jason updated the Khronos spec to explicitly state that Wayland surfaces must support VK_PRESENT_MODE_MAILBOX_KHR. ANV did so since day one (back in 2015) Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: ensure device name contains terminating characterLionel Landwerlin2017-07-171-2/+2
| | | | | | | | | | v2: Use sizeof() (Chris) CID: 1415113 Reported-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Implement VK_KHR_external_memory_*Jason Ekstrand2017-07-153-5/+163
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement VK_KHR_dedicated_allocationJason Ekstrand2017-07-152-0/+19
| | | | | | | | We always recommend sub-allocation and don't do anything special for dedicated allocations. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement VK_KHR_get_memory_requirements2Jason Ekstrand2017-07-152-0/+48
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Advertise version 1.0.54Jason Ekstrand2017-07-153-3/+3
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan: Update to the new 1.0.54 spec XML and headersJason Ekstrand2017-07-151-3/+3
| | | | | | | | | There is one small ANV change here because we used the VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX enum in the BO cache and that had to be updated to have the _KHR suffix. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Drop support for VK_KHX_external_semaphore_*Jason Ekstrand2017-07-153-125/+5
| | | | | | | | These have been formally deprecated by Khronos never to be shipped again. The KHR versions should be implemented/used instead. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Drop support for VK_KHX_external_memory_*Jason Ekstrand2017-07-143-161/+5
| | | | | | | | These have been formally deprecated by Khronos never to be shipped again. The KHR versions should be implemented/used instead. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pipeline: do not use BITFIELD64_BIT()Juan A. Suarez Romero2017-07-141-1/+1
| | | | | | | | | In the previous commit, forgot to apply v2 suggestions. Fixes: 28d0c38 (anv/pipeline: use unsigned long long constant to check enable vertex inputs) Signed-off-by: Juan A. Suarez Romero <[email protected]>
* anv/pipeline: use unsigned long long constant to check enable vertex inputsJuan A. Suarez Romero2017-07-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | When initializing the ANV pipeline, one of the tasks is checking which vertex inputs are enabled. This is done by checking if the enabled bits in inputs_read. But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 + desc->location))`. The problem here is that if location is 15 or greater, the sum is 32 or greater. But C is handling 1 as a 32-bit integer, which means the displaced bit is out of range and thus the full value is 0. Thus, use 1ull, which is an unsigned long long value. This fixes: dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner) Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Cc: [email protected]
* i965: Use pushed UBO data in the scalar backend.Kenneth Graunke2017-07-133-1/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This actually takes advantage of the newly pushed UBO data, avoiding pull loads. Improves performance in GLBenchmark Manhattan 3.1 by: HSW: ~1%, BDW/SKL/KBL GT2: 3-4%, SKL GT4: 7-8%, APL: 4-5%. (thanks to Eero Tamminen for these numbers) shader-db results on Skylake, ignoring programs with spill/fill changes: total instructions in shared programs: 13963994 -> 13651893 (-2.24%) instructions in affected programs: 4250328 -> 3938227 (-7.34%) helped: 28527 HURT: 0 total cycles in shared programs: 179808608 -> 172535170 (-4.05%) cycles in affected programs: 79720410 -> 72446972 (-9.12%) helped: 26951 HURT: 1248 LOST: 46 GAINED: 21 Many "Deus Ex: Mankind Divided" shaders which already spilled end up spill a lot more (about 240 programs hurt, 9 helped). The cycle estimator suggests this is still overall a win (-0.23% in cycle counts) presumably because we trade pull loads for fills. v2: Drop "PULL" environment variable left in for initial debugging (caught by Matt). Reviewed-by: Matt Turner <[email protected]>
* i965: Factor out push locations.Kenneth Graunke2017-07-132-16/+25
| | | | | | | | With UBOs, the answer of "have we decided to push this uniform" gets a bit more complicated - for one, we have multiple surfaces. This patch refactors things so we can add the new code in a single place. Reviewed-by: Matt Turner <[email protected]>
* i965: Push UBO data, but don't use it just yet.Kenneth Graunke2017-07-132-1/+11
| | | | | | | | | | | This patch starts uploading UBO data via 3DSTATE_CONSTANT_* packets, and updates the compiler to know that there's extra payload data, so things continue working. However, it still issues pull loads for all data. I wanted to separate the two aspects for greater bisectability. v2: Update for new intel_bufferobj_buffer parameter. Reviewed-by: Matt Turner <[email protected]>
* i965: Select ranges of UBO data to be uploaded as push constants.Kenneth Graunke2017-07-134-0/+312
| | | | | | | | | | | | | | | This adds a NIR pass that decides which portions of UBOS we should upload as push constants, rather than pull constants. v2: Switch to uint16_t for the UBO block number, because we may have a lot of them in Vulkan (suggested by Jason). Add more comments about bitfield trickery (requested by Matt). v3: Skip vec4 stages for now...I haven't finished wiring up support in the vec4 backend, and so pushing the data but not using it will just be wasteful. Reviewed-by: Matt Turner <[email protected]>
* i965: Switch to absolute addressing for constant buffer 0.Kenneth Graunke2017-07-131-0/+6
| | | | | | | | | | | | | | | | | By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic state base address. This makes it unusable for pushing UBOs. I'd like to be able to use all four push buffers. There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake) which controls whether buffer 0 is relative to dynamic state base address, or simply a normal pointer. Setting that gives us full flexibility. We can't currently write this on Haswell and earlier, and will need to update the kernel command parser, and then do the whole version checking song and dance. Reviewed-by: Matt Turner <[email protected]>
* aubinator: don't leak fd of opened aubfileLionel Landwerlin2017-07-131-0/+2
| | | | | | CID: 1373563 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv: don't use strcpy for copying stringsLionel Landwerlin2017-07-131-1/+2
| | | | | | CID: 1358935 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/compiler: no need to check unsigned is >= 0Lionel Landwerlin2017-07-131-1/+1
| | | | | | CID: 1338342 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/compiler: don't check unsigned is >= 0Lionel Landwerlin2017-07-131-1/+1
| | | | | | CID: 1224468 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/compiler: remove check unsigned is >= 0Lionel Landwerlin2017-07-131-1/+1
| | | | | | | | By definition unsigned are always >= 0. CID: 742212 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* isl: use 64bit arithmetic to compute sizeLionel Landwerlin2017-07-131-2/+2
| | | | | | | | | If we allow the size to be more than 2^32, then we should compute it in 64bit arithmetic otherwise we might run into overflow issues. CID: 1412892, 1412891 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/isl: Add a helper to convert tilings from ISL to i915Jason Ekstrand2017-07-122-0/+28
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/isl: Add basic modifier introspectionJason Ekstrand2017-07-124-0/+83
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/compiler: Don't use opt_sampler_eot() optimization on gen10+Anuj Phogat2017-07-121-1/+1
| | | | | | | This optimization has been removed on gen10+. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: Move the DRM uapi headers to a non-Intel location.Eric Anholt2017-07-1210-3566/+2
| | | | | | | | | | | | I want to remove vc4's dependency on headers from libdrm as well, but storing multiple copies of drm_fourcc.h in our tree would be silly. v2: Update Android.mk as well, move distcheck drm*.h references to top-level noinst_HEADERS. Reviewed-by: Lionel Landwerlin <[email protected]> (v1) Reviewed-by: Daniel Stone <[email protected]> (v1) Reviewed-by: Rob Herring <[email protected]>
* anv: Round u_vector element sizes to a power of twoJason Ekstrand2017-07-121-2/+3
| | | | | | | | | | | This fixes 32-bit builds of the driver. Commit 08413a81b93dc537fb0c3 changed things so that we now put struct anv_states in the u_vector for binding tables. On 64-bit builds, sizeof(struct anv_state) is a power of two but it isn't on 32-bit builds. Fixes: 08413a81b93dc537fb0c34327ad162f07e8c3427 Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* intel: add number of subslices to device infoLionel Landwerlin2017-07-112-8/+54
| | | | | | | | | | | We could have used a single integer to store that value, but Cannonlake has different number of subslices per slice depending on the GT. v2: Add CFL subslice numbers (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* intel: Fix clflushing on modern (Baytrail+) Atom CPUs.Kenneth Graunke2017-07-101-0/+12
| | | | | | | | Thanks to Chris Wilson for pointing this out. Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* intel: Move clflush helpers from anv to common/gen_clflush.h.Kenneth Graunke2017-07-107-34/+63
| | | | | | | | | I want to use these in the OpenGL driver as well. v2: Add to COMMON_FILES in Makefile.sources (caught by Emil) Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* anv: Stop setting domains to RENDER on EXEC_OBJECT_WRITEJason Ekstrand2017-07-101-5/+2
| | | | | | | | | | | | The reason we were doing this was to ensure that the kernel did the appropriate cross-ring synchronization and flushing. However, the kernel only looks at EXEC_OBJECT_WRITE to determine whether or not to insert a fence. It only cares about the domain for determining whether or not it needs to clflush the BO before using it for scanout but the domain automatically gets set to RENDER internally by the kernel if EXEC_OBJECT_WRITE is set. Reviewed-by: Chris Wilson <[email protected]>
* Revert "intel/isl: Only create a CCS buffer if the image supports rendering"Nanley Chery2017-07-071-1/+1
| | | | | | | | | This reverts commit 8aaa13467dc289d35dc7900ab9fab9a7689c4178, which was based on an incorrect assumption. Unlike the restriction placed on image views in the Vulkan API, OpenGL allows you to render to texture views whose formats differ from the originals. Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=101677
* intel: common: Fix link failure with standalone Android buildTomasz Figa2017-07-051-0/+5
| | | | | | | | | | | | | | | Some reshuffle in the Makefiles under src/intel resulted in Android libraries being no longer linked with code using src/intel/common/gen_debug.h that contains references to functions exported by those libraries (namely ALOGW macro, which is currently resolved into a call to __android_log_print() from cutils). Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for affected module on Android NDK builds. Fixes: d5b355ce5fd ("i965: Move intel_debug.h to intel/common/gen_debug.h") Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* anv: check support for enabled features in vkCreateDevice()Samuel Iglesias Gonsálvez2017-07-031-0/+13
| | | | | | | | | | | From Vulkan spec, 4.2.1. "Device Creation": "vkCreateDevice verifies that extensions and features requested in the ppEnabledExtensionNames and pEnabledFeatures members of pCreateInfo, respectively, are supported by the implementation." Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>