summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* anv: Use build-id for pipeline cache UUID.Matt Turner2017-02-152-21/+8
| | | | | | | | The --build-id=... ld flag has been present since binutils-2.18, released 28 Aug 2007. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* anv: wsi: report presentation error per image requestLionel Landwerlin2017-02-151-8/+15
| | | | | | | | | | | | | | | | vkQueuePresentKHR() takes VkPresentInfoKHR pointer and includes a pResults fields which must holds the results of all the images requested to be presented. Currently we're not filling this field. Also as a side effect we probably want to go through all the images rather than stopping on the first error. This commit also makes the QueuePresentKHR() implementation return the first error encountered. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0" <[email protected]>
* anv: Use vk_foreach_struct for handling extension structsJason Ekstrand2017-02-143-27/+21
| | | | Reviewed-by: Dave Airlie <[email protected]>
* anv: Implement the Skylake stencil PMA optimizationJason Ekstrand2017-02-143-6/+158
| | | | | | | | | | Unfortunately, this doesn't substantially improve the performance of any known apps. With Dota 2 on my Sky Lake gt4, it seems help by somewhere between 0% and 1% but there's enough noise that it's hard to get a clear picture. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* genxml: Add the CACHE_MODE_0 register on gen9Jason Ekstrand2017-02-141-0/+28
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pipeline: Be smarter about depth/stencil stateJason Ekstrand2017-02-141-34/+141
| | | | | | | | | It's a bit hard to measure because it almost gets lost in the noise, but this seemed to help Dota 2 by a percent or two on my Broadwell GT3e desktop. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/pipeline: Make a copy of VkPipelineDepthStencilStateCreateinfoJason Ekstrand2017-02-141-16/+18
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Add support for the PMA fix on BroadwellJason Ekstrand2017-02-149-2/+221
| | | | | | | | | | This helps Dota 2 on Broadwell by 8-9%. I also hacked up the driver and used the Sascha "shadowmapping" demo to get some results. Setting uses_kill to true dropped the framerate on the demo by 25-30%. Enabling the PMA fix brought it back up to around 90% of the original framerate. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* genxml: Add the CACHE_MODE_1 register on gen8Jason Ekstrand2017-02-141-0/+21
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Disable stencil writes when both write masks are zeroJason Ekstrand2017-02-144-2/+17
| | | | | | | | | | | | Vulkan doesn't have a stencilWriteEnable bit like it does for depth. Instead, you have a stencil mask. Since the stencil mask is handled as dynamic state, we have to handle it later during command buffer construction. This, combined with a later commit, seems to help Dota2 on my Broadwell GT3e desktop by a couple percent because it allows the hardware to move the depth and stencil writes to early in more cases. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/entrypoints: Only generate entrypoints for supported featuresJason Ekstrand2017-02-141-3/+41
| | | | | | | | | This changes the way anv_entrypoints_gen.py works from generating a table containing every single entrypoint in the XML to just the ones that we actually need. There's no reason for us to burn entrypoint table space on a bunch of NV extensions we never plan to implement. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: fix Get*MemoryRequirements for !LLCConnor Abbott2017-02-141-6/+8
| | | | | | | | | | Even though we supported both coherent and non-coherent memory types, we effectively forced apps to use the coherent types by accident. Found by inspection, only compile tested. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0" <[email protected]>
* anv: Add support for shaderStorageImageWriteWithoutFormatAlex Smith2017-02-146-23/+65
| | | | | | | | | | | | | | | | | | | | | | This allows shaders to write to storage images declared with unknown format if they are decorated with NonReadable ("writeonly" in GLSL). Previously an image view would always use a lowered format for its surface state, however when a shader declares a write-only image, we should use the real format. Since we don't know at view creation time whether it will be used with only write-only images in shaders, create two surface states using both the original format and the lowered format. When emitting the binding table, choose between the states based on whether the image is declared write-only in the shader. Tested on both Sascha Willems' computeshader sample (with the original shaders and ones modified to declare images writeonly and omit their format qualifiers) and on our own shaders for which we need support for this. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/apply_pipeline_layout: Set image.write_only to falseJason Ekstrand2017-02-141-0/+12
| | | | | | | | | | This makes our driver robust to changes in spirv_to_nir which would set this flag on the variable. Right now, our driver relies on spirv_to_nir *not* setting var->data.image.write_only for correctness. Any patch which implements the shaderStorageImageWriteWithoutFormat will need to effectively revert this commit. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/isl: Add format metadata for typed reads/writesJason Ekstrand2017-02-142-251/+291
| | | | | | | | This adds two columns to the format table as well as two helpers for determining whether or not a given format is supported for typed reads and writes. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/cmd_buffer: Return a VkResult from verify_cmd_parserJason Ekstrand2017-02-141-7/+7
| | | | | | This fixes a "statement with no effect" compiler warning Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/blorp: Don't sanitize the swizzle for blorp_clearJason Ekstrand2017-02-131-2/+1
| | | | | | | | | | | | | BLORP is now smart enough to handle any swizzle (even those that contain ZERO or ONE) in a reasonable manner. Just let BLORP handle it. This fixes the following Vulkan CTS tests on Haswell: - dEQP-VK.api.image_clearing.clear_color_image.1d_b4g4r4a4_unorm_pack16 - dEQP-VK.api.image_clearing.clear_color_image.2d_b4g4r4a4_unorm_pack16 - dEQP-VK.api.image_clearing.clear_color_image.3d_b4g4r4a4_unorm_pack16 Reviewed-by: Juan A. Suarez Romero <[email protected]> Cc: "17.0" <[email protected]>
* intel/blorp: Swizzle clear colors on the CPUJason Ekstrand2017-02-131-18/+30
| | | | | | | | | It's trivial to swizzle clear colors on the CPU, easily deals with the hardware restrictions for render target swizzles, and makes swizzled clears work on all hardware as opposed to just HSW+. Reviewed-by: Juan A. Suarez Romero <[email protected]> Cc: "17.0" <[email protected]>
* intel/blorp: do not return const data by get_px_size_sa()Emil Velikov2017-02-101-1/+1
| | | | | | | | | | | | | Not much point in the const qualifier since we provide a copy to the user. Resolves the following -Wignored-qualifiers warning. src/intel/blorp/blorp_blit.c:1857:8: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers] v2: keep const qualifier of local variable. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pipeline: set ThreadDispatchEnable conditionallyJuan A. Suarez Romero2017-02-061-23/+26
| | | | | | | | Set 3DSTATE_WM/ThreadDispatchEnable bit on/off based on the same conditions as used in the GL version. Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/blorp: Disable resolves for transparent black clearsNanley Chery2017-02-031-2/+8
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Don't temporarily enable CCS_E within a render passNanley Chery2017-02-031-2/+13
| | | | | | | | | | | | | | | | Compressing a render target and decompressing it in the same single-subpass render pass may waste bandwidth. While this may be beneficial in some circumstances, it does not help in all. Reclaims about 1.95% FPS for Dota 2 on some configurations. v2 (Jason Ekstrand): - Provide a more thorough comment - Enable CCS_D for input attachments v3 (Jason Ekstrand): - Provide performance numbers Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/isl: Add a better comment for format_supports_ccs_eJason Ekstrand2017-02-021-0/+6
| | | | | Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv: Remove the finishme for CCS_E with storage imagesJason Ekstrand2017-02-021-14/+7
| | | | | | | | | The data port can't handle CCS at all so replace the finishme with better comments. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/isl: Assert that we don't use CCS for storage imagesJason Ekstrand2017-02-021-0/+6
| | | | | | | | | | I enabled CCS for storage images in the Vulkan driver and ran it through the CTS. It didn't result in any hangs but it demonstrated that the data port cannot handle CCS. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/isl: Add a formats_are_ccs_e_compatible helperJason Ekstrand2017-02-023-0/+41
| | | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/isl: Add a format_supports_ccs_d helperJason Ekstrand2017-02-022-0/+24
| | | | | | | | | Nothing uses this yet but it serves as a nice bit of documentation that's relatively easy to find. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/isl: Rename supports_lossless_compression to supports_ccs_eJason Ekstrand2017-02-024-9/+8
| | | | | | | | | | | | | The term "lossless compression" could potentially mean multisample color compression, single-sample color compression or HiZ because they are all lossless. The term CCS_E, however, has a very precise meaning; in ISL and is only used to refer to single-sample color compression. It's also much shorter which is nice. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/pass: Store the depth-stencil attachment's last subpass indexNanley Chery2017-02-021-0/+1
| | | | | | | | | | | Commit 968ffd6c868af7226e8f889573eef709888151cb stored the last subpass index of all the attachments but that of the depth-stencil attachment. This could cause depth buffers used in multiple subpasses not to be in the requested final layout. Fix this error. Cc: "17.0" <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* anv: enable VK_KHR_shader_draw_parametersLionel Landwerlin2017-02-022-0/+5
| | | | | | | | | | Enables 10 tests from: dEQP-VK.draw.shader_draw_parameters.* Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: emit DrawID if neededLionel Landwerlin2017-02-023-7/+63
| | | | | | | | v2: use define for buffer ID (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: always allocate a vertex element with vertexid or instanceidLionel Landwerlin2017-02-021-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Up to now on Gen8+ we only allocated a vertex element for gl_InstanceIndex or gl_VertexIndex when a vertex shader uses gl_BaseInstanceARB or gl_BaseVertexARB. This is because we would configure the VF_SGVS packet to make the VF unit write the gl_InstanceIndex & gl_VertexIndex values right behind the values computed from the vertex buffers. In the next commit we will also write the gl_DrawIDARB value. Our backend expects to pull the gl_DrawIDARB value from the element following the element containing gl_InstanceIndex, gl_VertexIndex, gl_BaseInstanceARB and gl_BaseVertexARB (see vec4_vs_visitor::setup_attributes). Therefore we need to allocate an element for the SGVS elements as long as at least one of the SGVS element is read by the shader. Otherwise our shader will use a gl_DrawIDARB value pulled from the URB one element too far (most likely garbage). v2: Fix my english (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: move BaseVertexID/BaseInstanceID vertex buffer index to 31Lionel Landwerlin2017-02-023-2/+4
| | | | | | | | v2: use define for buffer ID (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: limit vertex buffers to 31Lionel Landwerlin2017-02-023-4/+4
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Don't use bogus alpha swizzlesJason Ekstrand2017-02-013-3/+22
| | | | | | | | | | | For RGB formats in Vulkan, we use the corresponding RGBA format with a swizzle of RGB1. While this swizzle is exactly what we want for texturing, it's not allowed for rendering according to the docs. While we haven't been getting hangs or anything, we should probably obey the docs. This commit just sanitizes all render swizzles so that the alpha channel maps to ALPHA. Reviewed-by: Anuj Phogat <[email protected]>
* isl: Add assertions for render target swizzle restrictionsJason Ekstrand2017-02-011-0/+32
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* android: add vulkan build for intelTapani Pälli2017-02-012-0/+227
| | | | | | | | | | | | | | | | | | | | | | | | | fixes to issues spotted by Emil Velikov: - set ANV_TIMESTAMP corretly - fix typo with VULKAN_GEM_FILES v2: update to use Makefile.sources under vulkan instead of having own v3: update to changes to generate from vk.xml (commit c7fc310) v4: remove 'hw' relative path cleanups, remove unnecessary cruft review from Emil Velikov: - move to vulkan folder - remove timestamp gen, no longer necessary - more cleanups Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* anv: Improve flushing around STATE_BASE_ADDRESSJason Ekstrand2017-01-311-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS actually is. We know from experimentation that we need to flush the render cache prior to emitting STATE_BASE_ADDRESS and invalidate the texture cache afterwards. The only thing the PRM says is that, on gen8+ we're supposed to invalidate the state cache after STATE_BASE_ADDRESS but experimentation has indicated that doing so does nothing whatsoever. Since we don't really know, let's do just a bit more flushing in the hopes that this won't be a problem again. In particular: 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't really know whether or not it's pipelined. 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS is a compute shader. 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS because the state may be getting cached there (we don't really know). Reported-by: Mark Janes <[email protected]> Tested-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv: Flush render cache before STATE_BASE_ADDRESS on gen7Jason Ekstrand2017-01-311-3/+0
| | | | | | | | | | | We had no good reason for *not* doing this on gen7 before but we didn't know it was needed. Recently, when trying update to Vulkan CTS version 1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear to be STATE_BASE_ADDRESS related. This commit fixes them. Reported-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0 17.0" <[email protected]>
* isl/formats: Only advertise sampling for A4B4G4R4 on BroadwellJason Ekstrand2017-01-311-2/+3
| | | | | | | | | This causes hangs on Broadwell if you try to render to it. I have no idea how we managed to not hit this earlier. Tested-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0 17.0" <[email protected]>
* intel/blorp: Handle clearing of A4B4G4R4 on all platformsJason Ekstrand2017-01-311-0/+23
| | | | | | Tested-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv/cmd_buffer: Use the proper depth input attachment surface stateNanley Chery2017-01-311-6/+6
| | | | | | | | | | | | | | | | | | | | | | | Commit 2852efcda40274acf3272611c6a3b7731523a72d moved the location of the depth input attachment surface state from the render pass to the image view, but failed to update the surface state location used when emitting the binding table. Fix this by loading the surface state from the correct location. Fixes: dEQP-VK.renderpass.formats.d16_unorm.input.* dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.* dEQP-VK.renderpass.formats.d32_sfloat.input.* dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.* dEQP-VK.renderpass.attachment_allocation.input_output.93 dEQP-VK.renderpass.attachment_allocation.input_output.92 dEQP-VK.renderpass.attachment_allocation.input_output.82 dEQP-VK.renderpass.attachment_allocation.input_output.46 Cc: "17.0" <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* anv: Advertise API version 1.0.39Jason Ekstrand2017-01-271-1/+1
| | | | | | I'm pretty sure we've kept up with the bug fixes. Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: add missing extension errors in vk_errorf()Eric Engestrom2017-01-271-0/+5
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: add missing core errors in vk_errorf()Eric Engestrom2017-01-271-0/+3
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: don't assert on out of memory descriptor pool in debug modeLionel Landwerlin2017-01-271-0/+2
| | | | | | | | Fixes: dEQP-VK.api.descriptor_pool.out_of_pool_memory Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* intel/blorp/dbg: Name blit shaders for easy recognition in dumpsTopi Pohjolainen2017-01-271-0/+2
| | | | | | | | Blorp clears already have an equivalent. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* anv: fix descriptor pool internal size allocationLionel Landwerlin2017-01-261-4/+4
| | | | | | | | | | | | | | | The size of the pool is slightly smaller than the size of the structure containing the whole pool. We need to take that into account on when setting up the internals. Fixes a crash due to out of bound memory access in: dEQP-VK.api.descriptor_pool.out_of_pool_memory v2: Drop debug traces (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]>
* anv/lower_input_attachments: honor sample index parameter to subpassLoad()Iago Toral Quiroga2017-01-261-4/+1
| | | | | | | | | | | | | | | | | According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is: gvec4 subpassLoad(gsubpassInput subpass); gvec4 subpassLoad(gsubpassInputMS subpass, int sample); So the multisampled case always receives an explicit sample index that we should use. The current implementation was ignoring this parameter and using gl_SampleID value instead. Fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_id.* Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0" <[email protected]>
* anv: Implement VK_KHR_get_physical_device_properties2Chad Versace2017-01-253-0/+161
| | | | | Reviewed-by: Jason Ekstranad <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>