summaryrefslogtreecommitdiffstats
path: root/src/intel/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* anv/cmd_buffer: Disable CCS on BDW input attachmentsNanley Chery2017-04-172-30/+13
| | | | | | | | | | | | | | | | | The description under RENDER_SURFACE_STATE::RedClearColor says, For Sampling Engine Multisampled Surfaces and Render Targets: Specifies the clear value for the red channel. For Other Surfaces: This field is ignored. This means that the sampler on BDW doesn't support CCS. Cc: Samuel Iglesias Gonsálvez <[email protected]> Cc: Jordan Justen <[email protected]> Cc: <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* anv: blorp: flush memory after copyLionel Landwerlin2017-04-171-2/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv: Add the pci_id into the shader cache UUIDJason Ekstrand2017-04-141-5/+15
| | | | | | | | | | This prevents a user from using a cache created on one hardware generation on a different one. Of course, with Intel hardware, this requires moving their drive from one machine to another but it's still possible and we should prevent it. Reviewed-by: Chad Versace <[email protected]> Cc: [email protected]
* anv/blorp: Properly handle VK_ATTACHMENT_UNUSEDJason Ekstrand2017-04-141-5/+22
| | | | | | | | | | | | | The Vulkan driver was originally written under the assumption that VK_ATTACHMENT_UNUSED was basically just for depth-stencil attachments. However, the way things fell together, VK_ATTACHMENT_UNUSED can be used anywhere in the subpass description. The blorp-based clear and resolve code has a bunch of places where we walk lists of attachments and we weren't handling VK_ATTACHMENT_UNUSED everywhere. This commit should fix all of them. Reviewed-by: Nanley Chery <[email protected]> Cc: <[email protected]>
* anv/cmd_buffer: Use the null surface state for ATTACHMENT_UNUSEDJason Ekstrand2017-04-141-2/+14
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: <[email protected]>
* anv/cmd_buffer: Always set up a null surface stateJason Ekstrand2017-04-141-31/+19
| | | | | | | | | | We're about to start requiring it in yet another case and calculating exactly when one is needed is starting to get prohibitively expensive. A single surface state doesn't take up that much space so we may as well create one all the time. Reviewed-by: Nanley Chery <[email protected]> Cc: <[email protected]>
* anv/cmd_buffer: Flush the VF cache at the top of all primariesJason Ekstrand2017-04-141-0/+12
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv/blorp: Flush the texture cache in UpdateBufferJason Ekstrand2017-04-141-0/+7
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv: Limit VkDeviceMemory objects to 2GBJason Ekstrand2017-04-141-0/+20
| | | | Reviewed-by: Juan A. Suarez Romero <[email protected]>
* anv: Only define wsi_cbs when VK_USE_PLATFORM_WAYLAND_KHR definedMatt Turner2017-04-121-0/+2
|
* anv: remove needless VALGRIND_MAKE_MEM_DEFINEDJuan A. Suarez Romero2017-04-111-1/+0
| | | | | | This is already invoked in the following VG_NOACCESS_READ() call. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Use ISL for emitting depth/stencil/hizJason Ekstrand2017-04-101-179/+39
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* anv: Use subpass dependencies for flushesJason Ekstrand2017-04-072-80/+18
| | | | | | | Instead of figuring it all out ourselves, just use the information given to us by the client. Reviewed-by: Nanley Chery <[email protected]>
* anv/pass: Record required pipe flushesJason Ekstrand2017-04-072-0/+90
| | | | Reviewed-by: Nanley Chery <[email protected]>
* anv/pass: Use anv_multialloc for allocating the anv_passJason Ekstrand2017-04-072-63/+44
| | | | Reviewed-by: Nanley Chery <[email protected]>
* anv/descriptor_set: Use anv_multialloc for descriptor set layoutsJason Ekstrand2017-04-071-11/+10
| | | | Reviewed-by: Nanley Chery <[email protected]>
* anv: Add a helper for doing mass allocationsJason Ekstrand2017-04-071-0/+96
| | | | | | | | | | | We tend to try to reduce the number of allocation calls the Vulkan driver uses by doing a single allocation whenever possible for a data structure. While this has certain downsides (usually code complexity), it does mean error handling and cleanup is much easier. This commit adds a nice little helper struct for getting rid of some of that complexity. Reviewed-by: Nanley Chery <[email protected]>
* anv: Add helpers for converting access flags to pipe bitsJason Ekstrand2017-04-072-45/+62
| | | | Reviewed-by: Nanley Chery <[email protected]>
* anv/query: Use snooping on !LLC platformsJason Ekstrand2017-04-071-13/+11
| | | | | | | | | | | | | | Commit b2c97bc789198427043cd902bc76e194e7e81c7d which made us start using a busy-wait for individual query results also messed up cache flushing on !LLC platforms. For one thing, I forgot the mfence after the clflush so memory access wasn't properly getting fenced. More importantly, however, was that we were clflushing the whole query range and then waiting for individual queries and then trying to read the results without clflushing again. Getting the clflushing both correct and efficient is very subtle and painful. Instead, let's side-step the problem by just snooping. Reviewed-by: Chris Wilson <[email protected]>
* anv: provide anv_gem_busy() stub for the testsEmil Velikov2017-04-071-0/+6
| | | | | | | | | | | Otherwise linking way fail. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100600 Fixes: f195d40eca4 ("anv/device: Add a helper for querying whether a BO is busy") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]> Tested-by: Vinson Lee <[email protected]>
* anv/blorp: sample input attachments with resolves on BDWSamuel Iglesias Gonsálvez2017-04-071-0/+11
| | | | | | | | | | | | | | On Broadwell we still need to do a resolve between the subpass that writes and the subpass that reads when there is a self-dependency because HW could not see fast-clears and works on the render cache as if there was regular non-fast-clear surface. Fixes 16 tests on BDW: dEQP-VK.renderpass.formats.*.input.clear.store.self_dep* Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/query: Busy-wait for available query entriesJason Ekstrand2017-04-051-6/+56
| | | | | | | | | | | | | | | | | | | Before, we were just looking at whether or not the user wanted us to wait and waiting on the BO. Some clients, such as the Serious engine, use a single query pool for hundreds of individual query results where the writes for those queries may be split across several command buffers. In this scenario, the individual query we're looking for may become available long before the BO is idle so waiting on the query pool BO to be finished is wasteful. This commit makes us instead busy-loop on each query until it's available. This significantly reduces pipeline bubbles and improves performance of The Talos Principle on medium settings (where the GPU isn't overloaded with drawing) by around 20% on my SkyLake gt4. Reviewed-by: Chris Wilson <[email protected]> Tested-by: Eero Tamminen <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]>
* anv/device: Add a helper for querying whether a BO is busyJason Ekstrand2017-04-053-6/+47
| | | | | | This is a bit more efficient than using GEM_WAIT with a timeout of 0. Reviewed-by: Chris Wilson <[email protected]>
* anv: provide required gem stubs for the testsEmil Velikov2017-04-051-0/+19
| | | | | | | | | | | | | | | | | | Introduce stubs to anv_gem_stub.c that match the anv_gem.c ones. Otherwise we may get link-time errors, when building the tests. v2: Introduce all the missing stubs at once. Cc: Jason Ekstrand <[email protected]> Cc: Vinson Lee <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100574 Fixes: c964f0e485d ("anv: Query the kernel for reset status") Fixes: 651ec926fc1 ("anv: Add support for 48-bit addresses") Fixes: 060a6434eca ("anv: Advertise larger heap sizes") Signed-off-by: Emil Velikov <[email protected]> --- I've intentionally kept the order the same identical to the anv_gem.c. This way we can easily grep & diff in the future ;-)
* anv: Advertise larger heap sizesJason Ekstrand2017-04-043-14/+75
| | | | | | | | | | | Instead of just advertising the aperture size, we do something more intelligent. On systems with a full 48-bit PPGTT, we can address 100% of the available system RAM from the GPU. In order to keep clients from burning 100% of your available RAM for graphics resources, we have a nice little heuristic (which has received exactly zero tuning) to keep things under a reasonable level of control. Reviewed-by: Kristian H. Kristensen <[email protected]>
* anv: Add support for 48-bit addressesJason Ekstrand2017-04-045-0/+54
| | | | | | | | | | | | | | | | | | This commit adds support for using the full 48-bit address space on Broadwell and newer hardware. Thanks to certain limitations, not all objects can be placed above the 32-bit boundary. In particular, general and state base address need to live within 32 bits. (See also Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order to handle this, we add a supports_48bit_address field to anv_bo and only set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit for all client-allocated memory objects but leave it false for driver-allocated objects. While this is more conservative than needed, all driver allocations should easily fit in the first 32 bits of address space and keeps things simple because we don't have to think about whether or not any given one of our allocation data structures will be used in a 48-bit-unsafe way. Reviewed-by: Kristian H. Kristensen <[email protected]>
* anv: Replace anv_bo::is_winsys_bo with a uint32_t flagsJason Ekstrand2017-04-043-9/+11
| | | | Reviewed-by: Kristian H. Kristensen <[email protected]>
* anv/blorp: Align vertex buffers to 64BJason Ekstrand2017-04-041-1/+14
| | | | | | | | | | This fixes issues seen when adding support for full 48-bit addresses. The 48-bit addresses themselves have nothing to do with it other than that it caused the kernel to place buffers slightly differently so they interacted differently with the caches. Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv: Query the kernel for reset statusJason Ekstrand2017-04-044-40/+107
| | | | | | | | | | | | When a client causes a GPU hang (or experiences issues due to a hang in another client) we want to let it know as soon as possible. In particular, if it submits work with a fence and calls vkWaitForFences or vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be able to trust the results of that rendering. In order to provide this guarantee, we have to ask the kernel for context status in a few key locations. Reviewed-by: Kenneth Graunke <[email protected]>
* anv: Check for device loss at the end of WaitForFencesJason Ekstrand2017-04-041-5/+14
| | | | | | | It's possible that the device could have been lost while we were waiting. We should let the user know if this has happened. Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndexJason Ekstrand2017-04-041-3/+24
| | | | | | | | | | When the shader does not set one of these values, they are supposed to get a default value of 0. We have hardware bits in 3DSTATE_CLIP for this but haven't been setting them. This fixes the intermittent failure of dEQP-VK.geometry.layered.3d.render_to_default_layer. Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv: Implement VK_KHR_incremental_presentJason Ekstrand2017-04-033-2/+15
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* vulkan/wsi: Plumb present regions through the common codeJason Ekstrand2017-04-031-1/+2
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Acked-by: Dave Airlie <[email protected]>
* anv: change BLOCK_POOL_MEMFD_SIZE to 1GBTapani Pälli2017-03-311-2/+2
| | | | | | | | This allows us to run 32bit Vulkan apps on Android, ftruncate call would fail on 2GB (max size being 2GB - 1). Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: fix host memory leakCraig Stout2017-03-291-1/+9
| | | | | | | | push_constants must be free'd. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100452 Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]>
* anv/batch_chain: Handle another OOM in cmd_buffer_execbufJason Ekstrand2017-03-291-2/+4
| | | | | | Found by inspection while rebasing other patches. Reviewed-by: Topi Pohjolainen <[email protected]>
* anv/cmd_buffer: Refactor flush_pipeline_select_*Jason Ekstrand2017-03-281-26/+16
| | | | | | | While having the _3d and _gpgpu versions is nice, there's no reason why we need to have duplicated logic for tracking the current pipeline. Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Flush caches prior to PIPELINE_SELECT on all gensJason Ekstrand2017-03-281-2/+1
| | | | | | | | | | | | | | | | The programming note that says we need to do this still exists in the SkyLake PRM and, from looking at the bspec, seems like it may apply to all hardware generations SNB+. Unfortunately, this isn't particularly clear cut since there is also language in the bspec that says you can skip the flushing and stall to get better throughput. Experimentation with the "Car Chase" benchmark in GL seems to indicate that some form of flushing is still needed. This commit makes us do the full set of flushes regardless of hardware generation. We can always reduce the flushing later. Reported-by: Topi Pohjolainen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "17.0 13.0" <[email protected]>
* anv/cmd_buffer: Fix bad indentationJason Ekstrand2017-03-281-24/+25
| | | | | | | | A bunch of code was indented in such a way that it looked like it went with the if statement above but it definitely didn't. Reviewed-by: Iago Toral Quiroga <[email protected]> Cc: "17.0 13.0" <[email protected]>
* anv/cmd_buffer: Apply flush operations prior to executing secondariesJason Ekstrand2017-03-281-0/+5
| | | | | | | This fixes rendering issues in the Vulkan port of skia on some hardware. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv/blorp: Use anv_get_layerCount everywhereJason Ekstrand2017-03-281-8/+12
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv: Make anv_get_layerCount a macroJason Ekstrand2017-03-281-7/+7
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "13.0 17.0" <[email protected]>
* intel: Fix requests for exact surface row pitch (v2)Chad Versace2017-03-282-15/+16
| | | | | | | | | | | | | | All callers of isl_surf_init() that set 'min_row_pitch' wanted to request an *exact* row pitch, as evidenced by nearby asserts, but isl lacked API for doing so. Now that isl has an API for that, update the code to use it. v2: Assert that isl_surf_init() succeeds because the callers assume it. [for jekstrand] Reviewed-by: Nanley Chery <[email protected]> (v1) Reviewed-by: Anuj Phogat <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]> (v2)
* anv/blorp: Fix a crash in CmdClearColorImageXu Randy2017-03-271-2/+2
| | | | | | | | | | | | | We should use anv_get_layerCount() to access layerCount of VkImageSub- resourceRange in anv_CmdClearColorImage and anv_CmdClearDepthStencil- Image, which handles the VK_REMAINING_ARRAY_LAYERS (~0) case. Test: Sample multithreadcmdbuf from LunarG can run without crash Signed-off-by: Xu Randy <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv: enable sampling from fast-cleared images on SKLSamuel Iglesias Gonsálvez2017-03-271-2/+2
| | | | | | | | A resolve is not needed on Skylake in this case. We were forcing a resolve because we set the input_aux_usage to ISL_AUX_USAGE_NONE. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/query: handle out of host memory without crashing in compute_query_result()Iago Toral Quiroga2017-03-241-0/+5
| | | | | | | | | | | | We don't need to make the caller (CmdCopyQueryPoolResults) aware of the problem since compute_query_result() only emits state. The caller is also expected to hit OOM in this scenario right after calling this function, but it is already handling it safely. Fixes: dEQP-VK.api.out_of_host_memory.cmd_copy_query_pool_results Reviewed-by: Topi Pohjolainen <[email protected]>
* anv/pipeline: make FragCoord include sample positions when sample shadingIago Toral Quiroga2017-03-242-8/+19
| | | | | | | | | | | | | | | | | | | | | | We need to know if sample shading has been requested during shader compilation since that affects the way fragment coordinates are computed. Notice that the semantics of fragment coordinates only depend on whether sample shading has been requested, not on whether more than one sample will actually be produced (that is, minSampleShading and rasterizationSamples do not affect this behavior). Because this setting affects the code we generate for the shader, we also need to include it in the WM prog key. Notice we don't need to alter the OpenGL code because it doesn't ever use this behavior, so they key's value is always false (the default). Fixes: dEQP-VK.glsl.builtin_var.fragcoord_msaa.* Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lower_wpos_center: support adding sample position to fragment coordinateIago Toral Quiroga2017-03-241-1/+1
| | | | | | | | | | | | | According to section 14.6 of the Vulkan specification: "When sample shading is enabled, the x and y components of FragCoord reflect the location of the sample corresponding to the shader invocation." So add a boolean parameter to the lowering pass to select this behavior when we need it. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: return VK_ERROR_DEVICE_LOST immeditely when device is known to be lostIago Toral Quiroga2017-03-242-1/+24
| | | | | | | | If we know the device has been lost we should return this error code for any command that can report it before we attempt to do anything with the device. Reviewed-by: Jason Ekstrand <[email protected]>
* anv/device: keep track of 'device lost' stateIago Toral Quiroga2017-03-242-0/+6
| | | | | | | | | | | | | | | | | | | | | | The Vulkan specs say: "A logical device may become lost because of hardware errors, execution timeouts, power management events and/or platform-specific events. This may cause pending and future command execution to fail and cause hardware resources to be corrupted. When this happens, certain commands will return VK_ERROR_DEVICE_LOST (see Error Codes for a list of such commands). After any such event, the logical device is considered lost. It is not possible to reset the logical device to a non-lost state, however the lost state is specific to a logical device (VkDevice), and the corresponding physical device (VkPhysicalDevice) may be otherwise unaffected. In some cases, the physical device may also be lost, and attempting to create a new logical device will fail, returning VK_ERROR_DEVICE_LOST." This means that we need to track if a logical device has been lost so we can have the commands referenced by the spec return VK_ERROR_DEVICE_LOST immediately. Reviewed-by: Jason Ekstrand <[email protected]>