summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: add support for local bos. (v3)Dave Airlie2017-10-2611-22/+51
| | | | | | | | | | | | This uses the new kernel interfaces for reduced cs overhead, We only set the local flag for memory allocations that don't have a dedicated allocation and ones that aren't imports. v2: add to all the internal buffer creation paths. v3: missed some command submission paths, handle 0/empty bo lists. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: only copy the dynamic states that changedSamuel Pitoiset2017-10-261-23/+69
| | | | | | | | | | | | | | | | When binding a new pipeline, we applied all dynamic states without checking if they really need to be re-emitted. This doesn't seem to be useful for the meta operations because only the viewports/scissors are updated. This should reduce the number of commands added to the IB when a new graphics pipeline is bound. Also, rename radv_dynamic_state_copy() to radv_bind_dynamic_state() and set the dirty flags directly there. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: store the dynamic state mask into radv_dynamic_stateSamuel Pitoiset2017-10-263-7/+12
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: only emit the depth bounds test values when set dynamicallySamuel Pitoiset2017-10-261-2/+1
| | | | | | | | The depth bounds test values are either set at pipeline creation or dynamically using vkCmdSetDepthBounds(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/llvm: drop pointless wrappers around umsb/imsbDave Airlie2017-10-261-14/+2
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: consolidate find lsb function.Dave Airlie2017-10-263-29/+36
| | | | | | | This was the same between si and ac. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: drop v4f32empty. (v2)Dave Airlie2017-10-261-12/+0
| | | | | | | | | This was unused. v2: drop args. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: add i1false/i1true to common code.Dave Airlie2017-10-263-41/+33
| | | | | | | These get used in fair few places. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: use the ac i32 0/1 and f32 0/1 llvm types.Dave Airlie2017-10-261-60/+52
| | | | | | | This just avoids having two copies of these. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: move lds declaration/load/store into shared code.Dave Airlie2017-10-263-41/+50
| | | | | | | This was duplicated between both drivers, share here. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Compute ac keys from pipeline key.Bas Nieuwenhuizen2017-10-261-72/+41
| | | | | | | | The beginning of the end for the shader keys. Not entirely sure what I'm going to replace them with for the compiler though, so this is the first step. Reviewed-by: Timothy Arceri <[email protected]>
* radv: Add single pipeline cache key.Bas Nieuwenhuizen2017-10-263-8/+55
| | | | | | | To decouple the key used for info gathering and the cache from whatever we pass to the compiler. Reviewed-by: Timothy Arceri <[email protected]>
* radv: Don't compute as_ls/as_es before hashing.Bas Nieuwenhuizen2017-10-261-14/+12
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: generate correct instruction for atomic min/max on unsigned imagesMatthew Nicholls2017-10-251-2/+4
| | | | | | | | v2: fix silly typo Cc: "17.2 17.3" <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: print NIR before LLVM IR and disassemblySamuel Pitoiset2017-10-251-7/+10
| | | | | | | | | It's still printed after linking, but it makes more sense to have SPIRV->NIR->LLVM IR->ASM. Fixes: f0a2bbd1a4 (radv: move nir print after linking is done) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix truncation issue hexifying the cache uuid for the disk cache.Bas Nieuwenhuizen2017-10-251-2/+2
| | | | | | | Going from binary to hex has a 2x blowup. Fixes: 14216252923 'radv: create on-disk shader cache' Reviewed-by: Dave Airlie <[email protected]>
* radv: enable lower to scalar nir passTimothy Arceri2017-10-251-0/+24
| | | | | | This will allow dead components of varyings to be removed. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add support for explicit component packingTimothy Arceri2017-10-251-16/+52
| | | | | | | | | | | | | | This is needed for RADV to support explicit component packing. This is also required to use the new NIR component splitting / packing passes. V2: - add commponent packing support for interpolate_at* intrinsics - improve store packing support when not all varyings are scalar as spotted by Bas the store source was incorrectly offset. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use device name in cache creation like radeonsi.Dave Airlie2017-10-251-2/+3
| | | | | | | | Not sure how useful this is, but it makes it more consistent. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.3" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: use a define for the transition point between cp and compute shaderDave Airlie2017-10-251-3/+9
| | | | | | | | | | For certain buffer meta ops we can use the CP or a compute shader, we should use a define to rather than hardcoding 4096, allows for easier testing and more consistency. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: only emit dfsm packets if dfsm is allowed.Dave Airlie2017-10-242-3/+4
| | | | | | | | | radeonsi only emits these when dfsm is enabled, so for now just hinge them on a flag we never set. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: postponed KILL isn't postponed anymore, but maintains WQMMarek Olšák2017-10-242-0/+8
| | | | | | | | | | | | | This restores performance for the drirc workaround, i.e. KILL_IF does: visible = src0 >= 0; kill_flag &= visible; // accumulate kills amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only And all helper pixels are killed at the end of the shader: amdgcn_kill(kill_flag); Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: use llvm.amdgcn.kill with LLVM 6.0Marek Olšák2017-10-241-0/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: replace ac_build_kill with ac_build_kill_if_falseMarek Olšák2017-10-243-26/+11
| | | | | | | This will be a new LLVM intrinsic and will also work nicely with llvm.amdgcn.wqm.vote. Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: move nir print after linking is doneTimothy Arceri2017-10-242-5/+7
| | | | | | | | We now have linking optimisations so we want to delay dumping the nir until after these are complete. Fixes: 06f05040eb73 (radv: Link shaders) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clone meta shaders before linkingTimothy Arceri2017-10-241-1/+8
| | | | | | | | | | The IR is reused in different pipeline combinations so we need to clone it to avoid link time optimistaions messing up the original copy. Fixes: 06f05040eb73 (radv: Link shaders) Reviewed-by: Dave Airlie <[email protected]>
* radv: Update code pointer correctly if a variant is already createdAlex Smith2017-10-231-0/+2
| | | | | | | | | | | | | | This was the actual cause of GPU hangs fixed by 0fdd531457ec ("radv: Fix pipeline cache locking issues"), since multiple threads would end up trying to create the variants for a single entry. Now that we're locking around the whole of this function, this isn't really necessary (we either create all or none of the variants), but fix this anyway in case things change later. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> CC: 17.3 <[email protected]>
* ac: Silence a compiler warning about results[0].Eric Anholt2017-10-231-0/+1
| | | | | | We know that num_components will be > 0, but it doesn't. Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: Fix a compiler warning for possibly undefined "name"Eric Anholt2017-10-231-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* amd/common/gfx9: workaround DCC corruption more conservativelyNicolai Hähnle2017-10-231-7/+25
| | | | | | | | Fixes KHR-GL45.texture_swizzle.smoke and others on Vega. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809 Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* radv: automake: include radv_extensions.py in the tarball17.3-branchpointJuan A. Suarez Romero2017-10-231-0/+1
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* ac/nir: Only clamp shadow reference on radeonsi.Bas Nieuwenhuizen2017-10-233-2/+8
| | | | | | | | | | | | | | | | | | Vulkan CTS does not expect the value to be clamped (at least for D32), and it makes a differences even though depth is in [0,1], due to strict inequalities. I couldn't find anything in the Vulkan spec about this, but the test seemed to be copied from GL tests and the GL spec only specifies clamping for fixed point formats. Hence I expect radeonsi to run into this at some point as well, but given that they still have a usecase with the Z16->Z32 promotion, I'll leave that for someone else to clean up. This at least fixes radv dEQP-VK.texture.shadow.* on VI. Fixes: 0f9e32519bb 'ac/nir: clamp shadow texture comparison value on VI' Reviewed-by: Dave Airlie <[email protected]>
* radv: Disallow indirect outputs for GS on GFX9 as well.Bas Nieuwenhuizen2017-10-231-3/+1
| | | | | | | | Since it also uses the output vector before writing to memory. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: Fix nir_texop_lod on GFX for 1D arrays.Bas Nieuwenhuizen2017-10-231-1/+3
| | | | | Fixes: 1bcb953e166 'radv: handle GFX9 1D textures' Reviewed-by: Dave Airlie <[email protected]>
* radv/ac/nir: only emit tess factors to storage if tes reads themDave Airlie2017-10-233-2/+4
| | | | | | | | | | Otherwise we just need to write them to the tf ring. this seems to improve the tessellation demo on Bonarie ~2190->~2230 fps Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Don't use vgpr indexing for outputs on GFX9.Bas Nieuwenhuizen2017-10-221-0/+5
| | | | | | | | Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.* tests. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Account for compact array index in GS input load from LDS.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | Mirrors the vram path. Fixes: d4ecc3c9299 'ac/nir: Add loading from LDS for merged GS.' Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't compile shaders when they are cached already.Bas Nieuwenhuizen2017-10-211-19/+23
| | | | | | | | | | When the gs_copy_shader is NULL (due to an incomplete cache), but the main shaders are found, we still do the nir, but we shouldn't compile the shaders again. For merged shaders we should also account for the missing shaders. Fixes: ce03c119ce0 'radv: Add code to compile merged shaders.' Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't check for max GL GS invocations.Bas Nieuwenhuizen2017-10-211-2/+0
| | | | | | | We specify 127 instead of 32 as the limit in vulkan. Fixes: 6bc42855f92 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't explicitly reference vertex shader for draw_id.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | | | With merged shaders the vertex shader may not exist. This got in because the offending patch was written before merged shaders were upstream, but committed after. Fixes: 75dfab24a2c 'radv: refactor indirect draws with radv_draw_info' Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Don't reset cmd_buffer->state.dirty.Bas Nieuwenhuizen2017-10-211-2/+0
| | | | | | | | | Otherwise for non-indexed draws we set and immediately unset RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should clear their own bit, this is unnecessary. Fixes: 341529dbee5 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Correctly detect changed shaders for vertex descriptors.Bas Nieuwenhuizen2017-10-211-6/+6
| | | | | | | | As they were emitted after the new pipeline, the changed pipeline detection was not working anymore. Fixes: 341529dbee5 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Set larged wrokgroup size for GS on GFX9.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | They don't take a single wave anymore and we need the barriers. Fixes: 6bc42855f92 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Take the max workgroup size of all provided shaders.Bas Nieuwenhuizen2017-10-211-1/+6
| | | | | Fixes: ffaf4d608a1 'radv: Enable tessellation shaders for GFX9.' Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix pipeline cache locking issuesAlex Smith2017-10-211-7/+23
| | | | | | | | | | | | Need to lock around the whole process of retrieving cached shaders, and around GetPipelineCacheData. This fixes GPU hangs observed when creating multiple pipelines in parallel, which appeared to be due to invalid shader code being pulled from the cache. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: disable implicit sync for radv allocated bos v3Andres Rodriguez2017-10-214-1/+8
| | | | | | | | | | | | | | | | | Implicit sync kicks in when a buffer is used by two different amdgpu contexts simultaneously. Jobs that use explicit synchronization mechanisms end up needlessly waiting to be scheduled for long periods of time in order to achieve serialized execution. This patch disables implicit synchronization for all radv allocations except for wsi bos. The only systems that require implicit synchronization are DRI2/3 and PRIME. v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC v3: Add drm version check (Bas) Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: factor out radv_alloc_memoryAndres Rodriguez2017-10-212-5/+25
| | | | | | | | | This allows us to pass extra parameters to the memory allocation operation that are not defined in the vulkan spec. This is useful for internal usage. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Expose VK_EXT_global_priorityAndres Rodriguez2017-10-214-0/+5
| | | | | | | Expose the extension string as supported Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't skip PS/VS partial flushAndres Rodriguez2017-10-211-8/+6
| | | | | | | | | | | | This patch helps lower high priority compute latency. Found by bisecting a perf regression on computeparticles with high priority compute queues enabled. Reverting this micro-optimization doesn't seem to have any negative effect on performance on Dota2 or ssao. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Implement VK_EXT_global_priorityAndres Rodriguez2017-10-214-8/+62
| | | | | | | | | This extension allows the caller to change a queue's system wide priority. This is useful for applications with specific latency constraints. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>