summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radeon/ac: remove assert causing regressionDave Airlie2017-04-271-1/+0
| | | | | | | | | | This assert wasn't in the original radeonsi code but I added it without totally understanding the original code, it caused some regressions in variable-indexing tessellation shaders. Fixes: e2659176 radeonsi/ac: move vertex export remove to common code. Reported-by: Michel Dänzer <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeon/ac: fix build on llvm 3.8.1Dave Airlie2017-04-271-0/+1
| | | | | | Add missing include to fix build. Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: eliminate unused vertex shader outputs. (v2)Dave Airlie2017-04-273-21/+45
| | | | | | | | | | | This is ported from radeonsi, and I can see at least one Talos shader drops an export due to this, and saves some VGPR usage. v2: use shared code. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/ac: move vertex export remove to common code.Dave Airlie2017-04-275-1/+221
| | | | | | | | | | | This code can be shared by radv, we bump the max to VARYING_SLOT_MAX here, but that shouldn't have too much fallout. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix regression in descriptor set freeing.Dave Airlie2017-04-271-1/+1
| | | | | | | | | | | Since the host pool changes, Fixes: dEQP-VK.api.descriptor_pool.out_of_pool_memory Fixes: 126d5ad "radv: Use host memory pool for non-freeable descriptors." Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Enable userspace fence checking.Bas Nieuwenhuizen2017-04-263-3/+36
| | | | | | | | | | v2: - Added some error handling. - memset the buffer to 0. v3: Added assert for buffer size. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/ac: setup mrt exports then export them in one go. (v2)Dave Airlie2017-04-251-15/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Noticed while looking at Sascha Willems deferred shaders. This is a bit of an llvm workaround, llvm was producing this: v_cvt_pkrtz_f16_f32_e64 v4, v7, v8 ; D2960004 00021107 v_cvt_pkrtz_f16_f32_e64 v6, v9, 1.0 ; D2960006 0001E509 s_waitcnt vmcnt(0) ; BF8C0F70 exp mrt0 v4, v4, v6, v6 compr ; C400040F 00000604 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e64 v4, v12, v5 ; D2960004 00020B0C v_cvt_pkrtz_f16_f32_e64 v5, v14, 1.0 ; D2960005 0001E50E exp mrt1 v4, v4, v5, v5 compr ; C400041F 00000504 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e64 v0, v0, v1 ; D2960000 00020300 v_cvt_pkrtz_f16_f32_e64 v1, v2, v3 ; D2960001 00020702 exp mrt2 v0, v0, v1, v1 done compr vm ; C4001C2F 00000100 After this change: v_cvt_pkrtz_f16_f32_e64 v4, v7, v8 ; D2960004 00021107 s_waitcnt vmcnt(0) ; BF8C0F70 v_cvt_pkrtz_f16_f32_e64 v0, v0, v1 ; D2960000 00020300 v_cvt_pkrtz_f16_f32_e64 v6, v9, 1.0 ; D2960006 0001E509 v_cvt_pkrtz_f16_f32_e64 v5, v12, v5 ; D2960005 00020B0C v_cvt_pkrtz_f16_f32_e64 v7, v14, 1.0 ; D2960007 0001E50E exp mrt0 v4, v4, v6, v6 compr ; C400040F 00000604 v_cvt_pkrtz_f16_f32_e64 v1, v2, v3 ; D2960001 00020702 exp mrt1 v5, v5, v7, v7 compr ; C400041F 00000705 exp mrt2 v0, v0, v1, v1 done compr vm ; C4001C2F 00000100 No waitcnt for exports are emitted. v2: fixup index->mrt mapping (Bas). Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: overhaul vs output/ps input routingDave Airlie2017-04-253-37/+55
| | | | | | | | In order to cleanly eliminate exports rewrite the code first to mirror how radeonsi works for now. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: move point coord after layer/viewport.Dave Airlie2017-04-251-6/+7
| | | | | | | | These need to be ordered as per shader enum ordering, I'll rewrite this soon, but this is a bug fix. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* android: radv/ac: Fix nir.h includeMauro Rossi2017-04-241-0/+2
| | | | | | | | | | | | | | Fixes following building errors due to missing include paths: external/mesa/src/amd/common/ac_shader_info.c:23:10: fatal error: 'nir/nir.h' file not found ^ external/mesa/src/compiler/nir/nir.h:48:10: fatal error: 'nir_opcodes.h' file not found ^ Fixes: 224cf29 "radv/ac: add initial pre-pass for shader info gathering" Acked-by: Dave Airlie <[email protected]> Acked-by: Emil Velikov <[email protected]>
* radv/ac: copy llvm machine feature flags from radeonsi.Dave Airlie2017-04-241-1/+1
| | | | | | | | This just updates this to use the same flags as radeonsi for consistency. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Enable lowering fdiv in nir.Bas Nieuwenhuizen2017-04-231-0/+1
| | | | | | | Results in faster code than the lowering by LLVM. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Use the correct pipeline for dispatches.Bas Nieuwenhuizen2017-04-221-3/+3
| | | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: ec15e0d30 "radv: optimise compute shader grid size emission." Tested-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Prefetch compute shader too.Bas Nieuwenhuizen2017-04-211-0/+1
| | | | | | | For consistency, doesn't really impact performance. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/ac: use tex_lz if we can.Dave Airlie2017-04-201-6/+16
| | | | | | | | Looking at some Talos shaders vs radeonsi, I noticed they use tex_lz in a few places, so we should be able to as well. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: fix build after LLVM 5.0 SVN r300718Christoph Haag2017-04-201-0/+4
| | | | | | | | v2: previously getWithDereferenceableBytes() exists, but addAttr() doesn't take that type Signed-off-by: Christoph Haag <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Tested-and-reviewed-by: Mike Lothian <[email protected]>
* radv: Set variant code_size when created from the cache.Bas Nieuwenhuizen2017-04-201-0/+1
| | | | | Signed-off-by: Bas Nieeuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Add shader prefetch.Bas Nieuwenhuizen2017-04-193-0/+18
| | | | | | | | | | Gives me approximately a 2% perf increase in bot dota2 & talos. Having descriptors (both sets and vertex buffers) prefetched didn't help so I didn't include that. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Remove binding buffer count.Bas Nieuwenhuizen2017-04-193-13/+10
| | | | | | | In cases where it is used it is always 1. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Don't try to find gaps for non-freeable descriptors.Bas Nieuwenhuizen2017-04-191-2/+3
| | | | | | | | With this we don't have any operations on a pool with non-freeable descriptors left that have O(#descriptors) complexity. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use host memory pool for non-freeable descriptors.Bas Nieuwenhuizen2017-04-192-16/+57
| | | | | | | | v2: Handle out of pool memory error. v3: Actually use VK_ERROR_OUT_OF_POOL_MEMORY_KHR for the error condition. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Don't allocate dynamic descriptors separately.Bas Nieuwenhuizen2017-04-191-12/+4
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/meta: Fix nir_builder.h includeMike Lothian2017-04-191-1/+1
| | | | | | | | | | | | | This fixes the build after: commit 399ebd2a84a133bd2ca3da388a059fd3bafe33f5 Author: Dave Airlie <[email protected]> Date: Wed Apr 19 06:18:23 2017 +1000 radv/meta: add common shader vertex generation function Signed-off-by: Mike Lothian <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: Fix nir.h includeMike Lothian2017-04-191-1/+1
| | | | | | | | | | | | | This fixes the build after: commit 224cf2906a8f38ce47411afc93a223ac0e41795f Author: Dave Airlie <[email protected]> Date: Mon Apr 17 13:01:52 2017 +1000 radv/ac: add initial pre-pass for shader info gathering Signed-off-by: Mike Lothian <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: refactor out some common shaders.Dave Airlie2017-04-195-104/+43
| | | | | | | | The vs vertex generate and fs noop shaders are used in a few places, so refactor them out. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: generate position for blit shaders.Dave Airlie2017-04-191-51/+16
| | | | | | | This generates the position info using the vertex shader. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: reduce vertex buffer in blit2d.Dave Airlie2017-04-191-28/+7
| | | | | | | Generate the position vertices. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: reduce vertex buffer usage in clear shadersDave Airlie2017-04-193-53/+25
| | | | | | | | | For depth clears we have to pass the depth in the 2nd component, we can use push constants for some of this later to drop the vertex buffer completely Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: avoid using vertex buffer for resolve shader.Dave Airlie2017-04-191-67/+7
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: move depth decompress to using inline vertex dataDave Airlie2017-04-191-69/+6
| | | | | | | | This removes the vertex buffer, and just generates the values in the shader. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: move fast clear to generate vertices in shader.Dave Airlie2017-04-191-68/+6
| | | | | | | Avoids having to setup vertex buffers. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: add common shader vertex generation functionDave Airlie2017-04-192-0/+39
| | | | | | | | | | Instead of passing in the same 1.0, -1.0 combinations via vertex buffers, we can just use vertex id to have the vertex shader build them. This function introduces the generator code needed, later patches will use this. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: add support for save/restore meta without vertex data.Dave Airlie2017-04-192-8/+38
| | | | | | | | Some of the shaders could just generate the vertex data in the shader, so add helpers to allow us to move to doing that. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop debugging leftovers code in descriptor set patches.Dave Airlie2017-04-191-3/+0
| | | | Signed-off-by: Dave Airlie <[email protected]>
* radv: add support for 32 descriptor sets.Dave Airlie2017-04-192-7/+7
| | | | | | | | | This bumps the limit to the number of sets to 32, now that we have proper support for it. It also uses 1u in a few places to make things a bit safer. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: add support for indirect access of descriptor sets.Dave Airlie2017-04-195-18/+105
| | | | | | | | | | | | | | | | | | | We want to expose more descriptor sets to the applications, but currently we have a 1:1 mapping between shader descriptor sets and 2 user sgprs, limiting us to 4 per stage. This commit check if we don't have enough user sgprs for the number of bound sets for this shader, we can ask for them to be indirected. Two sgprs are then used to point to a buffer or 64-bit pointers to the number of allocated descriptor sets. All shaders point to the same buffer. We can use some user sgprs to inline one or two descriptor sets in future, but until we have a workload that needs this I don't think we should spend too much time on it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: start allocating user sgprsDave Airlie2017-04-191-13/+74
| | | | | | | | | | | | This adds an initial implementation to allocate the user sgprs and make sure we don't run out if we try to bind a bunch of descriptor sets. This can be enhanced further in the future if we add support for inlining push constants. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: mark used descriptor sets in shader info.Dave Airlie2017-04-192-0/+35
| | | | | | | This pre calculates the used descriptor sets. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: frag shader only needs ring offsets if sample positions enabledDave Airlie2017-04-191-1/+4
| | | | | | | | mostly documenting things, since with modern llvm we always have the spill enabled. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: move needs_push_constants to shader info.Dave Airlie2017-04-193-10/+11
| | | | | | | First step to optimising push constants. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: optimise compute shader grid size emission.Dave Airlie2017-04-194-13/+29
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: start conditionalising vertex inputs. (v2)Dave Airlie2017-04-194-14/+63
| | | | | | | | | | | In practice this will probably just drop draw id in a few places. v2: just do draw_id for now. (Bas) it might be possible to do something more if we need it in the future. (nha) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: add initial pre-pass for shader info gatheringDave Airlie2017-04-196-9/+116
| | | | | | | | | | | | There is some radv specific info we need to gather from shaders before we get into converting nir->llvm, so we can make better decisions especially around user sgpr allocation. This is just an initial placeholder to gather if sample positions are required in the frag shader. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: enable timestampComputeAndGraphicsGrazvydas Ignotas2017-04-171-1/+1
| | | | | | | | | | | | Commit bfee9866 "radv: Use RELEASE_MEM packet for MEC timestamp query." added WriteTimestamp handling for compute queues but forgot to flip the flag. Tested with DOOM (by me) and CTS (by Bas), but without verification that these tests actually use timestamps on compute queues. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* android: amd/addrlib: trivial fix for gfx9 supportMauro Rossi2017-04-171-0/+2
| | | | | | | | | | | | Fixes the following build error: external/mesa/src/amd/addrlib/gfx9/gfx9addrlib.cpp:36:10: fatal error: 'gfx9_gb_reg.h' file not found ^ 1 error generated. Fixes: 7f160ef "amd/addrlib: import gfx9 support" Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radv: remove the temp descriptor set infrastructureFredrik Höglund2017-04-142-76/+28
| | | | | | | It is no longer used. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use push descriptors in metaFredrik Höglund2017-04-146-416/+301
| | | | | | | Use push descriptors instead of temp descriptor sets. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add private push descriptors for metaFredrik Höglund2017-04-142-0/+41
| | | | | | | | | | | | This allows meta to use push descriptors without disturbing user push descriptors. radv_meta_push_descriptor_set differs from vkCmdPushDescriptorSetKHR in that partial updates are not supported; all descriptors used in subsequent draw commands must be pushed at the same time. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove irrelevant commentGrazvydas Ignotas2017-04-141-1/+1
| | | | | | | A leftover from anv. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: report timestampPeriod correctlyGrazvydas Ignotas2017-04-142-2/+2
| | | | | | | | | | | | The kernel returns frequency in kHz, so to convert to nanosecond interval that Vulkan uses the dividend should be 1000000.0 and not 100000.0. This fixes the GPU graph in DOOM and matches the amdgpu-pro blob. Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver" Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>