aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
...
* ac: add missing extern "C" guardsNicolai Hähnle2017-05-182-0/+16
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac: add radeon_info::num_{sdma,compute}_ringsNicolai Hähnle2017-05-182-3/+15
| | | | | | Vulkan needs them. Reviewed-by: Marek Olšák <[email protected]>
* ac: add radeon_surf::htile_slice_sizeNicolai Hähnle2017-05-182-0/+6
| | | | | | Vulkan needs it. Reviewed-by: Marek Olšák <[email protected]>
* ac_surface: use radeon_info from ac_gpu_infoNicolai Hähnle2017-05-182-29/+29
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move radeon_info initialization to amd/commonNicolai Hähnle2017-05-182-0/+284
| | | | | | v2: update Android.common.mk (Emil) Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move struct radeon_info to ac_gpu_info.hNicolai Hähnle2017-05-181-0/+93
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move some aspects of sanity checking to ac_surfaceNicolai Hähnle2017-05-181-0/+33
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: add ac_compute_surface to automatically switch gfx6 vs. gfx9Nicolai Hähnle2017-05-182-16/+23
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move the bulk of gfx9_surface_init to ac_surfaceNicolai Hähnle2017-05-182-0/+384
| | | | | | We can now merge the two *_surface_init functions. Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move the bulk of gfx6_surface_init to ac_surfaceNicolai Hähnle2017-05-182-0/+454
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move amdgpu_addr_create to ac_surfaceNicolai Hähnle2017-05-182-0/+212
| | | | | | | | v2: - update Android.common.mk (Emil) - rebase on top of Raven support Reviewed-by: Marek Olšák <[email protected]> (v1)
* ac/radeonsi: move surface definitions to new header ac_surface.hNicolai Hähnle2017-05-181-0/+178
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: use a top-of-pipe timestamp for the start of TIME_ELAPSEDMarek Olšák2017-05-171-0/+11
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/debug: handle index field in SET_*_REG correctlyNicolai Hähnle2017-05-161-1/+7
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: add support for RavenMarek Olšák2017-05-151-0/+1
| | | | | | Cc: 17.1 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* amd/addrlib: import Raven supportMarek Olšák2017-05-151-0/+10
| | | | | | Cc: 17.1 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: Embed the shader_info in the nir_shader againJason Ekstrand2017-05-091-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ac: fix broken elimination of duplicated VS exportsMarek Olšák2017-05-081-14/+14
| | | | | | | | | | | | | | The renumbering code didn't take into account that multiple VS exports can have the same PARAM index. This also significantly simplifies the renumbering. Thankfully, we have piglits for this: spec@arb_gpu_shader5@arb_gpu_shader5-interpolateatcentroid-packing [email protected]@execution@interface-blocks-complex-vs-fs Reported by Michel Dänzer. Fixes: b08715499e61 ("ac: eliminate duplicated VS exports") Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: enable POLARIS12 support.Dave Airlie2017-05-051-0/+1
| | | | | | | | | | | This just adds the chip in the right places. We don't set the partial_vs_wave workaround, as radeonsi doesn't, but have to confirm it's not required. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.1" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: drop support for LLVM 3.8Marek Olšák2017-05-052-133/+53
| | | | | | | | | | | | LLVM 3.8: - had broken indirect resource indexing - didn't have scratch coalescing - was the last user of problematic v16i8 - only supported OpenGL 4.1 This leaves us with LLVM 3.9 and LLVM 4.0 support for Mesa 17.2. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: stop using v16i8Marek Olšák2017-05-051-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: make some PA & DB registers match the closed Vulkan driverMarek Olšák2017-05-051-0/+4
| | | | | Cc: 17.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: eliminate duplicated VS exportsMarek Olšák2017-05-031-2/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only very few shaders have them (from 48486 shaders): shaders/private/left_4_dead_2/765.shader_test - ac: 1 matches 2 shaders/private/left_4_dead_2/877.shader_test - ac: 1 matches 6 shaders/private/left_4_dead_2/2141.shader_test - ac: 1 matches 6 shaders/private/ue4_effects_cave/11.shader_test - ac: 4 matches 5 shaders/private/ue4_effects_cave/14.shader_test - ac: 5 matches 6 shaders/private/ue4_effects_cave/46.shader_test - ac: 5 matches 6 shaders/private/ue4_effects_cave/42.shader_test - ac: 4 matches 5 shaders/private/ue4_effects_cave/104.shader_test - ac: 4 matches 5 shaders/private/f1-2015/336.shader_test - ac: 3 matches 4 shaders/private/f1-2015/948.shader_test - ac: 6 matches 7 shaders/private/f1-2015/602.shader_test - ac: 0 matches 3 shaders/private/f1-2015/600.shader_test - ac: 0 matches 3 shaders/private/f1-2015/1214.shader_test - ac: 0 matches 1 shaders/private/f1-2015/988.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/149.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/346.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/178.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/136.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/168.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/690.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/19.shader_test - ac: 5 matches 6 shaders/private/dota2/1901.shader_test - ac: 0 matches 5 shaders/private/dota2/1357.shader_test - ac: 0 matches 5 shaders/private/dota2/1375.shader_test - ac: 0 matches 5 shaders/private/dota2/1369.shader_test - ac: 0 matches 5 shaders/private/dota2/1583.shader_test - ac: 0 matches 5 shaders/private/dota2/1811.shader_test - ac: 0 matches 5 shaders/private/dota2/1893.shader_test - ac: 0 matches 5 shaders/private/dota2/1533.shader_test - ac: 0 matches 5 shaders/private/dota2/1951.shader_test - ac: 0 matches 5 shaders/private/dota2/1361.shader_test - ac: 0 matches 5 shaders/private/mad_max/2792.shader_test - ac: 0 matches 1 shaders/private/mad_max/2794.shader_test - ac: 0 matches 1 shaders/private/mad_max/2780.shader_test - ac: 0 matches 1 shaders/private/mad_max/2902.shader_test - ac: 0 matches 1 shaders/private/bioshock-infinite/3050.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/2544.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3062.shader_test - ac: 3 matches 8 shaders/private/bioshock-infinite/2012.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3058.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3270.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/732.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3026.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3258.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3198.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3046.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3168.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/2550.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3210.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3032.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/668.shader_test - ac: 3 matches 7 Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: rename ac_eliminate_const_vs_outputs -> ac_optimize_vs_outputsMarek Olšák2017-05-033-15/+15
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: first parse VS exports before eliminating constant onesMarek Olšák2017-05-031-24/+58
| | | | | | | A later commit will make use of this. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radv/ac: canonicalize the output for 32-bit float min/max.Dave Airlie2017-05-031-0/+8
| | | | | | | | | | | | | | | | | | This fixes: dEQP-VK.glsl.builtin.precision.min.* dEQP-VK.glsl.builtin.precision.max.* dEQP-VK.glsl.builtin.precision.clamp.* The problem is the hw doesn't compare denorms properly, so we have to flush them, even though the spec says flushing is optional, if you don't flush the results should be correct. The -pro driver changes the shader float mode, it would be nice if llvm could grow that perhaps. Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: flush f32->f16 conversion denormals to zero. (v2)Dave Airlie2017-05-032-4/+41
| | | | | | | | | | | | | | | SPIR-V defines the f32->f16 operation as flushing denormals to 0, this compares the class using amd class opcode. Thanks to Matt Arsenault for figuring it out. This fix is VI+ only, add a TODO for SI/CIK. This fixes: dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Add top of pipe timestamp queries.Bas Nieuwenhuizen2017-05-021-0/+1
| | | | | | | Does not fix brokenness with the ready bit. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeon/ac: remove assert causing regressionDave Airlie2017-04-271-1/+0
| | | | | | | | | | This assert wasn't in the original radeonsi code but I added it without totally understanding the original code, it caused some regressions in variable-indexing tessellation shaders. Fixes: e2659176 radeonsi/ac: move vertex export remove to common code. Reported-by: Michel Dänzer <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeon/ac: fix build on llvm 3.8.1Dave Airlie2017-04-271-0/+1
| | | | | | Add missing include to fix build. Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: eliminate unused vertex shader outputs. (v2)Dave Airlie2017-04-272-14/+37
| | | | | | | | | | | This is ported from radeonsi, and I can see at least one Talos shader drops an export due to this, and saves some VGPR usage. v2: use shared code. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/ac: move vertex export remove to common code.Dave Airlie2017-04-275-1/+221
| | | | | | | | | | | This code can be shared by radv, we bump the max to VARYING_SLOT_MAX here, but that shouldn't have too much fallout. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: setup mrt exports then export them in one go. (v2)Dave Airlie2017-04-251-15/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Noticed while looking at Sascha Willems deferred shaders. This is a bit of an llvm workaround, llvm was producing this: v_cvt_pkrtz_f16_f32_e64 v4, v7, v8 ; D2960004 00021107 v_cvt_pkrtz_f16_f32_e64 v6, v9, 1.0 ; D2960006 0001E509 s_waitcnt vmcnt(0) ; BF8C0F70 exp mrt0 v4, v4, v6, v6 compr ; C400040F 00000604 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e64 v4, v12, v5 ; D2960004 00020B0C v_cvt_pkrtz_f16_f32_e64 v5, v14, 1.0 ; D2960005 0001E50E exp mrt1 v4, v4, v5, v5 compr ; C400041F 00000504 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e64 v0, v0, v1 ; D2960000 00020300 v_cvt_pkrtz_f16_f32_e64 v1, v2, v3 ; D2960001 00020702 exp mrt2 v0, v0, v1, v1 done compr vm ; C4001C2F 00000100 After this change: v_cvt_pkrtz_f16_f32_e64 v4, v7, v8 ; D2960004 00021107 s_waitcnt vmcnt(0) ; BF8C0F70 v_cvt_pkrtz_f16_f32_e64 v0, v0, v1 ; D2960000 00020300 v_cvt_pkrtz_f16_f32_e64 v6, v9, 1.0 ; D2960006 0001E509 v_cvt_pkrtz_f16_f32_e64 v5, v12, v5 ; D2960005 00020B0C v_cvt_pkrtz_f16_f32_e64 v7, v14, 1.0 ; D2960007 0001E50E exp mrt0 v4, v4, v6, v6 compr ; C400040F 00000604 v_cvt_pkrtz_f16_f32_e64 v1, v2, v3 ; D2960001 00020702 exp mrt1 v5, v5, v7, v7 compr ; C400041F 00000705 exp mrt2 v0, v0, v1, v1 done compr vm ; C4001C2F 00000100 No waitcnt for exports are emitted. v2: fixup index->mrt mapping (Bas). Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: overhaul vs output/ps input routingDave Airlie2017-04-252-6/+19
| | | | | | | | In order to cleanly eliminate exports rewrite the code first to mirror how radeonsi works for now. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: copy llvm machine feature flags from radeonsi.Dave Airlie2017-04-241-1/+1
| | | | | | | | This just updates this to use the same flags as radeonsi for consistency. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: use tex_lz if we can.Dave Airlie2017-04-201-6/+16
| | | | | | | | Looking at some Talos shaders vs radeonsi, I noticed they use tex_lz in a few places, so we should be able to as well. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: fix build after LLVM 5.0 SVN r300718Christoph Haag2017-04-201-0/+4
| | | | | | | | v2: previously getWithDereferenceableBytes() exists, but addAttr() doesn't take that type Signed-off-by: Christoph Haag <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Tested-and-reviewed-by: Mike Lothian <[email protected]>
* radv/ac: Fix nir.h includeMike Lothian2017-04-191-1/+1
| | | | | | | | | | | | | This fixes the build after: commit 224cf2906a8f38ce47411afc93a223ac0e41795f Author: Dave Airlie <[email protected]> Date: Mon Apr 17 13:01:52 2017 +1000 radv/ac: add initial pre-pass for shader info gathering Signed-off-by: Mike Lothian <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop debugging leftovers code in descriptor set patches.Dave Airlie2017-04-191-3/+0
| | | | Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: add support for indirect access of descriptor sets.Dave Airlie2017-04-192-16/+41
| | | | | | | | | | | | | | | | | | | We want to expose more descriptor sets to the applications, but currently we have a 1:1 mapping between shader descriptor sets and 2 user sgprs, limiting us to 4 per stage. This commit check if we don't have enough user sgprs for the number of bound sets for this shader, we can ask for them to be indirected. Two sgprs are then used to point to a buffer or 64-bit pointers to the number of allocated descriptor sets. All shaders point to the same buffer. We can use some user sgprs to inline one or two descriptor sets in future, but until we have a workload that needs this I don't think we should spend too much time on it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: start allocating user sgprsDave Airlie2017-04-191-13/+74
| | | | | | | | | | | | This adds an initial implementation to allocate the user sgprs and make sure we don't run out if we try to bind a bunch of descriptor sets. This can be enhanced further in the future if we add support for inlining push constants. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: mark used descriptor sets in shader info.Dave Airlie2017-04-192-0/+35
| | | | | | | This pre calculates the used descriptor sets. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: frag shader only needs ring offsets if sample positions enabledDave Airlie2017-04-191-1/+4
| | | | | | | | mostly documenting things, since with modern llvm we always have the spill enabled. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: move needs_push_constants to shader info.Dave Airlie2017-04-193-10/+11
| | | | | | | First step to optimising push constants. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: optimise compute shader grid size emission.Dave Airlie2017-04-193-5/+14
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: start conditionalising vertex inputs. (v2)Dave Airlie2017-04-193-8/+43
| | | | | | | | | | | In practice this will probably just drop draw id in a few places. v2: just do draw_id for now. (Bas) it might be possible to do something more if we need it in the future. (nha) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: add initial pre-pass for shader info gatheringDave Airlie2017-04-194-8/+113
| | | | | | | | | | | | There is some radv specific info we need to gather from shaders before we get into converting nir->llvm, so we can make better decisions especially around user sgpr allocation. This is just an initial placeholder to gather if sample positions are required in the frag shader. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Use an offset instead of pointers for immutable samplers.Bas Nieuwenhuizen2017-04-121-5/+7
| | | | | | | Makes more sense when we hash the layout for the pipeline cache. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: add unreachable() in ac_build_image_opcode()Samuel Pitoiset2017-04-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To silent the following compiler warning: common/ac_llvm_build.c: In function ‘ac_build_image_opcode’: common/ac_llvm_build.c:1080:3: warning: ‘name’ may be used uninitialized in this function [-Wmaybe-uninitialized] snprintf(intr_name, sizeof(intr_name), "%s%s%s%s.v4f32.%s.v8i32", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ name, ~~~~~ a->compare ? ".c" : "", ~~~~~~~~~~~~~~~~~~~~~~~ a->bias ? ".b" : ~~~~~~~~~~~~~~~~ a->lod ? ".l" : ~~~~~~~~~~~~~~~ a->deriv ? ".d" : ~~~~~~~~~~~~~~~~~ a->level_zero ? ".lz" : "", ~~~~~~~~~~~~~~~~~~~~~~~~~~~ a->offset ? ".o" : "", ~~~~~~~~~~~~~~~~~~~~~~ type); ~~~~~ Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: fix interp at sample code.Dave Airlie2017-04-041-3/+1
| | | | | | | | | | Interp at sample needs to use the center, since the sample positions it retrieves are relative to the center. This fixes a bunch of CTS tests with multisample_interpolation. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>