summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* radv/gfx10: set llvm_has_working_vgpr_indexingSamuel Pitoiset2019-07-071-3/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: unpacked GS invocation ID on GFX10+Samuel Pitoiset2019-07-071-3/+10
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add missing formats to ac_get_tbuffer_format() for GFX10Samuel Pitoiset2019-07-071-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: destroy passes in ac_destroy_llvm_compilerMarek Olšák2019-07-041-0/+3
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: use an LLVM fence instead of s.waitcnt when possibleMarek Olšák2019-07-041-9/+9
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: remove unused AC_WAIT_EXPMarek Olšák2019-07-042-7/+3
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: only set ac_dlc in ac_llvm_build.cMarek Olšák2019-07-042-8/+12
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: replace glc,slc with cache_policy for loadsMarek Olšák2019-07-043-77/+60
| | | | | | cosmetic change Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: replace glc,slc with cache_policy for storesMarek Olšák2019-07-043-55/+40
| | | | | | cosmetic change Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* amd/common: move ac_shader_{binary,reloc} into r600 and renameNicolai Hähnle2019-07-042-39/+0
| | | | | | | | They are no longer used by radeonsi or radv. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: removed unused ac_shader_binary functionsNicolai Hähnle2019-07-042-240/+0
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: remove unused ac_compile_module_to_binaryNicolai Hähnle2019-07-042-16/+0
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: rework ac_build_waitcnt for gfx10Marek Olšák2019-07-033-14/+49
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/gfx10: set PA_SC_TILE_STEERING_OVERRIDEMarek Olšák2019-07-032-0/+3
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: set DLC for llvm.amdgcn.s.buffer.loadNicolai Hähnle2019-07-031-3/+1
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/gfx10: set DLC for loads when GLC is setMarek Olšák2019-07-033-12/+26
| | | | | | This fixes L1 shader array cache coherency. Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/gfx10: implement hardware MSAA resolveNicolai Hähnle2019-07-032-1/+2
| | | | | | | | | | MSAA is only supported for 64KB_{R,Z}_X modes, so the micro tile optimization that we use on gfx9 and earlier does not work. Be very explicit about how the swizzle mode of the temporary surface is selected. Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/gfx10: implement gfx10_shader_nggNicolai Hähnle2019-07-031-0/+1
| | | | | | | | | | For pipelines without API GS. We will later expand this to cover NGG geometry shaders as well. Note that the vtx offset passed into the GS part is just the vertex index multiplied by VGT_ESGS_RING_ITEMSIZE. Acked-by: Bas Nieuwenhuizen <[email protected]>
* ac/surface/gfx10: allow "rotated" micro modeNicolai Hähnle2019-07-032-8/+8
| | | | | | | | Standard mode does not support DCC. The R is retconned to "render target" on gfx10. Acked-by: Bas Nieuwenhuizen <[email protected]>
* ac/surface/gfx10: DCC is only supported with SW_64KB_{Z,R}_X modesNicolai Hähnle2019-07-031-3/+10
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: print gfx10 registers in debug dumpsNicolai Hähnle2019-07-031-1/+3
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: CMASK is only used for FMASKNicolai Hähnle2019-07-031-2/+3
| | | | | | All regular color compression is done via DCC. Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: support new tbuffer encodingNicolai Hähnle2019-07-031-2/+45
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: pad shader buffers for instruction prefetchNicolai Hähnle2019-07-031-0/+19
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: implement scan & reduce operationsNicolai Hähnle2019-07-031-8/+104
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: add GS_ALLOC_REQ message defineNicolai Hähnle2019-07-031-0/+1
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: print out GCR_CNTL as part of {ACQUIRE,RELEASE}_MEMNicolai Hähnle2019-07-031-11/+17
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common/gfx10: add register JSONNicolai Hähnle2019-07-031-2/+4
| | | | | | A small number of fields now need new disambiguation. Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: add GFX10 chipsNicolai Hähnle2019-07-033-3/+20
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* amd/addrlib: add gfx10 supportMarek Olšák2019-07-031-1/+1
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* ac: compute the DCC fast clear size per slice on GFX8Samuel Pitoiset2019-07-022-0/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: compute the size of one DCC slice on GFX8Samuel Pitoiset2019-07-022-0/+7
| | | | | | | | Addrlib doesn't provide this info. Because DCC is linear, at least on GFX8, it's easy to compute the size of one slice. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: change ac_query_gpu_info() signatureEmil Velikov2019-06-282-4/+3
| | | | | | | | | | | | | | | | | | | | | | Currently libdrm_amdgpu provides a typedef of the various handles. While the goal was to make those opaque, it effectively became part of the API To the best of my knowledge there are two ways to have opaque handles: - "typedef void *foo;" - rather messy IMHO - "stuct foo;" and use "struct foo *" through the API In our case amdgpu_device_handle is used only internally, plus respective code is not used or applicable for r300 and r600. Hence we copied the typedef. Seemingly this will be a problem since libdrm_amdgpu wants to change the API, while not updating the code(?). Either way, we can safely s/amdgpU_device_handle/void */ and carry on. Cc: Michel Dänzer <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <marek.olsak at amd.com>
* radv: clear CMASK layers instead of the whole buffer on GFX8Samuel Pitoiset2019-06-252-1/+3
| | | | | | | | | | This reduces the size of fill operations needed to clear CMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clear FMASK layers instead of the whole buffer on GFX8Samuel Pitoiset2019-06-252-0/+2
| | | | | | | | | | This reduces the size of fill operations needed to clear FMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: set the calling convention for inlined function callsMarek Olšák2019-06-242-0/+11
| | | | | | | otherwise the behavior is undefined Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* amd/rtld: update the ELF representation of LDS symbolsNicolai Hähnle2019-06-241-7/+27
| | | | | | | | | | | | | | | | The initial prototype used a processor-specific symbol type, but feedback suggests that an approach using processor-specific section name that encodes the alignment analogous to SHN_COMMON symbols is preferred. This patch keeps both variants around for now to reduce problems with LLVM compatibility as we switch branches around. This also cleans up the error reporting in this function. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/surface: remove addrlib_family_rev_idMarek Olšák2019-06-243-108/+7
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: lower bitfield_extract to ubfe/ibfe.Daniel Schürmann2019-06-242-35/+20
| | | | Reviewed-by: Connor Abbott <[email protected]>
* amd/common: lower bitfield_insert to bfm & bitfield_selectDaniel Schürmann2019-06-241-26/+25
| | | | Reviewed-by: Connor Abbott <[email protected]>
* ac/rtld: report better error messages for LDS overallocationNicolai Hähnle2019-06-191-2/+11
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/rtld: check correct LDS max sizeMarek Olšák2019-06-192-1/+9
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: add s_sethalt to shaders for debuggingNicolai Hähnle2019-06-192-0/+18
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/rtld: fix sorting of LDS symbols by alignmentNicolai Hähnle2019-06-191-2/+2
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/nir: Set speculatable for buffer loads where allowedConnor Abbott2019-06-191-3/+4
| | | | | | | | | | | | | | | | | | | | | This brings the nir path in line with the TGSI path. Totals from affected shaders: SGPRS: 2984 -> 2984 (0.00 %) VGPRS: 2792 -> 2652 (-5.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 247380 -> 248072 (0.28 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 121 -> 132 (9.09 %) Wait states: 0 -> 0 (0.00 %) Most of the change came from DiRT: Showdown, and came from sinking SSBO loads. Reviewed-by: Timothy Arceri <[email protected]>
* ac,radeonsi: Always mark buffer stores as inaccessiblememonlyConnor Abbott2019-06-194-64/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inaccessiblememonly means that it doesn't modify memory accesible via normal LLVM pointers. This lets LLVM's dead store elimination, memcpy forwarding, etc. ignore functions with this attribute. We don't represent descriptors as pointers, so this property is always true of buffer and image stores. There are plans to represent descriptors via pointers, but this just means that now nothing is inaccessiblememonly, as LLVM will then understand loads/stores via its usual alias analysis. Radeonsi was mistakenly only setting it if the driver could prove that there were no reads, and then it was cargo-culted into ac_llvm_build and ac_llvm_to_nir. Rip it out of everything. statistics with nir enabled: Totals from affected shaders: SGPRS: 152 -> 152 (0.00 %) VGPRS: 128 -> 132 (3.12 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 9324 -> 9244 (-0.86 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 17 -> 17 (0.00 %) Wait states: 0 -> 0 (0.00 %) The only difference was a manhattan31 shader. Acked-by: Timothy Arceri <[email protected]> Acked-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: make ac_compute_cmask() a static functionSamuel Pitoiset2019-06-172-7/+3
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
* ac: update llvm.amdgcn.icmp intrinsic name for LLVM 9+Samuel Pitoiset2019-06-171-3/+4
| | | | | | | | LLVM r363339 changed llvm.amdgcn.icmp.i* to llvm.amdgcn.icmp.i64.i*. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
* ac: add radeon_info::is_amdgpu instead of checking drm_major == 3Marek Olšák2019-06-142-5/+9
| | | | | | and clean up Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: add support for AMD_shader_ballot functionsDaniel Schürmann2019-06-131-0/+20
| | | | Reviewed-by: Connor Abbott <[email protected]>