summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi/nir: implement default tess level system valuesMarek Olšák2019-08-122-3/+10
| | | | Reviewed-by: Connor Abbott <[email protected]>
* compiler: add SYSTEM_VALUE_USER_DATA_AMDMarek Olšák2019-08-122-0/+5
| | | | for internal radeonsi shaders
* compiler: add ACCESS_STREAM_CACHE_POLICYMarek Olšák2019-08-121-0/+3
| | | | | | radeonsi will use this. Reviewed-by: Connor Abbott <[email protected]>
* radv: Do not setup attachments without a framebuffer.Bas Nieuwenhuizen2019-08-121-3/+5
| | | | | | | Test that found this: dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer Fixes: 49e6c2fb78c "radv: Store color/depth surface info in attachment info instead of framebuffer." Reviewed-by: Dave Airlie <[email protected]>
* radv: Hash Wave32 settings in shader key.Bas Nieuwenhuizen2019-08-122-0/+9
| | | | | | | Can result in different shaders. Fixes: 8a86908e9a7 "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders" Reviewed-by: Dave Airlie <[email protected]>
* radv: Properly use Wave64 for non-NGG GS and copy shader.Bas Nieuwenhuizen2019-08-121-1/+4
| | | | | Fixes: 8a86908e9a7 "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders" Reviewed-by: Dave Airlie <[email protected]>
* radv: Put wave size in shader options/info.Bas Nieuwenhuizen2019-08-124-48/+38
| | | | | | | Instead of having the three values everywhere. This is also more future proof if we want the driver to make those decisions eventually. Reviewed-by: Dave Airlie <[email protected]>
* amd: prepare dropping include of p_compiler.hLionel Landwerlin2019-08-093-4/+5
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* radv: Avoid VEGA/RAVEN scissor bug in binning.Bas Nieuwenhuizen2019-08-081-1/+2
| | | | | CC: <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Avoid binning RAVEN hangs.Bas Nieuwenhuizen2019-08-081-1/+2
| | | | | | | Mirroring radeonsi. CC: <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix off by one for S_028C48_MAX_ALLOC_COUNT.Bas Nieuwenhuizen2019-08-081-1/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv/gfx10: Enable DCC for storage images.Bas Nieuwenhuizen2019-08-073-4/+18
| | | | | | v2: Hide it behind a perftest flag. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add device argument for dcc compression check.Bas Nieuwenhuizen2019-08-075-18/+24
| | | | | | Because it is about to be generation dependent. Reviewed-by: Dave Airlie <[email protected]>
* radv: Disable compression for compute DCC decompress store.Bas Nieuwenhuizen2019-08-073-13/+41
| | | | | | | Previously we relied on stores not using DCC but that is going to change, so disable compression explicitly. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add extra struct to image view creation.Bas Nieuwenhuizen2019-08-0712-22/+27
| | | | | | | For extra args. Unlike image creation, I'm not embedding the vk struct in there, so all the inline structs can be kept. Reviewed-by: Dave Airlie <[email protected]>
* radv: Do not decompress on LAYOUT_GENERAL.Bas Nieuwenhuizen2019-08-071-3/+3
| | | | | | | We handle render loops properly now and STORAGE still disables DCC/TC-compat HTILE in general. Reviewed-by: Dave Airlie <[email protected]>
* radv: Pass through render loop detection to internal layout decisions.Bas Nieuwenhuizen2019-08-076-46/+100
| | | | | | | | And do nothing with it yet. Everything outside a renderpass has no render loop. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add render loop detection in renderpass.Bas Nieuwenhuizen2019-08-072-0/+19
| | | | | | | | | | | | | | VK spec 7.3: "Applications must ensure that all accesses to memory that backs image subresources used as attachments in a given renderpass instance either happen-before the load operations for those attachments, or happen-after the store operations for those attachments." So the only renderloops we can have is with input attachments. Detect these. Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix config reg assert.Bas Nieuwenhuizen2019-08-071-1/+1
| | | | | | | | Using the wrong bounds Fixes: "219d6939df8 radv: add more assertions to make sure packets are correctly emitted" Reviewed-by: Andres Rodriguez <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrapPierre-Eric Pelloux-Prayer2019-08-061-0/+25
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac: add ac_atomic_inc_wrap / ac_atomic_dec_wrap supportPierre-Eric Pelloux-Prayer2019-08-062-0/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: handle key.mono.u.ps.interpolate_at_sample_force_centerMarek Olšák2019-08-062-0/+4
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/nir: Use correct cast for readfirstlane and ptrs.Bas Nieuwenhuizen2019-08-061-0/+2
| | | | | | Fixes: 028ce527 "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Do non-uniform lowering before bool lowering.Bas Nieuwenhuizen2019-08-061-1/+1
| | | | | | | | Since it can introduce comparisons. Fixes: 028ce527395 "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Lower large indirect variables to scratchConnor Abbott2019-08-051-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | results from radeonsi NIR: Totals from affected shaders: SGPRS: 704 -> 464 (-34.09 %) VGPRS: 2056 -> 672 (-67.32 %) Spilled SGPRs: 24 -> 0 (-100.00 %) Spilled VGPRs: 28406 -> 0 (-100.00 %) Private memory VGPRs: 0 -> 3182 (0.00 %) Scratch size: 1064 -> 3228 (203.38 %) dwords per thread Code Size: 935260 -> 40180 (-95.70 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 28 -> 70 (150.00 %) Wait states: 0 -> 0 (0.00 %) results from radv: Totals from affected shaders: SGPRS: 80 -> 48 (-40.00 %) VGPRS: 204 -> 108 (-47.06 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 256 (0.00 %) dwords per thread Code Size: 15792 -> 9504 (-39.82 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1 -> 2 (100.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* meson: replace last uses of libxmlconfig with idep_xmlconfigEric Engestrom2019-08-031-2/+1
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: replace libmesa_util with idep_mesautilEric Engestrom2019-08-031-2/+2
| | | | | | | | | | | This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* radv: Expose VK_KHR_imageless_framebuffer.Bas Nieuwenhuizen2019-08-022-0/+7
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Implement VK_KHR_imageless_framebuffer.Bas Nieuwenhuizen2019-08-022-10/+38
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Store image view also outside framebuffer.Bas Nieuwenhuizen2019-08-026-33/+31
| | | | | | So we can use it with imageless framebuffers. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Store color/depth surface info in attachment info instead of framebuffer.Bas Nieuwenhuizen2019-08-027-104/+102
| | | | | | That way we can use it for imageless framebuffers. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir,radv: Optimize bounds check for 64 bit CAS.Bas Nieuwenhuizen2019-08-027-17/+36
| | | | | | | | When the application does not ask for robust buffer access. Only implemented the check in radv. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: fix image_has_{cmask,fmask}() helpersSamuel Pitoiset2019-08-021-2/+2
| | | | | | | | | | | The driver should now rely on cmask_offset because CMASK can be disabled by the driver for some reasons (eg. mipmaps). Apply the same change for FMASK, although it should be useless. Fixes: ad1bc8621df ("radv: remove radv_get_image_fmask_info()") Fixes: 10d08da52c6 ("radv/gfx10: add missing dcc_tile_swizzle tweak") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove radv_get_image_fmask_info()Samuel Pitoiset2019-08-024-59/+25
| | | | | | | It's unnecessary to duplicate fields in another struct. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add missing dcc_tile_swizzle tweakSamuel Pitoiset2019-08-021-1/+3
| | | | | | Fixes: c90f46700dd ("radv/gfx10: mask DCC tile swizzle by alignment") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove radv_get_image_cmask_info()Samuel Pitoiset2019-08-024-45/+21
| | | | | | | It's unnecessary to duplicate fields in another struct. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only account for tile_swizzle for color surfaces with DCCSamuel Pitoiset2019-08-021-3/+3
| | | | | | | It's 0 for depth surfaces with TC compat HTILE enabled. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Enable VK_KHR_shader_atomic_int64Bas Nieuwenhuizen2019-08-022-6/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Implement LLVM9 64-bit buffer compare & exchange.Bas Nieuwenhuizen2019-08-021-4/+64
| | | | | | | | | | | LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this extracts the ptr, does a bound check and then uses a cmpxchg LLVM instruction. Not ideal, but the earliest release we're going to get a proper intrinsic is LLVM 10. Reviewed-by: Samuel Pitoiset <[email protected]>
* Revert "ac/nir: handle negate modifier"Connor Abbott2019-08-021-12/+1
| | | | This reverts commit bfea7e4d2965269bff8f1f6449cb99c312fd7384.
* Revert "ac/nir: handle abs modifier"Connor Abbott2019-08-021-30/+11
| | | | | | This reverts commit d3c80733cdfe8552b2f447ec8ed62465d0f2af1a. These were only appearing due to memory corruption.
* radv: re-apply "Optimize rebinding the same descriptor set."Samuel Pitoiset2019-08-021-1/+7
| | | | | | | | | | | | | This makes it cheaper to just change the dynamic offsets with the same descriptor sets. This optimization has been reverted a while back because of random GPU hangs on GFX9, no it looks fine, at least CTS no longer hangs on GFX9 and it doesn't hang on GFX10 as well. It fixes a performance problem with Wolfenstein Youngblood. Suggested-by: Philip Rebohle <[email protected]>
* radv/gfx10: use the correct target machine for Wave32Samuel Pitoiset2019-08-023-10/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add Wave32 support for vertex, tessellation and geometry shadersSamuel Pitoiset2019-08-027-8/+26
| | | | | | | It can be enabled with RADV_PERFTEST=gewave32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add Wave32 support for fragment shadersSamuel Pitoiset2019-08-027-2/+16
| | | | | | | It can be enabled with RADV_PERFTEST=pswave32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: implement a GE bug workaroundSamuel Pitoiset2019-07-311-4/+23
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: remove an obsolete VGT_REUSE_OFF workaroundSamuel Pitoiset2019-07-311-6/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: disable LATE_ALLOC_GS on Navi14Samuel Pitoiset2019-07-311-1/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: implement a bug workaround for GE_PC_ALLOCSamuel Pitoiset2019-07-312-17/+13
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: implement a bug workaround for NGG -> legacy transitionsSamuel Pitoiset2019-07-312-2/+21
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>