summaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* radv: Use correct flush bits for flushing L2 during CB/DB flushes.Bas Nieuwenhuizen2018-01-041-13/+16
| | | | | | | | | | | | Copied from radeonsi. Putting in the correct metadata flush commands for eventually not flushing L2 on CB/DB switch. Does not remove the need for V_028A90_CACHE_FLUSH_AND_INV_TS_EVENT at the moment. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Invalidate L1 for VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT.Bas Nieuwenhuizen2018-01-041-1/+1
| | | | | | | These are just shaders reads, so we need to invalidate L1. Fixes: 6dbb0eaccc "radv: handle subpass cache flushes" Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx9: reduce the number of input VGPRs for the GS stageSamuel Pitoiset2018-01-041-1/+14
| | | | | | | This can still be improved, but let's start with this. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: scan if gl_PrimitiveID is used before translating to LLVMSamuel Pitoiset2018-01-041-3/+3
| | | | | | | | | It makes more sense to move all scan stuff in the same place. Also, we don't really need to duplicate the uses_primid field for each stages. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: Allow writing 0 scissors.Bas Nieuwenhuizen2018-01-041-1/+2
| | | | | | | When rasterization is disabled we can have that few. Fixes: 76603aa90b8 "radv: Drop the default viewport when 0 viewports are given." Reviewed-by: Dave Airlie <[email protected]>
* radv: Use correct HTILE expanded words.Bas Nieuwenhuizen2018-01-041-2/+4
| | | | | | | | | | | Seems like users are actually hitting 0xFFFFFFFF actually making things broken for them, and the mad max regression is fixed, so lets put this in once more. v2: Use 0xf for depth-only htile. (Dave) Fixes: af2844116fd "radv: Revert HTILE reset word to 0xFFFFFFFF." Reviewed-by: Dave Airlie <[email protected]>
* ac: rename has_syncobj_wait -> has_syncobj_wait_for_submitMarek Olšák2018-01-042-5/+5
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Implement binning on GFX9.Bas Nieuwenhuizen2017-12-314-6/+348
| | | | | | | | | | | | Overall it does not really help or hurt. The deferred demo gets 1% improvement and some games a 3% decrease, so I don't think this should be enabled by default. But with the code upstream it is easier to experiment with it. v2: Remove initializing the registers from si_emit_config. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add flag for enabling binning.Bas Nieuwenhuizen2017-12-312-0/+9
| | | | | | Letting it be disabled by default. Reviewed-by: Dave Airlie <[email protected]>
* radv: Also set DCC params for sampling for input attachment usage.Bas Nieuwenhuizen2017-12-291-1/+2
| | | | | | | | Those are implemented as texture sampling, so we need to make the texture TC-compatible too. Fixes: 34d23e82ca9 "radv: set some dcc parameters depending on if texture will be sampled" Reviewed-by: Fredrik Höglund <[email protected]>
* radv: Enable DCC with transfers.Bas Nieuwenhuizen2017-12-291-2/+1
| | | | | | | | | Before this DCC was in practice disabled for most games. This enables practical DCC use. Expect a 5-10% perf increase on a bunch of games on vega @ 4k. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Decompress copy destination if formats are incompatible.Bas Nieuwenhuizen2017-12-291-2/+25
| | | | | | | | | | If both source and destination are DCC compressed, and their formats are not compatible, we need to decompress one of them to make sure we can do reinterpretation (which needs src format == dst format) . Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Disable DCC for GENERAL layout and compute transfer dest.Bas Nieuwenhuizen2017-12-294-8/+47
| | | | | | | | | | | | | | | Apps can use this for render feedback loops, where things are defined if they render each pixel only once. However, DCC fails here, as the level of coherence is a block not a pixel, so disable it. This is also going to help implementing other stuff. Even if we optimize this later to only happen if there actually is a loop (if possible at all ...), then the machinery is still useful to exclude images accessible by the SDMA queue when that is implemented. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Don't init DCC metadata during FS resolve.Bas Nieuwenhuizen2017-12-291-5/+0
| | | | | | | | It should already be valid there + the RB will update it during rendering. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Make color meta operations layout aware.Bas Nieuwenhuizen2017-12-295-110/+145
| | | | | | | | | | | | | For fast clear eliminate and decompressions, we always use the most compressed format. For clears, the code already creates a renderpass on demand with the exact same layout as specified. Otherwise we start distinguishing between GENERAL and TRANSFER_DST_OPTIMAL. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Add compute DCC decompress.Bas Nieuwenhuizen2017-12-293-0/+275
| | | | | | | | | | | We do an in place copy where we read compressed and write decompressed. By doing this in sizes that cover entire DCC blocks and waiting for all reads in the block before starting to write we avoid corruption. In the end we clear the DCC metadata to 0xffffffff. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Use the meta fast clear destructor on construction failure.Bas Nieuwenhuizen2017-12-291-6/+3
| | | | | | | | Simplifies failure paths. The caller already calls radv_device_finish_meta_fast_clear_flush_state on failure. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Add GFX DCC decompress.Bas Nieuwenhuizen2017-12-292-12/+83
| | | | | Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Don't enable DCC / TC compat HTILE for storage images.Bas Nieuwenhuizen2017-12-291-5/+6
| | | | | | | | | | | We don't get a layout when binding to a descriptor set, but can assume that the LAYOUT is GENERAL. For DCC stores with the DCC bits set will result in a hang, so better be safe than sorry. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* Revert "radv/gfx9: fix block compression texture views."Bas Nieuwenhuizen2017-12-291-35/+0
| | | | | | | | | This reverts commit 59515780433837ad3975f8ed20b93cf2fe6870e5. The mentioned commit causes a hang in DoW3 on Vega. Fixes: 59515780433 "radv/gfx9: fix block compression texture views." Acked-by: Dave Airlie <[email protected]>
* radv/gfx9: use correct swizzle parameter to work out border swizzle.Dave Airlie2017-12-291-2/+2
| | | | | | | | | | This should fix: dEQP-VK.pipeline.sampler.view_type.*.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black and a few others in that area. Fixes: b11c4a5546 (radv: add texture descriptor/fmask/cmask support for GFX9) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: use a bigger hammer to flush cb/db caches.Dave Airlie2017-12-291-1/+8
| | | | | | | | | | | | amdvlk is probably more subtle than this but it never uses the inv cb/db variants, we fail some CTS tests without this. Fixes: dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.input*. Fixes: c2fbeb7ca05 (radv: add GFX9 cache flushing support.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (for now :-) Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix block compression texture views.Dave Airlie2017-12-291-0/+35
| | | | | | | | | | | | This ports a fix from amdvlk, to fix the sizing for mip levels when block compressed images are viewed using uncompressed views. Fixes: dEQP-VK.image.texel_view_compatible.graphic.extended*bc* Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix buffer to image for 3d images on compute queuesDave Airlie2017-12-292-15/+48
| | | | | | | | | This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix 3d image clears on compute queuesDave Airlie2017-12-292-9/+65
| | | | | | | | | This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix 3d image to image transfers on compute queues.Dave Airlie2017-12-292-20/+56
| | | | | | | | | This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix pipeline statistics end query on compute queueDave Airlie2017-12-281-1/+1
| | | | | | | | | | | | It's legal to a pipeline stat query on a compute queue, but we'd emit the wrong packet here. This should fix it to emit the correct packet. Noticed while inspecting the mpv hang. Fixes: ad61eac250 (radv: factor out eop event writing code. (v2)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix events on compute queues.Dave Airlie2017-12-281-1/+1
| | | | | | | | | | | The event emission wasn't sending the correct packet for gfx8 compute queues, which explains why it works on vega fine. This fixes the mpv vulkan hang. Fixes: ad61eac250 (radv: factor out eop event writing code. (v2)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: move local bos usage to a perftest flag.Dave Airlie2017-12-285-1/+5
| | | | | | | | | | These seem mildly unstable on vega, crashing CTS in various fun ways, and looks like leaking memory. Disable for now, but leave the option to enable them. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Always use fragment resolve if dest uses DCC.Bas Nieuwenhuizen2017-12-281-5/+4
| | | | | | | HW resolve does not support it either. Fixes: 2a04f5481df "radv/meta: select resolve paths" Reviewed-by: Dave Airlie <[email protected]>
* radv: Use correct framebuffer size for partial FS resolves.Bas Nieuwenhuizen2017-12-281-2/+2
| | | | | | | Framebuffer is from 0,0, not (dst.x, dst.y). Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders" Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix fragment resolve destination offset.Bas Nieuwenhuizen2017-12-281-2/+2
| | | | | | | | | | The position start at (dst.x, dst.y), so if we want the source to start at (src.x, src.y), we have to offset by (src.x-dst.x,src.y-dst.y). Haven't tested that this fixed anything yet, but found by inspection. Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders" Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't handle DCC in compute resolve.Bas Nieuwenhuizen2017-12-281-4/+1
| | | | | | If the destination has DCC, we will use the FS resolve. Reviewed-by: Dave Airlie <[email protected]>
* radv: Flush caches before subpass resolve.Bas Nieuwenhuizen2017-12-282-0/+18
| | | | | Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: Invert condition for all samples identical during resolve.Bas Nieuwenhuizen2017-12-281-1/+1
| | | | | | | | the samples_identical instruction returns 0 if they are differet, so we have to do the extra work if the result is 0, not if it is != 0. Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: don't do format replacement on tc compat htile surfaces.Dave Airlie2017-12-281-1/+2
| | | | | | | | | | | | For copies the texture unit needs to know the depth format so it can read the htile data properly. This fixes: dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.load.clear Fixes: ad3d98da9f (radv: enable tc compatible htile for d32s8 also.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: use correct stencil format for tc compat htile.Dave Airlie2017-12-281-4/+7
| | | | | | | | | | | This needs to correspond to the bit depth of the Z plane. noticed in passing reading amdvlk. Fixes: fc6c77e162df3 (radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega) Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: set some dcc parameters depending on if texture will be sampledDave Airlie2017-12-271-1/+10
| | | | | | | | | | This is ported from amdvlk which sets the independent 64b blocks only for image which will sample dcc. I'm not sure how to port this to radeonsi. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/radeonsi: set dcc min uncompressed properly for APUs.Dave Airlie2017-12-271-0/+10
| | | | | | | This is ported from amdvlk. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* amd/common/radv/radeonsi: use register defines for dcc block sizes.Dave Airlie2017-12-271-3/+3
| | | | | | | | These are just taken from amdvlk, we probably knew these already, but may as well port them now. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Fix DCC compatible formats.Bas Nieuwenhuizen2017-12-231-1/+1
| | | | | | | | | DCC was disabled when the image format is !!supported, which is one ! too many. Ironically the commit that introduced it was supposed to lead to more DCC use ... Fixes: 969537d9358 "radv: Add support for more DCC compression with VK_KHR_image_format_list." Reviewed-by: Dave Airlie <[email protected]>
* radv: reduce the number of small surfaces that need CMASK or DCCSamuel Pitoiset2017-12-221-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/meta: fix blit paths for depth/stencil (v2.1)Dave Airlie2017-12-222-64/+80
| | | | | | | | | | | | | | This fixes the layout issue for the blit path as well. This fixes: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint* v2: use compatible render passes. v2.1: use enum Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2 17.3" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: handle depth/stencil image copy with layouts better. (v3.1)Dave Airlie2017-12-224-65/+106
| | | | | | | | | | | | | | | | | | | If we are doing a general->general transfer with HIZ enabled, we want to hit the tile surface disable bits in radv_emit_fb_ds_state, however we never get the current layout to know we are in general and meta hardcoded the transfer layout which is always tile enabled. This fixes: dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general v2: refactor some shared helpers for blit patches v3: we only need multiple render passes as they should be compatible. v3.1: use enum (Bas) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2 17.3" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: refactor blit2d pipeline creationDave Airlie2017-12-221-78/+33
| | | | | | | | This just refactors the gfx9 blit2d pipeline creation to be less lines of code. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: add support for 3d images to blit 2d pathsDave Airlie2017-12-222-23/+97
| | | | | | | | | | This add support for a 3D image reading path to the blit 2d paths, like I did for the clear paths. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Alex Smith <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: add 3d sampler image->buffer copy shader. (v3)Dave Airlie2017-12-222-18/+59
| | | | | | | | | | | | | | | | | | On GFX9 we must access 3D textures with 3D samplers AFAICS. This fixes: dEQP-VK.api.image_clearing.core.clear_color_image.3d.single_layer on GFX9 for me. v1.1: fix tex->sampler_dim to dim v2: send layer in from outside v3: don't regress on pre-gfx9 Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Alex Smith <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix surface max layer count (v2)Dave Airlie2017-12-221-7/+7
| | | | | | | | | | | looking at traces I noticed we'd set slice_max too large sometimes. This should fix it. v2: fix missing - 1 Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix issue with multisample positions and interp_var_at_sample.Dave Airlie2017-12-221-1/+2
| | | | | | | | | | | | | | | | This fixes vmfaults seen on vega with: dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_single_sample_.128_128_1.samples_1 These were caused by the don't allocate cmask but it was just accidental. The actual problem was the shader was trying to get the sample positions from a buffer, but the buffer was never getting configured to contain them, as the previous shader never needed them. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Fixes: 1171b304f3 (radv: overhaul fragment shader sample positions.) Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix primitive topology when adjacency is usedSamuel Pitoiset2017-12-211-1/+1
| | | | | | | | Found by inspection. Cc: 17.3 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>