summaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* ac/shader: gather If TES reads TESSINNER or TESSOUTERSamuel Pitoiset2018-01-151-2/+0
| | | | | | | This shouldn't be scanned in the pipeline. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv/radeonsi/nir: lower 64bit flrpTimothy Arceri2018-01-131-0/+1
| | | | | | | | Fixes a bunch of arb_gpu_shader_fp64 piglit tests for example: generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-mix-double-double-double.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't emit unneeded vertex state.Dave Airlie2018-01-122-8/+49
| | | | | | | | | | | | | | | | | | If the number of instances hasn't changed and we've already emitted it, don't emit it again. If the vertex shader is the same and the first_instance, vertex_offset haven't changed don't emit them again. This increases the fps in GL_vs_VK -t 1 -m -api vk from around 40 to around 60 here, it may not impact anything else. Dieter also reported smoketest going from 1060->1200 fps. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* meson: Use dependencies for nirDylan Baker2018-01-111-3/+3
| | | | | | | | | | | | | | | | | This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <[email protected]> Signed-off-by: Dylan Baker <[email protected]>
* meson: Use consistent styleDylan Baker2018-01-111-11/+21
| | | | | | | | | | | | | | | | | | | | Currently the meosn build has a mix of two styles: arg : [foo, ... bar], and arg : [ foo, ..., bar, ] For consistency let's pick one. I've picked the later style, which I think is more readable, and is more common in the mesa code base. v2: - fix commit message Acked-by: Eric Engestrom <[email protected]> Signed-off-by: Dylan Baker <[email protected]>
* radv: reset semaphores & fences on sync_file export.Bas Nieuwenhuizen2018-01-111-0/+16
| | | | | | | | | | | | | | | | | | Per spec: "Additionally, exporting a fence payload to a handle with copy transference has the same side effects on the source fence’s payload as executing a fence reset operation. If the fence was using a temporarily imported payload, the fence’s prior permanent payload will be restored." And similar for semaphores: "Additionally, exporting a semaphore payload to a handle with copy transference has the same side effects on the source semaphore’s payload as executing a semaphore wait operation. If the semaphore was using a temporarily imported payload, the semaphore’s prior permanent payload will be restored." Fixes: 42bc25a79c "radv: Advertise sync fd import and export." Reviewed-by: Dave Airlie <[email protected]>
* radv: Remove some typos.Bas Nieuwenhuizen2018-01-102-4/+4
| | | | Trivial.
* radv: Implement VK_EXT_discard_rectangles.Bas Nieuwenhuizen2018-01-105-6/+110
| | | | | | | | Tested with a modified deferred demo and no regressions in a 1.0.2 mustpass run. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Add mapping between dynamic state mask and external enum.Bas Nieuwenhuizen2018-01-103-38/+79
| | | | | | | | | The EXT values are really large, e.g. VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT = 1000099000, so 1 << value is not going to fit into a 32-bit mask. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove radv_pipeline_layout::push_constant_stages fieldSamuel Pitoiset2018-01-102-3/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx9: calculate the number of ES VGPRs for merged shadersSamuel Pitoiset2018-01-101-3/+10
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx9: enable LDS for GS only if the ES type is TESSamuel Pitoiset2018-01-101-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: avoid PS partial flushes when viewports/scissors don't changeSamuel Pitoiset2018-01-081-6/+33
| | | | | | | | | | | | | For Vega10 and Raven that need a special workaround for the scissor bug. This seems to give a minor boost for Talos and Dota 2, at least. To reduce the cost of memcmp, the driver checks if it's really useful to do the comparison. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add has_scissor_bug for Vega10 and RavenSamuel Pitoiset2018-01-083-2/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx9: do not load VGPR1 when GS uses points or linesSamuel Pitoiset2018-01-081-1/+3
| | | | | | | | VGPR1 is only needed for topology that needs 3 offsets like triangles or quads. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make shader BOs read-only for the GPUSamuel Pitoiset2018-01-083-1/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make descriptor BOs read-only for the GPUSamuel Pitoiset2018-01-082-3/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make the indirect GFX config BO read-only for the GPUSamuel Pitoiset2018-01-081-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: make IBs read-only for the GPUSamuel Pitoiset2018-01-081-6/+11
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: add RADEON_FLAG_READ_ONLYSamuel Pitoiset2018-01-082-1/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: rework radv_amdgpu_bo_va_op()Samuel Pitoiset2018-01-081-17/+23
| | | | | | | Needed for the following commit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused radv_color_buffer_info::cb_clear_valueXSamuel Pitoiset2018-01-051-2/+0
| | | | | | | Found by inspection. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas nieuwenhuizen <[email protected]>
* radv: limit the scissor bug workaround to Vega 10 and RavenSamuel Pitoiset2018-01-051-1/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use correct flush bits for flushing L2 during CB/DB flushes.Bas Nieuwenhuizen2018-01-041-13/+16
| | | | | | | | | | | | Copied from radeonsi. Putting in the correct metadata flush commands for eventually not flushing L2 on CB/DB switch. Does not remove the need for V_028A90_CACHE_FLUSH_AND_INV_TS_EVENT at the moment. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Invalidate L1 for VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT.Bas Nieuwenhuizen2018-01-041-1/+1
| | | | | | | These are just shaders reads, so we need to invalidate L1. Fixes: 6dbb0eaccc "radv: handle subpass cache flushes" Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx9: reduce the number of input VGPRs for the GS stageSamuel Pitoiset2018-01-041-1/+14
| | | | | | | This can still be improved, but let's start with this. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: scan if gl_PrimitiveID is used before translating to LLVMSamuel Pitoiset2018-01-041-3/+3
| | | | | | | | | It makes more sense to move all scan stuff in the same place. Also, we don't really need to duplicate the uses_primid field for each stages. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: Allow writing 0 scissors.Bas Nieuwenhuizen2018-01-041-1/+2
| | | | | | | When rasterization is disabled we can have that few. Fixes: 76603aa90b8 "radv: Drop the default viewport when 0 viewports are given." Reviewed-by: Dave Airlie <[email protected]>
* radv: Use correct HTILE expanded words.Bas Nieuwenhuizen2018-01-041-2/+4
| | | | | | | | | | | Seems like users are actually hitting 0xFFFFFFFF actually making things broken for them, and the mad max regression is fixed, so lets put this in once more. v2: Use 0xf for depth-only htile. (Dave) Fixes: af2844116fd "radv: Revert HTILE reset word to 0xFFFFFFFF." Reviewed-by: Dave Airlie <[email protected]>
* ac: rename has_syncobj_wait -> has_syncobj_wait_for_submitMarek Olšák2018-01-042-5/+5
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Implement binning on GFX9.Bas Nieuwenhuizen2017-12-314-6/+348
| | | | | | | | | | | | Overall it does not really help or hurt. The deferred demo gets 1% improvement and some games a 3% decrease, so I don't think this should be enabled by default. But with the code upstream it is easier to experiment with it. v2: Remove initializing the registers from si_emit_config. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add flag for enabling binning.Bas Nieuwenhuizen2017-12-312-0/+9
| | | | | | Letting it be disabled by default. Reviewed-by: Dave Airlie <[email protected]>
* radv: Also set DCC params for sampling for input attachment usage.Bas Nieuwenhuizen2017-12-291-1/+2
| | | | | | | | Those are implemented as texture sampling, so we need to make the texture TC-compatible too. Fixes: 34d23e82ca9 "radv: set some dcc parameters depending on if texture will be sampled" Reviewed-by: Fredrik Höglund <[email protected]>
* radv: Enable DCC with transfers.Bas Nieuwenhuizen2017-12-291-2/+1
| | | | | | | | | Before this DCC was in practice disabled for most games. This enables practical DCC use. Expect a 5-10% perf increase on a bunch of games on vega @ 4k. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Decompress copy destination if formats are incompatible.Bas Nieuwenhuizen2017-12-291-2/+25
| | | | | | | | | | If both source and destination are DCC compressed, and their formats are not compatible, we need to decompress one of them to make sure we can do reinterpretation (which needs src format == dst format) . Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Disable DCC for GENERAL layout and compute transfer dest.Bas Nieuwenhuizen2017-12-294-8/+47
| | | | | | | | | | | | | | | Apps can use this for render feedback loops, where things are defined if they render each pixel only once. However, DCC fails here, as the level of coherence is a block not a pixel, so disable it. This is also going to help implementing other stuff. Even if we optimize this later to only happen if there actually is a loop (if possible at all ...), then the machinery is still useful to exclude images accessible by the SDMA queue when that is implemented. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Don't init DCC metadata during FS resolve.Bas Nieuwenhuizen2017-12-291-5/+0
| | | | | | | | It should already be valid there + the RB will update it during rendering. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Make color meta operations layout aware.Bas Nieuwenhuizen2017-12-295-110/+145
| | | | | | | | | | | | | For fast clear eliminate and decompressions, we always use the most compressed format. For clears, the code already creates a renderpass on demand with the exact same layout as specified. Otherwise we start distinguishing between GENERAL and TRANSFER_DST_OPTIMAL. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Add compute DCC decompress.Bas Nieuwenhuizen2017-12-293-0/+275
| | | | | | | | | | | We do an in place copy where we read compressed and write decompressed. By doing this in sizes that cover entire DCC blocks and waiting for all reads in the block before starting to write we avoid corruption. In the end we clear the DCC metadata to 0xffffffff. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Use the meta fast clear destructor on construction failure.Bas Nieuwenhuizen2017-12-291-6/+3
| | | | | | | | Simplifies failure paths. The caller already calls radv_device_finish_meta_fast_clear_flush_state on failure. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Add GFX DCC decompress.Bas Nieuwenhuizen2017-12-292-12/+83
| | | | | Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: Don't enable DCC / TC compat HTILE for storage images.Bas Nieuwenhuizen2017-12-291-5/+6
| | | | | | | | | | | We don't get a layout when binding to a descriptor set, but can assume that the LAYOUT is GENERAL. For DCC stores with the DCC bits set will result in a hang, so better be safe than sorry. Reviewed-by: Dave Airlie <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* Revert "radv/gfx9: fix block compression texture views."Bas Nieuwenhuizen2017-12-291-35/+0
| | | | | | | | | This reverts commit 59515780433837ad3975f8ed20b93cf2fe6870e5. The mentioned commit causes a hang in DoW3 on Vega. Fixes: 59515780433 "radv/gfx9: fix block compression texture views." Acked-by: Dave Airlie <[email protected]>
* radv/gfx9: use correct swizzle parameter to work out border swizzle.Dave Airlie2017-12-291-2/+2
| | | | | | | | | | This should fix: dEQP-VK.pipeline.sampler.view_type.*.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black and a few others in that area. Fixes: b11c4a5546 (radv: add texture descriptor/fmask/cmask support for GFX9) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: use a bigger hammer to flush cb/db caches.Dave Airlie2017-12-291-1/+8
| | | | | | | | | | | | amdvlk is probably more subtle than this but it never uses the inv cb/db variants, we fail some CTS tests without this. Fixes: dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.input*. Fixes: c2fbeb7ca05 (radv: add GFX9 cache flushing support.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (for now :-) Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix block compression texture views.Dave Airlie2017-12-291-0/+35
| | | | | | | | | | | | This ports a fix from amdvlk, to fix the sizing for mip levels when block compressed images are viewed using uncompressed views. Fixes: dEQP-VK.image.texel_view_compatible.graphic.extended*bc* Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix buffer to image for 3d images on compute queuesDave Airlie2017-12-292-15/+48
| | | | | | | | | This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix 3d image clears on compute queuesDave Airlie2017-12-292-9/+65
| | | | | | | | | This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix 3d image to image transfers on compute queues.Dave Airlie2017-12-292-20/+56
| | | | | | | | | This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.*64x64x8* tests. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix pipeline statistics end query on compute queueDave Airlie2017-12-281-1/+1
| | | | | | | | | | | | It's legal to a pipeline stat query on a compute queue, but we'd emit the wrong packet here. This should fix it to emit the correct packet. Noticed while inspecting the mpv hang. Fixes: ad61eac250 (radv: factor out eop event writing code. (v2)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>