summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: remove useless checks around radv_CmdBindPipeline()Samuel Pitoiset2017-10-048-97/+34
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: check that pipeline is different before binding itSamuel Pitoiset2017-10-041-2/+8
| | | | | | | | We only need to dirty the descriptors when the pipeline is a new one, because user SGPRs can be potentially different. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable tc compatible htile for d32s8 also.Dave Airlie2017-10-041-1/+2
| | | | | | | | | | | | This enables tc compatible htile for stencil surfaces as well. This gives a 3-5fps boost on Mad Max on high@4k. It also depends on Bas's tc-compat htile patch. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: dump SPIRV when a GPU hang is detectedSamuel Pitoiset2017-10-044-4/+13
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: dump NIR when a GPU hang is detectedSamuel Pitoiset2017-10-044-11/+27
| | | | | | | | This looks a bit ugly to me, but the existing codepath is not terribly elegant as well. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: silence a warningMarek Olšák2017-10-041-1/+3
|
* radv: Implement TC compatible HTILE.Bas Nieuwenhuizen2017-10-044-6/+62
| | | | | | | The situations where we enable it are quite limitied, but it works, even for madmax, so lets just enable it. Reviewed-by: Dave Airlie <[email protected]>
* radv: emit fmuladd instead of fma to llvm.Dave Airlie2017-10-041-1/+1
| | | | | | | | | | | | | | | | | | | | For Vulkan SPIR-V the spec states fma() Inherited from OpFMul followed by OpFAdd. Matt says the backend will do the right thing depending on the hardware being compiled for, if you use the fmuladd intrinsic. Using the Mad Max pts test, on high settings at 4K: CHP: 55->60 HGDD: 46->50 LM: 55->60 No change on Stronghold. Thanks to Feral for spending the time to track this down. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: make radv_dynamic_state_copy() staticSamuel Pitoiset2017-10-022-5/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: move ac_build_phi from radeonsiNicolai Hähnle2017-10-022-0/+19
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radv: remove unused radv_meta_state::btoi::render_pass handleSamuel Pitoiset2017-10-021-1/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not check the number of levels when doing fast htileSamuel Pitoiset2017-10-021-3/+0
| | | | | | | | We shouldn't reach this point because HTILE is only enabled when the number of levels is 1. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: cleanup radv_device_finish_meta_XXX() helpersSamuel Pitoiset2017-10-028-219/+136
| | | | | | | Unnecessary to double check that handles are not NULL. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: select the pipeline outside of emit_fast_clear_flush()Samuel Pitoiset2017-10-021-12/+11
| | | | | | | It can't change during the decompression pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop useless param in emit_depth_decomp()Samuel Pitoiset2017-10-021-5/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop useless check in depth_view_can_fast_clear()Samuel Pitoiset2017-10-021-2/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_subpass_clear_attachment() helperSamuel Pitoiset2017-10-021-20/+32
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_attachment_needs_clear() helperSamuel Pitoiset2017-10-021-39/+31
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused param in radv_handle_{cmask,dcc}_image_transition()Samuel Pitoiset2017-10-021-8/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_vi_dcc_enabled() helperSamuel Pitoiset2017-10-023-2/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not need to double zero-init the meta state structuresSamuel Pitoiset2017-10-0212-28/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: inline destroy_render_pass()Samuel Pitoiset2017-10-021-9/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use pipeline handles instead of objects for meta clear operationsSamuel Pitoiset2017-10-022-44/+36
| | | | | | | To be consistent with other meta operations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: inline blit2d_unbind_dst()Samuel Pitoiset2017-10-021-9/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rework DCC/CMASK/FMASK/HTILE allocationsSamuel Pitoiset2017-10-021-27/+56
| | | | | | | Add helpers and some comments to make the thing more readable. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: clamp depth comparison value only for fixed point formatsNicolai Hähnle2017-09-291-0/+2
| | | | | | | | | | | | | | | | | | | The hardware usually does this automatically. However, we upgrade depth to Z32_FLOAT to enable TC-compatible HTILE, which means the hardware no longer clamps the comparison value for us. The only way to tell in the shader whether a clamp is required seems to be to communicate an additional bit in the descriptor table. While VI has some unused bits in the resource descriptor, those bits have unfortunately all been used in gfx9. So we use an unused bit in the sampler state instead. Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f and many other tests in dEQP-GLES3.functional.texture.shadow.* Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* amd/common: save an instruction in the build_cube_select sequenceNicolai Hähnle2017-09-291-5/+6
| | | | | | | Avoid a v_cndmask: the absolute value is free due to input modifiers. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* amd/common: fix build_cube_selectNicolai Hähnle2017-09-291-3/+3
| | | | | | | | | | | | Fix the custom cube coord selection sequence to be identical to the hardware v_cubesc/tc and OpenGL spec. Affects texture sampling with user-provided derivatives. Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.* Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* amd/common: remove ac_shader_abi::chip_classNicolai Hähnle2017-09-292-13/+10
| | | | | | | Redundant with the recently added ac_llvm_context::chip_class. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: add an assertion in radv_BeginCommandBuffer()Gwan-gyeong Mun2017-09-281-0/+1
| | | | | | | | | To check a valid usage requirement. CID: 1401616 Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: set image view type when decompressing depth surfacesSamuel Pitoiset2017-09-281-0/+1
| | | | | | | This was missing. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* meson: build "radv" vulkan driver for radeon hardwareDylan Baker2017-09-274-0/+277
| | | | | | | | | | | | | | | | This builds, installs, and has been tested on a r290x (Hawaii) with the Vulkan CTS. It dies horribly in a fire at the same point for the meson build as the autotools build. v2: - enable radv by default - add shader cache support and enforce that it's built for radv v3: - Fix typo in meson_options (Nicholas) - strip trailing 'svn' from llvm version before setting the version preprocessor flag (Bas) - Check for LLVM module requirements Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store the amount of saved constants in the compute stateSamuel Pitoiset2017-09-277-17/+20
| | | | | | | It's safer and more elegant. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove useless radv_meta_{begin,end}_XXX() helpersSamuel Pitoiset2017-09-274-62/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix saved compute state when doing statistics/occlusion queriesSamuel Pitoiset2017-09-261-2/+2
| | | | | | | | | We are pushing 16-bytes of constants, so we have to save/restore the same amount of data to avoid data corruption. Cc: 17.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: save/restore all viewports/scissors for meta operationsSamuel Pitoiset2017-09-253-25/+43
| | | | | | | | | | | | | | | | | | | | This is needed since we don't update the number of viewports/scissors when they are set dynamically (according to the spec). In the following scenario: * vkCmdSetViewport() * vkCmdClearColorImage() (or any other meta operations) The viewports/scissors weren't saved correctly because no pipeline was bound before, and thus the number of viewports/scissors were 0. This fixes a regression with: dEQP-VK.draw.negative_viewport_height.front_ccw_cull_back Fixes: 60878dd00c ("radv: do not update the number of viewports in vkCmdSetViewport()") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix VK_KHR_image_format_list.Bas Nieuwenhuizen2017-09-251-1/+3
| | | | | | | Spec adding corner cases ... Fixes: 969537d9358 "radv: Add support for more DCC compression with VK_KHR_image_format_list." Reviewed-by: Dave Airlie <[email protected]>
* Revert "Revert "radv: fallback to an in-memory cache when no pipline cache ↵Bas Nieuwenhuizen2017-09-253-8/+15
| | | | | | | | | | | | | is provided"" I tested this 10 times with ./deqp-vk --deqp-case=dEQP-VK.texture.filtering.3d.formats.r4g4b4a4* and one full run of CTS, seems the issue is gone. Also reduces CTS runtime by 30% or so. Reviewed-by: Timothy Arceri <[email protected]>
* radv: init the trace BO before compiling meta shadersSamuel Pitoiset2017-09-251-5/+5
| | | | | | | Otherwise, the disasm string is NULL for meta shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make radv_pipeline_init() staticSamuel Pitoiset2017-09-252-8/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused variable in radv_dump_annotated_shader()Samuel Pitoiset2017-09-251-1/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make use of ATI_VENDOR_ID everywhereSamuel Pitoiset2017-09-254-5/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add gfx9 scissor workaroundDavid Airlie2017-09-241-0/+5
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: 17.2 <[email protected]>
* radv: Implement VK_AMD_rasterization_orderNicholas Miell2017-09-212-1/+26
| | | | | | | Tested with AMD's Anvil OutOfOrderRasterization demo on a RX 560. Signed-off-by: Nicholas Miell <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/surface: handle error when choosing preferred swizzle modeNicolai Hähnle2017-09-211-2/+4
| | | | | | | CID: 1418140 Fixes: c4ac522511d2 ("ac/surface: handle S8 on gfx9") Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* amd/addrlib: fix missing va_end() after va_copy()Nicolai Hähnle2017-09-211-6/+2
| | | | | | | | | | There's no reason to use va_copy here. CID: 1418113 Reviewed-by: Eric Engestrom <[email protected]> Fixes: e7fc664b91a5d886c270 ("winsys/amdgpu: add addrlib - texture addressing and alignment calculator") Reviewed-by: Marek Olšák <[email protected]>
* radv: copy the number of viewports/scissors at pipeline bind timeSamuel Pitoiset2017-09-211-2/+6
| | | | | | | | | | | The number of viewports/scissors can only be specified at pipeline creation time, so make sure to copy them when binding a new one because the dynamic state is cleared in BeginCommandBuffer(). Fixes: dcf46e995d ("radv: do not update the number of scissors in vkCmdSetScissor()") Fixes: 60878dd00c ("radv: do not update the number of viewports in vkCmdSetViewport()") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* spirv: Flip the tessellation winding orderJason Ekstrand2017-09-201-0/+1
| | | | | | | | It's not SPIR-V that's backwards from GLSL, it's Vulkan that's backwards from GL. Let's make NIR consistent with the source language and do the flipping inside the Vulkan driver instead. Reviewed-by: Kenneth Graunke <[email protected]>
* radv: Don't use a virtual function for getting the buffer virtual address.Bas Nieuwenhuizen2017-09-2013-89/+87
| | | | | | | | | | | We are really not going to use a winsys which does not need to store the va, so might as well store it in a standard field. Not sure this helps perf much though, as most of the cost is in the cache miss accessing the bo anyway, which we stil need to do. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Only enter the immutable samplers init loop when we have some.Bas Nieuwenhuizen2017-09-202-12/+18
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>