aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* ac/llvm: drop pointless wrappers around umsb/imsbDave Airlie2017-10-261-14/+2
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: consolidate find lsb function.Dave Airlie2017-10-263-29/+36
| | | | | | | This was the same between si and ac. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: drop v4f32empty. (v2)Dave Airlie2017-10-261-12/+0
| | | | | | | | | This was unused. v2: drop args. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: add i1false/i1true to common code.Dave Airlie2017-10-263-41/+33
| | | | | | | These get used in fair few places. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: use the ac i32 0/1 and f32 0/1 llvm types.Dave Airlie2017-10-261-60/+52
| | | | | | | This just avoids having two copies of these. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: move lds declaration/load/store into shared code.Dave Airlie2017-10-263-41/+50
| | | | | | | This was duplicated between both drivers, share here. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: generate correct instruction for atomic min/max on unsigned imagesMatthew Nicholls2017-10-251-2/+4
| | | | | | | | v2: fix silly typo Cc: "17.2 17.3" <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add support for explicit component packingTimothy Arceri2017-10-251-16/+52
| | | | | | | | | | | | | | This is needed for RADV to support explicit component packing. This is also required to use the new NIR component splitting / packing passes. V2: - add commponent packing support for interpolate_at* intrinsics - improve store packing support when not all varyings are scalar as spotted by Bas the store source was incorrectly offset. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: postponed KILL isn't postponed anymore, but maintains WQMMarek Olšák2017-10-242-0/+8
| | | | | | | | | | | | | This restores performance for the drirc workaround, i.e. KILL_IF does: visible = src0 >= 0; kill_flag &= visible; // accumulate kills amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only And all helper pixels are killed at the end of the shader: amdgcn_kill(kill_flag); Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: use llvm.amdgcn.kill with LLVM 6.0Marek Olšák2017-10-241-0/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: replace ac_build_kill with ac_build_kill_if_falseMarek Olšák2017-10-243-26/+11
| | | | | | | This will be a new LLVM intrinsic and will also work nicely with llvm.amdgcn.wqm.vote. Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: Silence a compiler warning about results[0].Eric Anholt2017-10-231-0/+1
| | | | | | We know that num_components will be > 0, but it doesn't. Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: Fix a compiler warning for possibly undefined "name"Eric Anholt2017-10-231-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* amd/common/gfx9: workaround DCC corruption more conservativelyNicolai Hähnle2017-10-231-7/+25
| | | | | | | | Fixes KHR-GL45.texture_swizzle.smoke and others on Vega. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809 Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Only clamp shadow reference on radeonsi.Bas Nieuwenhuizen2017-10-233-2/+8
| | | | | | | | | | | | | | | | | | Vulkan CTS does not expect the value to be clamped (at least for D32), and it makes a differences even though depth is in [0,1], due to strict inequalities. I couldn't find anything in the Vulkan spec about this, but the test seemed to be copied from GL tests and the GL spec only specifies clamping for fixed point formats. Hence I expect radeonsi to run into this at some point as well, but given that they still have a usecase with the Z16->Z32 promotion, I'll leave that for someone else to clean up. This at least fixes radv dEQP-VK.texture.shadow.* on VI. Fixes: 0f9e32519bb 'ac/nir: clamp shadow texture comparison value on VI' Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Fix nir_texop_lod on GFX for 1D arrays.Bas Nieuwenhuizen2017-10-231-1/+3
| | | | | Fixes: 1bcb953e166 'radv: handle GFX9 1D textures' Reviewed-by: Dave Airlie <[email protected]>
* radv/ac/nir: only emit tess factors to storage if tes reads themDave Airlie2017-10-232-2/+3
| | | | | | | | | | Otherwise we just need to write them to the tf ring. this seems to improve the tessellation demo on Bonarie ~2190->~2230 fps Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: Account for compact array index in GS input load from LDS.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | Mirrors the vram path. Fixes: d4ecc3c9299 'ac/nir: Add loading from LDS for merged GS.' Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Set larged wrokgroup size for GS on GFX9.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | They don't take a single wave anymore and we need the barriers. Fixes: 6bc42855f92 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Take the max workgroup size of all provided shaders.Bas Nieuwenhuizen2017-10-211-1/+6
| | | | | Fixes: ffaf4d608a1 'radv: Enable tessellation shaders for GFX9.' Reviewed-by: Dave Airlie <[email protected]>
* radv: Expose VK_EXT_global_priorityAndres Rodriguez2017-10-212-0/+2
| | | | | | | Expose the extension string as supported Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Get rid of nir_shader::stageJason Ekstrand2017-10-202-21/+21
| | | | | | | | It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* ac/nir: Fix up GS input vgprs.Bas Nieuwenhuizen2017-10-201-0/+15
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add loading from LDS for merged GS.Bas Nieuwenhuizen2017-10-201-15/+21
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add ES output to LDS for GFX9.Bas Nieuwenhuizen2017-10-201-8/+49
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add merged GS function.Bas Nieuwenhuizen2017-10-201-17/+63
| | | | | | [airlied: merged fixup + and fixed up a couple more bits]. Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: init full exec mask for merged shaders.Dave Airlie2017-10-203-0/+12
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: move some code out of loop in store_tcs_output()Timothy Arceri2017-10-201-5/+5
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add code to compile merged shaders.Bas Nieuwenhuizen2017-10-191-0/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add LS-HS input VGPR workaround.Bas Nieuwenhuizen2017-10-191-0/+18
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Compile the bodies of multiple shaders.Bas Nieuwenhuizen2017-10-191-50/+83
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Expand user SGPR descriptions a bit.Bas Nieuwenhuizen2017-10-191-3/+3
| | | | | | To prevent VS/TCS collisions in merged shaders. Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Don't write to the dynamic HS word on GFX9.Bas Nieuwenhuizen2017-10-191-11/+16
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add function creation for merged LS+HS.Bas Nieuwenhuizen2017-10-191-76/+178
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Make scan_shader_output_decl less dependent on the context.Bas Nieuwenhuizen2017-10-191-14/+17
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.Bas Nieuwenhuizen2017-10-191-1/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Change interface to allow multiple source shaders.Bas Nieuwenhuizen2017-10-192-38/+47
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add HS calling convention.Bas Nieuwenhuizen2017-10-191-1/+4
| | | | | | Needed for GFX9 merged shaders. Reviewed-by: Dave Airlie <[email protected]>
* ac: Parse the new HS RSRC1 register.Bas Nieuwenhuizen2017-10-191-0/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: clean up ac_build_indexed_load function interfacesMarek Olšák2017-10-173-36/+42
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: add radeon_info::has_sync_filecros-mesa-17.2.3-vanillachadv/cros-mesa-17.2.3-vanillaMarek Olšák2017-10-122-0/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: add ac_surface::is_displayableMarek Olšák2017-10-122-0/+13
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* amd: move r600d_common.h into r600gMarek Olšák2017-10-091-135/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: shrink r600d_common.h and stop using itMarek Olšák2017-10-092-165/+17
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: properly document a buffer.store LLVM workaroundMarek Olšák2017-10-062-6/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: silence a warningMarek Olšák2017-10-041-1/+3
|
* radv: emit fmuladd instead of fma to llvm.Dave Airlie2017-10-041-1/+1
| | | | | | | | | | | | | | | | | | | | For Vulkan SPIR-V the spec states fma() Inherited from OpFMul followed by OpFAdd. Matt says the backend will do the right thing depending on the hardware being compiled for, if you use the fmuladd intrinsic. Using the Mad Max pts test, on high settings at 4K: CHP: 55->60 HGDD: 46->50 LM: 55->60 No change on Stronghold. Thanks to Feral for spending the time to track this down. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* amd/common: move ac_build_phi from radeonsiNicolai Hähnle2017-10-022-0/+19
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: clamp depth comparison value only for fixed point formatsNicolai Hähnle2017-09-291-0/+2
| | | | | | | | | | | | | | | | | | | The hardware usually does this automatically. However, we upgrade depth to Z32_FLOAT to enable TC-compatible HTILE, which means the hardware no longer clamps the comparison value for us. The only way to tell in the shader whether a clamp is required seems to be to communicate an additional bit in the descriptor table. While VI has some unused bits in the resource descriptor, those bits have unfortunately all been used in gfx9. So we use an unused bit in the sampler state instead. Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f and many other tests in dEQP-GLES3.functional.texture.shadow.* Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* amd/common: save an instruction in the build_cube_select sequenceNicolai Hähnle2017-09-291-5/+6
| | | | | | | Avoid a v_cndmask: the absolute value is free due to input modifiers. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>