summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* radv: export SampleMask from pixel shaders at full rateSamuel Pitoiset2017-12-141-11/+35
| | | | | | | | | | | Use 16_ABGR instead of 32_ABGR if Z isn't written. Ported from RadeonSI. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: add ac_get_spi_shader_z_format()Samuel Pitoiset2017-12-143-0/+80
| | | | | | | | ac_shader_util.c will contain shader helpers for RadeonSI and RADV. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not load the local invocation index when it's unusedSamuel Pitoiset2017-12-143-1/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: scan which components of gl_LocalInvocationID are usedSamuel Pitoiset2017-12-142-1/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: scan which components of gl_WorkGroupID are usedSamuel Pitoiset2017-12-142-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: replace grid_components_used by uses_grid_sizeSamuel Pitoiset2017-12-143-5/+6
| | | | | | | | Use a boolean instead because the number of needed SGPRs is always 3. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always emit all compute block componentsSamuel Pitoiset2017-12-141-3/+6
| | | | | | | | | The number of grid components is always 3 when gl_NumWorkGroups is declared, because it relies on the number of components of nir_instrinsic_load_num_work_groups. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fix nir_op_f2f64Timothy Arceri2017-12-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | Without this we get the error "FPExt only operates on FP" when converting the following: vec1 32 ssa_5 = b2f ssa_4 vec1 64 ssa_6 = f2f64 ssa_5 Which results in: %44 = and i32 %43, 1065353216 %45 = fpext i32 %44 to double With this patch we now get: %44 = and i32 %43, 1065353216 %45 = bitcast i32 %44 to float %46 = fpext float %45 to double Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Support vulkan_resource_reindex.Bas Nieuwenhuizen2017-12-121-0/+14
| | | | | Fixes: 93b4cb61eb2 "spirv: Allow OpPtrAccessChain for block indices" Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Don't load the descriptor in vulkan_resource_index.Bas Nieuwenhuizen2017-12-121-5/+13
| | | | | | | | | | | To support the reindex intrinsic, we need the result to be something on which we can adjust the index/address. Since it is all within a basic block, the compiler should be able to merge any extra loads. v2: Change visit_get_buffer_size too. Reviewed-by: Dave Airlie <[email protected]>
* radv: use a faster version for nir_op_pack_half_2x16Samuel Pitoiset2017-12-071-11/+1
| | | | | | | | | | | | | | | | | | | | | This patch is ported from RadeonSI and it has two effects. It fixes a rendering issue which affects F1 2017 and Dawn of War 3 (Vega only) because LLVM was ending up by generating the new v_mad_mix_{hi,lo} instructions which appear to be buggy in some way. Not sure if Mesa is generating something wrong or if the issue is in LLVM only. Anyway, that explains why the DOW3 issue can't be reproduced with GL on Vega. It also improves performance because v_cvt_pkrtz_f16 is faster, and because I guess the rounding mode behaviour is similar between GL and VK, we can use it. About performance, it improves Talos by +3/4% but I don't see any other impacts. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: add si_nir_load_input_gs() to the abiTimothy Arceri2017-12-042-14/+34
| | | | | | | V2: make use of driver_location and don't expose NIR to the ABI. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: move build_varying_gather_values() to ac_llvm_build.h and exposeTimothy Arceri2017-12-043-28/+32
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add basic nir -> llvm type helperTimothy Arceri2017-12-041-0/+22
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/surface: always compute DCC info when DCC is possible on GFX9Marek Olšák2017-11-301-1/+0
| | | | | | The same code for VI doesn't check for scanout either. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: dismantle si_common_screen_init/destroyMarek Olšák2017-11-292-0/+56
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move/remove ac_shader_binary helpersMarek Olšák2017-11-292-0/+14
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: enable DCC computation for MSAAMarek Olšák2017-11-291-4/+2
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* meson: build r600 driverDylan Baker2017-11-281-2/+0
| | | | | | | | | v4: - Ensure inc_amd_common defined when radeonsi is disabled (needed by r600) Signed-off-by: Dylan Baker <[email protected]> Tested-by: Aaron Watry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ac/surface: fix indentationNicolai Hähnle2017-11-281-1/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: sid.h cleanupsNicolai Hähnle2017-11-281-13/+29
| | | | | | | Fix a bunch of labels indicating when registers were added/removed and normalize the SI-class GRBM_GFX_INDEX. Reviewed-by: Marek Olšák <[email protected]>
* ac: pack legacy_surf_level betterMarek Olšák2017-11-271-3/+3
| | | | | | r600_texture: 1488 -> 1248 bytes Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: change legacy_surf_level::slice_size to dword unitsMarek Olšák2017-11-272-2/+2
| | | | | | | | | The next commit will reduce the size even more. v2: typecast to uint64_t manually v3: add more typecasts, add asserts Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: pack ac_surface betterMarek Olšák2017-11-271-4/+5
| | | | | | r600_texture: 1736 -> 1488 bytes Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/nir: don't write tcs outputs to LDS that aren't read back.Dave Airlie2017-11-271-1/+16
| | | | | | | | | | | | If the TCS doesn't read back the outputs, no need to store them to LDS in the first place. (except for tess factors). This seems to give about 50fps (3290->3330) with tessellation demo. I haven't tested if it impacts DoW3 at all. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeon/common: add vcn enc ip info queryBoyuan Zhang2017-11-171-1/+9
| | | | | | | New ip info query is needed for vcn encode Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* ac: add gs_{prim,invocation}_id to the abiTimothy Arceri2017-11-162-8/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* meson: Remove build_by_default from amd codeDylan Baker2017-11-131-1/+0
| | | | | | | This is the same logic as the previous two patches. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ac: add emit_vertex to the abiTimothy Arceri2017-11-122-5/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: add support for all intrinsics. (v2)Dave Airlie2017-11-091-1/+31
| | | | | | | | | | | This is derived from tgsi/radeonsi code from the GLSL intrinsics. This should pre-fix radv for the upcoming spirv patches. v2: actually use wait_cnt, sleep deprived dad time! (Bas) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* amd/addrlib: update to latest versionMarek Olšák2017-11-082-222/+29
| | | | | | | | | | | | This uses C++11 initializer lists. I just overwrote all Mesa files with internal addrlib and discarded hunks that we should probably keep, but I might have missed something. The code depending on ADDR_AM_BUILD is removed. We can add it back next time if needed. Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: use ac_create_target_machineMarek Olšák2017-11-072-2/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use ac_get_llvm_processor_nameMarek Olšák2017-11-072-1/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove unused field in the PCI ID tableMarek Olšák2017-11-071-1/+1
| | | | Reviewed-by: Alex Deucher <[email protected]>
* ac/nir: for ubo load use correct num_componentsDave Airlie2017-11-071-1/+1
| | | | | | | | | I was hacking something stupid in doom, and hit an assert for the bitcast following this, it definitely looks like this should be the number of 32-bit components, not the instr level ones. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: remove the remaining duplicate llvm typesTimothy Arceri2017-11-031-12/+1
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: remove usused v4f32Timothy Arceri2017-11-031-4/+0
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: add v2f32 to the common code and make use of itTimothy Arceri2017-11-033-10/+7
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the ac f16 llvm typeTimothy Arceri2017-11-031-3/+1
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the ac f32 llvm typeTimothy Arceri2017-11-031-35/+33
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the ac f64 llvm typeTimothy Arceri2017-11-031-3/+1
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the common v8i32 llvm typeTimothy Arceri2017-11-031-4/+2
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the common v4i32 llvm typeTimothy Arceri2017-11-031-9/+7
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: add v3i32 to the common code and make use of itTimothy Arceri2017-11-033-5/+5
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: add v2i32 to the common code and use itTimothy Arceri2017-11-033-11/+11
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the ac i64 llvm typeTimothy Arceri2017-11-031-3/+1
| | | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: remove unused i16 llvm typeTimothy Arceri2017-11-031-2/+0
| | | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the ac ivoidt llvm typeTimothy Arceri2017-11-031-4/+2
| | | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the ac i8 llvm typeTimothy Arceri2017-11-031-6/+4
| | | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: use the ac i1 llvm typeTimothy Arceri2017-11-031-3/+1
| | | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>