summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* ac/radv/radeonsi: refactor harvest config register getters.Dave Airlie2018-04-242-0/+117
| | | | | | | | This refactors the code out to share it between radv and radeonsi. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Nicolai Hähnle <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/radv/radeonsi: refactor max simd waves into common code.Dave Airlie2018-04-241-0/+16
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/radv/radeonsi: refactor raster_config default values getters.Dave Airlie2018-04-242-1/+96
| | | | | | | This just makes this common code between the two drivers. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/info: move gs table depth to common code.Dave Airlie2018-04-242-0/+34
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fix the number of coordinates for ac_image_get_lod and arraysSamuel Pitoiset2018-04-231-0/+14
| | | | | | | | | | | | | This fixes crashes for the following CTS: dEQP-VK.glsl.texture_functions.query.texturequerylod.* Cubemaps are the same as 2D arrays. Fixes: 625dcbbc456 ("amd/common: pass address components individually to ac_build_image_intrinsic") Cc: 18.1 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: teach get_ac_sampler_dim() about subpass attachmentsSamuel Pitoiset2018-04-231-17/+7
| | | | | | | | Suggested by Nicolai. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/nir: add missing round_slice for 1D arraysSamuel Pitoiset2018-04-231-0/+7
| | | | | | | | | | | | | This fixes a bunch of CTS fails with 1D arrays: dEQP-VK.glsl.texture_functions.texture*.sampler1darray_* Fixes: 625dcbbc456 ("amd/common: pass address components individually to ac_build_image_intrinsic") Cc: 18.1 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/nir: fix image dimension for subpass attachmentsSamuel Pitoiset2018-04-201-3/+15
| | | | | | | | | | | For subpass attachments we need one more coordinate with the layer, so make them array types. This fixes a bunch of CTS fails with RADV. Fixes: 24fb3e6aa1 ("ac/nir: use ac_build_image_opcode for image intrinsics") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: handle nir_intrinsic_load_first_vertex like base_vertexSamuel Pitoiset2018-04-201-2/+2
| | | | | | | | This fixes a ton of CTS crashes. Fixes: c366f422f0 ("nir: Offset vertex_id by first_vertex instead of base_vertex") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: use ac_build_image_opcode for image intrinsicsNicolai Hähnle2018-04-203-140/+78
| | | | | | So that we'll use the dimension-aware intrinsics in the future. Acked-by: Marek Olšák <[email protected]>
* radeonsi: generate image load/store/atomic ops using ac_build_image_opcodeNicolai Hähnle2018-04-202-32/+110
| | | | | | In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass address components individually to ac_build_image_intrinsicNicolai Hähnle2018-04-203-264/+216
| | | | | | This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass new enum ac_image_dim to ac_build_image_opcodeNicolai Hähnle2018-04-203-11/+66
| | | | | | | This is in preparation for the new, dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* ac/nir: fix atomic compare-and-swapNicolai Hähnle2018-04-201-0/+1
| | | | | | | | | The LLVM instruction returns { i32, i1 }, where the i1 indicates success. We're only interested in the first part, which is the loaded value. Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.* Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: add support for VegaMMarek Olšák2018-04-183-0/+6
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.Bas Nieuwenhuizen2018-04-161-17/+22
| | | | | | | | | No clue how I missed those ... Fixes: 4503ff760c "ac/nir: Add workaround for GFX9 buffer views." CC: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320 Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: handle subgroup intrinsicsDaniel Schürmann2018-04-141-29/+40
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add LLVM build functions for subgroup instrinsicsDaniel Schürmann2018-04-142-1/+485
| | | | | Co-authored-by: Connor Abbott <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: make ballot and umsb capable of 64bit inputsDaniel Schürmann2018-04-141-9/+25
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* winsys/amdgpu: allow local BOs on APUsMarek Olšák2018-04-131-1/+2
| | | | | | | Local BOs ignore BO priorities, and we don't need those on APUs. Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* ac/surface: Allow S swizzle for displayable surfaces.Bas Nieuwenhuizen2018-04-121-2/+5
| | | | | | | | | | | For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts S swizzles. At the same time addrlib prefers D swizzles is allowed, so we can just allow S swizzles as fallback. Fixes: b64b712558 "ac/surface/gfx9: request desired micro tile mode explicitly" Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass -O halt_waves to umr for hang debuggingNicolai Hähnle2018-04-111-1/+1
| | | | | | | | | | | This will give us meaningful wave information in the case of a hang where shaders are still running in an infinite loop. Note that we call umr multiple times for different sections of the ddebug hang dump, and so the wave information will not necessarily match up between sections. Reviewed-by: Marek Olšák <[email protected]>
* ac/surface: don't set the display flag for obviously unsupported cases (v2)Marek Olšák2018-04-102-4/+31
| | | | | | | This enables the tile swizzle for some cases of the displayable micro mode, and it also fixes an addrlib assertion failure on Vega. Reviewed-by: Michel Dänzer <[email protected]>
* ac/surface/gfx9: request desired micro tile mode explicitlyMarek Olšák2018-04-101-4/+16
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/nir: Use an array instead of hashtable for SSA defs.Bas Nieuwenhuizen2018-04-101-9/+13
| | | | | | | | | Saves about 2% of compile time for F1 2017, as well as reduce code size of an optimized libvulkan_radeon.so by about 1 KiB. This still keeps the hashtable, as we also stored blocks in there. Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: move FMASK shader logic to shared codeMarek Olšák2018-04-022-0/+59
| | | | | | We'll need it for FBFETCH in both TGSI and NIR paths. Tested-by: Dieter Nützel <[email protected]>
* ac/gpu_info: print GB_ADDR_CONFIGMarek Olšák2018-04-022-0/+51
|
* ac/gpu_info: reorder the fields and print them nicelyMarek Olšák2018-04-022-55/+76
|
* ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memoryMarek Olšák2018-04-022-2/+2
|
* ac/gpu_info: don't print irrelevant fieldsMarek Olšák2018-04-021-5/+0
|
* util: Include bitscan.h directlyIan Romanick2018-03-291-0/+1
| | | | | | | | | | | | | | | Previously bitset.h would include u_math.h to get bitscan.h. u_math.h lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h live in src/util. Having the one file directly include another file that lives in the same directory makes much more sense. As a side-effect, several files need to directly include standard header files that were previously indirectly included. v2: Fix build break in src/amd/common/ac_nir_to_llvm.c. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* util: Move util_is_power_of_two to bitscan.h and rename to ↵Ian Romanick2018-03-292-3/+3
| | | | | | | | | | | util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* ac: add support for trinary_minmax instructionsDaniel Schürmann2018-03-291-0/+54
| | | | | | | v2: Add missing break (Bas) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Add workaround for GFX9 buffer views.Bas Nieuwenhuizen2018-03-294-7/+69
| | | | | | | | | | | | | | | | | | | | | On GFX9 whether the buffer size is interpreted as elements or bytes depends on whether IDXEN is enabled in the instruction. If the index is a constant zero, LLVM optimizes IDXEN to 0. Now the size in elements is interpreted in bytes which of course results in out of bounds accesses. The correct fix is most likely to disable the LLVM optimization, but we need something to work with LLVM <= 6.0. radeonsi does the max between stride and element count on the CPU but that results in the size intrinsics returning the wrong size for the buffer. This would cause CTS errors for radv. v2: Also include the store changes. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertionsMarek Olšák2018-03-281-0/+8
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738 Tested-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add support for Vega12Marek Olšák2018-03-283-7/+28
| | | | Reviewed-by: Alex Deucher <[email protected]>
* amd/addrlib: update to the latest version for Vega12Marek Olšák2018-03-281-1/+1
| | | | Reviewed-by: Alex Deucher <[email protected]>
* ac/radeonsi: pass bindless bool to load_sampler_desc()Timothy Arceri2018-03-282-3/+11
| | | | | | | | We also fix the base_index for bindless by using the driver location. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir_to_llvm: fix component packing for double outputsTimothy Arceri2018-03-281-1/+3
| | | | | | | | | | We need to wait until after the writemask is widened before we adjust it for component packing. Together with the previous patch this fixes a number of arb_enhanced_layouts component layout piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't reallocate on DMABUF export if local BOs are disabledMarek Olšák2018-03-262-0/+3
|
* nir: Rename image intrinsics to image_varJason Ekstrand2018-03-231-21/+21
| | | | | | | | | | | Generated with git grep -l nir_intrinsic_image | xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir_to_llvm: add frexp supportTimothy Arceri2018-03-221-0/+11
| | | | | | | | | | | | | | Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/surface: compute tile swizzle for GFX9Marek Olšák2018-03-212-3/+88
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/nir: pass the nir variable through tcs loading.Dave Airlie2018-03-142-12/+6
| | | | | | | | | | | | I was going to have to add another parameter to this monster, so we should just pass the nir_variable in, I can't find any reason this would be a bad idea. This needed for the next fix. Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroupsJason Ekstrand2018-03-134-98/+0
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa()Samuel Pitoiset2018-03-132-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: remove some unnecessary includes and declarationsSamuel Pitoiset2018-03-132-9/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: drop radv prefix from radv_lower_gather4_integer()Samuel Pitoiset2018-03-131-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: move ac_nir_compiler_options and friends to radv folderSamuel Pitoiset2018-03-131-72/+0
| | | | | | | Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: move ac_shader_info to radv folderSamuel Pitoiset2018-03-134-388/+0
| | | | | | | This is RADV specific code. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>