summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* ac: add 16-bit support to ac_build_isign()Samuel Pitoiset2018-09-171-5/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add 16-bit constant values for zero and oneSamuel Pitoiset2018-09-172-0/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_bifield_reverse() helperSamuel Pitoiset2018-09-173-1/+26
| | | | | | | Are we missing 64-bit support? Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_bit_count() helperSamuel Pitoiset2018-09-173-6/+31
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fix get_image_coords() for radeonsiTimothy Arceri2018-09-151-1/+2
| | | | | | | | | | Because this was setting image to true we would end up calling si_load_image_desc() when we sould be calling si_load_sampler_desc(). This fixes an assert() in Deus Ex: MD Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: adjust and simplify max_alloc_size determinationMarek Olšák2018-09-101-8/+8
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: fix GPU hangs with bindless textures and LLVM 7.0Marek Olšák2018-09-102-5/+51
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: remove deprecated use of LLVMInt1Type()Marek Olšák2018-09-101-1/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: use iN_0/1 constantsMarek Olšák2018-09-102-14/+13
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: add radeon_info::num_good_cu_per_shMarek Olšák2018-09-102-0/+4
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: revert new LLVM 7.0 behavior for fdivMarek Olšák2018-09-101-1/+8
| | | | | Cc: 18.2 <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* ac/radeonsi: fix CIK copy max sizeDave Airlie2018-08-311-1/+3
| | | | | | | | | | | | While adding transfer queues to radv, I started writing some tests, the first test I wrote fell over copying a buffer larger than this limit. Checked AMDVLK and found the correct limit. Cc: <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA for measuring SDMA performanceMarek Olšák2018-08-291-0/+4
|
* radeonsi: add flag L2_STREAM for minimal cache usageMarek Olšák2018-08-291-0/+2
|
* nir: Use a bitfield for image access qualifiersJason Ekstrand2018-08-291-2/+2
| | | | | | | | | | This commit expands the current memory access enum to contain the extra two bits provided for images. We choose to follow the SPIR-V convention of NonReadable and NonWriteable because readonly implies that you *can* read so readonly + writeonly doesn't make as much sense as NonReadable + NonWriteable. Reviewed-by: Kenneth Graunke <[email protected]>
* ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VIMarek Olšák2018-08-281-2/+2
| | | | | | | This fixes VM faults and corruption. Cc: 18.1 18.2 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fix WAITCNT flags for GFX9Marek Olšák2018-08-222-4/+6
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: fix getting GLSL type of array of samplers for TG4Samuel Pitoiset2018-08-221-2/+4
| | | | | | | | | | | This fixes a crash in build_tex_intrinsic() when trying to launch the Basemark GPU benchmark on GFX8. It looks like there is still something wrong because some frames are black. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106980 CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac,radeonsi: use ac_build_gather_values moreMarek Olšák2018-08-211-11/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac,radeonsi: use ac_build_fmadMarek Olšák2018-08-212-7/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add imad & fmad helpersMarek Olšák2018-08-212-0/+18
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add ac_build_s_barrierMarek Olšák2018-08-213-2/+8
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: completely remove +auto-waitcnt-before-barrierMarek Olšák2018-08-212-6/+2
| | | | | | | it causes corruption on several different GPU generations. Cc: 18.2 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: disable the auto-waitcnt-before-barrier LLVM optionSamuel Pitoiset2018-08-152-1/+3
| | | | | | | | | | | | | | This option allows us to remove additional s_waitcnt instructions because s_barrier internally does s_waitcnt 0. Though, apparently there is a problem with LDS accesses that causes rendering issues with FFXV and DXVK. Disable this optimization for now (RadeonSI still uses it). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107460 CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add radeon_info::nameMarek Olšák2018-08-142-1/+6
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: implement EXT_window_rectanglesMarek Olšák2018-08-141-0/+16
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* meson: Build with Python 3Mathieu Bridon2018-08-101-1/+1
| | | | | | | | | | | | Now that all the build scripts are compatible with both Python 2 and 3, we can flip the switch and tell Meson to use the latter. Since Meson already depends on Python 3 anyway, this means we don't need two different Python stacks to build Mesa. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* amd: remove support for LLVM 5.0Marek Olšák2018-08-032-108/+25
| | | | | | Users are encouraged to switch to LLVM 6.0 released in March 2018. Reviewed-by: Timothy Arceri <[email protected]>
* ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2)Marek Olšák2018-08-012-4/+25
| | | | | | | | To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 finish sooner on the older CPUs. (otherwise it gets killed and we fail the test) Acked-by: Dave Airlie <[email protected]>
* python: Use the unicode_escape codecMathieu Bridon2018-08-011-1/+1
| | | | | | | | | | | | Python 2 had string_escape and unicode_escape codecs. Python 3 only has the latter. These work the same as far as we're concerned, so let's use the future-proof one. However, the reste of the code expects unicode strings, so we need to decode them again. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* ac/surface: fix MSAA corruption on Vega due to FMASK tile swizzleMarek Olšák2018-07-311-1/+1
| | | | | | | a needle in the haystack? Cc: 18.1 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: use storage_samples instead of color_samples in most placesMarek Olšák2018-07-312-4/+4
| | | | | | | and use pipe_resource::nr_storage_samples instead of r600_texture::num_color_samples. Tested-by: Dieter Nützel <[email protected]>
* ac: pass write param to get_sampler_desc() from get_image_descriptor()Timothy Arceri2018-07-281-1/+1
| | | | | | | Looks like a mistake from when the deref stuff landed. Fixes: 506a07e4e3a4 ("ac/nir: Add deref support to image intrinsics.") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fix typo DSL_SEL -> DST_SELMarek Olšák2018-07-261-2/+2
|
* nir: rename f2f16_undef to f2f16Karol Herbst2018-07-241-1/+1
| | | | | | | | | | | we need rounding modes on other conversions involving floats and it is easier to rename f2f16_undef than renaming all the other ones. v2: rebased on master Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Rob Clark <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* radeonsi: Add debug option to enable LLVM GlobalISel (v2)Tom Stellard2018-07-233-2/+18
| | | | | | | | | | R600_DEBUG=gisel will tell LLVM to use GlobalISel rather than SelectionDAG for instruction selection. v2: mareko: move the helper to src/amd/common Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* ac: add support for 16bit load_push_constantDaniel Schürmann2018-07-231-0/+20
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for 16bit input/outputDaniel Schürmann2018-07-231-1/+7
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add support for 16bit buffer loadsDaniel Schürmann2018-07-231-40/+55
| | | | | | v2: Fixed dvec3 loads (bas) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add support for 16bit UBO loadsDaniel Schürmann2018-07-233-3/+51
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add support for 16bit ssbo storesDaniel Schürmann2018-07-231-60/+84
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add 16bit conversion operationsDaniel Schürmann2018-07-232-9/+31
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: add a workaround for bitfield_extract when count is 0Samuel Pitoiset2018-07-191-3/+17
| | | | | | | | | | | | LLVM 7 returns incorrect results when count is 0, something has been broken since LLVM 6. Of course, the best solution is to fix LLVM but this workaround works as expected for now. Original workaround by Philippe Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: run LLVM optimization passes only on the final function after inliningMarek Olšák2018-07-193-0/+14
|
* radeonsi: add support for Vega20Marek Olšák2018-07-124-1/+9
| | | | Reviewed-by: Alex Deucher <[email protected]>
* python: Use the print functionMathieu Bridon2018-07-061-19/+20
| | | | | | | | | | | | In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Acked-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* python: Stabilize some script outputsMathieu Bridon2018-07-051-1/+1
| | | | | | | | | | | | In Python, dictionaries and sets are unordered, and as a result their is no guarantee that running this script twice will produce the same output. Using ordered dicts and explicitly sorting items makes the build more reproducible, and will make it possible to verify that we're not breaking anything when we move the build scripts to Python 3. Reviewed-by: Eric Engestrom <[email protected]>
* ac: fold LLVMContext creation into ac_llvm_context_initMarek Olšák2018-07-042-4/+4
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: add reusable helpers for direct LLVM compilationMarek Olšák2018-07-043-4/+76
| | | | | | | | | | | | | | | This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked. struct ac_compiler_passes (opaque type) contains the main pass manager. ac_create_llvm_passes -- the result can go to thread local storage ac_destroy_llvm_passes -- can be called by a destructor in TLS ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary The motivation is to do the expensive call addPassesToEmitFile once per context or thread. Reviewed-by: Dave Airlie <[email protected]>
* ac: make some fns staticDave Airlie2018-07-042-13/+6
| | | | | | | Some of the compiler functions are no longer called outside the util file. Reviewed-by: Marek Olšák <[email protected]>