summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* radv: fix mask attribs properly.Dave Airlie2017-03-301-2/+2
| | | | | | some days it just doesn't pay to get out of bed. Signed-off-by: Dave Airlie <[email protected]>
* radv: fix regression with mask attrib setting code.Dave Airlie2017-03-301-3/+3
| | | | Signed-off-by: Dave Airlie <[email protected]>
* radv: move to using nir clip/cull merge pass.Dave Airlie2017-03-301-112/+39
| | | | | | | | | | Doing this before tessellation makes doing some bits of tessellation a bit cleaner. It also cleans up a bit of the llvm generator code. Reviewed-by: Edward O'Callaghan <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: add parameter to emit_waitcnt.Dave Airlie2017-03-281-3/+8
| | | | | | | | This is just a precursor for tess support, which needs to pass different values here. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: rework vertex/export shader output handlingDave Airlie2017-03-282-37/+47
| | | | | | | | | In order to faciliate adding tess support, split the vs/es output info into a separate block, so we make it easier to have the tess shaders export the same info. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: consistently use ifndef guards over pragma onceEmil Velikov2017-03-223-3/+12
| | | | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Vedran Miletić <[email protected]> Acked-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* ac: fix build with LLVM 5.0svnMarek Olšák2017-03-221-2/+8
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/ac: Fix shared memory offset calculationAlex Smith2017-03-171-1/+1
| | | | | | | | | | | The index passed to get_shared_memory_ptr is an attribute slot index, i.e. the index of a vec4 within LDS. Therefore this must be scaled by sizeof(vec4) to give the LDS byte offset. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> CC: <[email protected]>
* radv: Fix using more than 4 bound descriptor setsJames Legg2017-03-171-1/+3
| | | | | | | | Avoid a buffer overflow in ac_nir_to_llvm.c's create_function when using more than 4 descriptor sets. radv claims support for 8. Cc: 17.0 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/ac: workaround regression in llvm 4.0 releaseDave Airlie2017-03-151-1/+12
| | | | | | | | | | | | | | | LLVM 4.0 released with a pretty messy regression, that hopefully get fixed in the future. This work around was proposed by Tom, and it fixes the CTS regressions here at least, I'm not sure if this will cause any major side effects, but correctness over speed and all that. radeonsi should possibly consider the same workaround until an llvm fix can be found. Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: gather4 cube workaround integerDave Airlie2017-03-151-1/+71
| | | | | | | | | | | | | | | | | | | | This fix is extracted from amdgpu-pro shader traces. It appears the gather4 workaround for integer types doesn't work for cubes, so instead if forces a float scaled sample, then converts to integer. It modifies the descriptor before calling the gather. This also produces some ugly asm code for reasons specified in the patch, llvm could probably do better than dumping sgprs to vgprs. This fixes: dEQP-VK.glsl.texture_gather.basic.cube.rgba8* Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: Rework conversion opcodesJason Ekstrand2017-03-141-14/+10
| | | | | | | | | | | | | | | | | | | | | | | | The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <[email protected]>
* radv: setup llvm target data layoutDave Airlie2017-03-141-0/+7
| | | | | | | | | | | | | | Ported from radeonsi, pointed out by Tom. "This prevents LLVM from using sext instructions for local memory offsets and allows the backend to fold immediate offsets into the instruction. This also prevents some incorrect code generation for ptrtoint and inttoptr instructions." Cc: "13.0 17.0" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: move to new image intrinsics.Dave Airlie2017-03-131-145/+77
| | | | | | | This hooks up radv to the new image intrinsic builders. Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* amd: remove shebang from python scriptsEmil Velikov2017-03-101-1/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* amd: remove execute bit from python scriptsEmil Velikov2017-03-101-0/+0
| | | | | | | Analogous to earlier commit(s). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv/ac: fix multiple descriptor sets with dynamic buffersFredrik Höglund2017-03-071-3/+5
| | | | | | | | | | The dynamic_offset_offset in the descriptor set binding layout is relative to the dynamic_offset_start for the set in the pipeline layout. Cc: 17.0 <[email protected]> Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: document PREDICATION OP 3 as 64-bit bool.Dave Airlie2017-03-071-0/+1
| | | | | | | This just documents some info for possible future use. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: introduce i1true/i1false to context.Dave Airlie2017-03-071-32/+33
| | | | | | | | This uses these in a few places, and fixes one or two cases which were using da as 32-bit instead of bool. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: handle Z export using new builder.Dave Airlie2017-03-071-22/+19
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: move to using common ac_get_image_intr_name.Dave Airlie2017-03-071-40/+15
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/ac: move get_image_intr_name to commonDave Airlie2017-03-072-0/+31
| | | | | | | | This code is used in radv, so move to common build code. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: drop support for LLVM 3.6 & 3.7Marek Olšák2017-03-063-38/+14
| | | | | | They are too old. Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set the convergent attribute where neededMarek Olšák2017-03-061-2/+6
| | | | Reviewed-by: Dave Airlie <[email protected]>
* gallivm,ac: add LP_FUNC_ATTR_CONVERGENTMarek Olšák2017-03-062-0/+2
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: fix LLVM 3.9 - don't use non-matching attributes on declarationsMarek Olšák2017-03-061-3/+3
| | | | | | | | | Call site attributes are used since LLVM 4.0. This also reverts commit b19caecbd6f310c1663b0cfe483d113ae3bd5fe2 "radeon/ac: fix intrinsic version check", because this is the correct fix. Reviewed-by: Dave Airlie <[email protected]>
* radv/ac: use bitfield extract new intrinsics.Dave Airlie2017-03-061-7/+4
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: move to new kill build.Dave Airlie2017-03-061-5/+2
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: move to using new export intrinsics.Dave Airlie2017-03-061-93/+86
| | | | | | | This uses the new code in build to do exports. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: switch to new intrinsics for pkrtz and clamp.Dave Airlie2017-03-061-5/+2
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeon/ac: fix intrinsic version checkDave Airlie2017-03-061-1/+1
| | | | | | | Reported-by: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100068 Signed-off-by: Dave Airlie <[email protected]>
* ac: normalize build helper namesMarek Olšák2017-03-033-283/+282
| | | | | | s/emit/build/ Reviewed-by: Dave Airlie <[email protected]>
* ac: replace SI.vs.load.input with amdgcn.buffer.load.formatMarek Olšák2017-03-031-0/+20
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move SI.vs.load.input building into amd/commonMarek Olšák2017-03-032-0/+23
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: replace llvm.SI.tbuffer.store with llvm.amdgcn.buffer.store if ADD_TID=0Marek Olšák2017-03-033-4/+62
| | | | | | | | ADD_TID doesn't work. Needs more investigation. v2: remove leftover dead code Reviewed-by: Dave Airlie <[email protected]> (v1)
* ac: remove offen parameter from ac_build_buffer_store_dwordMarek Olšák2017-03-033-10/+8
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: merge and simplify tbuffer_store functionsMarek Olšák2017-03-033-74/+38
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfeMarek Olšák2017-03-032-0/+29
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move kill intrinsic building into amd/commonMarek Olšák2017-03-032-0/+17
| | | | | | just a cleanup Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set readnone on reads from read-only memoryMarek Olšák2017-03-032-3/+11
|
* radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtzMarek Olšák2017-03-032-0/+20
|
* ac: replace old image intrinsics with new onesMarek Olšák2017-03-031-0/+80
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move image intrinsic building to amd/commonMarek Olšák2017-03-032-0/+97
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: replace SI.export with amdgcn.exp.*Marek Olšák2017-03-031-0/+31
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move llvm.SI.export building to amd/commonMarek Olšák2017-03-032-0/+26
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: unify build_type_name_for_intr functionsMarek Olšák2017-03-033-38/+42
| | | | Reviewed-by: Dave Airlie <[email protected]>
* gallivm, ac: add writeonly and inaccessiblememonly attributesMarek Olšák2017-03-032-0/+4
| | | | Reviewed-by: Dave Airlie <[email protected]>
* amd/common: Fix build with new ac_add_function_attr()Tobias Klausmann2017-03-013-3/+5
| | | | | | | | | | | | | Fix usage of ac_add_function_attr() and make it known! common/ac_nir_to_llvm.c: In function 'create_llvm_function': common/ac_nir_to_llvm.c:265:4: error: implicit declaration of function 'ac_add_function_attr' [-Werror=implicit-function-declaration] ac_add_function_attr(main_function, i + 1, AC_FUNC_ATTR_BYVAL); ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallivm,ac: add function attributes at call sites instead of declarationsMarek Olšák2017-03-014-48/+86
| | | | | | | | | | | | | | | | They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic. We need this to force readnone or inaccessiblememonly on some amdgcn intrinsics. This is only used with LLVM 4.0 and later. Intrinsics only used with LLVM <= 3.9 don't need the LEGACY flag. gallivm and ac code is in the same patch, because splitting would be more complicated with all the LEGACY uses all over the place. v2: don't change the prototype of lp_add_function_attr. Reviewed-by: Jose Fonseca <[email protected]> (v1)
* gallivm,ac: remove unused FUNC_ATTR_LAST enumsMarek Olšák2017-03-011-1/+0
| | | | Reviewed-by: Jose Fonseca <[email protected]>