summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* docs: fixup typoErik Faye-Lund2019-06-131-1/+1
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: enable AMD_shader_ballot with RADV_PERFTEST_SHADER_BALLOT ↵Daniel Schürmann2019-06-135-1/+9
| | | | | | ('shader_ballot') Reviewed-by: Connor Abbott <[email protected]>
* amd/common: add support for AMD_shader_ballot functionsDaniel Schürmann2019-06-131-0/+20
| | | | Reviewed-by: Connor Abbott <[email protected]>
* spirv/nir: add support for AMD_shader_ballot and Groups capabilityDaniel Schürmann2019-06-136-11/+139
| | | | | | | | This commit also renames existing AMD capabilities: - gcn_shader -> amd_gcn_shader - trinary_minmax -> amd_trinary_minmax Reviewed-by: Connor Abbott <[email protected]>
* nir: add intrinsics for AMD_shader_ballotDaniel Schürmann2019-06-133-0/+31
| | | | Reviewed-by: Connor Abbott <[email protected]>
* radv: enable shader_subgroup_vote & shader_subgroup_ballot extensionsDaniel Schürmann2019-06-131-0/+2
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/spirv: add support for the SubgroupBallotKHR SPIR-V capabilityDaniel Schürmann2019-06-132-7/+13
| | | | | | This capability is required for the VK_EXT_shader_subgroup_ballot extension. Reviewed-by: Connor Abbott <[email protected]>
* nir/spirv: add support for the SubgroupVoteKHR SPIR-V capabilityDaniel Schürmann2019-06-132-4/+20
| | | | | | This capability is required for the VK_EXT_shader_subgroup_vote extension. Reviewed-by: Connor Abbott <[email protected]>
* v3d: fix checking twice auf flagAlejandro Piñeiro2019-06-131-1/+1
| | | | | | | | | Seems a C&P error, and should check for auf/muf. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110902 Fixes: 8f065596d22ab000c53f "v3d: Add an optimization pass for redundant flags updates." Reviewed-by: Eric Anholt <[email protected]>
* radv: flush and invalidate CB before resetting query pools on GFX9Samuel Pitoiset2019-06-131-0/+4
| | | | | | | | | | | | We have to emit a CACHE_FLUSH_AND_INV_TS_EVENT to be sure all prior GPU work is done. While we are at it, also flush and invalidate DB. This fixes the following CTS (when the small hint is disabled): dEQP-VK.query_pool.statistics_query.reset_before_copy.* Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
* vl: Always enable drm winsys.Bas Nieuwenhuizen2019-06-133-21/+3
| | | | | | | | | | | | | | | | | | | | The dri2 winsys also uses libdrm (and you can only enable dri3 if you enable dri2), and the drm winsys only requires libdrm. So if any winsys is enabled you can also enable the drm winsys, and since we always want at least one winsys we can always enable it. I removed the check for the drm platform for VA and OMX since they do not care anymore. Since we still check for one of r600g, nouveau or radeonsi, we are guarantueed to still only enable it by default in a configuration that requires libdrm anyway. So for people using va=auto, we don't suddenly start requiring libdrm were we did not before. This supersedes "vl: Enable DRM by default.", which I pushed, but rolled back because it used dep_libdrm before its definition. Reviewed-by: Emil Velikov <[email protected]>
* radv: Always disable DCC on shareable images.Bas Nieuwenhuizen2019-06-131-3/+1
| | | | | | | Do not want it for perf reasons. Always have to disable DCC when transferring to external queue. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Skip transitions coming from external queue.Bas Nieuwenhuizen2019-06-131-0/+3
| | | | | | | | Transitions to external queue should do the transition & make sure it works on all queues. Fixes: 8ebc7dcb59a "radv: Allow fast clears with concurrent queue mask for some layouts." Reviewed-by: Samuel Pitoiset <[email protected]>
* lima/ppir: change offset type to intMateusz Krzak2019-06-132-2/+2
| | | | | | | | | | | | Offset doesn't need to be 64-bit. This fixes compilation error with 64-bit off_t. Fixes: af0de6b9 lima/ppir: implement discard and discard_if Suggested-by: Qiang Yu <[email protected]> Signed-off-by: Mateusz Krzak <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Tested-by: Andreas Baierl <[email protected]>
* virgl: virgl_transfer should own its virgl_resourceChia-I Wu2019-06-122-8/+6
| | | | | | | | We should avoid having potentially dangling pointers to pipe_resources in general. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: pass virgl_context to transfer create/destroyChia-I Wu2019-06-125-22/+21
| | | | | | | | A pipe_transfer is a context object. It is fine for the constructor/destructor to have access to the context. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: init transfer queue from virgl_contextChia-I Wu2019-06-123-10/+11
| | | | | | | | A pipe_transfer is a context object. It is fine for virgl_transfer_queue to have access to the context. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: clean up virgl_transfer_queue.hChia-I Wu2019-06-122-1/+13
| | | | | | | | Add header guard and forward declare structs. Move virgl_resource.h inclusion to the C file. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* radeonsi: add radeonsi_debug_disassembly optionNicolai Hähnle2019-06-122-6/+10
| | | | | | | | | | This dumps disassembly to the pipe_debug_callback together with shader stats. Can be used together with shader-db to get full disassembly of all shaders in the database. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix line splitting in si_shader_dump_assemblyNicolai Hähnle2019-06-121-1/+1
| | | | | | | Compute the count since the start of the current line instead of the count since the start of the the disassembly. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: raise the alignment of LDS memory for compute shadersNicolai Hähnle2019-06-121-1/+1
| | | | | | | This implies that the memory will always be at address 0, which allows LLVM to generate slightly better code. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use an explicit symbol for the LSHS LDS memoryNicolai Hähnle2019-06-122-2/+20
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: rename lds_{load,store} to lshs_lds_{load,store}Nicolai Hähnle2019-06-121-17/+16
| | | | | | | These functions are now only used in LS/HS shaders (both separate and merged). Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: declare LDS ESGS ring as an explicit symbol on LLVM >= 9Nicolai Hähnle2019-06-123-36/+94
| | | | | | | | | | | | | | | | | | | This will make it easier to use LDS for other purposes in geometry shaders in the future. The lifetime of the esgs_ring variable is as follows: - declared as [0 x i32] while compiling shader parts or monolithic shaders - just before uploading, gfx9_get_gs_info computes (among other things) the final ESGS ring size (this depends on both the ES and the GS shader) - during upload, the "esgs_ring" symbol is given to ac_rtld as a shared LDS symbol, which will lead to correctly laying out the LDS including other LDS objects that may be defined in the future - si_shader_gs uses shader->config.lds_size as the LDS size This change depends on the LLVM changes for emitting LDS symbols into the ELF file. Reviewed-by: Marek Olšák <[email protected]>
* amd/rtld: layout and relocate LDS symbolsNicolai Hähnle2019-06-127-52/+301
| | | | | | | | | | | Upcoming changes to LLVM will emit LDS objects as symbols in the ELF symbol table, with relocations that will be resolved with this change. Callers will also be able to define LDS symbols that are shared between shader parts. This will be used by radeonsi for the ESGS ring in gfx9+ merged shaders. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: cleanup some #includesNicolai Hähnle2019-06-121-2/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: use ARRAY_SIZE for the LLVM command line optionsNicolai Hähnle2019-06-121-2/+2
| | | | | | This is more convenient for changing it around during debug. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: inline si_shader_binary_read_config into its only callerNicolai Hähnle2019-06-122-16/+7
| | | | | | | Since it can only be used for reading the config of an individual, non-combined shader, it is not very reusable anyway. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use the new run-time linker for shadersNicolai Hähnle2019-06-129-237/+272
| | | | | | | v2: - fix a memory leak Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't declare pointers to static stringsNicolai Hähnle2019-06-121-2/+2
| | | | | | | | The compiler should be able to optimize them away, but still. There's no point in declaring those as pointers, and if the compiler *doesn't* optimize them away, they add unnecessary load-time relocations. Reviewed-by: Marek Olšák <[email protected]>
* amd/common: add ac_compile_module_to_elfNicolai Hähnle2019-06-122-7/+83
| | | | | | | A new variant of ac_compile_module_to_binary that allows us to keep the entire ELF around. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: dump shader binary buffer contentsNicolai Hähnle2019-06-122-0/+19
| | | | | | | Help identify bugs related to corruption of shaders in memory, or errors in shader upload / rtld. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: return bool from si_shader_binary_uploadNicolai Hähnle2019-06-124-19/+16
| | | | | | We didn't really use error codes anyway. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: let si_shader_create return a booleanNicolai Hähnle2019-06-124-16/+14
| | | | | | We didn't really use error codes anyway. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use ac_shader_configNicolai Hähnle2019-06-124-126/+27
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: add a more powerful runtime linkerNicolai Hähnle2019-06-125-0/+655
| | | | | | | | | Using an explicit linker instead of just concatenating .text sections will allow us to start using .rodata sections and explicit descriptions of data on LDS that is shared between stages. Reviewed-by: Marek Olšák <[email protected]>
* i965: Fix INTEL_DEBUG=batCaio Marcelo de Oliveira Filho2019-06-124-25/+26
| | | | | | | | | | | | | Use hash_table_u64 instead of hash_table directly, since the former will also handle the special keys (deleted and freed) and allow use the whole u64 space. Fixes crash in INTEL_DEBUG=bat when using a key with value 0 -- the current value for a freed key. Fixes: b38dab101ca "util/hash_table: Assert that keys are not reserved pointers" Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* util/hash_table: Properly handle the NULL key in hash_table_u64Caio Marcelo de Oliveira Filho2019-06-122-5/+37
| | | | | | | | | | The hash_table_u64 should support any uint64_t as input. It does special handling for the "deleted" key, storing the data in the table itself; do the same for the "freed" key. Fixes: b38dab101ca "util/hash_table: Assert that keys are not reserved pointers" Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* amd/common: clarify ac_shader_binary::lds_sizeNicolai Hähnle2019-06-121-1/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: extract ac_parse_shader_binary_configNicolai Hähnle2019-06-122-34/+47
| | | | Reviewed-by: Marek Olšák <[email protected]>
* u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macrosNicolai Hähnle2019-06-1210-33/+49
| | | | | | | | | | | | | | | | | | | | | | | | The main motivation for this change is API ergonomics: most operations on dynarrays are really on elements, not on bytes, so it's weird to have grow and resize as the odd operations out. The secondary motivation is memory safety. Users of the old byte-oriented functions would often multiply a number of elements with the element size, which could overflow, and checking for overflow is tedious. With this change, we only need to implement the overflow checks once. The checks are cheap: since eltsize is a compile-time constant and the functions should be inlined, they only add a single comparison and an unlikely branch. v2: - ensure operations are no-op when allocation fails - in util_dynarray_clone, call resize_bytes with a compile-time constant element size v3: - fix iris, lima, panfrost Reviewed-by: Marek Olšák <[email protected]>
* u_dynarray: return 0 on realloc failure and ensure no-opNicolai Hähnle2019-06-121-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | We're not very good at handling out-of-memory conditions in general, but this change at least gives the caller the option of handling it gracefully and without memory leaks. This happens to fix an error in out-of-memory handling in i965, which has the following code in brw_bufmgr.c: node = util_dynarray_grow(vma_list, sizeof(struct vma_bucket_node)); if (unlikely(!node)) return 0ull; Previously, allocation failure for util_dynarray_grow wouldn't actually return NULL when the dynarray was previously non-empty. v2: - make util_dynarray_ensure_cap a no-op on failure, add MUST_CHECK attribute - simplify the new capacity calculation: aside from avoiding a useless loop when newcap is very large, this also avoids an infinite loop when newcap is larger than 1 << 31 Reviewed-by: Marek Olšák <[email protected]>
* freedreno: use util_dynarray_clear instead of util_dynarray_resize(_, 0)Nicolai Hähnle2019-06-125-12/+12
| | | | | | | | | This is more expressive and simplifies a subsequent change. v2: - fix one more call-site after rebase Reviewed-by: Marek Olšák <[email protected]>
* panfrost/midgard: Differentiate vertex/fragment texture tagsAlyssa Rosenzweig2019-06-123-4/+15
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Assert on unknown texture sourceAlyssa Rosenzweig2019-06-121-5/+2
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Set minimal swizzle on texture inputAlyssa Rosenzweig2019-06-121-1/+2
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower texture projectorsAlyssa Rosenzweig2019-06-121-1/+2
| | | | | | | | | We do have native support for perspective division on the load/store unit, but this is for the future, something ideally we would select generally, not just for textures. Meanwhile, flipping on projector lowering works now. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Implement txlAlyssa Rosenzweig2019-06-122-8/+11
| | | | | | | This follows the txb implementation, but requires an adjustment to how the cont/last flags are set. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Implement txb opAlyssa Rosenzweig2019-06-121-10/+55
| | | | | | We refactor the main tex handling to fit a bias argument in as well. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Unify bind_vs/fs_stateAlyssa Rosenzweig2019-06-121-57/+49
| | | | | | | | This replaces bind_vs/fs_state calls to a unified bind_shader_state call, removing a great deal of duplicated logic related to variant selection. Signed-off-by: Alyssa Rosenzweig <[email protected]>