mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	radeonsi: fix line splitting in si_shader_dump_assembly	Nicolai Hähnle	2019-06-12	1	-1/+1
\| \| \| \| \| \| \|	Compute the count since the start of the current line instead of the count since the start of the the disassembly. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: raise the alignment of LDS memory for compute shaders	Nicolai Hähnle	2019-06-12	1	-1/+1
\| \| \| \| \| \| \|	This implies that the memory will always be at address 0, which allows LLVM to generate slightly better code. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: use an explicit symbol for the LSHS LDS memory	Nicolai Hähnle	2019-06-12	2	-2/+20
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: rename lds_{load,store} to lshs_lds_{load,store}	Nicolai Hähnle	2019-06-12	1	-17/+16
\| \| \| \| \| \| \|	These functions are now only used in LS/HS shaders (both separate and merged). Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi/gfx9: declare LDS ESGS ring as an explicit symbol on LLVM >= 9	Nicolai Hähnle	2019-06-12	3	-36/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will make it easier to use LDS for other purposes in geometry shaders in the future. The lifetime of the esgs_ring variable is as follows: - declared as [0 x i32] while compiling shader parts or monolithic shaders - just before uploading, gfx9_get_gs_info computes (among other things) the final ESGS ring size (this depends on both the ES and the GS shader) - during upload, the "esgs_ring" symbol is given to ac_rtld as a shared LDS symbol, which will lead to correctly laying out the LDS including other LDS objects that may be defined in the future - si_shader_gs uses shader->config.lds_size as the LDS size This change depends on the LLVM changes for emitting LDS symbols into the ELF file. Reviewed-by: Marek Olšák <[email protected]>
*	amd/rtld: layout and relocate LDS symbols	Nicolai Hähnle	2019-06-12	7	-52/+301
\| \| \| \| \| \| \| \| \| \| \|	Upcoming changes to LLVM will emit LDS objects as symbols in the ELF symbol table, with relocations that will be resolved with this change. Callers will also be able to define LDS symbols that are shared between shader parts. This will be used by radeonsi for the ESGS ring in gfx9+ merged shaders. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: cleanup some #includes	Nicolai Hähnle	2019-06-12	1	-2/+2
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: use ARRAY_SIZE for the LLVM command line options	Nicolai Hähnle	2019-06-12	1	-2/+2
\| \| \| \| \| \|	This is more convenient for changing it around during debug. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: inline si_shader_binary_read_config into its only caller	Nicolai Hähnle	2019-06-12	2	-16/+7
\| \| \| \| \| \| \|	Since it can only be used for reading the config of an individual, non-combined shader, it is not very reusable anyway. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: use the new run-time linker for shaders	Nicolai Hähnle	2019-06-12	9	-237/+272
\| \| \| \| \| \| \|	v2: - fix a memory leak Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: don't declare pointers to static strings	Nicolai Hähnle	2019-06-12	1	-2/+2
\| \| \| \| \| \| \| \|	The compiler should be able to optimize them away, but still. There's no point in declaring those as pointers, and if the compiler doesn't optimize them away, they add unnecessary load-time relocations. Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: add ac_compile_module_to_elf	Nicolai Hähnle	2019-06-12	2	-7/+83
\| \| \| \| \| \| \|	A new variant of ac_compile_module_to_binary that allows us to keep the entire ELF around. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: dump shader binary buffer contents	Nicolai Hähnle	2019-06-12	2	-0/+19
\| \| \| \| \| \| \|	Help identify bugs related to corruption of shaders in memory, or errors in shader upload / rtld. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: return bool from si_shader_binary_upload	Nicolai Hähnle	2019-06-12	4	-19/+16
\| \| \| \| \| \|	We didn't really use error codes anyway. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: let si_shader_create return a boolean	Nicolai Hähnle	2019-06-12	4	-16/+14
\| \| \| \| \| \|	We didn't really use error codes anyway. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: use ac_shader_config	Nicolai Hähnle	2019-06-12	4	-126/+27
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: add a more powerful runtime linker	Nicolai Hähnle	2019-06-12	5	-0/+655
\| \| \| \| \| \| \| \| \|	Using an explicit linker instead of just concatenating .text sections will allow us to start using .rodata sections and explicit descriptions of data on LDS that is shared between stages. Reviewed-by: Marek Olšák <[email protected]>
*	i965: Fix INTEL_DEBUG=bat	Caio Marcelo de Oliveira Filho	2019-06-12	4	-25/+26
\| \| \| \| \| \| \| \| \| \| \| \| \|	Use hash_table_u64 instead of hash_table directly, since the former will also handle the special keys (deleted and freed) and allow use the whole u64 space. Fixes crash in INTEL_DEBUG=bat when using a key with value 0 -- the current value for a freed key. Fixes: b38dab101ca "util/hash_table: Assert that keys are not reserved pointers" Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	util/hash_table: Properly handle the NULL key in hash_table_u64	Caio Marcelo de Oliveira Filho	2019-06-12	2	-5/+37
\| \| \| \| \| \| \| \| \| \|	The hash_table_u64 should support any uint64_t as input. It does special handling for the "deleted" key, storing the data in the table itself; do the same for the "freed" key. Fixes: b38dab101ca "util/hash_table: Assert that keys are not reserved pointers" Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	amd/common: clarify ac_shader_binary::lds_size	Nicolai Hähnle	2019-06-12	1	-1/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: extract ac_parse_shader_binary_config	Nicolai Hähnle	2019-06-12	2	-34/+47
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macros	Nicolai Hähnle	2019-06-12	10	-33/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main motivation for this change is API ergonomics: most operations on dynarrays are really on elements, not on bytes, so it's weird to have grow and resize as the odd operations out. The secondary motivation is memory safety. Users of the old byte-oriented functions would often multiply a number of elements with the element size, which could overflow, and checking for overflow is tedious. With this change, we only need to implement the overflow checks once. The checks are cheap: since eltsize is a compile-time constant and the functions should be inlined, they only add a single comparison and an unlikely branch. v2: - ensure operations are no-op when allocation fails - in util_dynarray_clone, call resize_bytes with a compile-time constant element size v3: - fix iris, lima, panfrost Reviewed-by: Marek Olšák <[email protected]>
*	u_dynarray: return 0 on realloc failure and ensure no-op	Nicolai Hähnle	2019-06-12	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We're not very good at handling out-of-memory conditions in general, but this change at least gives the caller the option of handling it gracefully and without memory leaks. This happens to fix an error in out-of-memory handling in i965, which has the following code in brw_bufmgr.c: node = util_dynarray_grow(vma_list, sizeof(struct vma_bucket_node)); if (unlikely(!node)) return 0ull; Previously, allocation failure for util_dynarray_grow wouldn't actually return NULL when the dynarray was previously non-empty. v2: - make util_dynarray_ensure_cap a no-op on failure, add MUST_CHECK attribute - simplify the new capacity calculation: aside from avoiding a useless loop when newcap is very large, this also avoids an infinite loop when newcap is larger than 1 << 31 Reviewed-by: Marek Olšák <[email protected]>
*	freedreno: use util_dynarray_clear instead of util_dynarray_resize(_, 0)	Nicolai Hähnle	2019-06-12	5	-12/+12
\| \| \| \| \| \| \| \| \|	This is more expressive and simplifies a subsequent change. v2: - fix one more call-site after rebase Reviewed-by: Marek Olšák <[email protected]>
*	panfrost/midgard: Differentiate vertex/fragment texture tags	Alyssa Rosenzweig	2019-06-12	3	-4/+15
\| \| \| \|	Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost/midgard: Assert on unknown texture source	Alyssa Rosenzweig	2019-06-12	1	-5/+2
\| \| \| \|	Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost/midgard: Set minimal swizzle on texture input	Alyssa Rosenzweig	2019-06-12	1	-1/+2
\| \| \| \|	Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost/midgard: Lower texture projectors	Alyssa Rosenzweig	2019-06-12	1	-1/+2
\| \| \| \| \| \| \| \| \|	We do have native support for perspective division on the load/store unit, but this is for the future, something ideally we would select generally, not just for textures. Meanwhile, flipping on projector lowering works now. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost/midgard: Implement txl	Alyssa Rosenzweig	2019-06-12	2	-8/+11
\| \| \| \| \| \| \|	This follows the txb implementation, but requires an adjustment to how the cont/last flags are set. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost/midgard: Implement txb op	Alyssa Rosenzweig	2019-06-12	1	-10/+55
\| \| \| \| \| \|	We refactor the main tex handling to fit a bias argument in as well. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Unify bind_vs/fs_state	Alyssa Rosenzweig	2019-06-12	1	-57/+49
\| \| \| \| \| \| \| \|	This replaces bind_vs/fs_state calls to a unified bind_shader_state call, removing a great deal of duplicated logic related to variant selection. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Add panfrost_job_type_for_pipe helper	Alyssa Rosenzweig	2019-06-12	1	-2/+29
\| \| \| \| \| \| \|	This logic is repeated in a bunch of places and will only grow worse as we support more job types; collect it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost/midgard: Extract emit_varying_read	Alyssa Rosenzweig	2019-06-12	1	-28/+37
\| \| \| \| \| \| \|	Paralleling emit_uniform_read, this allows varying reads to be emitted independent of an honest-to-goodness load vary instruction in the NIR. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Remove "vertex/tiler render target" silliness	Alyssa Rosenzweig	2019-06-12	2	-90/+77
\| \| \| \| \| \| \| \|	I don't think these are actual structures, just figments over cargoculting dumped memory without making any sense of it. Nothing seems to break if the region is zeroed out, anyway. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost/decode: Print line number of bad memory access	Alyssa Rosenzweig	2019-06-12	1	-0/+6
\| \| \| \|	Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Replace pantrace with direct decoding	Alyssa Rosenzweig	2019-06-12	11	-259/+140
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	History lesson! In the early days of a Panfrost, we had a library independent of the driver called `panwrap` which would be LD_PRELOAD'ed into a driver to decode its cmdstream in real-time. When upstreaming Panfrost, we realized that we would much rather have this decode functionality maintained in-tree to avoid divergence, but that we could not upstream panwrap because of its use with the legacy API. So we instead dumped GPU memory to the filesystem with an out-of-tree panwrap, and decoded that with the in-tree pandecode module. When we migrated to the new kernel, we just added support for doing this memory dump directly from the driver (via a module "pantrace"). This works, but dumping memory every frame is sloooooooooooooow and error-prone. I figured if we have pandecode in-tree, we might as well link to it directly in the driver, allowing us to decode Panfrost's command streams without dumping memory to the filesystem first. This cleans up the code substantially and improves dumping performance by a HUGE margin. I'm talking "several seconds per frame" to "dumping in real-time" kind of jump. Note to users: this removes the environmental option "PANTRACE_BASE". Instead, for equivalent functionality set "PAN_MESA_DEBUG=trace" and redirect stdout to the file of your choosing. This should be debugging Panfrost much more pleasant. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	st/mesa: Add rgbx handling for fp formats	Kevin Strasser	2019-06-12	1	-0/+6
\| \| \| \| \| \| \| \| \|	Add missing cases for fp32 and fp16 formats. Fixes: c68334ffc0a9 "st/mesa: add floating point formats in st_new_renderbuffer_fb()" Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	gallium/winsys/kms: Fix dumb buffer bpp	Kevin Strasser	2019-06-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The bpp in the dumb buffer creation request is hardcoded to 32, which is an incorrect assumption as the caller is free to pick any pipe format. Use the bpp supplied to us through util_format_get_blocksizebits(). Fixes: 3b176c441b "gallium: Add a dumb drm/kms winsys backed swrast provider" Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	util/futex: fix dangling pointer use	Eric Engestrom	2019-06-12	1	-5/+5
\| \| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110901 Fixes: 7dc2f4788288ec9c7ab6 "util: emulate futex on FreeBSD using umtx" Cc: Greg V <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
*	radv: fix VK_EXT_memory_budget if one heap isn't available	Samuel Pitoiset	2019-06-12	1	-27/+33
\| \| \| \| \| \| \| \| \| \| \|	When the visible VRAM size is equal to the VRAM size only two heaps are exposed. This fixes dEQP-VK.api.info.device.memory_budget. Cc: 19.0 19.1 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
*	radv: fix occlusion queries on VegaM	Samuel Pitoiset	2019-06-12	1	-21/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The number of render backends is 16 but the enabled mask is 0xaaaa. As noticed by Bas, allowing disabled render backends might break the OCCLUSION_QUERY packet. We don't use it yet but keep this in mind. This fixes dEQP-VK.query_pool.* and dEQP-VK.multiview.*. Cc: 19.0 19.1 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
*	anv: do not parse genxml data without INTEL_DEBUG=bat	Lionel Landwerlin	2019-06-12	1	-10/+13
\| \| \| \| \| \| \| \|	This significantly slows down the CTS runs. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 32ffd90002b04b ("anv: add support for INTEL_DEBUG=bat") Reviewed-by: Jordan Justen <[email protected]>
*	intel/dump: fix segfault when the app hasn't accessed the device	Lionel Landwerlin	2019-06-12	1	-3/+5
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	iris: Only upload surface state for grid info when needed	Caio Marcelo de Oliveira Filho	2019-06-11	1	-8/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Special care is needed to ensure that when we have two consecutive calls with the same grid size, we only bail in the second one if it either don't need the surface state or the surface state was already uploaded. v2: Instead of having a new bool in ice->state to know whether we had a surface, check whether we have state->ref. (Ken) Clean up the logic a little bit by adding 'grid_updated' local. (Ken) Reviewed-by: Sagar Ghuge <[email protected]> [v1] Reviewed-by: Kenneth Graunke <[email protected]>
*	iris: Create binding table slot for num_work_groups only when needed	Caio Marcelo de Oliveira Filho	2019-06-11	2	-2/+6
\| \| \| \| \|	Reviewed-by: Sagar Ghuge <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	r300g: implement GLSL disk shader caching	Rui Salvaterra	2019-06-11	2	-1/+40
\| \| \| \| \| \| \|	This implements GLSL disk shader caching for the R300-R500 series of AMD GPUs. Signed-off-by: Rui Salvaterra <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	r300g: restore performance after RADEON_FLAG_NO_INTERPROCESS_SHARING was added	Richard Thier	2019-06-11	5	-6/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v1: Fix skipped slab allocators and the buffer cache. v2: Use only 1 domain for texture allocation v3: Added flag for the create_fence call too Based on Marek v1 and v2 proposed fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=1107812.patch Cc: 19.1 <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	radeonsi: don't test SDMA perf if SDMA is disabled/unsupported	Marek Olšák	2019-06-11	1	-0/+3
\|
*	radeonsi: always interpolate PrimID as flat	Marek Olšák	2019-06-11	1	-1/+2
\|
*	radeonsi: move color clamping to si_llvm_export_vs to unify the code	Marek Olšák	2019-06-11	1	-80/+67
\|