mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	r600/ssbo: refactor out buffer coord calcs and use for atomic path.	Dave Airlie	2017-12-05	1	-34/+37
\| \| \| \| \| \| \| \|	The atomic rat path has a bug in the ssbo path, refactor out the address calcs from the load/store paths and reuse to fix the bug in the buffer rat atomic path. Signed-off-by: Dave Airlie <[email protected]>
*	r600/ssbo: fix multi-dword buffer loads.	Dave Airlie	2017-12-05	1	-5/+7
\| \| \| \| \| \|	This fixes loading from different channels. Signed-off-by: Dave Airlie <[email protected]>
*	r600/ssbo: use r32ui format for ssbo resources.	Dave Airlie	2017-12-05	1	-3/+3
\| \| \| \| \| \| \|	This works best for returning the correct values and sizes in tests. Signed-off-by: Dave Airlie <[email protected]>
*	r600: refactor out the immediate setup code.	Dave Airlie	2017-12-05	1	-38/+28
\| \| \| \| \| \|	This just refactors the same code out of the images/buffers paths. Signed-off-by: Dave Airlie <[email protected]>
*	r600/shader: fix ssbo atomic operations formats.	Dave Airlie	2017-12-05	1	-4/+12
\| \| \| \| \| \|	Don't try and use the image format for ssbo, just 32-bit uint. Signed-off-by: Dave Airlie <[email protected]>
*	r600/shader: fix thread id loading.	Dave Airlie	2017-12-05	1	-9/+18
\| \| \| \| \| \| \|	This just changes how thread id loading is done, it makes smaller shaders if we don't use thread id gprs. Signed-off-by: Dave Airlie <[email protected]>
*	gallium/u_upload_mgr: allow drivers to specify pipe_resource::flags	Marek Olšák	2017-12-05	9	-13/+13
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	winsys/amdgpu: add RADEON_FLAG_READ_ONLY	Marek Olšák	2017-12-05	1	-6/+41
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: remove RADEON_HEAP_VRAM_GTT	Marek Olšák	2017-12-05	1	-8/+2
\| \| \| \| \| \| \|	Only winsyses can set VRAM\|GTT. Drivers shouldn't if they want to use winsys allocators. Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: move setting VRAM\|GTT into winsyses	Marek Olšák	2017-12-05	2	-28/+0
\| \| \| \| \| \|	The combined VRAM\|GTT heap will be removed. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: flush the context after resource_copy_region for buffer exports	Marek Olšák	2017-12-05	1	-2/+12
\| \| \| \| \|	Cc: 17.2 17.3 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	Android: gallium/radeon: fix libmesa_amd_common dependency	Mauro Rossi	2017-12-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libmesa_amd_common static dependency is added in Android build to avoid the following building errors: In file included from external/mesa/src/gallium/drivers/radeon/r600_buffer_common.c:24: In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:26: external/mesa/src/gallium/drivers/radeonsi/si_shader.h:138:10: fatal error: 'ac_binary.h' file not found ^~~~~~~~~~~~~ 1 error generated. ... In file included from external/mesa/src/gallium/drivers/radeon/r600_gpu_load.c:34: In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:26: external/mesa/src/gallium/drivers/radeonsi/si_shader.h:138:10: fatal error: 'ac_binary.h' file not found ^~~~~~~~~~~~~ 1 error generated. Fixes: 950221f923 ("radeonsi: remove r600_common_screen") Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	r600/atomic: add cayman version of atomic save/restore from GDS (v2)	Dave Airlie	2017-12-05	2	-24/+126
\| \| \| \| \| \| \| \| \| \| \| \| \|	On Cayman we don't use the append/consume counters (fglrx doesn't) and they don't seem to work well with compute shaders. This just uses GDS instead to do the atomic operations. v1.1: remove unused line. v2: use EOS on cayman, it appears to work. Acked-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	r600/atomic: refactor out evergreen atomic setup/save code.	Dave Airlie	2017-12-05	1	-30/+50
\| \| \| \| \| \|	For cayman we want to use different code paths. Signed-off-by: Dave Airlie <[email protected]>
*	radeonsi: pass llvm type directly to buffer_load()	Timothy Arceri	2017-12-05	1	-8/+7
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	meson: define driver dependencies	Dylan Baker	2017-12-04	13	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \|	This allow us to encapsulate the compiler and linkage requirements of each driver in a reusable way. The result will be that each target that needs a specific driver can simply add `driver_<name>` to its dependencies line and the necessary libraries and compiler args will be added. This will allow for a lot of code de-duplication between gallium targets. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	freedreno: mark stencil buffer valid too in case of z32x24s8	Rob Clark	2017-12-04	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \|	The separate stencil buffer was not also getting marked as valid if written by a draw/clear, resulting in gmem2mem getting skipped. Move this into fd_batch_resource_used() which also handles the separate stencil case. Also fix restore_buffers typo. Fixes: 4ab6ab80365 freedreno: avoid mem2gmem for invalidated buffers Signed-off-by: Rob Clark <[email protected]>
*	freedreno: remove use of u_transfer	Rob Clark	2017-12-04	11	-41/+30
\| \| \| \| \| \| \|	Freedreno doesn't treat buffers and images differently, so it's use was kind of pointless. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: add -Wno-packed-bitfield-compat for meson build	Eric Engestrom	2017-12-04	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \|	Otherwise huge amount of spam from instr-a2xx.h.. gcc has no way to know that freedreno was never built with such an old gcc version to care about the bugs in old gcc ;-) Reported-by: Rob Clark <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> [added commit message] Signed-off-by: Rob Clark <[email protected]>
*	nvc0/ir: Properly lower 64-bit shifts when the shift value is >32	Pierre Moreau	2017-12-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: 61d7676df77 "nvc0/ir: add support for 64-bit shift lowering on SM20/SM30" Fixes fs-shift-scalar-by-scalar.shader_test from piglit for the current set-up: uniform int64_t ival -0x7dfcfefbdf6536ff # bit pattern: 0x82030104209ac901 uniform uint64_t uval 0x1400000085010203 uniform int shl 36 uniform int shr 36 uniform int64_t iexpected_shl 0x09ac901000000000 uniform int64_t iexpected_shr -0x7dfcff0 # bit pattern: 0xfffffffff8203010 uniform uint64_t uexpected_shl 0x5010203000000000 uniform uint64_t uexpected_shr 0x0000000001400000 draw rect ortho 12 0 4 4 Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	st/glsl_to_nir/radeonsi: enable gs support for nir backend	Timothy Arceri	2017-12-04	2	-29/+35
\| \| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	ac: add si_nir_load_input_gs() to the abi	Timothy Arceri	2017-12-04	3	-0/+30
\| \| \| \| \| \| \|	V2: make use of driver_location and don't expose NIR to the ABI. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: create si_llvm_load_input_gs()	Timothy Arceri	2017-12-04	2	-23/+44
\| \| \| \| \| \| \| \| \| \|	This creates a common function that can be shared by the tgsi and nir backends. v2: use LLVMBuildBitCast() directly Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: pass llvm type to lds_load()	Timothy Arceri	2017-12-04	1	-13/+13
\| \| \| \| \| \| \|	v2: use LLVMBuildBitCast() directly Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: add llvm_type_is_64bit() helper	Timothy Arceri	2017-12-04	1	-0/+9
\| \| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: pass llvm type to si_llvm_emit_fetch_64bit()	Timothy Arceri	2017-12-04	3	-12/+18
\| \| \| \| \| \| \|	v2: use LLVMBuildBitCast() directly Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: add nir support for gs epilogue	Timothy Arceri	2017-12-04	1	-4/+21
\| \| \| \| \| \| \|	v2: add emit_gs_epilogue() helper function to reduce duplication. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: add nir support for es epilogue	Timothy Arceri	2017-12-04	1	-16/+13
\| \| \| \| \| \| \|	v2: make use of existing si_tgsi_emit_epilogue() Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: add nir support for ls epilogue	Timothy Arceri	2017-12-04	1	-15/+14
\| \| \| \| \| \| \|	v2: make use of existing si_tgsi_emit_epilogue() Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	st/glsl_to_nir: enable NIR link time opts	Timothy Arceri	2017-12-04	1	-7/+27
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/nir: add support for packed inputs	Timothy Arceri	2017-12-04	1	-21/+25
\| \| \| \| \| \| \| \| \|	Because NIR can create non vec4 variables when implementing component packing we need to make sure not to reprocess the same slot again. Also we can drop the fs_attr_idx counter and just use driver_location. Reviewed-by: Nicolai Hähnle <[email protected]>
*	freedreno/ir3: relax barriers	Rob Clark	2017-12-03	1	-2/+2
\| \| \| \| \| \|	Instructions with no barrier_class can move wrt. an EVERYTHING barrier. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: all mem instructions have WAR hazzard	Rob Clark	2017-12-03	1	-1/+1
\| \| \| \| \| \| \| \|	It isn't just load instructions that have write-after-read hazzard. Fixes stk gaussian blur compute shaders. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: add debug option to force emulated indirect	Rob Clark	2017-12-03	3	-0/+12
\| \| \| \| \| \|	Useful mostly for debugging indirect draw. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: also mark draw-indirect buffer as read	Rob Clark	2017-12-03	1	-0/+7
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: small cleanups	Rob Clark	2017-12-03	1	-17/+8
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: avoid unneccessary batch flush	Rob Clark	2017-12-03	1	-0/+2
\| \| \| \| \| \| \| \| \|	In some cases we can end up trying to add a write dependency on ourself, which shouldn't trigger a flush. Avoids an extra couple flushes per from in stk. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: avoid mem2gmem for invalidated buffers	Rob Clark	2017-12-03	3	-2/+17
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: deferred flush support	Rob Clark	2017-12-03	5	-4/+32
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: rework fence tracking	Rob Clark	2017-12-03	12	-61/+109
\| \| \| \| \| \| \| \| \|	ctx->last_fence isn't such a terribly clever idea, if batches can be flushed out of order. Instead, each batch now holds a fence, which is created before the batch is flushed (useful for next patch), that later gets populated after the batch is actually flushed. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: proper locking for iterating dependent batches	Rob Clark	2017-12-03	2	-8/+20
\| \| \| \| \| \| \| \| \|	In transfer_map(), when we need to flush batches that read from a resource, we should be holding screen->lock to guard against race conditions. Somehow deferred flush seems to make this existing race more obvious. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a5xx: correct max_indicies for indirect draws	Rob Clark	2017-12-03	1	-1/+2
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	broadcom/vc4: Use a single-entry cached last_hindex value.	Eric Anholt	2017-12-01	2	-2/+20
\| \| \| \| \| \| \| \| \|	Since almost all BOs will be in one CL at a time, this cache will almost always hit except for the first usage of the BO in each CL. This didn't show up as statistically significant on the minetest trace (n=340), but if I lop off the throttled lobe of the bimodal distribution, it very clearly does (0.74731% +/- 0.162093%, n=269).
*	broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.	Eric Anholt	2017-12-01	1	-5/+14
\| \| \| \| \| \| \| \|	No significant difference in the minetest replay, but it should reduce overhead by not requiring that we write quad indices to index buffers that we repeatedly re-upload (and making the draw packet smaller, as well). Over the course of the series the actual game seems to be up by 1-2 fps.
*	broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.	Eric Anholt	2017-12-01	3	-3/+12
\| \| \| \| \| \| \| \| \|	Now that there's only one user of it, it's pretty obvious how to avoid emitting redundant ones. This should save a bunch of kernel validation overhead. No statistically sigificant difference on the minetest trace I was looking at (n=169), but the maximum FPS is up by .3%
*	broadcom/vc4: Simplify the relocation handling for index buffers.	Eric Anholt	2017-12-01	2	-17/+17
\| \| \| \| \| \|	Originally there was CL code for handling various relocations back when I had relocs for the TSDA/TA buffers. Now that the kernel handles those entirely on its own, I can inline that code into the one place using it.
*	broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.	Eric Anholt	2017-12-01	1	-16/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	We failed to take the start into account for how many vertices to draw in this round, so we would end up decrementing count below 0, which as an unsigned number meant we would loop until the CLs soon ran out of space. When I wrote the code I was thinking about how to use the previously emitted shader state (no index bias baked into the elements) by emitting up to 65535 and then only re-emitting with bias for the second wround, but that doesn't work if the start is over 65535. Instead, just delay emitting shader state until we get into the drawarrays GFXH-515 loop and always bake the bias in when we're doing the workaround.
*	broadcom/vc4: Fix the scaling factor for the GFXH-515 workaround.	Eric Anholt	2017-12-01	1	-1/+1
\| \| \| \|	For triangle strips, we step by max_verts - 2.
*	meson: use dep_thread instead of dependency('threads') in freedreno	Dylan Baker	2017-12-01	1	-1/+1
\| \| \| \| \| \| \| \|	They are the same thing, but this is more consistent with the rest of the project. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	meson: Add lmsensors support	Dylan Baker	2017-12-01	4	-3/+6
\| \| \| \| \| \| \| \|	v2: - Make -Dlmsensors=false work - Simplify auto and true cases Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>