mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3	Tom Stellard	2016-09-16	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch switches non-TGSI compute shaders over to using the HSA ABI described here: https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md The HSA ABI provides a much cleaner interface for compute shaders and allows us to share more code in the compiler with the HSA stack. The main changes in this patch are: - We now pass the scratch buffer resource into the shader via user sgprs rather than using relocations. - Grid/Block sizes are now passed to the shader via the dispatch packet rather than at the beginning of the kernel arguments. Typically for HSA, the CP firmware will create the dispatch packet and set up the user sgprs automatically. However, in Mesa we let the driver do this work. The main reason for this is that I haven't researched how to get the CP to do all these things, and I'm not sure if it is supported for all GPUs. v2: - Add comments explaining why we are setting certain bits of the scratch resource descriptor. v3: - Use amdgcn-mesa-mesa3d triple instead of amdgcn--mesa3d. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: reload PS inputs with direct indexing at each use (v2)	Marek Olšák	2016-09-14	2	-6/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The LLVM compiler can CSE interp intrinsics thanks to LLVMReadNoneAttribute. 26011 shaders in 14651 tests Totals: SGPRS: 1146340 -> 1132676 (-1.19 %) VGPRS: 727371 -> 711730 (-2.15 %) Spilled SGPRs: 2218 -> 2078 (-6.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35841268 -> 36009732 (0.47 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222559 -> 224779 (1.00 %) Wait states: 0 -> 0 (0.00 %) v2: don't call load_input for fragment shaders in emit_declaration Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: set new r600_resource fields correctly in other places too	Marek Olšák	2016-09-13	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	This was missed in: commit 0d2e43fcb1198a6e67c85feadb1ca8c360ddc284 Author: Marek Olšák <[email protected]> Date: Thu Aug 18 16:30:00 2016 +0200 gallium/radeon: derive buffer placement and flags only at initialization Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeon: Don't check DCC on pipe buffers	Jan Vesely	2016-09-13	1	-3/+4
\| \| \| \| \| \| \| \| \|	Fixes segfaults in EG compute since: commit 21de3be8e62b2b093569a99550e6356ed2f106b4 radeonsi: fix texture format reinterpretation with DCC Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	winsys/amdgpu: remove amdgpu_cs_lookup_buffer	Nicolai Hähnle	2016-09-12	1	-0/+3
\| \| \| \| \| \| \|	The radeonsi driver doesn't and shouldn't care about the buffer index. Only the virtual addresses matter. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: page alignment for buffers is unnecessary	Nicolai Hähnle	2016-09-12	1	-4/+1
\| \| \| \| \| \|	In some places (e.g. shader program pointers) we require 256 bytes alignment. Reviewed-by: Marek Olšák <[email protected]>
*	gallium: remove PIPE_BIND_TRANSFER_READ/WRITE	Marek Olšák	2016-09-08	1	-1/+1
\| \| \| \| \| \| \| \|	not used in any useful way Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	radeonsi: unify si_set_optimal_micro_tile_mode call sites	Marek Olšák	2016-09-08	1	-8/+4
\| \| \| \| \| \| \|	There is nothing special happening in those code blocks. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: fix texture reinterpretation after DCC fast clear	Marek Olšák	2016-09-08	1	-12/+20
\| \| \| \| \| \| \| \| \| \|	The problem is that TC-compatible DCC clear codes translate into different clear values when you change the format. I have a new piglit reproducing the issue. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: enable DCC fast clear for 128-bit formats	Marek Olšák	2016-09-08	1	-13/+32
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: clamp integer clear color values for DCC fast clear	Marek Olšák	2016-09-08	1	-4/+12
\| \| \| \| \| \| \|	It should be possible to get TC-compatible fast clear more often now. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium: switch drivers to the slab allocator in src/util	Marek Olšák	2016-09-06	3	-8/+7
\|
*	radeon: move radeon_family/chip_class defintions to common	Dave Airlie	2016-09-06	1	-84/+2
\| \| \| \| \| \| \|	This just moves these to a common header file. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: move sid.h/r600d_common.h to a common place.	Dave Airlie	2016-09-06	4	-255/+2
\| \| \| \| \| \| \| \| \| \|	Step one to merging radv would be to move some files around. This only adds the include path to r600/radeonsi, because later we want to avoid having to add it to the generic target paths. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: set VPORT_ZMIN/MAX registers correctly	Marek Olšák	2016-09-05	3	-9/+71
\| \| \| \| \| \| \| \| \| \| \| \|	Calculate depth ranges from viewport states and pipe_rasterizer_state::clip_halfz. The evergreend.h change is required to silence a warning. This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: unify viewport emission code	Marek Olšák	2016-09-05	1	-14/+16
\| \| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: add HUD queries for counting VS/PS/CS partial flushes	Marek Olšák	2016-09-05	3	-0/+27
\| \| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: rename the num-cs-flushes query to num-ctx-flushes	Marek Olšák	2016-09-05	2	-5/+5
\| \| \| \| \| \| \|	num-cs-flushes will mean compute shader flushes Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: fix texture format reinterpretation with DCC	Marek Olšák	2016-09-05	2	-0/+102
\| \| \| \| \| \| \| \| \| \| \| \|	DCC is limited in how texture formats can be reinterpreted using texture views. If we get a view format that is incompatible with the initial texture format with respect to DCC, disable DCC. There is a new piglit which tests all format combinations. What works and what doesn't was deduced by looking at the piglit failures. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: return correct eviction stats for NVX_gpu_memory_info	Marek Olšák	2016-09-05	1	-2/+7
\| \| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: also eliminate DCC fast clear in resource_get_handle	Marek Olšák	2016-09-05	1	-2/+3
\| \| \| \| \| \| \|	just do what the comment says Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: use the current ctx for CMASK elimination in resource_get_handle	Marek Olšák	2016-09-05	1	-6/+11
\| \| \| \| \| \| \|	For coherency with the current context. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: use the current ctx for DCC decompression in resource_get_handle	Marek Olšák	2016-09-05	1	-3/+3
\| \| \| \| \| \| \|	For coherency with the current context. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: derive buffer placement and flags only at initialization	Marek Olšák	2016-09-05	3	-43/+63
\| \| \| \| \| \| \| \| \| \|	Invalidated buffers don't have to go through it. Split r600_init_resource into r600_init_resource_fields and r600_alloc_resource. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	Introduce .editorconfig	Eric Engestrom	2016-08-31	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files to try and enforce the formatting of the code, to which Michel Dänzer suggested [1] we start by importing the existing .dir-locals.el settings. The first draft was discussed in the RFC [2]. These .editorconfig are a first step, one that has the advantage of requiring little to no intervention from the devs once the settings files are in place, but the settings are very limited. This does have the advantage of applying while the code is being written. This doesn't replace the need for more comprehensive formatting tools such as clang-format & clang-tidy, but those reformat the code after the fact. [0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Eric Anholt <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	gallium: add cap to export device pointer size	Jan Vesely	2016-08-29	1	-0/+8
\| \| \| \| \| \| \| \| \|	v2: document the new cap v3: fix 80 char limit in screen.rst Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
*	gallium/radeon: clear dirty_level_mask when discarding CMASK	Marek Olšák	2016-08-29	1	-0/+1
\| \| \| \| \| \|	This fixes: GL45-CTS.texture_barrier.* Tested-by: Michel Dänzer <[email protected]>
*	gallium/radeon: add a driver query for AMDGPU_INFO_NUM_EVICTIONS	Marek Olšák	2016-08-26	3	-2/+8
\| \| \| \| \| \|	If the kernel driver doesn't support it, it returns 0. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	gallium/radeon: increase priority for shader binaries	Marek Olšák	2016-08-26	1	-1/+1
\| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	gallium/radeon: merge USER_SHADER and INTERNAL_SHADER priority flags	Marek Olšák	2016-08-26	1	-2/+1
\| \| \| \| \| \|	there's no reason to separate these Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi: don't use allocas for arrays with LLVM 3.8	Marek Olšák	2016-08-25	1	-1/+3
\| \| \| \| \| \|	It crashes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97413
*	gallium/radeon: unify and simplify checking for an empty gfx IB	Marek Olšák	2016-08-25	1	-10/+19
\| \| \| \| \| \| \|	We can take advantage of the fact that multi_fence does the obvious thing with NULL fences. This fixes unflushed fences that can get stuck due to empty IBs.
*	gallium: add a pipe_context parameter to resource_get_handle	Marek Olšák	2016-08-25	1	-0/+1
\| \| \| \| \| \| \| \|	radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL interop and this is the only way to make it coherent with the current context. It can optionally be set to NULL. Reviewed-by: Brian Paul <[email protected]>
*	radeon/vce: set flag based on dual instance enablement	Boyuan Zhang	2016-08-19	1	-2/+4
\| \| \| \| \| \| \|	Set the flag on when dual instance encoding is supported, otherwise set it to off. Signed-off-by: Boyuan Zhang <[email protected]>
*	radeonsi: initialize and finalize the LLVM function pass manager	Marek Olšák	2016-08-18	1	-0/+2
\| \| \| \|	Reviewed-by: Tom Stellard <[email protected]>
*	gallium/radeon: assign the highest priority to scratch; make rings second	Marek Olšák	2016-08-17	1	-2/+4
\| \| \| \| \| \| \|	just FYI, the kernel receives priority/4 Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/winsys: re-number winsys priority flags	Marek Olšák	2016-08-17	1	-16/+13
\| \| \| \| \| \| \|	free 60..63, move CP_DMA up Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: mark shader rings as highest-priority buffers	Marek Olšák	2016-08-17	1	-1/+1
\| \| \| \| \| \| \|	and rename the enum Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: set SHADER_RW_BUFFER priority for streamout buffers	Marek Olšák	2016-08-17	1	-2/+2
\| \| \| \| \|	Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: use current context for DCC feedback-loop decompress, fixes Elemental	Marek Olšák	2016-08-17	2	-7/+33
\| \| \| \| \| \| \| \| \| \|	This is just a workaround. The problem is described in the code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96541 v2: say that it's only between the current context and aux_context Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
*	gallium/radeon: use unflushed fences for PIPE_QUERY_GPU_FINISHED	Marek Olšák	2016-08-17	1	-2/+2
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: use lp_build_alloca_undef	Nicolai Hähnle	2016-08-17	1	-13/+4
\| \| \| \| \| \| \| \|	Avoid building all those store 0 / store undef instruction pairs that end up getting removed anyway. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: protect against out of bounds temporary array accesses	Nicolai Hähnle	2016-08-17	1	-0/+15
\| \| \| \| \| \| \|	They can lead to VM faults and worse, which goes against the GL robustness promises. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: add radeon_llvm_bound_index for bounds checking	Nicolai Hähnle	2016-08-17	2	-0/+33
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: reduce alloca of temporaries based on usagemask	Nicolai Hähnle	2016-08-17	2	-10/+54
\| \| \| \| \| \|	v2: take actual writemasks into account Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: use tgsi_scan_arrays for temp arrays	Nicolai Hähnle	2016-08-17	2	-4/+8
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: allocate temps array info in radeon_llvm_context_init	Nicolai Hähnle	2016-08-17	2	-33/+44
\| \| \| \| \| \| \| \| \|	Also, prepare for using tgsi_array_info. This also opens the door for properly handling allocation failures, but I'm leaving that for a separate change. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: always do the full store in store_value_to_array	Nicolai Hähnle	2016-08-17	1	-49/+28
\| \| \| \| \| \| \| \| \|	Doing the write-back of the temporary vector in radeon_llvm_emit_store makes no sense. This also allows us to get rid of get_alloca_for_array. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: extract common getelementptr logic into get_pointer_into_array	Nicolai Hähnle	2016-08-17	1	-39/+66
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: pass indirect register info into get_alloca_for_array	Nicolai Hähnle	2016-08-17	1	-5/+6
\| \| \| \| \| \|	To have the same signature as get_array_range. Reviewed-by: Marek Olšák <[email protected]>