mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	radeonsi: don't use allocas for arrays with LLVM 3.8	Marek Olšák	2016-08-25	1	-1/+3
\| \| \| \| \| \|	It crashes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97413
*	gallium/radeon: unify and simplify checking for an empty gfx IB	Marek Olšák	2016-08-25	1	-10/+19
\| \| \| \| \| \| \|	We can take advantage of the fact that multi_fence does the obvious thing with NULL fences. This fixes unflushed fences that can get stuck due to empty IBs.
*	gallium: add a pipe_context parameter to resource_get_handle	Marek Olšák	2016-08-25	1	-0/+1
\| \| \| \| \| \| \| \|	radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL interop and this is the only way to make it coherent with the current context. It can optionally be set to NULL. Reviewed-by: Brian Paul <[email protected]>
*	radeon/vce: set flag based on dual instance enablement	Boyuan Zhang	2016-08-19	1	-2/+4
\| \| \| \| \| \| \|	Set the flag on when dual instance encoding is supported, otherwise set it to off. Signed-off-by: Boyuan Zhang <[email protected]>
*	radeonsi: initialize and finalize the LLVM function pass manager	Marek Olšák	2016-08-18	1	-0/+2
\| \| \| \|	Reviewed-by: Tom Stellard <[email protected]>
*	gallium/radeon: assign the highest priority to scratch; make rings second	Marek Olšák	2016-08-17	1	-2/+4
\| \| \| \| \| \| \|	just FYI, the kernel receives priority/4 Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/winsys: re-number winsys priority flags	Marek Olšák	2016-08-17	1	-16/+13
\| \| \| \| \| \| \|	free 60..63, move CP_DMA up Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: mark shader rings as highest-priority buffers	Marek Olšák	2016-08-17	1	-1/+1
\| \| \| \| \| \| \|	and rename the enum Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: set SHADER_RW_BUFFER priority for streamout buffers	Marek Olšák	2016-08-17	1	-2/+2
\| \| \| \| \|	Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: use current context for DCC feedback-loop decompress, fixes Elemental	Marek Olšák	2016-08-17	2	-7/+33
\| \| \| \| \| \| \| \| \| \|	This is just a workaround. The problem is described in the code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96541 v2: say that it's only between the current context and aux_context Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
*	gallium/radeon: use unflushed fences for PIPE_QUERY_GPU_FINISHED	Marek Olšák	2016-08-17	1	-2/+2
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: use lp_build_alloca_undef	Nicolai Hähnle	2016-08-17	1	-13/+4
\| \| \| \| \| \| \| \|	Avoid building all those store 0 / store undef instruction pairs that end up getting removed anyway. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: protect against out of bounds temporary array accesses	Nicolai Hähnle	2016-08-17	1	-0/+15
\| \| \| \| \| \| \|	They can lead to VM faults and worse, which goes against the GL robustness promises. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: add radeon_llvm_bound_index for bounds checking	Nicolai Hähnle	2016-08-17	2	-0/+33
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: reduce alloca of temporaries based on usagemask	Nicolai Hähnle	2016-08-17	2	-10/+54
\| \| \| \| \| \|	v2: take actual writemasks into account Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: use tgsi_scan_arrays for temp arrays	Nicolai Hähnle	2016-08-17	2	-4/+8
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: allocate temps array info in radeon_llvm_context_init	Nicolai Hähnle	2016-08-17	2	-33/+44
\| \| \| \| \| \| \| \| \|	Also, prepare for using tgsi_array_info. This also opens the door for properly handling allocation failures, but I'm leaving that for a separate change. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: always do the full store in store_value_to_array	Nicolai Hähnle	2016-08-17	1	-49/+28
\| \| \| \| \| \| \| \| \|	Doing the write-back of the temporary vector in radeon_llvm_emit_store makes no sense. This also allows us to get rid of get_alloca_for_array. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: extract common getelementptr logic into get_pointer_into_array	Nicolai Hähnle	2016-08-17	1	-39/+66
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: pass indirect register info into get_alloca_for_array	Nicolai Hähnle	2016-08-17	1	-5/+6
\| \| \| \| \| \|	To have the same signature as get_array_range. Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: extract common lookup code into get_temp_array function	Nicolai Hähnle	2016-08-17	1	-33/+40
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: clarify the comment on the array alloca heuristic	Nicolai Hähnle	2016-08-17	1	-10/+19
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: more descriptive names for LLVM temporaries in debug builds	Nicolai Hähnle	2016-08-17	1	-2/+12
\| \| \| \| \|	Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: simplify radeon_llvm_emit_store for direct array addressing	Nicolai Hähnle	2016-08-17	1	-7/+0
\| \| \| \| \| \| \|	We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: simplify radeon_llvm_emit_fetch for direct array addressing	Nicolai Hähnle	2016-08-17	1	-5/+0
\| \| \| \| \| \| \|	We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: clean up emit_declaration for temporaries	Nicolai Hähnle	2016-08-17	1	-9/+18
\| \| \| \| \| \| \| \|	In the alloca'd array case, no longer create redundant and unused allocas for the individual elements; create getelementptrs instead. Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: use unflushed fences for deferred flushes (v2)	Marek Olšák	2016-08-10	1	-1/+43
\| \| \| \| \| \| \| \| \| \|	+23% Bioshock Infinite performance. v2: - use the new fence_finish interface - allow deferred fences with multiple contexts - clear the ctx pointer after a deferred flush Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium: add a pipe_context parameter to fence_finish	Marek Olšák	2016-08-10	2	-1/+2
\| \| \| \| \| \| \| \|	required by glClientWaitSync (GL 4.5 Core spec) that can optionally flush the context Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: add HUD queries for mapped VRAM/GTT	Marek Olšák	2016-08-10	2	-0/+12
\| \| \| \| \| \|	mainly for monitoring visible VRAM congestion Reviewed-by: Nicolai Hähnle <[email protected]>
*	winsys/amdgpu: track the amount of mapped memory	Marek Olšák	2016-08-10	1	-0/+2
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: increase the size of the renderer string	Marek Olšák	2016-08-10	1	-1/+1
\| \| \| \| \| \|	Mine is longer than 64 bytes. Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: implement ARB_clear_texture (v3)	Marek Olšák	2016-08-10	1	-0/+67
\| \| \| \| \| \| \| \| \| \|	Some ideas copied from Jakob Sinclair's implementation, but the color clearing is completely different. v2: remove leftover code, disable conditional rendering disable render condition cleanly Reviewed-by: Nicolai Hähnle <[email protected]>
*	winsys/amdgpu: query ME/PFP/CE firmware versions	Nicolai Hähnle	2016-08-08	2	-0/+6
\| \| \| \| \| \| \|	The radeon kernel module doesn't have the firmware query interface, so the corresponding values will remain 0. Reviewed-by: Marek Olšák <[email protected]>
*	Revert "gallium/radeon: count contexts"	Marek Olšák	2016-08-06	2	-4/+0
\| \| \| \| \| \|	This reverts commit b403eb338533894ee012a96bf55653996c92ec7c. Not needed.
*	gallium/radeon: add cs_get_next_fence winsys callback	Marek Olšák	2016-08-06	1	-0/+7
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: count contexts	Marek Olšák	2016-08-06	2	-0/+4
\| \| \| \| \| \|	We don't wanna use unflushed fences when we have multiple contexts. Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: count gfx IB flushes	Marek Olšák	2016-08-06	1	-0/+1
\| \| \| \| \| \| \|	This will be used as a counter for whether fence_finish needs to flush the IB. Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: move radeon_winsys::cs_memory_below_limit to drivers	Marek Olšák	2016-08-06	3	-15/+27
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: inline radeon_winsys::query_memory_usage	Marek Olšák	2016-08-06	2	-3/+1
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon/winsyses: expose per-IB used_vram and used_gart to drivers	Marek Olšák	2016-08-06	1	-0/+5
\| \| \| \| \| \|	The following patches will use this. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: flush if sampler views and images use too much memory	Marek Olšák	2016-08-06	1	-0/+34
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: add r600_resource::vram_usage and gart_usage	Marek Olšák	2016-08-06	3	-12/+19
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium/radeon: move last_gfx_fence from radeonsi to common code	Marek Olšák	2016-08-03	2	-0/+2
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: don't set the last parameter component of llvm.AMDGPU.cube	Marek Olšák	2016-08-03	1	-2/+8
\| \| \| \| \| \|	LLVM doesn't use it. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: use llvm.amdgcn.cube* if available	Marek Olšák	2016-08-03	1	-4/+28
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: use llvm.amdgcn.rsq.f64 if available	Marek Olšák	2016-08-03	1	-1/+2
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: use v_mad_f32 for fma	Marek Olšák	2016-08-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v_fma_f32 runs at FP64 rate (= slow). Alien Isolation and F1 2015 seem to use fma for all d3d multiply-add instructions, which is silly. This tries to restore performance for those games. The main difference between v_mad_f32 and v_fma_f32 is that v_mad doesn't support denormals, which we don't enable anyway, because they are slow too. Also, there is code size reduction: Totals from affected shaders: VGPRS: 109796 -> 109808 (0.01 %) Spilled SGPRs: 29995 -> 30022 (0.09 %) Spilled VGPRs: 12 -> 13 (8.33 %) <-- it's just one shader going from 12 to 13 Code Size: 6667596 -> 6476356 (-2.87 %) bytes Max Waves: 26931 -> 26899 (-0.12 %) I've not actually tested real performance. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeon/llvm: Use alloca instructions for larger arrays [revert a revert]	Marek Olšák	2016-07-26	2	-25/+149
\| \| \| \| \| \|	This reverts commit f84e9d749fbb6da73a60fb70e6725db773c9b8f8. Bioshock Infinite no longer hangs.
*	radeonsi: implement buffer_subdata without indirect calls	Marek Olšák	2016-07-23	3	-3/+39
\| \| \| \| \| \|	There is less noise in CPU profile data now. Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallium: split transfer_inline_write into buffer and texture callbacks	Marek Olšák	2016-07-23	3	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Roland Scheidegger <[email protected]>