mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	radeonsi/gfx10: add a workaround for stencil HTILE with mipmapping	Marek Olšák	2019-07-03	6	-12/+28
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: disable DCC with MSAA	Marek Olšák	2019-07-03	1	-1/+6
\| \| \| \| \| \|	It was only enabled for 2x MSAA anyway. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: fix GL_LINE polygon mode for decomposed primitives	Marek Olšák	2019-07-03	5	-3/+24
\| \| \| \| \| \| \|	We need to tell PA to accept edge flags generated by the input assembler, because decomposed primitives shouldn't draw inner edges. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: fix NGG GS color clamping	Marek Olšák	2019-07-03	1	-0/+4
\| \| \| \| \| \|	Just need to pass the input from ES to GS. Everything else is done. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: fix vertex color clamping for TES	Marek Olšák	2019-07-03	1	-5/+18
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: unbind NGG shaders when destroyed	Marek Olšák	2019-07-03	1	-0/+9
\| \| \| \| \| \| \|	This fixes glsl-max-varyings, which creates shaders, draws, and then destroys them. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: don't use the GS workaround for triangle strips w/ adjancency	Marek Olšák	2019-07-03	1	-1/+1
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: don't do the query buffer atomic for blit shaders	Marek Olšák	2019-07-03	1	-23/+26
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: update spi_map if API VS (as NGG) changes and PS doesn't	Marek Olšák	2019-07-03	1	-1/+3
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0	Marek Olšák	2019-07-03	1	-0/+8
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: prefetch HW GS when NGG is used	Marek Olšák	2019-07-03	1	-2/+2
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	amd/common/gfx10: set DLC for llvm.amdgcn.s.buffer.load	Nicolai Hähnle	2019-07-03	1	-3/+1
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: fix PS exports for SPI_SHADER_32_AR	Marek Olšák	2019-07-03	1	-3/+9
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: set DLC for loads when GLC is set	Marek Olšák	2019-07-03	4	-17/+34
\| \| \| \| \| \|	This fixes L1 shader array cache coherency. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: fix shader images	Marek Olšák	2019-07-03	1	-2/+3
\| \| \| \| \| \|	Don't promote 2D image instructions to 3D, and don't set z=BASE_ARRAY. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: set the DCC constant encoding flag	Marek Olšák	2019-07-03	1	-1/+2
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: fix intensity formats	Marek Olšák	2019-07-03	5	-14/+23
\| \| \| \| \| \|	move the ALPHA_IS_ON_MSB fixup into vi_alpha_is_on_msb Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: allocate GDS BOs for streamout	Marek Olšák	2019-07-03	4	-10/+40
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: make sure GDS is idle between IBs	Marek Olšák	2019-07-03	2	-9/+28
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement streamout	Nicolai Hähnle	2019-07-03	4	-33/+618
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement streamout-related queries	Nicolai Hähnle	2019-07-03	12	-3/+903
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NGG hardware pipeline doesn't track these statistics automatically, and in fact cannot track them automatically when API geometry shaders are involved, so we accumulate statistics in the shader using atomic adds. This implementation accumulates statistics via the memory system and the RW buffer descriptor setup. We could use GDS, but since these atomics aren't latency-sensitive, that basically just trades off L2$ bandwidth vs. export bus bandwidth. One single memory transaction per shader workgroup doesn't seem too bad. The result ring buffer in memory is needed either way to avoid pipeline stalls. The shader code contains the atomic unconditionally, though the GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The atomic is simply discarded by the shader hardware in that case. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: enable the workaround for unaligned vertex fetch	Nicolai Hähnle	2019-07-03	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	Yes, really. Note that non-format buffer loads are unaffected and work just fine with unaligned pointers (as long as SH_MEM_CONFIG is setup correctly, which amdgpu ensures). Fixes e.g. KHR-GL45.vertex_attrib_64bit.vao Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: re-order the initialization order in si_compile_tgsi_main	Nicolai Hähnle	2019-07-03	1	-32/+32
\| \| \| \| \| \| \|	It's useful to be able to access gs_ngg_scratch before creating the main wrapping branch. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: apply DCC MSAA blend workaround	Nicolai Hähnle	2019-07-03	1	-3/+1
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement si_emit_global_shader_pointers	Nicolai Hähnle	2019-07-03	1	-1/+12
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement si_init_tess_factor_ring	Nicolai Hähnle	2019-07-03	1	-1/+4
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: initialize EXEC for TES-as-NGG (without geometry shader)	Nicolai Hähnle	2019-07-03	1	-1/+3
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: use correct VGPR for instance ID in LS shader	Nicolai Hähnle	2019-07-03	1	-2/+7
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement si_shader_hs	Nicolai Hähnle	2019-07-03	1	-7/+26
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement si_create_sampler_state	Nicolai Hähnle	2019-07-03	1	-5/+10
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: double the number of tessellation offchip buffers per SE	Nicolai Hähnle	2019-07-03	1	-2/+4
\| \| \| \| \| \| \|	Each gfx10 shader engine corresponds to two gfx9 shader engines, so scale the number of offchip buffers accordingly. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement get_tess_ring_descriptor	Nicolai Hähnle	2019-07-03	1	-7/+14
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: mask DCC tile swizzle by alignment	Nicolai Hähnle	2019-07-03	2	-2/+7
\| \| \| \| \| \| \| \|	DCC alignment can be less than the alignment of the main surface. In that case, the DCC tile swizzle needs to be masked accordingly. Should have no impact on pre-gfx10. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement hardware MSAA resolve	Nicolai Hähnle	2019-07-03	5	-2/+17
\| \| \| \| \| \| \| \| \| \|	MSAA is only supported for 64KB_{R,Z}_X modes, so the micro tile optimization that we use on gfx9 and earlier does not work. Be very explicit about how the swizzle mode of the temporary surface is selected. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: fix binding on si_update_scratch_relocs	Nicolai Hähnle	2019-07-03	1	-3/+7
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: set llvm_has_working_vgpr_indexing	Nicolai Hähnle	2019-07-03	1	-3/+2
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement load_const_buffer_desc_fast_path	Nicolai Hähnle	2019-07-03	1	-7/+14
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: take PRIMID from the correct output when exported by GS	Nicolai Hähnle	2019-07-03	1	-2/+2
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: change location of instance ID shader input	Nicolai Hähnle	2019-07-03	1	-2/+11
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: set USER_DATA_ADDR offset for geometry shaders	Nicolai Hähnle	2019-07-03	1	-2/+8
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement si_emit_derived_tess_state	Nicolai Hähnle	2019-07-03	1	-2/+6
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement si_shader_gs	Nicolai Hähnle	2019-07-03	1	-15/+29
\| \| \| \| \| \|	This is only used in the legacy, non-NGG path. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement preload_ring_buffers	Nicolai Hähnle	2019-07-03	1	-11/+20
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: implement si_set_ring_buffer	Nicolai Hähnle	2019-07-03	1	-2/+9
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: allow rectangle outputs from NGG primitive shader	Nicolai Hähnle	2019-07-03	1	-0/+1
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: emit VGT_GS_OUT_PRIM_TYPE from draw and add it to VS_STATE	Nicolai Hähnle	2019-07-03	5	-48/+52
\| \| \| \| \| \| \| \| \| \| \| \|	With NGG, the VGT_GS_OUT_PRIM_TYPE can change without a shader change. The VS_STATE is required for both streamout and culling from a vertex shader without pre-compiling outprim-specific variants. We could consider compiling specialized variants in the future. We could also consider compiling the NGG logic as an epilog. Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: NGG geometry shader PM4 and upload	Nicolai Hähnle	2019-07-03	5	-29/+316
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: generate geometry shaders for NGG	Nicolai Hähnle	2019-07-03	4	-4/+439
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: use the correct register for image descriptor dumping	Nicolai Hähnle	2019-07-03	1	-3/+5
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi/gfx10: emit GE_CNTL instead of IA_MULTI_VGT_PARAM for legacy mode	Nicolai Hähnle	2019-07-03	1	-7/+60
\| \| \| \|	Acked-by: Bas Nieuwenhuizen <[email protected]>