mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	swr/rast: add memory api to SwrGetInterface()	Tim Rowley	2017-04-28	6	-28/+54
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: use gather instruction for odd format fetch	Tim Rowley	2017-04-28	1	-46/+9
\| \| \| \| \| \| \|	Small fetch performance optimization - use gather instruction for odd format fetch instead of slow emulated code. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: enable SIMD16 8x2 tile backend	Tim Rowley	2017-04-28	1	-1/+1
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: add SwrInit() to init backend/memory tables	Tim Rowley	2017-04-28	5	-22/+26
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: increment depth/stencil tile pointer in SIMD16 BE	Tim Rowley	2017-04-28	1	-1/+1
\| \| \| \| \| \| \|	Misplaced #endif preventing depth and stencil hot tile pointers from incrementing in SIMD16 8x2 configuration of BackendPixelRate. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: add SwrGetInterface() function to return api	Tim Rowley	2017-04-28	3	-44/+151
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: enable per-warp scratch space for CS	Tim Rowley	2017-04-28	8	-8/+33
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: reduce simd{16}vertex stack for VS output	Tim Rowley	2017-04-28	2	-16/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Frontend - reduce simdvertex/simd16vertex stack usage for VS output in ProcessDraw, fixes stack overflow in some of the deeper call stacks under SIMD16. 1. Move the vertex store out of PA_FACTORY, and off the stack 2. Allocate the vertex store out of the aligned heap (pointer is temporarily stored in TLS, but will be migrated to thread pool along with other frontend temporary buffers). 3. Grow the vertex store as necessary for the number of verts per primitive, in chunks of 8/4 simdvertex/simd16vertex Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: remove default argument from SwrSync()	Tim Rowley	2017-04-28	1	-1/+1
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: remove unused variables in the SIMD16 FE	Tim Rowley	2017-04-28	3	-14/+2
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: move construction of const above goto	Tim Rowley	2017-04-28	1	-2/+2
\| \| \| \| \| \|	Fixes gcc error for SIMD16 FE. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: name threads to aid debugging	Tim Rowley	2017-04-28	4	-2/+126
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: disable buffer overrun warning for Assemble()	Tim Rowley	2017-04-28	1	-2/+4
\| \| \| \| \| \| \| \|	Disabling buffer overrun warning for Assemble(uint32_t slot, simdvector *verts) due to what looks like a MSVC compiler bug when compiling the SIMD16 FE. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: clean up clipper comments	Tim Rowley	2017-04-28	1	-2/+2
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: add SIMDAPI decorators in binner/clipper	Tim Rowley	2017-04-28	2	-6/+6
\| \| \| \| \| \|	Fixes MSVC errors with SIMD16 FE. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: add additional jit utility functions	Tim Rowley	2017-04-28	4	-1/+76
\| \| \| \| \| \|	Not used yet. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: more flexible max attribute slots	Tim Rowley	2017-04-28	7	-27/+30
\| \| \| \| \| \| \| \| \| \| \|	Ability to allocate space for an arbitrary number (at compile time) of positions in the vertex layout. Removes KNOB_NUM_ATTRIBUTES from knobs.h, replaces the VTX slot number #defines with the SWR_VTX_SLOTS enum (which contains replacement for NUM_ATTRIBUTES: SWR_VTX_NUM_SLOTS) Reviewed-by: Bruce Cherniak <[email protected]>
*	i965: Drop BRW_NEW_CONTEXT from 3DSTATE_DS/GS on Gen7-7.5.	Kenneth Graunke	2017-04-28	2	-2/+0
\| \| \| \| \| \| \|	We already have BRW_NEW_BATCH, which completely covers all the cases that BRW_NEW_CONTEXT would handle. Drop it. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Drop _NEW_TRANSFORM from 3DSTATE_DS/GS on Gen7-7.5.	Kenneth Graunke	2017-04-28	2	-2/+2
\| \| \| \| \| \|	There's no reason for this as far as I can tell. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Set point rasterization rule to UPPER_RIGHT on Gen6-7.5.	Kenneth Graunke	2017-04-28	2	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gen4-5 and Gen8+ already set this, but Gen6-7.5 did not. We ought to be consistent - the answer depends on the API, not the hardware generation. The Sandybridge PRM says about RASTRULE_UPPER_RIGHT: "To match OpenGL point rasterization rules (round to +infinity, where this is the upper right direction wrt OpenGL screen origin of lower left). So this is likely the one we should use. Reviewed-by: Rafael Antognolli <[email protected]>
*	i965: Always set AALINEDISTANCE_TRUE on Sandybridge.	Kenneth Graunke	2017-04-28	1	-2/+1
\| \| \| \| \| \| \| \|	We set this unconditionally on every other platform. Zero (Manhattan) isn't even listed as an option in the Sandybridge docs - only "true". Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
*	i965: Use true AA line distance on G45/Ironlake.	Kenneth Graunke	2017-04-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original Broadwater and Crestline platforms computed antialiased line distances using "manhattan" distance, aka a + b = c. Eaglelake and Cantiga added "true" distance, which apparently does something like max(a, b) + min(a, b) / 4. Not exactly "true", but at least more accurate. The G45 documentation indicates that the old manhattan distance setting is "only for debug purposes" and should never be used. The Ironlake documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it does still contain the narrative about the feature. At any rate, we should use the more accurate mode. Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
*	docs: add news item and link release notes for 17.0.5	Andres Gomez	2017-04-29	2	-0/+7
\| \| \| \|	Signed-off-by: Andres Gomez <[email protected]>
*	docs: add sha256 checksums for 17.0.5	Andres Gomez	2017-04-29	1	-1/+2
\| \| \| \| \|	Signed-off-by: Andres Gomez <[email protected]> (cherry picked from commit 6cb65ce2d3689ae7f692f8cf08559109037dd74e)
*	docs: add release notes for 17.0.5	Andres Gomez	2017-04-29	1	-0/+143
\| \| \| \| \|	Signed-off-by: Andres Gomez <[email protected]> (cherry picked from commit 61b134a862ecc1877bbe2f2c14e493b5fb607e04)
*	radeonsi: don't load unused compute shader input SGPRs and VGPRs	Marek Olšák	2017-04-28	4	-48/+76
\| \| \| \| \| \| \| \| \|	Basically, don't load GRID_SIZE or BLOCK_SIZE if they are unused, determine whether to load BLOCK_ID for each component separately, and set the number of THREAD_ID VGPRs to load. Now we should get the maximum CS launch wave rate in most cases. Reviewed-by: Nicolai Hähnle <[email protected]>
*	tgsi/scan: record compute shader system value usage	Marek Olšák	2017-04-28	2	-0/+37
\| \| \| \| \| \|	v2: just do indexing with swizzle[i] Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: add a HUD query for draw calls with primitive restart	Marek Olšák	2017-04-28	4	-0/+11
\| \| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	radeonsi: tell LLVM not to remove s_barrier instructions	Marek Olšák	2017-04-28	1	-12/+33
\| \| \| \| \| \| \|	LLVM 5.0 removes s_barrier instructions if the max-work-group-size attribute is not set. What a surprise. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: fix tess offchip offset for per-patch attributes	Marek Olšák	2017-04-28	3	-12/+18
\| \| \| \| \| \|	We need 4 more bits there. I don't know what is fixed by this. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: pass tessellation ring addresses via user SGPRs	Marek Olšák	2017-04-28	7	-56/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes s_load_dword latency for tess rings. We need just 1 SGPR for the address if we use 64K alignment. The final asm for recreating the descriptor is: // s2 is (address >> 16) s_mov_b32 s3, 0 s_lshl_b64 s[4:5], s[2:3], 16 s_mov_b32 s6, -1 s_mov_b32 s7, 0x27fac v2: bitcast the descriptor type from v2i64 to v4i32 Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: use si_insert_input_ret in si_llvm_emit_tcs_epilogue	Marek Olšák	2017-04-28	1	-19/+10
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: remove VS epilog code, compile VS with PrimID export on demand	Marek Olšák	2017-04-28	5	-210/+31
\| \| \| \| \| \| \| \| \| \| \| \|	The use of PrimID in the pixel shader is too rare to deserve such a sizable support code. The initial idea of the VS epilog was to move the clipping code there and remove it based on states, but optimized variants are now used to do that and are easier to support, so the VS epilog has turned out to be not so useful. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3	Marek Olšák	2017-04-28	4	-13/+33
\| \| \| \| \| \|	VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1 Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: don't load PrimID in TES if it's not used	Marek Olšák	2017-04-28	1	-3/+3
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: explain (non-)monolithic shaders	Marek Olšák	2017-04-28	1	-0/+67
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: enable OpenGL 4.5	Marek Olšák	2017-04-28	1	-5/+0
\| \| \| \| \| \| \|	Tentatively enable it, expecting the scratch buffer support to be done before the next Mesa release. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: 2nd shader of merged shaders should hold a reference of the 1st	Marek Olšák	2017-04-28	2	-10/+26
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: add reference counting for shader selectors	Marek Olšák	2017-04-28	2	-3/+25
\| \| \| \| \| \| \|	The 2nd shader of merged shaders should take a reference of the 1st shader. The next commit will do that. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: set VGT_VERTEX_REUSE for ES in ES-GS	Marek Olšák	2017-04-28	1	-6/+12
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: set TES registers for merged ES-GS	Marek Olšák	2017-04-28	1	-4/+7
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: disallow scratch buffer for LS-HS and ES-GS	Marek Olšák	2017-04-28	1	-0/+10
\| \| \| \| \| \|	not implemented yet Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: always compile monolithic ES-GS (asynchronously)	Marek Olšák	2017-04-28	2	-1/+28
\| \| \| \| \| \|	In addition to the non-monolithic variant. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: add support for monolithic ES-GS	Marek Olšák	2017-04-28	2	-9/+72
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: make sure the 1st shader's main part exists for merged shaders	Marek Olšák	2017-04-28	1	-18/+60
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: select shader parts for non-monolithic ES-GS	Marek Olšák	2017-04-28	1	-3/+14
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: add GS prolog support for merged ES-GS	Marek Olšák	2017-04-28	1	-17/+70
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: add VS prolog support for merged ES-GS	Marek Olšák	2017-04-28	1	-0/+2
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: pass GS input SGPRs and VGPRs from the ES part to GS	Marek Olšák	2017-04-28	1	-0/+32
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi/gfx9: store ES outputs to LDS	Marek Olšák	2017-04-28	1	-4/+17
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>