mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	intel/eu/validate/gen12: Add TGL to eu_validate tests.	Jordan Justen	2019-10-30	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \|	These reworks were combined into this patch: * Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+ * Francisco Jerez: intel/eu/validate/gen12: Disable qword_low_power_no_depctrl eu_validate test. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/dev: Add preliminary device info for Tigerlake	Jordan Justen	2019-10-30	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \|	Reworks: * adjust 64-bit support, hiz (Jason Ekstrand) * sim-id (Lionel Landwerlin) * adjust threads, urb size (Rafael Antognolli) * adjust urb size (Kenneth Graunke) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/dump_gpu: handle context create extended ioctl	Lionel Landwerlin	2019-10-30	1	-0/+15
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
*	radv: Allocate space for temp. semaphore parts.	Bas Nieuwenhuizen	2019-10-30	1	-0/+1
\| \| \| \| \| \| \| \|	Calculated the number for allocation and did not reserve space .... Fixes: 2117c53b723 "radv: Add temporary datastructure for submissions." Reviewed-by: Samuel Pitoiset <[email protected]>
*	anv: Add Tile Cache Flush for Unified Cache.	Rafael Antognolli	2019-10-30	3	-1/+45
\|
*	blorp: Add Tile Cache Flush for Unified Cache.	Rafael Antognolli	2019-10-30	1	-0/+3
\|
*	iris: Add Tile Cache Flush for Unified Cache.	Rafael Antognolli	2019-10-30	2	-0/+21
\|
*	intel/genxml: Add gen12 tile cache flush bit	Jordan Justen	2019-10-30	1	-0/+1
\| \| \| \|	Signed-off-by: Jordan Justen <[email protected]>
*	aco: implement VGPR spilling	Daniel Schürmann	2019-10-30	1	-7/+162
\| \| \| \| \| \|	VGPR spilling is implemented via MUBUF instructions and scratch memory. Reviewed-by: Rhys Perry <[email protected]>
*	aco: always set scratch_offset in startpgm	Daniel Schürmann	2019-10-30	3	-23/+22
\| \| \| \| \| \| \|	This patch also moves private_segment_buffer and scratch_offset to Program to easily access it. Reviewed-by: Rhys Perry <[email protected]>
*	aco: omit linear VGPRs as spill variables	Daniel Schürmann	2019-10-30	1	-4/+8
\| \| \| \|	Reviewed-by: Rhys Perry <[email protected]>
*	aco: ensure that spilled VGPR reloads are done after p_logical_start	Daniel Schürmann	2019-10-30	1	-34/+43
\| \| \| \|	Reviewed-by: Rhys Perry <[email protected]>
*	aco: simplify calculation of target register pressure when spilling	Daniel Schürmann	2019-10-30	1	-39/+12
\| \| \| \|	Reviewed-by: Rhys Perry <[email protected]>
*	aco: fix new_demand calculation for first instructions	Rhys Perry	2019-10-30	1	-4/+7
\| \| \| \|	Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: don't add interferences between spilled phi operands	Daniel Schürmann	2019-10-30	1	-8/+8
\| \| \| \|	Reviewed-by: Rhys Perry <[email protected]>
*	aco: consider loop_exit blocks like merge blocks, even if they have only one ↵	Daniel Schürmann	2019-10-30	1	-2/+2
\| \| \| \| \| \|	predecessor Reviewed-by: Rhys Perry <[email protected]>
*	aco: don't insert the exec mask into set of live-out variables when spilling	Daniel Schürmann	2019-10-30	1	-14/+6
\| \| \| \|	Reviewed-by: Rhys Perry <[email protected]>
*	aco: fix transitive affinities of spilled variables	Daniel Schürmann	2019-10-30	1	-25/+79
\| \| \| \| \| \| \|	Variables spilled on both branch legs need to be assigned to the same spilling slot. These affinities can be transitive through multiple merge blocks. Reviewed-by: Rhys Perry <[email protected]>
*	aco: fix live-range splits of phis	Daniel Schürmann	2019-10-30	1	-14/+23
\| \| \| \|	Reviewed-by: Rhys Perry <[email protected]>
*	aco: remove potential critical edge on loops.	Daniel Schürmann	2019-10-30	2	-18/+23
\| \| \| \|	Reviewed-by: Rhys Perry <[email protected]>
*	aco: improve live variable analysis	Daniel Schürmann	2019-10-30	1	-25/+64
\| \| \| \| \| \| \|	This patch makes the live variable analysis more precise w.r.t. killed phi operands and the block's register pressure. Reviewed-by: Rhys Perry <[email protected]>
*	aco: Lower to CSSA	Daniel Schürmann	2019-10-30	4	-41/+268
\| \| \| \| \| \| \| \| \| \|	Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes. Previously, it was possible that phi operands have intersecting live-ranges, and thus, couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to spill phis, even if it was beneficial. This patch implements a conversion pass which is currently only called if spilling is necessary. Reviewed-by: Rhys Perry <[email protected]>
*	etnaviv: fix non-pointsprite points on GC7000L	Jonathan Marek	2019-10-30	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes these deqp tests (and more): dEQP-GLES2.functional.draw.draw_arrays.points.single_attribute dEQP-GLES2.functional.draw.draw_arrays.points.multiple_attributes dEQP-GLES2.functional.draw.draw_arrays.points.default_attribute dEQP-GLES2.functional.draw.draw_elements.points.single_attribute dEQP-GLES2.functional.draw.draw_elements.points.multiple_attributes dEQP-GLES2.functional.draw.draw_elements.points.default_attribute Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
*	etnaviv: stencil fix	Jonathan Marek	2019-10-30	1	-13/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The final version of previous stencil fix patch ended up breaking one-sided stencil. Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L): dEQP-GLES2.functional.fragment_ops.depth_stencil.* Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0 Fixes: 05da025f ("etnaviv: fix two-sided stencil") Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
*	etnaviv: fix depth bias	Jonathan Marek	2019-10-30	2	-1/+2
\| \| \| \| \| \| \| \| \| \|	Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L): dEQP-GLES2.functional.polygon_offset.* Fixes: 6c3c05dc ("etnaviv: fix polygon offset") Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
*	iris: Set MOCS for external surfaces to uncached	Jordan Justen	2019-10-30	1	-4/+8
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	iris: Align fast clear color state buffer to a page.	Rafael Antognolli	2019-10-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On gen11 and older, compressed images are tiled and aligned to 4K. On gen12 this 4K alignment restriction was removed. However, only aligning the fast clear color buffer to 64B (a cacheline, as it's on the documentation) is causing some bugs where the fast clear color is not converted during the fast clear operation. Aligning things to 4K seems to fix it. v2: Fix typo case in the comment (Nanley) v3: Rebase and fix conflicts. v4: Fix rebase mistake (Nanley). Reviewed-by: Nanley Chery <[email protected]>
*	anv: Align fast clear color state buffer to a page.	Rafael Antognolli	2019-10-30	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	On gen11 and older, compressed images are tiled and aligned to 4K. On gen12 this 4K alignment restriction was removed. However, only aligning the fast clear color buffer to 64B (a cacheline, as it's on the documentation) is causing some bugs where the fast clear color is not converted during the fast clear operation. Aligning things to 4K seems to fix it. v2: Assert that image->planes[plane].offset is 4K aligned (Nanley) Reviewed-by: Nanley Chery <[email protected]>
*	zink: only enable KHR_external_memory_fd if supported	Erik Faye-Lund	2019-10-30	3	-7/+28
\| \| \| \| \| \| \| \|	While we're at it, make sure we error out if it's not supported when required. This brings us a bit closer to being able to test on SwiftShader, which doesn't currently support KHR_external_memory_fd.
*	radv: Start signalling semaphores in WSI acquire.	Bas Nieuwenhuizen	2019-10-30	1	-7/+27
\| \| \| \| \| \| \| \| \| \|	Winsys semaphores without signal operation get silently ignored. Not so for syncobjs, so actually signal them. Fixes: 84d9551b232 "radv: Always enable syncobj when supported for all fences/semaphores." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030 Reviewed-by: Samuel Pitoiset <[email protected]>
*	aco: rename README to README.md	Rhys Perry	2019-10-30	1	-0/+0
\| \| \| \| \|	Closes: #1974 Signed-off-by: Rhys Perry <[email protected]>
*	aco: a couple loop handling fixes for GFX10 hazard pass	Rhys Perry	2019-10-30	1	-3/+3
\| \| \| \| \| \| \|	It was joining from the wrong blocks and block.kind is a bitmask instead of an enum. Reviewed-By: Timur Kristóf <[email protected]>
*	intel/compiler: Add instruction compaction support on Gen12	Matt Turner	2019-10-30	2	-184/+868
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	intel/compiler: Make separate src0/src1 index tables	Matt Turner	2019-10-30	1	-11/+18
\| \| \| \| \| \|	TGL uses different data (and even a different format!) for each source. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	intel/compiler: Inline get_src_index()	Matt Turner	2019-10-30	1	-26/+15
\| \| \| \| \| \| \|	TGL will have separate tables for src0 and src1, so the shared function will no longer make sense. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	intel/compiler: Restructure instruction compaction in preparation for Gen12	Matt Turner	2019-10-30	1	-20/+28
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	intel/compiler: Remove unreachable() from brw_reg_type.c	Matt Turner	2019-10-30	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The EU compaction unit test fuzzes the compaction code by flipping bits. We use a simple skip_bits() function with a list of reserved bits to ignore, but for more complex cases like invalid combinations of register file:type, we need either machinery to check validity or for these functions to simply inform us whether a combination was valid. enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it with an "INVALID" value, just return -1 and let the caller check for that. Scott suggested redefining unreachable() within the unit test to longjmp() which would allow driver code like this to still use it and allow the test to handle expected failures like this. If that plan works out, I plan to revert this.
*	freedreno/a2xx: add missing vertex formats (SSCALE/USCALE/FIXED)	Jonathan Marek	2019-10-30	8	-50/+83
\| \| \| \| \| \| \| \|	Mostly for vertex formats, but they are supported as texture formats too (untested however). Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	radeonsi: disable sdma for gfx10	Pierre-Eric Pelloux-Prayer	2019-10-30	1	-1/+7
\| \| \| \| \| \| \| \| \| \|	Disable sdma on gfx10 until all timeouts bugs are fixed. See: https://gitlab.freedesktop.org/mesa/mesa/issues/1907 https://bugs.freedesktop.org/show_bug.cgi?id=111481 Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: sdma misc fixes	Pierre-Eric Pelloux-Prayer	2019-10-30	2	-4/+2
\| \| \| \| \| \|	SDMA IB doesn't need to be padded for SDMA. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: align sdma byte count to dw	Pierre-Eric Pelloux-Prayer	2019-10-30	1	-1/+12
\| \| \| \| \| \| \| \| \|	If src/dst addresses are dw aligned and size is > 4 then we align byte count to dw as well. PAL implementation works like this. Reviewed-by: Marek Olšák <[email protected]>
*	radv: Enable ACO on Navi.	Timur Kristóf	2019-10-30	1	-2/+1
\| \| \| \| \|	Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	radeonsi: enable 8K video decode support for HEVC and VP9	Leo Liu	2019-10-30	1	-2/+18
\| \| \| \| \| \| \|	HW 8K decode support starts at Renoir Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
*	radeon/vcn: Add VP9 8K decode support	Leo Liu	2019-10-30	1	-1/+1
\| \| \| \| \| \| \|	Require increase of context buffer size Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
*	aco: try to group together VMEM loads of the same resource	Rhys Perry	2019-10-30	1	-10/+56
\| \| \| \| \| \| \| \|	v2: remove accidental shaderInt16 change v2: simplify can_move_down initialization v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST Reviewed-by: Daniel Schürmann <[email protected]>
*	aco: don't schedule instructions through depending VMEM instructions	Daniel Schürmann	2019-10-30	1	-0/+3
\| \| \| \| \| \| \| \|	Previously, the scheduler tried to move up instructions from below depending VMEM instructions only to move them down again when scheduling the VMEM instruction. Reviewed-by: Rhys Perry <[email protected]>
*	aco: add can_reorder flags to load_ubo and load_constant	Daniel Schürmann	2019-10-30	1	-5/+9
\| \| \| \| \| \| \| \|	These got lost due to some refactoring. Due to the way our scheduler works currently, for now we add back the reorder flag for divergent loads only. Reviewed-by: Rhys Perry <[email protected]>
*	aco: only skip RAR dependencies if the variable is killed somewhere	Daniel Schürmann	2019-10-30	1	-21/+46
\| \| \| \| \| \| \| \| \|	This patch changes VMEM scheduling in a way that they can only be moved upwards by previous VMEM instructions but not downwards. This way, it improves the order of VMEM instructions in relation to their users. Reviewed-by: Rhys Perry <[email protected]>
*	aco: restrict scheduling depending on max_waves	Daniel Schürmann	2019-10-30	1	-9/+15
\| \| \| \| \| \| \| \| \|	Previously, we allowed all shaders to reduce the number of max_waves to as low as 5. Restricting this on shaders with low register demand, increases the total number of waves while the VMEM def-use distances hardly change. This patch also changes the max number of move operations per MEM instruction. Reviewed-by: Rhys Perry <[email protected]>
*	anv: Avoid emitting UBO surface states that won't be used	Jason Ekstrand	2019-10-30	1	-1/+12
\| \| \| \| \| \| \| \|	This shaves around 4-5% off of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>