mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	mesa: add xbgr support adjacent to xrgb	Ilia Mirkin	2018-02-19	7	-2/+74
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Daniel Stone <[email protected]>
*	st/shader_cache: copy nir pointer to gl_program after deserializing	Timothy Arceri	2018-02-20	1	-0/+6
\| \| \| \| \| \| \|	This fixes a crash when running the arb_get_program_binary-api-errors piglit test twice. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: add nir shader cache support	Timothy Arceri	2018-02-20	1	-11/+30
\| \| \| \| \| \| \|	In future we might want to try avoid calling nir_serialize() but this works for now. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: rename variables tgsi_binary -> ir_binary	Timothy Arceri	2018-02-20	1	-21/+21
\| \| \| \| \| \|	This better represents that the ir could be either tgsi or nir. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: fix regression from 32-bit pointers on CI	Marek Olšák	2018-02-19	1	-1/+1
\| \| \| \|	Tested-by: Michel Dänzer <[email protected]>
*	radv: compact varyings after removing unused ones	Samuel Pitoiset	2018-02-19	1	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It makes no sense to compact before, and the description of nir_compact_varyings() confirms that. Polaris10: Totals from affected shaders: SGPRS: 108528 -> 108128 (-0.37 %) VGPRS: 74548 -> 74500 (-0.06 %) Spilled SGPRs: 844 -> 814 (-3.55 %) Code Size: 3007328 -> 2992932 (-0.48 %) bytes Max Waves: 16019 -> 16009 (-0.06 %) Vega10: Totals from affected shaders: SGPRS: 106088 -> 106232 (0.14 %) VGPRS: 74652 -> 74700 (0.06 %) Spilled SGPRs: 692 -> 658 (-4.91 %) Code Size: 2967708 -> 2953028 (-0.49 %) bytes Max Waves: 18178 -> 18162 (-0.09 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	radeonsi/nir: fix gl_FragCoord for pixel_center_integer	Timothy Arceri	2018-02-19	1	-0/+5
\| \| \| \| \| \|	Fixes piglit test glsl-arb-fragment-coord-conventions Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl/nir: add pixel_center_integer to shader info	Timothy Arceri	2018-02-19	2	-0/+7
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	gm107/ir: avoid using kepler instruction capabilities	Ilia Mirkin	2018-02-17	2	-21/+45
\| \| \| \| \| \| \| \|	Split up the op properties table into generation-specific bits, and only use the kepler ones on kepler. Fixes some CTS images tests. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
*	nvc0: add support for bindless on maxwell+	Ilia Mirkin	2018-02-17	3	-14/+116
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	gm107/ir: change how SUQ works in preparation for bindless	Ilia Mirkin	2018-02-17	3	-1/+61
\| \| \| \| \| \| \|	All this information can be retrieved from the TIC directly. Avoid having to dip into the constbuf information about the image. Signed-off-by: Ilia Mirkin <[email protected]>
*	i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.	Kenneth Graunke	2018-02-17	2	-1/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic state base address. This makes it unusable for pushing UBOs. There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake) which controls whether buffer 0 is relative to dynamic state base address, or simply a normal pointer. Setting that gives us full flexibility. This lets us push up to 4 UBO ranges. We can't currently write this on Haswell and earlier, and will need to update the kernel command parser, and then do the whole version checking song and dance. We also need a brand new kernel that supports context isolation - on older kernels, newly created contexts inherit register state from whatever happened to be running. So, setting this would have catastrophic impact on other drivers such as libva, Beignet, or older Mesa. See commit 8ec5a4e4a4a32f4de351c5fc2bf0eb615b6eef1b where we did this once before, but had to revert it in commit 013d33122028f2492da90a03a. Reviewed-by: Francisco Jerez <[email protected]>
*	i965: Stop restoring the default L3 configuration on Kernel 4.16+.	Kenneth Graunke	2018-02-17	3	-2/+7
\| \| \| \| \| \| \| \| \| \|	Kernel 4.16 has proper context isolation, which means we can change the L3 configuration without worrying about that leaking to other newly created contexts, breaking the assumptions of other userspace. So, disable our workaround to reprogram it back to the default. Reviewed-by: Francisco Jerez <[email protected]>
*	nvc0: Use GP100_COMPUTE_CLASS on GP10B	Mikko Perttunen	2018-02-17	1	-1/+2
\| \| \| \| \| \| \| \|	GP10B requires the use of GP100_COMPUTE_CLASS instead of GP104_COMPUTE_CLASS as is used for other non-GP100 chips. Signed-off-by: Mikko Perttunen <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	i965: Fix aux-surface size check	Daniel Stone	2018-02-17	2	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous commit reworked the checks intel_from_planar() to check the right individual cases for regular/planar/aux buffers, and do size checks in all cases. Unfortunately, the aux size check was broken, and required the aux surface to be allocated with the correct aux stride, but full image height (!). As the ISL aux surface is not recorded in the DRIimage, we cannot easily access it to check. Instead, store the aux size from when we do have the ISL surface to hand, and check against that later when we go to access the aux surface. Signed-off-by: Daniel Stone <[email protected]> Fixes: c2c4e5bae3ba ("i965: Fix bugs in intel_from_planar") Reviewed-by: Jason Ekstrand <[email protected]>
*	radeonsi: implement 32-bit pointers in user data SGPRs (v2)	Marek Olšák	2018-02-17	7	-59/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	User SGPRs changes: VS: 14 -> 9 TCS: 14 -> 10 TES: 10 -> 6 GS: 8 -> 4 GSCOPY: 2 -> 1 PS: 9 -> 5 Merged VS-TCS: 24 -> 16 Merged VS-GS: 18 -> 11 Merged TES-GS: 18 -> 11 SGPRS: 2170102 -> 2158430 (-0.54 %) VGPRS: 1645656 -> 1641516 (-0.25 %) Spilled SGPRs: 9078 -> 8810 (-2.95 %) Spilled VGPRs: 130 -> 114 (-12.31 %) Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread Code Size: 52094872 -> 52692540 (1.15 %) bytes Max Waves: 371848 -> 372723 (0.24 %) v2: - the shader cache needs to take address32_hi into account - set amdgpu-32bit-address-high-bits Reviewed-by: Samuel Pitoiset <[email protected]> (v1)
*	radeonsi: disallow constant buffers with a 64-bit address in slot 0	Marek Olšák	2018-02-17	2	-1/+9
\| \| \| \| \| \| \|	State trackers must use a user buffer or const_uploader, or set pipe_resource::flags same as const_uploader->flags. Reviewed-by: Samuel Pitoiset <[email protected]>
*	radeonsi: move const_uploader allocations to 32-bit address space	Marek Olšák	2018-02-17	3	-2/+7
\| \| \| \|	Reviewed-by: Samuel Pitoiset <[email protected]>
*	winsys/radeon: implement and enable 32-bit VM allocations	Marek Olšák	2018-02-17	3	-8/+64
\| \| \| \|	Reviewed-by: Samuel Pitoiset <[email protected]>
*	winsys/radeon: add struct radeon_vm_heap	Marek Olšák	2018-02-17	3	-36/+47
\| \| \| \|	Reviewed-by: Samuel Pitoiset <[email protected]>
*	winsys/amdgpu: enable 32-bit VM allocations	Marek Olšák	2018-02-17	1	-1/+2
\| \| \| \|	Reviewed-by: Samuel Pitoiset <[email protected]>
*	gallium/radeon: add 32-bit address space heaps	Marek Olšák	2018-02-17	1	-3/+44
\| \| \| \|	Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac: query high bits of 32-bit address space	Marek Olšák	2018-02-17	2	-0/+8
\|
*	gallium: use PIPE_CAP_CONSTBUF0_FLAGS	Marek Olšák	2018-02-17	4	-5/+27
\|
*	gallium: allow drivers to impose BO flags restrictions on constant buffer 0	Marek Olšák	2018-02-17	18	-0/+21
\| \| \| \|	Required by radeonsi for optimal behavior.
*	meson: Add Haiku platform support v4	Alexander von Gluck IV	2018-02-16	9	-13/+189
\| \| \| \|	Reviewed-by: Dylan Baker <[email protected]>
*	anv/icl: Add render target flush after uploading binding table	Anuj Phogat	2018-02-16	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	The PIPE_CONTROL command description says: "Whenever a Binding Table Index (BTI) used by a Render Taget Message points to a different RENDER_SURFACE_STATE, SW must issue a Render Target Cache Flush by enabling this bit. When render target flush is set due to new association of BTI, PS Scoreboard Stall bit must be set in this packet." Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Enable float blend optimization	Anuj Phogat	2018-02-16	1	-1/+1
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Use gen11 functions	Anuj Phogat	2018-02-16	2	-0/+6
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Build anv libs for gen11	Anuj Phogat	2018-02-16	4	-2/+32
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Generate gen11 entry point functions	Anuj Phogat	2018-02-16	1	-1/+5
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Don't use DISPATCH_MODE_SIMD4X2	Anuj Phogat	2018-02-16	1	-0/+5
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Don't use SingleVertexDispatch	Anuj Phogat	2018-02-16	1	-0/+2
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Don't set ResetGatewayTimer	Anuj Phogat	2018-02-16	1	-0/+2
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Add #define genX	Anuj Phogat	2018-02-16	1	-0/+3
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/icl: Add gen11 mocs defines	Anuj Phogat	2018-02-16	1	-0/+11
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Implement GenerateMipmap directly, rather than using Meta.	Kenneth Graunke	2018-02-16	5	-0/+135
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Meta is awful and we'd like to stop using it. Implementing this using BLORP allows us to stop trashing a bunch of GL state every time. This follows the structure of st_generate_mipmap(). compute_num_levels is lifted directly from there. Improves performance in Gl41HdrBloom by about 11.794% +/- 1.01919% (n=3) on Kabylake GT2 at 1280x720 (the difference seems much smaller at higher resolutions). v2 (idr): Don't try depth or depth-stencil blorp blits on Gen4 or Gen5 because it's not implemented yet. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	mesa: Move compute_num_levels from st_gen_mipmap.c to mipmap.c.	Kenneth Graunke	2018-02-16	3	-27/+29
\| \| \| \| \| \| \|	I want to use compute_num_levels inside i965. Rather than duplicating it, move it from mesa/st to core Mesa, and make it non-static. Reviewed-by: Marek Olšák <[email protected]>
*	meson: freedreno depends on nir	Dylan Baker	2018-02-16	1	-0/+1
\| \| \| \| \| \| \| \| \|	This fixes a race condition in building targets that link in freedreno. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120 Fixes: 0bbecc5a8548883f76a7 ("meson: define driver dependencies") Signed-off-by: Dylan Baker <[email protected]> Acked-by: Mark Janes <[email protected]>
*	swr/rast: blend_epi32() should return Integer, not Float	George Kyriazis	2018-02-16	1	-1/+1
\| \| \| \| \| \| \|	fix gcc8 compiler error for KNL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105029 Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Normalize path for debug metadata	George Kyriazis	2018-02-16	1	-1/+2
\| \| \| \| \| \|	in template gen_llvm.hpp Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Consolidate archrast Draw events	George Kyriazis	2018-02-16	4	-26/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consolidate archrst draw events into single draw event with an attribute that represents the type of draw - Add handlers for new private proto versions of DrawInstancedEvent, DrawIndexedInstancedEvent, DrawInstancedSplitEvent, and DrawIndexedInstancedSplitEvent - Convert the draw events to generic DrawInfoEvents - parse_proto_event_fields() replaces 'AR_DRAW_TYPE' as a field type with 'uint32_t'. This draw type is actually an enum, but can be represented as an unsigned integer. - is_draw_or_dispatch() recognizes DrawInfoEvent as a draw event Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Add semantics for translating address	George Kyriazis	2018-02-16	2	-0/+5
\| \| \| \| \| \|	Added support for another full translation path in fetch jitter. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Convert C Sampler intrinsics	George Kyriazis	2018-02-16	2	-0/+19
\| \| \| \| \| \| \| \| \| \|	Convert portions of the C sampler to the rasty SIMD lib. Also fix SRL call with a non-immediate. Don't count on the compiler automagically converting an srli call to srl if the shift count isn't an immediate. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Make SIMDLib templated types easier to use	George Kyriazis	2018-02-16	5	-298/+307
\| \| \| \| \| \|	"typename SIMD_T::TypeName" --> "TypeName<SIMD_T>" Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Be more explicit when fetching next component	George Kyriazis	2018-02-16	2	-4/+11
\| \| \| \| \| \| \|	Use a new function to denote that we want to get offset to next component and hide the fact that GEP is used underneath. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Fix bug related to passing AR handle	George Kyriazis	2018-02-16	1	-1/+1
\| \| \| \| \| \|	We were passing a garbage handle. Let's not do that. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Fix primitive replication issue in tesselation PA.	George Kyriazis	2018-02-16	2	-2/+3
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Use llvm intrinsic masked gather	George Kyriazis	2018-02-16	2	-0/+14
\| \| \| \| \| \| \| \| \|	Use llvm intrinsic masked.gather instead of manual unroll for the cases where we have vector of pointers. Improves llvm IR debug experience by reducing a ton of IR to a single intrinsic call. Also seems to reduce overall stack use considerably. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Misc cleanup	George Kyriazis	2018-02-16	3	-49/+60
\| \| \| \| \| \|	Together with correct detection of clipDistance NaNs when no cullDistance is set Reviewed-by: Bruce Cherniak <[email protected]>