mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	swr/rast: Replace INSERT2 vextract/vinsert with JOIN2 vshuffle	Tim Rowley	2017-12-15	3	-105/+30
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: SIMD16 Fetch - Fully widen 16-bit float vertex components	Tim Rowley	2017-12-15	1	-7/+48
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: SIMD16 Fetch - Fully widen 32-bit float vertex components	Tim Rowley	2017-12-15	4	-32/+194
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Pass prim to ClipSimd	Tim Rowley	2017-12-15	1	-5/+5
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Pull most of the VPAI manipulation out of the binner/clipper	Tim Rowley	2017-12-15	7	-158/+177
\| \| \| \| \| \|	Move out of binner/clipper; hand them down from the frontend code instead. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Move GatherScissors to header	Tim Rowley	2017-12-15	2	-127/+127
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Rewrite Shuffle8bpcGatherd using shuffle	Tim Rowley	2017-12-15	1	-182/+62
\| \| \| \| \| \|	Ease future code maintenance, prepare for folding simd8 and simd16 versions. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Convert gather masks to Nx1bit	Tim Rowley	2017-12-15	2	-40/+14
\| \| \| \| \| \| \|	Simplifies calling code, gets gather function interface closer to llvm's masked_gather. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: WIP - Widen fetch shader to SIMD16	Tim Rowley	2017-12-15	1	-27/+689
\| \| \| \| \| \|	Widen vertex gather/storage to SIMD16 for all component types. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Corrections to multi-scissor handling	Tim Rowley	2017-12-15	1	-88/+88
\| \| \| \| \| \| \|	binner's GatherScissors() will be turned into a real gather in the not too distant future. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Binner fixes for viewport index offset handling	Tim Rowley	2017-12-15	2	-2/+12
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr/rast: Remove unneeded copy of gather mask	Tim Rowley	2017-12-15	2	-79/+23
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	i965: Allow old begin/end queryobj for gen4/5 with HW contexts	Chris Wilson	2017-12-15	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Since we have HW contexts on gen4/5, we could take advantage of them, as done for gen6+ in commit e32cd5ffbb72 ("i965: Rely on hardware contexts for query objects on Gen6+."), to only emit a pair of counters at begin/end queryobj, rather than around every primitive. However, to keep queryobj working in the meantime as we bringup support for HW ctx on gen4/5, we can keep using the existing code. References: e32cd5ffbb72 ("i965: Rely on hardware contexts for query objects on Gen6+.") Cc: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	freedreno: use u_transfer_helper	Rob Clark	2017-12-15	2	-229/+44
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	gallium/util: add u_transfer_helper	Rob Clark	2017-12-15	5	-1/+649
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new helper that drivers can use to emulate various things that need special handling in particular in transfer_map: 1) z32_s8x24.. gl/gallium treats this as a single buffer with depth and stencil interleaved but hardware frequently treats this as separate z32 and s8 buffers. Special pack/unpack handling is needed in transfer_map/unmap to pack/unpack the exposed buffer 2) fake RGTC.. GPUs designed with GLES in mind, but which can other- wise do GL3, if native RGTC is not supported it can be emulated by converting to uncompressed internally, but needs pack/unpack in transfer_map/unmap 3) MSAA resolves in the transfer_map() case v2: add MSAA resolve based on Eric's "gallium: Add helpers for MSAA resolves in pipe_transfer_map()/unmap()." patch; avoid wrapping pipe_resource, to make it possible for drivers to use both this and threaded_context. Signed-off-by: Rob Clark <[email protected]>
*	i965: enable EXT_disjoint_timer_query extension	Tapani Pälli	2017-12-15	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Following dEQP cases pass: dEQP-EGL.functional.get_proc_address.extension.gl_ext_disjoint_timer_query dEQP-EGL.functional.client_extensions.disjoint Piglit test 'ext_disjoint_timer_query-simple' passes with these changes. No changes/regression observed in Intel CI. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	mesa: GL_EXT_disjoint_timer_query extension API bits	Tapani Pälli	2017-12-15	6	-1/+30
\| \| \| \| \| \| \| \| \| \| \|	Patch adds GL_GPU_DISJOINT_EXT and enables to use timer queries when EXT_disjoint_timer_query is enabled. v2: enable extension only when EXT_disjoint_timer_query set Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> (v1) Reviewed-by: Ian Romanick <[email protected]>
*	glapi: add GL_EXT_disjoint_timer_query	Tapani Pälli	2017-12-15	3	-2/+23
\| \| \| \| \| \| \| \| \|	Most entrypoints already available via other extensions like GL_EXT_occlusion_query_boolean, GL_EXT_timer_query. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	mesa: add DisjointOperation to gl_shared_state	Tapani Pälli	2017-12-15	2	-0/+9
\| \| \| \| \| \| \| \| \|	This state will be used by EXT_disjoint_timer_query. As first usage, patch sets DisjointOperation true when gpu reset happens. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	broadcom/vc5: Fix a typo in memcmp for sig unpack checking.	Eric Anholt	2017-12-14	1	-1/+1
\| \| \| \| \| \| \|	This shockingly ended up working out, because only the first byte of sig is used and (sizeof(sig) != 0) == 1. Fixes a compiler warning. Link: https://bugs.freedesktop.org/show_bug.cgi?id=104183
*	broadcom/vc5: Enable NIR txd lowering on all txd instructions.	Eric Anholt	2017-12-14	1	-0/+1
\| \| \| \| \| \| \| \|	Fixes almost all of piglit's arb_shader_texture_lod grad tests, except for the base -texgrad/texgradcube ones which fail on what appear to be precision problems. Reviewed-by: Ian Romanick <[email protected]>
*	nir: Add a new lowering option to lower all txd to txl.	Eric Anholt	2017-12-14	2	-6/+14
\| \| \| \| \| \|	VC5 requires that all txd are lowered in the shader. Reviewed-by: Ian Romanick <[email protected]>
*	nir: Fix interaction of GL_CLAMP lowering with texture offsets.	Eric Anholt	2017-12-14	1	-33/+42
\| \| \| \| \| \| \| \| \| \| \|	We want the clamping of the coordinate to apply after the offset, so we need to do math to lower the offset out of the instruction. Fixes texwrap offset cases for GL_CLAMP with GL_NEAREST on vc5. Note: I moved the get_texture_size() verbatim, so that it was defined before use. Reviewed-by: Ian Romanick <[email protected]>
*	broadcom/vc5: Fix shader input/outputs for gallium's new NIR linking.	Eric Anholt	2017-12-14	1	-4/+8
\|
*	gallivm: implement accurate corner behavior for textureGather with cube maps	Roland Scheidegger	2017-12-14	1	-103/+201
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The spec says the missing texel (when we wrap around both x and y axis) should be synthesized as the average of the 3 other texels. For bilinear filtering however we instead adjusted the filter weights (because, while the complexity looks similar, there would be 4 times as many color values to fix up than weights). Obviously this could not work for gather (hence accurate corner filtering was disabled with gather). Implement this by just doing it as the spec implies - calculate the 4th texel as the average of the other 3. With gather of course there's only one color to worry about, so it's not all that many instructions neither (albeit surely the whole cube map filtering is hilariously complex). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	gallivm: fix an issue with NaNs with seamless cube filtering	Roland Scheidegger	2017-12-14	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cube texture wrapping is a bit special since the values (post face projection) always are within [0,1], so we took advantage of that and omitted some clamps. However, we can still get NaNs (either because the coords already had NaNs, or the face projection generated them), and in fact we didn't handle them quite safely. I've seen -INT_MAX + 1 been propagated through as the final int coord value, albeit I didn't observe a crash. (Not quite a coincidence, since any stride mul with -INT_MAX or -INT_MAX+1 will turn up as a small positive number - nevertheless, I'd rather not try my luck, I'm not entirely sure it can't really turn up negative neither due to seamless coord swapping, plus ifloor of a NaN is not guaranteed to return -INT_MAX by any standard. And we kill off NaNs similarly with ordinary texture wrapping too.) So kill off the NaNs by using the common max against zero method. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	intel/tools: Convert aubinator over to the common framework	Jason Ekstrand	2017-12-14	3	-690/+33
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/batch-decoder: Decode registers	Jason Ekstrand	2017-12-14	1	-0/+13
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/batch-decoder: Decode dynamic state	Jason Ekstrand	2017-12-14	1	-0/+81
\| \| \| \| \| \| \| \|	Unfortunately, in aubinator and aubinator_error_decode we don't always know how many of a given state we have, so we must guess. One day, we'll come up with a way to annotate the batch to solve this problem. Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/batch-decoder: Decode constants, binding tables, and samplers	Jason Ekstrand	2017-12-14	1	-0/+73
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/tools: Switch aubinator_error_decode over to the gen_print_batch	Jason Ekstrand	2017-12-14	3	-205/+37
\| \| \| \| \| \| \|	The shared framework can now do everything that aubinator_error_decode ever did and more. It's time to make the switch. Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/batch-decoder: Decode graphics shaders	Jason Ekstrand	2017-12-14	1	-0/+95
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/batch-decoder: Decode vertex and index buffers	Jason Ekstrand	2017-12-14	2	-0/+161
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/batch-decoder: Decode MEDIA_INTERFACE_DESCRIPTOR_LOAD	Jason Ekstrand	2017-12-14	1	-0/+145
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/tools: Add the start of a generic batch decoder	Jason Ekstrand	2017-12-14	2	-0/+306
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/decoder: Expose the raw field value in the iterator	Jason Ekstrand	2017-12-14	2	-1/+3
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/disasm: Take a devinfo in gen_disasm_create	Jason Ekstrand	2017-12-14	4	-8/+7
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/decoder: Take a bit offset in gen_print_group	Jason Ekstrand	2017-12-14	5	-23/+27
\| \| \| \| \| \| \| \| \| \|	Previously, if a group was nested in another group such that it didn't start on a dword boundary, we would decode it as if it started at the start of its first dword. This changes things to work even more in terms of bits so that we can properly decode these structs. This affects MOCS, attribute swizzles, and several other things. Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/decoder: Stop rounding down to the nearest dword	Jason Ekstrand	2017-12-14	1	-11/+12
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/decoder: Convert the iterator to work entirely in bits	Jason Ekstrand	2017-12-14	2	-12/+9
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/decoder: Drop gen_field_decode helper	Jason Ekstrand	2017-12-14	2	-11/+0
\| \| \| \| \| \|	It's unused Reviewed-by: Lionel Landwerlin <[email protected]>
*	amd/common: add ac_build_waitcnt()	Samuel Pitoiset	2017-12-14	6	-27/+17
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	amd/common: more use of i32_1	Samuel Pitoiset	2017-12-14	1	-4/+4
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	amd/common: more use of i32_0	Samuel Pitoiset	2017-12-14	1	-9/+9
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi: make use of ac_build_fdiv()	Samuel Pitoiset	2017-12-14	2	-7/+2
\| \| \| \| \| \| \|	And move the comment to amd/common. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: export SampleMask from pixel shaders at full rate	Samuel Pitoiset	2017-12-14	2	-16/+41
\| \| \| \| \| \| \| \| \| \| \|	Use 16_ABGR instead of 32_ABGR if Z isn't written. Ported from RadeonSI. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi: make use of ac_get_spi_shader_z_format()	Samuel Pitoiset	2017-12-14	3	-23/+4
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	amd/common: add ac_get_spi_shader_z_format()	Samuel Pitoiset	2017-12-14	4	-1/+84
\| \| \| \| \| \| \| \|	ac_shader_util.c will contain shader helpers for RadeonSI and RADV. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: do not load the local invocation index when it's unused	Samuel Pitoiset	2017-12-14	4	-2/+7
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components	Samuel Pitoiset	2017-12-14	1	-3/+8
\| \| \| \| \| \| \| \|	We should also not load the input SGPRs and VGPRS, but let's start with this for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>