mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	v3d: Ask the state tracker to lower image accesses off of derefs.	Eric Anholt	2020-02-24	3	-71/+48
\| \| \| \| \| \| \| \|	This saves a bunch of hassle in handling derefs in the backend, and would be needed for reasonable handling of dynamic indexing of image arrays. Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
*	broadcom: Fix implicit declaration of ffs for Android build	Jose Maria Casanova Crespo	2020-02-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Include util/bitscan.h to ensure ffs is available when there is no glibc like in Android. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1983 Reviewed-by: Eric Anholt <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2554> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2554>
*	glsl,nir: Switch the enum representing shader image formats to PIPE_FORMAT.	Eric Anholt	2020-02-05	1	-220/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This means you can directly use format utils on it without having to have your own GL enum to number-of-components switch statement (or whatever) in your vulkan backend. Thanks to imirkin for fixing up the nouveau driver (and a couple of core details). This fixes the computed qualifiers for EXT_shader_image_load_store's non-integer sizeNxM qualifiers, which we don't have tests for. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> (v3d) Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
*	util/hash_table: update users to use new optimal integer hash functions	Anthony Pesch	2020-01-23	1	-13/+1
\| \| \| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
*	nir/lower_atomics_to_ssbo: Also lower barriers	Jason Ekstrand	2020-01-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	This is more correct for a pass which is supposed to completely lower away atomic counters. It also lets us stop supporting atomic counter barriers in most of the drivers. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
*	nir: Rename nir_intrinsic_barrier to control_barrier	Jason Ekstrand	2020-01-13	1	-1/+1
\| \| \| \| \| \| \| \|	This is a more explicit name now that we don't want it to be doing any memory barrier stuff for us. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
*	nir: Add a new memory_barrier_tcs_patch intrinsic	Jason Ekstrand	2020-01-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Right now, it's implemented as a no-op for everyone. For most drivers, it's a switch case in the NIR -> whatever which just breaks. For ir3, they already have code to delete tessellation barriers so we just add a case to also delete memory_barrier_tcs_patch. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
*	v3d: handle writes to gl_Layer from geometry shaders	Iago Toral Quiroga	2019-12-16	3	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When geometry shaders write a value to gl_Layer that doesn't correspond to an existing layer in the target framebuffer the rendering behavior is undefined according to the spec, however, there are CTS tests that trigger this scenario on purpose, probably to ensure that nothing terrible happens. For V3D, this situation is problematic because the binner uses the layer index to select the offset to write into the tile state data, and we only allocate tile state for MAX2(num_layers, 1), so we want to make sure we don't produce values that would lead to out of bounds writes. The simulator has an assert to catch this, although we haven't observed issues in actual hardware it is probably best to play safe. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: predicate geometry shader outputs inside non-uniform control flow	Iago Toral Quiroga	2019-12-16	1	-0/+15
\| \| \| \|	Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: we always have at least one output segment	Iago Toral Quiroga	2019-12-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	If we program an output size of 0 the simulator asserts. This was not a problem until now because our VS would always have to emit fixed function outputs, however, now that it can be paired with a GS we can end up with a VS shader that no longer emits any outputs. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: compute appropriate VPM memory configuration for geometry shader workloads	Iago Toral Quiroga	2019-12-16	2	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Geometry shaders can output many vertices and thus have higher VPM memory pressure as a result. It is possible that too wide geometry shader dispatches exceed the maximum available VPM output allocated, in which case we need to reduce the dispatch width until we can fit the VPM memory requirements. Supported dispatch widths for geometry shaders are 16, 8, 4, 1. There is a limit in the number of VPM output sectors that can be used by a geometry shader that we can meet by lowering the dispatch width at compile time, however, at draw time we need to revisit this number and, together with other elements that can contribute to total VPM memory requirements, decide on a configuration that can fit the program into the available VPM memory. Ideally, we also want to aim for not using more than half of the available memory so we that we can run a pair of bin and render programs in parallel. v2: fixed language in comment and typo in commit log. (Alejandro) Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: add 1-way SIMD packing definition	Iago Toral Quiroga	2019-12-16	1	-0/+1
\| \| \| \| \| \| \|	According to the documentation, the 1-way dispatch width is only supported with geometry shaders. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: implement geometry shader instancing	Iago Toral Quiroga	2019-12-16	3	-0/+9
\| \| \| \| \| \| \|	v2: - Remove unused field uses_iid from v3d_gs_prog_data (Alejandro) Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: fix packet descriptions for geometry and tessellation shaders	Iago Toral Quiroga	2019-12-16	1	-10/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Every code address starts at bit 3 (addresses must be 64-bit aligned), with the first 3 bits used to specify threading and NaN propagation parameters for the shader program. We generally skip "reserved" bits, however, doing this when the reserved field is the last in a struct and it is large enough can make us compute incorrect (smaller) struct sizes which can lead to corrupt CLs. In particular, the "Tess/Geom Common Params" struct has a reserved field at the end that is 8-bit, so if we don't include this we compute a packet size that is 1 byte smaller than it shold, making the next packet we emit start 1 byte earlier and therefore leading to incorrect CL data from that point forward. The name of one of the fields was not correct. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: add initial compiler plumbing for geometry shaders	Iago Toral Quiroga	2019-12-16	5	-79/+610
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of the relevant work happens in the v3d_nir_lower_io. Since geometry shaders can write any number of output vertices, this pass injects a few variables into the shader code to keep track of things like the number of vertices emitted or the offsets into the VPM of the current vertex output, etc. This is also where we handle EmitVertex() and EmitPrimitive() intrinsics. The geometry shader VPM output layout has a specific structure with a 32-bit general header, then another 32-bit header slot for each output vertex, and finally the actual vertex data. When vertex shaders are paired with geometry shaders we also need to consider the following: - Only geometry shaders emit fixed function outputs. - The coordinate shader used for the vertex stage during binning must not drop varyings other than those used by transform feedback, since these may be read by the binning GS. v2: - Use MAX3 instead of a chain of MAX2 (Alejandro). - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro) - Update comment in IO owering so it includes the GS stage (Alejandro) Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: remove unused variable	Iago Toral Quiroga	2019-12-16	1	-4/+1
\| \| \| \|	Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: enable debug options for geometry shader dumps	Iago Toral Quiroga	2019-12-16	2	-10/+12
\| \| \| \|	Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: add debug assert	Iago Toral Quiroga	2019-12-16	1	-0/+1
\| \| \| \| \| \| \| \|	While lowering vpm outputs we look for the NIR variables matching particular store output instructions and we expect to find a match, so assert on that. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	v3d: add missing plumbing for VPM load instructions	Iago Toral Quiroga	2019-12-16	2	-0/+7
\| \| \| \| \| \| \|	We will need to use LDVPMG_IN specifically to read VPM inputs in geometry shaders. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	meson/broadcom: libbroadcom_cle also needs zlib	Dylan Baker	2019-12-11	1	-1/+1
\| \| \| \| \| \|	Fixes: 1ae8018a6af81eec4832a57d9d0346aa3dd98d28 ("meson: Add support for the vc4 driver.") Reviewed-by: Eric Anholt <[email protected]>
*	meson/broadcom: libbroadcom_cle needs expat headers	Dylan Baker	2019-12-10	1	-1/+1
\| \| \| \| \| \|	Fixes: 1ae8018a6af81eec4832a57d9d0346aa3dd98d28 ("meson: Add support for the vc4 driver.") Reviewed-by: Eric Anholt <[email protected]>
*	nir: Add a scheduler pass to reduce maximum register pressure.	Eric Anholt	2019-11-25	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is similar to a scheduler I've written for vc4 and i965, but this time written at the NIR level so that hopefully it's reusable. A notable new feature it has is Goodman/Hsu's heuristic of "once we've started processing the uses of a value, prioritize processing the rest of their uses", which should help avoid the heuristic otherwise making such systematically bad choices around getting texture results consumed. Results for v3d: total instructions in shared programs: 6497588 -> 6518242 (0.32%) total threads in shared programs: 154000 -> 152828 (-0.76%) total uniforms in shared programs: 2119629 -> 2068681 (-2.40%) total spills in shared programs: 4984 -> 472 (-90.53%) total fills in shared programs: 6418 -> 1546 (-75.91%) Acked-by: Alyssa Rosenzweig <[email protected]> (v1) Reviewed-by: Alejandro Piñeiro <[email protected]> (v2) v2: Use the DAG datastructure, fold in the scheduling-for-parallelism patch, include SSA defs in live values so we can switch to bottom-up if we want. v3: Squash in improvements from Alejandro Piñeiro for getting V3D to successfully register allocate on GLES3.1 dEQP. Make sure that discards don't move after store_output. Comment spelling fix.
*	v3d: adds an extra MOV for any sig.ld*	Alejandro Piñeiro	2019-11-20	1	-4/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Specifically when we are in non-uniform control flow, as we would need to set the condition for the last instruction. If (for example) a image atomic load stores directly their value on a NIR register, last_inst would be a nop, and would fail when set the condition. Fixes piglit test: spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test Fixes: 6281f26f064ada ("v3d: Add support for shader_image_load_store.") v2: (Changes suggested by Eric Anholt) * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of them have the same restriction. * Update comment explaining why we add a MOV in that case * Tweak commit message. v3: * Drop extra set of parens (Eric) * Add missing ld signal to is_ld_signal to fix shader-db regression. Reviewed-by: Eric Anholt <[email protected]>
*	v3d: Fix predication with atomic image operations	Jose Maria Casanova Crespo	2019-11-20	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes dEQP test: dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read Fixes piglit test: spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test Fixes: 6281f26f064ada ("v3d: Add support for shader_image_load_store.") Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	util: Move gallium's PIPE_FORMAT utils to /util/format/	Eric Anholt	2019-11-14	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to move their helpers out of gallium. Since u_format used util_copy_rect(), I moved that in there, too. I've put it in a separate directory in util/ because it's a big chunk of related code, and it's not clear to me whether we might want it as a separate library from libmesa_util at some point. Closes: #1905 Acked-by: Marek Olšák <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	v3d: rename vertex shader key (num)_fs_inputs fields	Iago Toral Quiroga	2019-10-31	4	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Until now this made sense because we always paired vertex shaders with fragment shaders, but as soon as we implement geometry and tessellation shaders that will no longer be the case, so rename this to (num_)used_outputs. v2: Use 'used_outputs' instead of ns_outputs, which is more explicit (Eric). Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	util: rename list_empty() to list_is_empty()	Timothy Arceri	2019-10-28	3	-4/+4
\| \| \| \| \| \| \|	This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <[email protected]>
*	v3d: fix empty-body instruction	Eric Engestrom	2019-10-27	1	-1/+1
\| \| \| \| \| \| \|	Fixes: 8d43e2b2ded0fe3c82d4 ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	Revert "v3d: do not report alpha-test as supported"	Erik Faye-Lund	2019-10-23	2	-0/+11
\| \| \| \| \| \| \|	This reverts commit 9d0523b569bb7208c6e74cafc0f3945415d94336. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jose Maria Casanova <[email protected]>
*	nir/lower_idiv: add new llvm-based path	Rhys Perry	2019-10-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2: make variable names snake_case v2: minor cleanups in emit_udiv() v2: fix Panfrost build failure v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature v4: remove nir_op_urcp v5: drop nv50 path v5: rebase v6: add back nv50 path v6: add comment for nir_lower_idiv_path enum v7: rename _nv50/_llvm to _fast/_precise v8: fix etnaviv build failure Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
*	broadcom: document known hardware issues for L2T flush command	Iago Toral Quiroga	2019-10-18	1	-0/+35
\| \| \| \| \|	Suggested-by: Eric Anholt <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	v3d: add new flag dirty TMU cache at v3d_compiler	Iago Toral Quiroga	2019-10-18	5	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	That we set for any TMU write on spills and general tmu. It is then used as part of v3d_emit_gl_shader_state later. v2: add a new flag instead at v3d_compiler instead of dirty the flag at v3dx if there is any spill (change suggested by Eric, added by Alejandro) v3: set this for anything that is not a load and do it also in v3d40_vir_emit_image_load_store (Eric) Reviewed-by: Eric Anholt <[email protected]>
*	v3d: do not report alpha-test as supported	Erik Faye-Lund	2019-10-17	2	-11/+0
\| \| \| \| \| \| \|	This triggers lowering in the state-tracker, which makes things a bit simpler. Reviewed-by: Marek Olšák <[email protected]>
*	nir: support feeding state to nir_lower_clip_[vg]s	Erik Faye-Lund	2019-10-17	1	-1/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	nir: support lowering clipdist to arrays	Erik Faye-Lund	2019-10-17	1	-2/+3
\| \| \| \| \| \| \| \|	This allows us to make sure clipdist is emitted as a scalar array rather than two vec4s. This matches SPIR-V semantics, and will be useful for Zink. Reviewed-by: Marek Olšák <[email protected]>
*	nir: allow passing alpha-ref state to lowering-code	Erik Faye-Lund	2019-10-17	1	-1/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	nir: add nir_shader_compiler_options::lower_to_scalar	Marek Olšák	2019-10-10	1	-0/+1
\| \| \| \| \| \| \| \|	This will replace PIPE_SHADER_CAP_SCALAR_ISA. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	v3d: Enable the late algebraic optimizations to get real subs.	Eric Anholt	2019-09-30	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This worked better than my original v3d-local pass for just subs, and is a huge win over not producing subs. total instructions in shared programs: 6408469 -> 6167932 (-3.75%) total threads in shared programs: 153784 -> 154104 (0.21%) total uniforms in shared programs: 2157078 -> 1905823 (-11.65%) total max-temps in shared programs: 904546 -> 895796 (-0.97%) total spills in shared programs: 4959 -> 4993 (0.69%) total fills in shared programs: 6558 -> 6670 (1.71%) total sfu-stalls in shared programs: 25845 -> 25175 (-2.59%) total inst-and-stalls in shared programs: 6434314 -> 6193107 (-3.75%) Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	broadcom/genxml: Stop manually scrubbing 'α' -> "alpha"	Kenneth Graunke	2019-09-23	1	-1/+0
\| \| \| \| \| \| \|	'α' has never appeared in any genxml files, so there's no need to replace it with the word "alpha". Reviewed-by: Eric Anholt <[email protected]>
*	nir: allow specifying filter callback in lower_alu_to_scalar	Vasily Khoruzhick	2019-09-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	v3d: writes to magic registers aren't RF writes after THREND	Jose Maria Casanova Crespo	2019-09-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Shaders must not attempt to write to the register files in the last three instructions, but that doesn't include the magic registers: nop ; nop ; thrsw; ldtmu.- * ERROR * nop ; nop nop ; nop v2: Simplify validation rules. (Eric Anholt) v3: Adjust validation even more. (Eric Anholt) Reviewed-by: Eric Anholt <[email protected]>
*	nir: Fix num_ssbos when lowering atomic counters	Connor Abbott	2019-09-03	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Otherwise it's impossible to know the maximum SSBO index for both internal TGSI shaders from TTN (which don't have any notion of atomic counters and no offset) as well as shaders from GLSL. I fixed everything I could find while grepping for num_ssbos and num_abos, which hopefully is everything (iris was the only user I could find that uses it in a meaningful way). Reviewed-by: Marek Olšák <[email protected]>
*	v3d: Use the correct opcodes for signed image min/max	Jason Ekstrand	2019-08-21	1	-0/+2
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	nir: Add explicit signs to image min/max intrinsics	Jason Ekstrand	2019-08-21	2	-4/+8
\| \| \| \| \| \| \| \| \| \| \|	This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	v3d: clamp gl_PointSize to a minimum of 1.0	Iago Toral Quiroga	2019-08-13	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The OpenGL ES spec requires that the value of gl_PointSize is clamped to an implementation-dependent range matching what is advertised by GL_ALIASED_POINT_SIZE_RANGE. For V3D this is [1.0, 512.0], but the hardware won't clamp to the minimum side of the range and won't render points with a size strictly smaller than 1.0 either, so we need to clamp manually. For points larger than the maximum size of the range the hardware clamps automatically. Fixes piglit test: spec/!opengl 2.0/vs-point_size-zero Reviewed-by: Eric Anholt <[email protected]>
*	v3d: line length style fixes	Iago Toral Quiroga	2019-08-13	1	-26/+33
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	v3d: honor the write mask on store operations	Iago Toral Quiroga	2019-08-13	1	-85/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2: - Fix incremental update of the const offset when we need to emit a sequence with more than one write because of the writemask. - Do not move the tmu write emission to a separate helper. v3: - Get the store writemask before the loop, use ffs to get the first component to write and clear writemask bits as we process the components (Eric). - Simplified the code that figured out the number of components for the TMU config based on the number of tmu writes for stores and atomics. v4: - Code clean-ups (Eric). Fixes: KHR-GLES31.core.shader_image_load_store.advanced-cast-cs KHR-GLES31.core.shader_image_load_store.advanced-cast-fs KHR-GLES31.core.shader_storage_buffer_object.advanced-switchBuffers-cs KHR-GLES31.core.shader_storage_buffer_object.advanced-switchPrograms-cs KHR-GLES31.core.shader_storage_buffer_object.basic-operations-case1-cs Reviewed-by: Eric Anholt <[email protected]>
*	v3d: refactor ntq_emit_tmu_general() slightly	Iago Toral Quiroga	2019-08-13	1	-24/+36
\| \| \| \| \| \| \| \| \| \|	When we implement write masks on store operations we might need to emit multiple write sequences for a given store intrinsic. To make that easier, let's split the emission of the tmud instructions to their own block after we are done with the code that only needs to run once no matter how many write sequences we need to emit. Reviewed-by: Eric Anholt <[email protected]>
*	nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo	Rhys Perry	2019-08-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	v2: add to series v3: update Makefile.sources v4: don't remove a comment and break statement v4: use nir_can_move_instr Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	v3d: use the GPU to record primitives written to transform feedback	Iago Toral Quiroga	2019-08-08	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can use the PRIMITIVE_COUNTS_FEEDBACK packet to write various primitive counts to a buffer, including the number of primives written to transform feedback buffers, which will handle buffer overflow correctly. There are a couple of caveats with this: Primitive counters are reset when we emit a 'Tile Binning Mode Configuration' packet, which can happen in the middle of a primitives query, so we need to read the buffer when we submit a job and accumulate the counts in the context so we don't lose them. We also need to do the same when we switch primitive type during transform feedback so we can compute the correct number of recorded vertices from the number of primitives. This is necessary so we can provide an accurate vertex count for draw from transform feedback. v2: - When computing the number of vertices for a primitive, pass in the base primitive, since that is what the hardware will count. - No need to update primitive counts when switching primitive types if the base primitives are the same. - Log perf warning when mapping the primitive counts BO for readback (Eric). - Only emit the primitive counts packet once at job end (Eric). - Use u_upload mechanism for the primitive counts buffer (Eric). - Use the XML to generate indices into the primitive counters buffer (Eric). Fixes piglit tests: spec/ext_transform_feedback/overflow-edge-cases spec/ext_transform_feedback/query-primitives_written-bufferrange spec/ext_transform_feedback/query-primitives_written-bufferrange-discard spec/ext_transform_feedback/change-size base-shrink spec/ext_transform_feedback/change-size base-grow spec/ext_transform_feedback/change-size offset-shrink spec/ext_transform_feedback/change-size offset-grow spec/ext_transform_feedback/change-size range-shrink spec/ext_transform_feedback/change-size range-grow spec/ext_transform_feedback/intervening-read prims-written Reviewed-by: Eric Anholt <[email protected]>