mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	freedreno/ir3: use lower_global_vars_to_local in cmdline compiler	Rob Clark	2018-04-07	1	-0/+1
\| \| \| \| \| \| \| \|	tgsi_to_nir emits things with arrays as global vars.. and nir->ir3 does lower_locals_to_regs. But nothing was lowering global to local, which breaks compiling tgsi shaders Signed-off-by: Rob Clark <[email protected]>
*	nir+drivers: add helpers to get # of src/dest components	Rob Clark	2018-04-03	1	-5/+1
\| \| \| \| \| \| \| \| \|	Add helpers to get the number of src/dest components for an intrinsic, and update spots that were open-coding this logic to use the helpers instead. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	freedreno/ir3: fix fallout of unused false-depth elimination	Rob Clark	2018-04-03	2	-17/+19
\| \| \| \| \| \| \| \| \| \|	Since we were MARK flag for both preventing loops, and tracking whether instructions were used, we could end up in an infinite loop due to bd2ca2bcdd. Instead invert the logic.. mark all instructions UNUSED up front and clear the flag as we visit them. Fixes: bd2ca2bcdd freedreno/ir3: eliminate unused false-deps Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a5xx: don't align height for PIPE_BUFFER	Rob Clark	2018-04-01	1	-1/+1
\| \| \| \| \| \| \| \| \|	Buffers can be large, so we probably don't want to make them all 32x bigger. But they can't be rendered to (at least in GL) so we don't need this workaround to prevent page faults on mem<->gmem. Cc: "18.0" <[email protected]> Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a5xx: fix page faults on last level	Rob Clark	2018-04-01	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	We could alternatively fall back to using "old style" draw's for mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32, but that is somewhat more work (and not really something that could be applied to stable) Cc: "18.0" <[email protected]> Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix issue w/ glamor composite shaders	Rob Clark	2018-03-31	2	-2/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes an issue that became possible when we started lowering phi webs to regs (a7ea2b4e) (although was not really seen until we also switched to using peephole select pass (ec8bc54a) instead of lowering all if/else to select). If texture coord (or anything else that uses create_collect() to collect scalar values in a sequence of scalar registers) was consuming a value produced on either side of an if/else (ie. a phi lowered to nir reg, which in ir3 is an "array" of length 1) then register allocation would happen incorrectly and we'd end up sampling from garbage coordinates. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: more half-precision fixes	Rob Clark	2018-03-31	2	-8/+37
\| \| \| \| \| \| \| \|	Some instructions require src/dst to be in full or half precision register depending on src/dst type. So do a better job of propagating register type. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: add helper to create immed of specified size	Rob Clark	2018-03-31	1	-4/+11
\| \| \| \| \| \| \|	We'll also need to be able to create a half-precision immediate. So re-work create_immed(). Prep work for following patch. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: pass ctx instead of block to create_collect()	Rob Clark	2018-03-31	1	-18/+19
\| \| \| \| \| \|	Prep work for following patch. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: eliminate unused false-deps	Rob Clark	2018-03-31	2	-11/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously false-dependencies would get flagged as used, even if the only "use" was a false dep to (for example) prevent a load from being scheduled after a store. In addition to being pointless instructions, in some cases they can cause problems. For example, ldg (and similar instructions) depend on an immed arg getting CP'd into the instruction, but this doesn't happen if an instruction is otherwise unused. Which can result in undefined results (overwriting unintended registers). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: add local_group_size	Rob Clark	2018-03-31	3	-2/+12
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: clear SSA flag when assigning "ARRAY" regs too	Rob Clark	2018-03-31	1	-0/+1
\| \| \| \| \| \|	Avoids a misleading "INVALID FLAGS" warning in debug builds. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: print array live ranges	Rob Clark	2018-03-31	1	-4/+10
\| \| \| \| \| \|	This is also useful to see if optmsgs are enabled. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: a2xx: Implement DP2 instruction	Wladimir J. van der Laan	2018-03-31	1	-0/+21
\| \| \| \| \| \| \| \|	Use DOT2ADDv instruction with 0.0f constant add. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno: a2xx: implement SEQ/SNE instructions	Wladimir J. van der Laan	2018-03-31	1	-3/+20
\| \| \| \| \| \| \| \| \|	Extend translate_sge_slt to emit these, in analogous fashion but using CNDEv. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno: a2xx: Compressed textures support	Wladimir J. van der Laan	2018-03-31	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for: - PIPE_FORMAT_ETC1_RGB8 - PIPE_FORMAT_DXT1_RGB - PIPE_FORMAT_DXT1_RGBA - PIPE_FORMAT_DXT3_RGBA - PIPE_FORMAT_DXT5_RGBA Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno: a2xx: Support TEXTURE_RECT	Wladimir J. van der Laan	2018-03-31	3	-1/+4
\| \| \| \| \| \| \| \| \|	Denormalized texture coordinates are required for text rendering in GALLIUM_HUD. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno: a2xx: Prevent crash in emit_texture if view is not set	Wladimir J. van der Laan	2018-03-31	1	-3/+10
\| \| \| \| \| \| \| \| \| \|	Textures will sometimes be updated if texture view state was un-set, without this change that causes an assertion crash or segfault. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno: a2xx: Fix fd2_tex_swiz	Wladimir J. van der Laan	2018-03-31	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \|	Compose swizzles using util_format_compose_swizzles instead of the custom code (which somehow had a bug). This makes the GL_ALPHA internal format work. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno: a2xx: Change use of BLEND_ to BLEND2_	Wladimir J. van der Laan	2018-03-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change use of BLEND_ to BLEND2_, BLEND_* a3xx_rb_blend_opcode BLEND2_* is a2xx_rb_blend_opcode This makes no effective difference as the used enumerant has the same value (0), but the other enumerants do not match 1-to-1 so this will avoid future problems. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno: a2xx: Update rnndb header for formats enumeration	Wladimir J. van der Laan	2018-03-31	1	-20/+13
\| \| \| \| \| \| \| \| \| \|	The format enumeration comes comes from the yamoto register headers that are part of the amd-gpu kernel driver. (see freedreno envytools commit b8fb7978e7ae106d0d11d0b238ab2ba2d4dd9d43) Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	util: Move util_is_power_of_two to bitscan.h and rename to ↵	Ian Romanick	2018-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
*	nir: Rename image intrinsics to image_var	Jason Ekstrand	2018-03-23	2	-20/+20
\| \| \| \| \| \| \| \| \| \| \|	Generated with git grep -l nir_intrinsic_image \| xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <[email protected]>
*	gallium: add packed uniform CAP	Timothy Arceri	2018-03-20	1	-0/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	freedreno/ir3: start dealing with half-precision	Rob Clark	2018-03-05	3	-30/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some instructions, assume src and/or dst is half-precision based on a type field (ie. f32/s32/u32 are full precision but others are half precision). So add some code to sanity check the src/dst registers to catch mixups. Also propagate half-precision flag for SSA sources. The instruction consuming a SSA value needs to be of the same type as the one producing it. This is probably not complete half-precision support, but a useful first step. We do still need to add support for nir alu instructions for converting between half/full precision. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix fixing-up register footprint	Rob Clark	2018-03-05	2	-18/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	It isn't just vertex shaders that need to fixup reg footprint for inputs populated before shader starts. This problem showed up with compute shaders. If you have (for example) a localregid sysval, but only the .x component is used, the hw still writes the .yz components, which could overflow into other threads causing corruption. Showed up in cl cts 'basic/test_basic intmath_int'. But in theory the same problem could crop up elsewhere. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: surfaces can be PIPE_BUFFER	Rob Clark	2018-03-05	1	-4/+10
\| \| \| \| \| \|	At least for clover. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a5xx: handle compute resources	Rob Clark	2018-03-05	1	-2/+4
\| \| \| \| \| \|	Not entirely sure why this is a different BIND bit, but it is. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: ignore return jump	Rob Clark	2018-03-05	1	-0/+1
\| \| \| \| \| \| \|	I think this should also always only occur at the end of a BB (by definition), and the BB successor should be the end block. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: add some more compute caps	Rob Clark	2018-03-05	2	-4/+21
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a5xx: don't expose 64b pointers yet	Rob Clark	2018-03-05	1	-2/+5
\| \| \| \| \| \| \|	Temporary hack, but since we can't do 64b math yet in ir3, pretend that we don't support 64b pointers. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: steal handy macro for compute caps from nouveau	Rob Clark	2018-03-05	1	-42/+17
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: add global_bindings state	Rob Clark	2018-03-05	4	-4/+85
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: small cleanup	Rob Clark	2018-03-05	1	-3/+3
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: add pctx->memory_barrier()	Rob Clark	2018-03-05	1	-0/+8
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: cmdline compiler updates for spv shaders	Rob Clark	2018-03-05	1	-0/+7
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	nir: add lower_ldexp to nir compiler options	Timothy Arceri	2018-02-28	1	-0/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	freedreno/ir3: fix use_count refcnt'ing issue	Rob Clark	2018-02-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Was hitting an assert with vs-varying-array-mat4-index-col-row-wr.shader_test When eliminating a copy, we were dropping the use_count of the mov that is skipped, but not increasing the use_count of it's src instruction. Fixes: 76440fcca91 freedreno/ir3: clean up dangling false-dep's Signed-off-by: Rob Clark <[email protected]>
*	gallium: allow drivers to impose BO flags restrictions on constant buffer 0	Marek Olšák	2018-02-17	1	-0/+1
\| \| \| \|	Required by radeonsi for optimal behavior.
*	meson: freedreno depends on nir	Dylan Baker	2018-02-16	1	-0/+1
\| \| \| \| \| \| \| \| \|	This fixes a race condition in building targets that link in freedreno. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120 Fixes: 0bbecc5a8548883f76a7 ("meson: define driver dependencies") Signed-off-by: Dylan Baker <[email protected]> Acked-by: Mark Janes <[email protected]>
*	gallium: drop all the guard band float caps.	Dave Airlie	2018-02-14	1	-5/+0
\| \| \| \| \| \| \| \| \| \|	Nobody queries these and nobody sets them to anything useful, the docs say TODO. Drop them until a use appears. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	freedreno: small fix for flushing dependent batches	Rob Clark	2018-02-10	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Flush a resource's previous write_batch synchronously. Because a resource's associated batches are not updated until after the flush thread submits rendering to the kernel, this was causing a bit of confusion in the following loop. This fixes a bug that appeared with recent stk. Perhaps we need to re-work things a bit to clear out dependent patches in the ctx's thread and use a fence to deal with the period between when a flush is queued and when it is submitted to the kernel. But this will do until time permits a larger refactor. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: intra-block scheduling	Rob Clark	2018-02-10	1	-22/+104
\| \| \| \| \| \| \| \| \| \| \| \|	Because of loops, we can't schedule all of a block's predecessors first. Instead just assume that the result consumed in a block was written far enough away in all paths into a block. And do an intra-block scheduling pass to figure out if there are any cases where we need to insert extra nop's. This works out better than always assuming the worst case (ie. that a value live into a block was written in the last instruction in the predecessor block). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: "boost" the depth of if/else condition	Rob Clark	2018-02-10	1	-5/+6
\| \| \| \| \| \| \|	Account for the move to predicate register, to try to avoid needing to insert extra NOPs later. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: account for arrays in delayslot calc	Rob Clark	2018-02-10	1	-2/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	Normally false-deps are not something to consider, since they mostly exist for delay-slot related reasons: * barriers * ordering writes after read * SSBO/image access ordering The exception is a false-dependency on an array store. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: more clever legalize algorithm	Rob Clark	2018-02-10	1	-42/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we didn't handle flow control in legalize, and instead just set (ss)(sy) on the first instruction in every block. Which isn't very clever. Instead, consider output state of all predecessor blocks, so we only set a sync bit if needed for any possible path leading into a block. Because of loops, we can't require that all successor blocks are legalized before a given block, so instead run in a loop until results converge. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: track block predecessors	Rob Clark	2018-02-10	2	-7/+25
\| \| \| \| \| \|	Useful in the following patches. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: clean up dangling false-dep's	Rob Clark	2018-02-10	2	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \|	Maybe there is a better way for this.. where it comes useful is "array" loads, which end up as a false-dep for a later array store. If all the uses of an array load are CP'd into their consumer, it still leaves the dangling array load, leading to funny things like: mov.u32u32 r5.y, r0.y mov.u32u32 r5.y, r0.z Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: handle IMMED for mad 2nd src special case	Rob Clark	2018-02-10	1	-2/+4
\| \| \| \| \| \| \|	Consider also immediates for swapping the first two srcs, because they can be lowered to constant. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: remove ir3 phi instruction	Rob Clark	2018-02-10	8	-205/+16
\| \| \| \| \| \|	Now that we convert phi webs to ssa, we can drop all this. Signed-off-by: Rob Clark <[email protected]>