mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type()	Samuel Pitoiset	2018-02-12	1	-13/+13
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: drop nir_to_llvm_context from visit_var_atomic()	Samuel Pitoiset	2018-02-12	1	-7/+7
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex()	Samuel Pitoiset	2018-02-12	1	-5/+5
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: drop nir_to_llvm_context from visit_load_push_constant()	Samuel Pitoiset	2018-02-12	1	-6/+7
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: drop nir_to_llvm_context from cast_ptr()	Samuel Pitoiset	2018-02-12	1	-3/+3
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index()	Samuel Pitoiset	2018-02-12	1	-4/+4
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: drop nir_to_llvm_context from emit_f2f16()	Samuel Pitoiset	2018-02-12	1	-15/+14
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: remove unused parameters in abi::load_tess_coord()	Samuel Pitoiset	2018-02-12	3	-10/+5
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: remove useless bitcast in load_tess_coord()	Samuel Pitoiset	2018-02-12	1	-8/+3
\| \| \| \| \| \| \|	nir_intrinsic_load_tess_coord always returns a v3i32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: add load_resource() to the ABI	Samuel Pitoiset	2018-02-12	2	-7/+25
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: add load_sample_mask_in() to the ABI	Samuel Pitoiset	2018-02-12	3	-8/+17
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: move view_index to the ABI	Samuel Pitoiset	2018-02-12	2	-15/+18
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: move push_constants to the ABI	Samuel Pitoiset	2018-02-12	2	-4/+5
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: move tg_size to the ABI	Samuel Pitoiset	2018-02-12	2	-3/+3
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: remove unused nir_to_llvm_context:{defs,phis}	Samuel Pitoiset	2018-02-12	1	-3/+0
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	egl/gbm: Fix compiler warning about visual matching.	Eric Anholt	2018-02-12	1	-1/+1
\| \| \| \| \| \| \|	The compiler doesn't know that num_visuals > 0. Fixes: 37a8d907cc16 ("egl/gbm: Ensure EGLConfigs match GBM surface format") Reviewed-by: Daniel Stone <[email protected]>
*	freedreno: small fix for flushing dependent batches	Rob Clark	2018-02-10	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Flush a resource's previous write_batch synchronously. Because a resource's associated batches are not updated until after the flush thread submits rendering to the kernel, this was causing a bit of confusion in the following loop. This fixes a bug that appeared with recent stk. Perhaps we need to re-work things a bit to clear out dependent patches in the ctx's thread and use a fence to deal with the period between when a flush is queued and when it is submitted to the kernel. But this will do until time permits a larger refactor. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: intra-block scheduling	Rob Clark	2018-02-10	1	-22/+104
\| \| \| \| \| \| \| \| \| \| \| \|	Because of loops, we can't schedule all of a block's predecessors first. Instead just assume that the result consumed in a block was written far enough away in all paths into a block. And do an intra-block scheduling pass to figure out if there are any cases where we need to insert extra nop's. This works out better than always assuming the worst case (ie. that a value live into a block was written in the last instruction in the predecessor block). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: "boost" the depth of if/else condition	Rob Clark	2018-02-10	1	-5/+6
\| \| \| \| \| \| \|	Account for the move to predicate register, to try to avoid needing to insert extra NOPs later. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: account for arrays in delayslot calc	Rob Clark	2018-02-10	1	-2/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	Normally false-deps are not something to consider, since they mostly exist for delay-slot related reasons: * barriers * ordering writes after read * SSBO/image access ordering The exception is a false-dependency on an array store. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: more clever legalize algorithm	Rob Clark	2018-02-10	1	-42/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we didn't handle flow control in legalize, and instead just set (ss)(sy) on the first instruction in every block. Which isn't very clever. Instead, consider output state of all predecessor blocks, so we only set a sync bit if needed for any possible path leading into a block. Because of loops, we can't require that all successor blocks are legalized before a given block, so instead run in a loop until results converge. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: track block predecessors	Rob Clark	2018-02-10	2	-7/+25
\| \| \| \| \| \|	Useful in the following patches. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: clean up dangling false-dep's	Rob Clark	2018-02-10	2	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \|	Maybe there is a better way for this.. where it comes useful is "array" loads, which end up as a false-dep for a later array store. If all the uses of an array load are CP'd into their consumer, it still leaves the dangling array load, leading to funny things like: mov.u32u32 r5.y, r0.y mov.u32u32 r5.y, r0.z Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: handle IMMED for mad 2nd src special case	Rob Clark	2018-02-10	1	-2/+4
\| \| \| \| \| \| \|	Consider also immediates for swapping the first two srcs, because they can be lowered to constant. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: remove ir3 phi instruction	Rob Clark	2018-02-10	8	-205/+16
\| \| \| \| \| \|	Now that we convert phi webs to ssa, we can drop all this. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: remove lower_if_else pass	Rob Clark	2018-02-10	4	-328/+0
\| \| \| \| \| \|	Now that it is unused. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: add experimental GCM pass	Rob Clark	2018-02-10	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	Generally seems to do worse on instruction count and register usage, according to shader-db. But shader-db also doesn't do a very good job of weighting loop bodies, so that might not be totally valid. So add an env variable to enable GCM pass for easier experimentation. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: change opt passes	Rob Clark	2018-02-10	1	-0/+14
\| \| \| \| \| \| \|	There are more useful nir passes added since initial conversion to nir. But ir3 was never updated to use them. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: use peephole select pass	Rob Clark	2018-02-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Agressively lowering all if/else to selects in some extreme cases results in much higher register pressure. Using peephole select instead with a modest threshold speeds up alu2 4x! 16 seems like a good limit, low enough to help alu2 but not too low that it penalizes everything else. With a bit better scheduling of the instruction that moves a value into a predicate register, we might be able to lower this limit a bit more in the future, but since we need 6 cycles from the move to predicate register to predicated branch, that puts some sort of lower bound on how far we can lower this threshold. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: lower phi webs to regs	Rob Clark	2018-02-10	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	nir's from_ssa pass is much better at avoiding inserting extra moves than our logic is. And lowering phi webs to regs just treats anything involved in a phi web as an array of length=1. Which with previous array related fixes in RA/etc ends up working out quite well. This cuts down on extra instructions and also helps with register pressure. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: separate arrays from groups	Rob Clark	2018-02-10	1	-0/+8
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: make block/instruction serialno per-shader	Rob Clark	2018-02-10	2	-4/+6
\| \| \| \| \| \| \|	Makes it easier to compare values seen in-game (where there are many shaders) to cmdline standalone compiler. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: add spirv support to cmdline compiler	Rob Clark	2018-02-10	1	-3/+60
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: don't lower fsat	Rob Clark	2018-02-10	3	-1/+23
\| \| \| \| \| \| \| \|	Instead, if possible fold (sat) flag into src, otherwise use: (sat)max.f rD, rS, rS Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: add encoding/decoding for (sat) bit	Rob Clark	2018-02-10	4	-12/+42
\| \| \| \| \| \| \| \|	Seems to be there since a3xx, but we always lowered fsat. But we can shave some instructions, especially in shaders that use lots of clamp(foo, 0.0, 1.0) by not lowering fsat. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: extend liverange of arrays	Rob Clark	2018-02-10	1	-0/+11
\| \| \| \| \| \| \|	Use livein state of other blocks to extend liverange of arrays when they are still needed by successor blocks. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: avoid extra mov's for "arrays"	Rob Clark	2018-02-10	1	-3/+23
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: a couple more array fixes	Rob Clark	2018-02-10	1	-2/+15
\| \| \| \| \| \|	(Plus a couple TODOs) Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: keep array stores	Rob Clark	2018-02-10	1	-0/+6
\| \| \| \| \| \| \|	Since these are not in SSA form, add to block's keeps so it doesn't appear unused. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: propagate barrier information	Rob Clark	2018-02-10	1	-0/+5
\| \| \| \| \| \| \| \|	When eliminating movs, the instruction that is now directly using the src of the mov has the same scheduling order constraints as the original mov instruction. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: remove pointless statement	Rob Clark	2018-02-10	1	-3/+0
\| \| \| \| \| \|	Function ends after this if/else ladder, so it was pointless. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: some more debug prints	Rob Clark	2018-02-10	2	-0/+36
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix printing of relative branch offsets	Rob Clark	2018-02-10	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	The number of bits depends on generation. But printing negative values with a5xx encoding (largest size) but compiling for a3xx or a4xx, would result in negative values printed as large positive values. I guess in practice huge negative branch offsets aren't likely (and if that is the case, the shader is probably too big to grok by reading the assembly). So just print using smallest bitfield size. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: be more clever with if/else jumps	Rob Clark	2018-02-10	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \|	Try to clean up things like: br !p0.x #2 br p0.x #something to eliminate the first branch. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: avoid some spurious sync bits	Rob Clark	2018-02-10	1	-1/+3
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: print # of sync bits for shaderdb	Rob Clark	2018-02-10	3	-2/+18
\| \| \| \| \| \|	When trying to optimize to reduce stalls, it is nice to see this info. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: add debug trace for flush	Rob Clark	2018-02-10	1	-0/+2
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	intel/compiler: fix 64bit value prints on 32bit	Grazvydas Ignotas	2018-02-10	2	-3/+3
\| \| \| \| \| \| \| \|	Fix the following: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t {aka long long unsigned int}. Reviewed-by: Lionel Landwerlin <[email protected]>
*	st/glsl_to_nir: remove unused options variable	Timothy Arceri	2018-02-10	1	-2/+0
\|
*	st/radeonsi: enable disk cache for nir	Timothy Arceri	2018-02-10	2	-8/+11
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>