mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	nv50/ir: pre-compute BFE arg when both bits and offset are imm	Ilia Mirkin	2015-08-20	1	-3/+9
\| \| \| \| \| \| \| \| \| \|	Due to a quirk in how the nv50 opt passes run, the algebraic optimization that looks for these BFE's happens before the constant folding pass. Rearranging these passes isn't a great idea, but this is easy enough to fix. Allows a following cvt to eliminate the bfe in certain situations. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: Handle OP_CVT when folding constant expressions	Tobias Klausmann	2015-08-20	1	-0/+78
\| \| \| \| \|	[imirkin: handle more type combinations, use macro] Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: undo more shifts still by allowing a pre-SHL to occur	Ilia Mirkin	2015-08-20	1	-15/+33
\| \| \| \| \| \| \| \|	This happens with unpackSnorm lowering. There's yet another bitfield-extract behind it, but there's too much variation to be worth cutting through. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: don't require AND when the high byte is being addressed	Ilia Mirkin	2015-08-20	1	-0/+12
\| \| \| \| \| \| \|	unpackUnorm* lowering doesn't AND the high byte/word as it's unnecessary. Detect that situation as well. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: detect i2f/i2i which operate on specific bytes/words	Ilia Mirkin	2015-08-20	4	-4/+82
\| \| \| \| \| \| \| \| \| \| \|	Some Unigine shaders have been observed to unpack bytes out of 32-bit integers and convert them to floats. I2F/I2I can handle this sort of thing directly. Detect the handleable situations. This misses 16-bit word capabilities in nv50, but I haven't seen shaders that would actually make use of that. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: detect AND/SHR pairs and convert into EXTBF	Ilia Mirkin	2015-08-20	1	-20/+46
\| \| \| \| \| \| \|	Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: support different unordered_set implementations	Chih-Wei Huang	2015-08-20	5	-12/+57
\| \| \| \| \| \| \| \| \| \| \| \|	If build with C++11 standard, use std::unordered_set. Otherwise if build on old Android version with stlport, use std::tr1::unordered_set with a wrapper class. Otherwise use std::tr1::unordered_set. Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	gk110/ir: fix sched calculator to consider all registers in the ISA	Ilia Mirkin	2015-08-17	1	-7/+10
\| \| \| \| \| \| \|	GK110/GK208 have 256 registers, not 64. Find out the number of registers from the target to avoid unnecessary iteration for pre-GK110. Signed-off-by: Ilia Mirkin <[email protected]>
*	gm107/ir: avoid letting the lowering pass get out of sync	Ilia Mirkin	2015-08-17	2	-88/+5
\| \| \| \| \| \| \| \|	There's a lot of functionality duplicated in the gm107 lowering pass from the nvc0 pass. As that one gets updated, the gm107 one falls behind. Avoid this by sharing the code. Signed-off-by: Ilia Mirkin <[email protected]>
*	gm107/ir: indirect handle goes first on maxwell also	Ilia Mirkin	2015-08-14	1	-8/+4
\| \| \| \| \| \| \|	Fixes fs-simple-texture-size.shader_test Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6" <[email protected]>
*	nvc0/ir: cache vertex out base so that we don't recompute again	Ilia Mirkin	2015-07-29	1	-8/+15
\| \| \| \| \| \| \| \|	The global CSE pass stinks and is unable to pull this out. Easy enough to handle it here and avoid generating unnecessary special register loads (which can allegedly be quite slow). Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: output base for reading is based on laneid	Ilia Mirkin	2015-07-29	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \|	PFETCH retrieves the address for incoming vertices, not output vertices in TCS. For output vertices, we must use the laneid as a base. Fixes barrier piglit test, which was failing for entirely non-barrier reasons, but rather that it was (a) trying to draw multiple patches and (b) the incoming patch size was not the same as the outgoing patch size. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: trim out barrier sync for non-compute shaders	Ilia Mirkin	2015-07-28	1	-0/+6
\| \| \| \| \| \| \|	It seems like they're never necessary, and actively cause harm. This fixes some of the barrier-related piglits. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: fix barrier emission	Ilia Mirkin	2015-07-28	1	-0/+2
\| \| \| \| \| \|	immediate arguments require a flag to be set for each one Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: per-patch vars are in a separate address space	Ilia Mirkin	2015-07-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	There's no need to attempt to avoid overlapping generic i/o with patch i/o. By the same token, we can't merge patch and non-patch loads/stores. This fixes at least the tes-both-input-array-*-index-rd tessellation variable-indexing tests. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: kepler can't do indirect shader input/output loads directly	Ilia Mirkin	2015-07-23	8	-6/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's a special AL2P instruction (called AFETCH in nv50 ir) which computes a "physical" value to be used with indirect addressing with ALD. Fixes tcs-input-array--index-rd tcs-output-array--index-wr varying-indexing tessellation tests on Kepler. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: tess factors are now sysvals, adapt codegen to expect that	Ilia Mirkin	2015-07-23	6	-11/+24
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	gk110/ir: fake BAR support	Ilia Mirkin	2015-07-23	1	-0/+12
\| \| \| \| \| \|	Makes things sorta work until we figure out the real way to do this. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: cleanup private enums that have graduated to gallium	Ilia Mirkin	2015-07-23	1	-5/+0
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: allow tess eval output loads to be CSE'd	Ilia Mirkin	2015-07-23	1	-0/+2
\| \| \| \| \| \|	These only happen for gl_TessCoord which are constant. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: add hazard for 2nd dim of vfetch/load indirect argument	Ilia Mirkin	2015-07-23	1	-0/+2
\| \| \| \| \| \| \|	Apparently a multi-word load can potentially overwrite the indirect sources, so make sure that RA picks different registers for those. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: patch vertex count is stored in the upper bits	Ilia Mirkin	2015-07-23	1	-0/+4
\|
*	nvc0/ir: add support for reading outputs in tess control shaders	Ilia Mirkin	2015-07-23	2	-2/+18
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: set perPatch flag on load/stores to per-patch varyings	Ilia Mirkin	2015-07-23	1	-2/+6
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: populate info structure based on new tess properties	Ilia Mirkin	2015-07-23	1	-0/+18
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: mark varyings as per-patch based on semantic name	Ilia Mirkin	2015-07-23	1	-0/+14
\| \| \| \| \| \|	Also add proper handling for PATCH semantics Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: TESSCOORD comes in as a sysval, not an input	Ilia Mirkin	2015-07-23	1	-2/+0
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: preliminary tess support	Ilia Mirkin	2015-07-23	3	-7/+4
\| \| \| \| \| \| \|	Uncomment the various functionality that was already there and add in obvious missing bits that parallel vp/gp/fp functionality. Signed-off-by: Ilia Mirkin <[email protected]>
*	nouveau: use bool instead of boolean	Samuel Pitoiset	2015-07-21	3	-12/+12
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
*	gm107/ir: fix indirect txq emission	Ilia Mirkin	2015-07-18	1	-2/+8
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
*	nvc0/ir: don't worry about sampler in txq handling	Ilia Mirkin	2015-07-18	1	-22/+8
\| \| \| \| \| \| \| \| \|	There's no need to deal with samplers for texture size queries. That code also was accidentally setting an invalid sIndirectSrc position, but it can now just be removed. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
*	nvc0/ir: fix txq on indirect samplers	Ilia Mirkin	2015-07-18	2	-2/+56
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
*	nv50/ir: UCMP arguments are float, so make sure modifiers are applied	Ilia Mirkin	2015-07-03	1	-1/+2
\| \| \| \| \| \| \| \| \|	The first argument to UCMP needs to be compared against 0, but the latter arguments are treated as float and need to be able to properly apply neg/abs arguments. Adjust the inferSrcType function accordingly. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nv50/ir: don't emit src2 in immediate form	Ilia Mirkin	2015-07-02	1	-2/+2
\| \| \| \| \| \| \| \| \|	In the immediate form, src2 == dst, so it does not need to be emitted. Otherwise it overlaps with the immediate value's low bits. Fixes: 09ee907266 (nv50/ir: Fold IMM into MAD) Cc: "10.6" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: copy joinAt when splitting both before and after	Ilia Mirkin	2015-07-01	3	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current implementation only moves the joinAt when splitting after the given instruction, not before it. So if you have a BB with foo instr bar joinat and thus with joinAt set, we end up first splitting before instr, at which point the instr's bb is updated to the new bb. Since that bb doesn't have a joinAt set (despite containing one), when splitting after the instr, there is nothing to copy over. Since the joinat will be in the "split" bb irrespective of whether we're splitting before or after the instruction, move it over in either case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91124 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nv50/ir: fix emission of address reg in 3rd source	Ilia Mirkin	2015-06-30	1	-2/+6
\| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91056 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nv50/ir: propagate modifier to right arg when const-folding mad	Ilia Mirkin	2015-06-26	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	An immediate has to be the second arg of an ADD operation. However we were mistakenly propagating the modifier of the non-folded value to the folded immediate argument. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91117 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nvc0/ir: can't have a join on a load with an indirect source	Ilia Mirkin	2015-06-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Triggers an INVALID_OPCODE warning on GK208. Seems rare enough to not warrant verification on other chips. Fixes the new piglits: ubo_array_indexing/fs-nonuniform-control-flow.shader_test ubo_array_indexing/vs-nonuniform-control-flow.shader_test Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nvc0/ir: fix collection of first uses for texture barrier insertion	Ilia Mirkin	2015-06-15	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	One of the places we have to insert texbars is in situations where the result of the tex gets overwritten by a different instruction (e.g. in a conditional statement). However in some situations it can actually appear as though the original tex itself is an overwriting instruction. This can naturally never really happen, so just ignore the tex instruction when it comes up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90347 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nv50/ir: OP_JOIN is a flow instruction	Jürgen Rühle	2015-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	OP_JOIN instructions are assumed to be flow instructions and mercilessly casted to FlowInstruction. This patch fixes an instance where an OP_JOIN is created as a plain instruction. This can cause crashes in the ir printer. [imirkin: add ->fixed = 1] Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50/ir: avoid messing up arg1 of PFETCH	Ilia Mirkin	2015-05-23	1	-2/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There can be scenarios where the "indirect" arg of a PFETCH becomes known, and so the code will attempt to propagate it. Use this opportunity to just fold it into the first argument, and prevent the load propagation pass from touching PFETCH further. This fixes gs-input-array-vec4-index-rd.shader_test and vs-output-array-vec4-index-wr-before-gs.shader_test on nvc0 at least. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nvc0/ir: LOAD's can't be used for shader inputs	Ilia Mirkin	2015-05-22	2	-0/+2
\| \| \| \| \| \| \| \| \| \|	We forgot to convert to VFETCH in case of indirect access. Fix that. This avoids crashes on the new gs-input-array-vec4-index-rd and vs-output-array-vec4-index-wr-before-gs but they still fail. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nv50/ir: guess that the constant offset is the starting slot of array	Ilia Mirkin	2015-05-22	1	-2/+4
\| \| \| \| \| \| \| \|	When we get something like IN[ADDR[0].x+5], we will now guess that we should look at IN[5] for the "base" information. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nvc0/ir: set ftz when sources are floats, not just destinations	Ilia Mirkin	2015-05-22	1	-3/+2
\| \| \| \| \| \| \| \|	In the case of a compare, the destination might be a predicate, but we still want to flush denorms. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.5 10.6" <[email protected]>
*	nv50/ir: allow OP_SET to merge with OP_SET_AND/etc as well as a neg	Ilia Mirkin	2015-05-22	1	-26/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This covers the pattern where a KILL_IF is used, which triggers a comparison of -x to 0. This can usually be folded into the comparison whose result is being compared to 0, however it may, itself, have already been combined with another comparison. That shouldn't impact the logic of this pass however. With this and the & 1.0 change, code like 00000020: 001c0001 80081df4 set b32 $r0 lt f32 $r0 0x3e800000 00000028: 001c0000 201fc000 and b32 $r0 $r0 0x3f800000 00000030: 7f9c001e dd885c00 set $p0 0x1 lt f32 neg $r0 0x0 00000038: 0000003c 19800000 $p0 discard becomes 00000020: 001c001d b5881df4 set $p0 0x1 lt f32 $r0 0x3e800000 00000028: 0000003c 19800000 $p0 discard Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: optimize set & 1.0 to produce boolean-float sets	Ilia Mirkin	2015-05-22	2	-0/+29
\| \| \| \| \| \| \| \|	This has started to happen more now that the backend is producing KILL_IF more often. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]>
*	nvc0/ir: allow iset to produce a boolean float	Ilia Mirkin	2015-05-22	3	-5/+16
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: avoid jumping to a sched instruction	Ilia Mirkin	2015-05-22	3	-2/+9
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	gallium: remove TGSI_SAT_MINUS_PLUS_ONE	Marek Olšák	2015-05-20	1	-12/+1
\| \| \| \| \| \| \| \|	It's a remnant of some old NV extension. Unused. I also have a patch that removes predicates if anyone is interested. Reviewed-by: Roland Scheidegger <[email protected]>
*	gk110/ir: switch to gk104-style sched codes rather than all-in-one	Ilia Mirkin	2015-05-18	1	-9/+9
\| \| \| \| \| \|	Matches change to envydis/envyas tools. Signed-off-by: Ilia Mirkin <[email protected]>