mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	nv50/ir: fix emission of s[] args in certain situations	Ilia Mirkin	2015-11-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	There might only be a single arg (e.g. cvt), so use mode rather than looking at the source directly. Also we don't want to rely on the type of the value, which can be unreliable, but instead use the instruction's. This works out well since mkSplit doesn't adjust the type. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: only take abs value when computing high result	Ilia Mirkin	2015-11-07	1	-1/+1
\| \| \| \| \| \| \| \|	Not reachable from TGSI since it only has UMUL, no IMUL. However it's surprising that setting argument types to s32 will cause sign to get lost. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: allow emission of immediates in imul/imad ops	Ilia Mirkin	2015-11-07	1	-2/+8
\| \| \| \| \| \| \|	Nothing actually uses this yet (due to complications), but the emission logic is right. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: properly set the type of the constant folding result	Ilia Mirkin	2015-11-06	1	-4/+4
\| \| \| \| \| \| \|	This removes the hack used for merge, which only covers a fraction of the cases. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: add support for const-folding OP_CVT with F64 source/dest	Ilia Mirkin	2015-11-06	3	-0/+45
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: add fp64 opcode emission support for G200 (NVA0)	Ilia Mirkin	2015-11-06	1	-10/+84
\| \| \| \| \| \|	Need to emulate rcp/rsq before providing full fp64 support Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: Add support for 64bit immediates to checkSwapSrc01	Hans de Goede	2015-11-06	1	-5/+6
\| \| \| \| \| \| \| \|	Now that we support 64 bit immediates in insnCanLoad, we need to swap 64 bit immediate sources too for optimal effect. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: Teach insnCanLoad about double immediates	Hans de Goede	2015-11-06	1	-6/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Teach insnCanLoad about double immediates, together with the "Add support for merge-s to the ConstantFolding pass" This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50/ir: Add support for merge-s to the ConstantFolding pass	Hans de Goede	2015-11-06	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	This allows later passes like LoadPropagation to properly deal with 64 bit immediates. If the new 64 bit load this introduces does not get optimized away then split64BitOpPostRA() will split this into 2 instructions again. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50/ir: disallow 64-bit immediates on nv50 targets	Ilia Mirkin	2015-11-06	1	-1/+1
\| \| \| \| \| \|	No instructions are able to load short immediates like nvc0 can. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: allow movs with TYPE_F64 destinations to be split	Ilia Mirkin	2015-11-06	1	-0/+6
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	gm107/ir: Add support for double immediates	Hans de Goede	2015-11-06	1	-1/+4
\| \| \| \| \| \| \| \|	Add support for encoding double immediates (up to 20 bits of precision) into the generated gm107 machine-code. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: Add support for double immediates	Hans de Goede	2015-11-06	1	-0/+8
\| \| \| \| \| \| \| \|	Add support for encoding double immediates (up to 20 bits of precision) into the generated nvc0 machine-code. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50,nvc0: provide debug messages with shader compilation stats	Ilia Mirkin	2015-11-05	2	-0/+3
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nouveau: get rid of tabs	Ilia Mirkin	2015-10-31	3	-4/+4
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50: allow per-sample interpolation to be forced via rast	Ilia Mirkin	2015-10-29	3	-3/+29
\| \| \| \| \| \| \| \|	Uses the same technique as for nvc0 of fixups before upload, and evicting in case of state change. Removes one source of variants kept by st/mesa. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: adapt to new method for passing in cull/clip distance masks	Ilia Mirkin	2015-10-29	3	-8/+10
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: do upload-time fixups for interpolation parameters	Ilia Mirkin	2015-10-29	7	-9/+157
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately flatshading is an all-or-nothing proposition on nvc0, while GL 3.0 calls for the ability to selectively specify explicit interpolation parameters on gl_Color/gl_SecondaryColor which would override the flatshading setting. This allows us to fix up the interpolation settings after shader generation based on rasterizer settings. While we're at it, we can add support for dynamically forcing all (non-flat) shader inputs to be interpolated per-sample, which allows st/mesa to not generate variants for these. Fixes the remaining failing glsl-1.30/execution/interpolation piglits. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: use C++11 standard std::unordered_map if possible	Chih-Wei Huang	2015-10-15	1	-3/+17
\| \| \| \| \| \| \| \|	Note Android version before Lollipop is not supported. Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected]
*	nvc0/ir: start offset at texBindBase for txq, like regular texturing	Ilia Mirkin	2015-09-14	1	-1/+4
\| \| \| \| \| \| \| \| \|	Curiously this has no actual effect. I think it's because the first 8 textures are bound in multiple slots for some reason. However seems prudent to use these the same way as regular texturing, esp in the case where there are more than 8 textures bound. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: add support for TXQS tgsi opcode	Ilia Mirkin	2015-09-13	3	-7/+39
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: don't fold immediate into mad if registers are too high	Ilia Mirkin	2015-09-10	1	-0/+4
\| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91551 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
*	nv50/ir: fix emission of 8-byte wide interp instruction	Ilia Mirkin	2015-09-10	1	-5/+6
\| \| \| \| \| \| \| \| \|	This can come up if the target register number is > 63, which is fairly rare. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91551 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
*	nv50/ir: r63 is only 0 if we are using less than 63 registers	Ilia Mirkin	2015-09-10	1	-1/+4
\| \| \| \| \| \| \| \| \|	It is advantageous to use r63 instead of r127 since r63 can fit into the shorter encoding. However if we've RA'd over 63 registers, we must use r127 as the replacement instead. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
*	nv50/ir: make edge splitting fix up phi node sources	Ilia Mirkin	2015-09-10	1	-13/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately nv50_ir phi nodes aren't directly connected to the CFG, so the mapping between source and the actual BB is by inbound edge order. So when manipulating edges one has to be extremely careful. We were insufficiently careful when splitting critical edges which resulted in the phi nodes being confused as to where their sources were coming from. This primarily manifests itself with the TXL-lowering logic on nv50, when it is inside of a conditional. I've been unable to trigger the issue anywhere else so far. This resolves rendering failures in a number of games like Two Worlds 2, Trine: Enchanted Edition, Trine 2, XCOM:Enemy Unknown, Stacking. It also improves the situation in Hearthstone, Sonic Generations, and The Raven: Legacy of a Master Thief. However more work needs to be done there (splitting a lot more edges solves it, so it's some other sort of RA-related issue). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90887 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
*	nouveau: android: add space before PRIx64 macro	Mauro Rossi	2015-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise the android build fails with error : unable to find string literal operator ‘operator"" PRIx64’ There are several resources referring to the problem, which is related to c++11, in our case used when building mesa for lollipop. http://comments.gmane.org/gmane.comp.graphics.opensg.user/5883 I've not investigated all the semantics, some people even suggested a bug in the gcc compiler, I just saw the building error was solved with one little space for lollipop and no side effect when c+11 not used. v2: [Emil Velikov] add an alternative commit message from Mauro. Cc: 11.0 <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	nv50/ir: pre-compute BFE arg when both bits and offset are imm	Ilia Mirkin	2015-08-20	1	-3/+9
\| \| \| \| \| \| \| \| \| \|	Due to a quirk in how the nv50 opt passes run, the algebraic optimization that looks for these BFE's happens before the constant folding pass. Rearranging these passes isn't a great idea, but this is easy enough to fix. Allows a following cvt to eliminate the bfe in certain situations. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: Handle OP_CVT when folding constant expressions	Tobias Klausmann	2015-08-20	1	-0/+78
\| \| \| \| \|	[imirkin: handle more type combinations, use macro] Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: undo more shifts still by allowing a pre-SHL to occur	Ilia Mirkin	2015-08-20	1	-15/+33
\| \| \| \| \| \| \| \|	This happens with unpackSnorm lowering. There's yet another bitfield-extract behind it, but there's too much variation to be worth cutting through. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: don't require AND when the high byte is being addressed	Ilia Mirkin	2015-08-20	1	-0/+12
\| \| \| \| \| \| \|	unpackUnorm* lowering doesn't AND the high byte/word as it's unnecessary. Detect that situation as well. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: detect i2f/i2i which operate on specific bytes/words	Ilia Mirkin	2015-08-20	4	-4/+82
\| \| \| \| \| \| \| \| \| \| \|	Some Unigine shaders have been observed to unpack bytes out of 32-bit integers and convert them to floats. I2F/I2I can handle this sort of thing directly. Detect the handleable situations. This misses 16-bit word capabilities in nv50, but I haven't seen shaders that would actually make use of that. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: detect AND/SHR pairs and convert into EXTBF	Ilia Mirkin	2015-08-20	1	-20/+46
\| \| \| \| \| \| \|	Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: support different unordered_set implementations	Chih-Wei Huang	2015-08-20	5	-12/+57
\| \| \| \| \| \| \| \| \| \| \| \|	If build with C++11 standard, use std::unordered_set. Otherwise if build on old Android version with stlport, use std::tr1::unordered_set with a wrapper class. Otherwise use std::tr1::unordered_set. Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	gk110/ir: fix sched calculator to consider all registers in the ISA	Ilia Mirkin	2015-08-17	1	-7/+10
\| \| \| \| \| \| \|	GK110/GK208 have 256 registers, not 64. Find out the number of registers from the target to avoid unnecessary iteration for pre-GK110. Signed-off-by: Ilia Mirkin <[email protected]>
*	gm107/ir: avoid letting the lowering pass get out of sync	Ilia Mirkin	2015-08-17	2	-88/+5
\| \| \| \| \| \| \| \|	There's a lot of functionality duplicated in the gm107 lowering pass from the nvc0 pass. As that one gets updated, the gm107 one falls behind. Avoid this by sharing the code. Signed-off-by: Ilia Mirkin <[email protected]>
*	gm107/ir: indirect handle goes first on maxwell also	Ilia Mirkin	2015-08-14	1	-8/+4
\| \| \| \| \| \| \|	Fixes fs-simple-texture-size.shader_test Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6" <[email protected]>
*	nvc0/ir: cache vertex out base so that we don't recompute again	Ilia Mirkin	2015-07-29	1	-8/+15
\| \| \| \| \| \| \| \|	The global CSE pass stinks and is unable to pull this out. Easy enough to handle it here and avoid generating unnecessary special register loads (which can allegedly be quite slow). Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: output base for reading is based on laneid	Ilia Mirkin	2015-07-29	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \|	PFETCH retrieves the address for incoming vertices, not output vertices in TCS. For output vertices, we must use the laneid as a base. Fixes barrier piglit test, which was failing for entirely non-barrier reasons, but rather that it was (a) trying to draw multiple patches and (b) the incoming patch size was not the same as the outgoing patch size. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: trim out barrier sync for non-compute shaders	Ilia Mirkin	2015-07-28	1	-0/+6
\| \| \| \| \| \| \|	It seems like they're never necessary, and actively cause harm. This fixes some of the barrier-related piglits. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: fix barrier emission	Ilia Mirkin	2015-07-28	1	-0/+2
\| \| \| \| \| \|	immediate arguments require a flag to be set for each one Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: per-patch vars are in a separate address space	Ilia Mirkin	2015-07-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	There's no need to attempt to avoid overlapping generic i/o with patch i/o. By the same token, we can't merge patch and non-patch loads/stores. This fixes at least the tes-both-input-array-*-index-rd tessellation variable-indexing tests. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: kepler can't do indirect shader input/output loads directly	Ilia Mirkin	2015-07-23	8	-6/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's a special AL2P instruction (called AFETCH in nv50 ir) which computes a "physical" value to be used with indirect addressing with ALD. Fixes tcs-input-array--index-rd tcs-output-array--index-wr varying-indexing tessellation tests on Kepler. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: tess factors are now sysvals, adapt codegen to expect that	Ilia Mirkin	2015-07-23	6	-11/+24
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	gk110/ir: fake BAR support	Ilia Mirkin	2015-07-23	1	-0/+12
\| \| \| \| \| \|	Makes things sorta work until we figure out the real way to do this. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: cleanup private enums that have graduated to gallium	Ilia Mirkin	2015-07-23	1	-5/+0
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: allow tess eval output loads to be CSE'd	Ilia Mirkin	2015-07-23	1	-0/+2
\| \| \| \| \| \|	These only happen for gl_TessCoord which are constant. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: add hazard for 2nd dim of vfetch/load indirect argument	Ilia Mirkin	2015-07-23	1	-0/+2
\| \| \| \| \| \| \|	Apparently a multi-word load can potentially overwrite the indirect sources, so make sure that RA picks different registers for those. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: patch vertex count is stored in the upper bits	Ilia Mirkin	2015-07-23	1	-0/+4
\|
*	nvc0/ir: add support for reading outputs in tess control shaders	Ilia Mirkin	2015-07-23	2	-2/+18
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: set perPatch flag on load/stores to per-patch varyings	Ilia Mirkin	2015-07-23	1	-2/+6
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>