mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	nvc0: enable compute support on Fermi	Samuel Pitoiset	2015-11-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Altough the compute support is still not complete because textures and surfaces need to be implemented, it allows to launch very simple compute kernel like one which reads reading MP performance counters. This turns on PIPE_CAP_COMPUTE and PIPE_SHADER_COMPUTE. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50/ir: fix emission of s[] args in certain situations	Ilia Mirkin	2015-11-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	There might only be a single arg (e.g. cvt), so use mode rather than looking at the source directly. Also we don't want to rely on the type of the value, which can be unreliable, but instead use the instruction's. This works out well since mkSplit doesn't adjust the type. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: only take abs value when computing high result	Ilia Mirkin	2015-11-07	1	-1/+1
\| \| \| \| \| \| \| \|	Not reachable from TGSI since it only has UMUL, no IMUL. However it's surprising that setting argument types to s32 will cause sign to get lost. Signed-off-by: Ilia Mirkin <[email protected]>
*	nouveau: avoid queueing too much work onto a single fence	Ilia Mirkin	2015-11-07	2	-26/+43
\| \| \| \| \| \| \| \| \| \|	Force the fence to get kicked off, which won't actually wait for its completion, but any additional work will be put onto a fresh list. This fixes crashes in teximage-colors --benchmark with too many active maps. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: allow emission of immediates in imul/imad ops	Ilia Mirkin	2015-11-07	1	-2/+8
\| \| \| \| \| \| \|	Nothing actually uses this yet (due to complications), but the emission logic is right. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: properly set the type of the constant folding result	Ilia Mirkin	2015-11-06	1	-4/+4
\| \| \| \| \| \| \|	This removes the hack used for merge, which only covers a fraction of the cases. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: add support for const-folding OP_CVT with F64 source/dest	Ilia Mirkin	2015-11-06	3	-0/+45
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: add fp64 opcode emission support for G200 (NVA0)	Ilia Mirkin	2015-11-06	1	-10/+84
\| \| \| \| \| \|	Need to emulate rcp/rsq before providing full fp64 support Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: Add support for 64bit immediates to checkSwapSrc01	Hans de Goede	2015-11-06	1	-5/+6
\| \| \| \| \| \| \| \|	Now that we support 64 bit immediates in insnCanLoad, we need to swap 64 bit immediate sources too for optimal effect. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: Teach insnCanLoad about double immediates	Hans de Goede	2015-11-06	1	-6/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Teach insnCanLoad about double immediates, together with the "Add support for merge-s to the ConstantFolding pass" This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50/ir: Add support for merge-s to the ConstantFolding pass	Hans de Goede	2015-11-06	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	This allows later passes like LoadPropagation to properly deal with 64 bit immediates. If the new 64 bit load this introduces does not get optimized away then split64BitOpPostRA() will split this into 2 instructions again. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50/ir: disallow 64-bit immediates on nv50 targets	Ilia Mirkin	2015-11-06	1	-1/+1
\| \| \| \| \| \|	No instructions are able to load short immediates like nvc0 can. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: allow movs with TYPE_F64 destinations to be split	Ilia Mirkin	2015-11-06	1	-0/+6
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	gm107/ir: Add support for double immediates	Hans de Goede	2015-11-06	1	-1/+4
\| \| \| \| \| \| \| \|	Add support for encoding double immediates (up to 20 bits of precision) into the generated gm107 machine-code. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: Add support for double immediates	Hans de Goede	2015-11-06	1	-0/+8
\| \| \| \| \| \| \| \|	Add support for encoding double immediates (up to 20 bits of precision) into the generated nvc0 machine-code. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: reintroduce BGRA4 format support	Ilia Mirkin	2015-11-06	2	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 342e68dc60 (nvc0: remove BGRA4 format support) removed the support to fix a WoW trace. However after further experimentation, I was able to get the blit to work by using a different "fake" format in the 2d engine. The reason why this worked on nv50 is that nv50 falls back to the 3d blit path in case either the src or the dst aren't "faithfully" supported, while nvc0 only does it for the dst format. RG8 is better supported by the nvc0 2d engine than R16. Signed-off-by: Ilia Mirkin <[email protected]>
*	nouveau: send back a debug message when waiting for a fence to complete	Ilia Mirkin	2015-11-05	10	-16/+30
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50,nvc0: provide debug messages with shader compilation stats	Ilia Mirkin	2015-11-05	11	-9/+28
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nouveau: add support for sending debug messages via KHR_debug	Ilia Mirkin	2015-11-05	5	-0/+26
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nouveau: relax fence emit space assert	Ilia Mirkin	2015-11-04	3	-3/+3
\| \| \| \| \| \| \| \| \|	We also have the "reserved for kick" space available. Some of my earlier changes can probably be removed, but this is a quick fix for some of the rarer fallout. Signed-off-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
*	nvc0: add missing compute parameters required by clover	Samuel Pitoiset	2015-11-03	1	-1/+10
\| \| \| \| \| \| \|	This fixes crashes with some piglit OpenCL tests. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: handle NULL pointer in nvc0_get_compute_param()	Samuel Pitoiset	2015-11-03	1	-24/+21
\| \| \| \| \| \| \| \| \|	To get the size (in bytes) of a compute parameter, clover first calls get_compute_param() with a NULL data pointer. The RET() macro is based on nv50. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nv50: use correct heaps for FP and GP code segments	Samuel Pitoiset	2015-11-01	1	-2/+2
\| \| \| \| \| \|	This is just a cosmetic change. Trivial. Signed-off-by: Samuel Pitoiset <[email protected]>
*	nouveau: get rid of tabs	Ilia Mirkin	2015-10-31	19	-607/+607
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50: do not create an invalid HW query type	Samuel Pitoiset	2015-10-30	2	-12/+30
\| \| \| \| \| \| \| \|	While we are at it, store the rotate offset for occlusion queries to nv50_hw_query like on nvc0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
*	nv50: move HW queries to nv50_query_hw.c/h files	Samuel Pitoiset	2015-10-30	8	-349/+476
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
*	nv50: move nva0_so_target_save_offset() to its correct location	Samuel Pitoiset	2015-10-30	3	-21/+18
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
*	nv50: add a header file for nv50_query	Samuel Pitoiset	2015-10-30	6	-40/+49
\| \| \| \| \| \| \| \| \|	Like for nvc0, this will allow to split different types of queries and to prepare the way for both global performance counters and MP counters. While we are at it, make use of nv50_query struct instead of pipe_query. Signed-off-by: Samuel Pitoiset <[email protected]>
*	nv50: mark contexts shareable, compile at creation time	Ilia Mirkin	2015-10-29	2	-1/+4
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50: allow per-sample interpolation to be forced via rast	Ilia Mirkin	2015-10-29	8	-9/+52
\| \| \| \| \| \| \| \|	Uses the same technique as for nvc0 of fixups before upload, and evicting in case of state change. Removes one source of variants kept by st/mesa. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: expose a group of performance metrics on Fermi	Samuel Pitoiset	2015-10-29	3	-3/+16
\| \| \| \| \| \| \|	This allows to monitor those performance metrics through GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset <[email protected]>
*	nv50/ir: adapt to new method for passing in cull/clip distance masks	Ilia Mirkin	2015-10-29	4	-14/+14
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: share shaders between contexts and build immediately	Ilia Mirkin	2015-10-29	3	-1/+7
\| \| \| \| \| \| \|	Avoid deferring building shaders until draw time, should hopefully reduce any stuttering, as well as enable shader-db style analysis. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: do upload-time fixups for interpolation parameters	Ilia Mirkin	2015-10-29	15	-19/+239
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately flatshading is an all-or-nothing proposition on nvc0, while GL 3.0 calls for the ability to selectively specify explicit interpolation parameters on gl_Color/gl_SecondaryColor which would override the flatshading setting. This allows us to fix up the interpolation settings after shader generation based on rasterizer settings. While we're at it, we can add support for dynamically forcing all (non-flat) shader inputs to be interpolated per-sample, which allows st/mesa to not generate variants for these. Fixes the remaining failing glsl-1.30/execution/interpolation piglits. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50: add ARB_copy_image support	Ilia Mirkin	2015-10-28	2	-7/+11
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: add ARB_copy_image support	Ilia Mirkin	2015-10-28	2	-7/+11
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0: fix crash when nv50_miptree_from_handle fails	Julien Isorce	2015-10-28	1	-1/+2
\| \| \| \| \|	Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	gallium: add PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS	Marek Olšák	2015-10-28	3	-0/+3
\| \| \| \| \| \|	For ARB_copy_image. Reviewed-by: Brian Paul <[email protected]>
*	nvc0: respect edgeflag attribute width	Ilia Mirkin	2015-10-23	1	-7/+33
\| \| \| \| \| \| \| \| \| \| \| \|	The edgeflag comes in as ubyte with glEdgeFlagPointer but as float with plain immediate glEdgeFlag. Avoid reading bytes that weren't meant for the edgeflag in the pointer case. Fixes intermittent failures with gl-2.0-edgeflag piglit (and valgrind complaints about reading uninitialized memory). Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
*	gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT	Marek Olšák	2015-10-20	3	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	gallium: add PIPE_CAP_SHAREABLE_SHADERS	Marek Olšák	2015-10-20	3	-0/+3
\| \| \| \| \| \|	I'll let drivers figure out how to do it. Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: do not bind input params at compute state init on Fermi	Samuel Pitoiset	2015-10-18	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It looks like binding a constant buffer on compute overwrites the 3D state. To avoid that, we already re-bind all the 3D constant buffers after launching a compute grid but this is not enough. Binding the constant buffer of input parameters for the compute state at initialization corrupts the 3D constant buffers, and it's just useless to bind it because this is not needed until we really launch a grid. This fixes some piglit regressions related to interpolation tests introduced in "nvc0: enable compute support by default on Fermi". Fixes: 00d6186 (nvc0: enable compute support by default on Fermi) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: add support for performance monitoring metrics on Fermi	Samuel Pitoiset	2015-10-17	4	-3/+500
\| \| \| \| \| \| \| \| \|	As explained in the CUDA toolkit documentation, "a metric is a characteristic of an application that is calculated from one or more event values." Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: add a note about MP counters on GF100/GF110	Samuel Pitoiset	2015-10-16	1	-0/+5
\| \| \| \| \| \| \| \| \|	MP counters on GF100/GF110 (compute capability 2.0) are buggy because there is a context-switch problem that we need to fix. Results might be wrong sometimes, be careful! Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: add MP counters variants for GF100/GF110	Samuel Pitoiset	2015-10-16	2	-77/+483
\| \| \| \| \| \| \| \| \|	GF100 and GF110 chipsets are compute capability 2.0, while the other Fermi chipsets are compute capability 2.1. That's why, some MP counters are different between these chipsets and we need to handle variants. Signed-off-by: Samuel Pitoiet <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: move SW/HW queries info to their respective files	Samuel Pitoiset	2015-10-16	7	-178/+228
\| \| \| \| \| \| \|	This will help for handling HW SM queries variants on Fermi. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: enable compute support by default on Fermi	Samuel Pitoiset	2015-10-16	2	-8/+2
\| \| \| \| \| \| \| \| \| \|	Compute support was not enabled by default because weird effects on 3D state happened, but I can't reproduce them anymore. This also enables MP performance counters by default on Fermi. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: allow only one active query for the MP counters group	Samuel Pitoiset	2015-10-16	1	-11/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because we can't expose the number of hardware counters needed for each different query, we don't want to allow more than one active query simultaneously to avoid failure when the maximum number of counters is reached. Note that these groups of GPU counters are currently only used by AMD_performance_monitor. Like for Kepler, this limits the maximum number of active queries to 1 on Fermi. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: read MP counters of all GPCs on Fermi	Samuel Pitoiset	2015-10-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	When a card has more than one GPC, the grid used by the compute kernel which reads MP performance counters seems to be too small. The consequence is that the kernel is not launched on all TPCs. Increasing the grid size using the number of GPCs now launches enough blocks and we can read MP performance counters of all TPCs. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	nvc0: store the number of GPCs to nvc0_screen	Samuel Pitoiset	2015-10-16	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	NOUVEAU_GETPARAM_GRAPH_UNITS param returns the number of GPCs, the total number of TPCs and the number of ROP units. Note that when the DRM version is too old the default number of GPCs is fixed to 4. This will be used to launch the compute kernel which is used to read MP performance counters over all GPCs. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>