mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	freedreno/ir3: move some helpers	Rob Clark	2014-11-14	2	-65/+71
\| \| \| \| \| \| \|	Split out a few helpers from fd3_program so we don't have to duplicate for fd4_program. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: rename draw->draw_vbo	Rob Clark	2014-11-14	4	-6/+6
\| \| \| \| \| \| \|	Gets rid of a namespace conflict w/ a4xx which wants an fd4_draw() version of fd_draw().. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: missing u_upload_destroy	Rob Clark	2014-11-14	1	-0/+2
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: fix borked check for a320.0	Rob Clark	2014-11-14	1	-1/+1
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: half vs full reg in standalone compiler output	Rob Clark	2014-11-14	1	-6/+10
\| \| \| \| \| \|	Handle hrN.c in printing outputs/inputs. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: consider instruction neighbors in cp	Rob Clark	2014-10-25	2	-11/+178
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive scalar registers. Keep track of instruction neighbors in copy- propagation step and avoid eliminating mov's which would cause an instruction to need multiple distinct left and/or right neighbors. This lets us not fall on our face when we encounter things like: 1: MOV TEMP[2], IN[0].xyzw 2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D 3: MOV TEMP[2].xy, IN[0].yxzz 4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D 5: END Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: always mov tex coords	Rob Clark	2014-10-25	1	-54/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Always insert extra mov's for the tex coord into the fanin. This simplifies things a bit, and avoids a scenario where multiple sam instructions can have mutually exclusive input's to it's fanin, for example: 1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D 2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D The CP pass can always remove the mov's that are not actually needed, so better to start out with too many mov's in the front end, than not enough. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: rename a couple debug flags	Rob Clark	2014-10-25	3	-7/+7
\| \| \| \| \| \| \| \| \|	dscis -> noscis dbypass -> nobypass a bit more consistant w/ nobin, etc. And IMO a bit more sensible names. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: skip virtual outputs in standalone compiler	Rob Clark	2014-10-25	1	-0/+3
\| \| \| \| \| \| \|	Kills get added to the outputs list, to ensure they get scheduled. But they aren't really outputs so skip them in the header comment block. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: standalone compiler updates for ir3test	Rob Clark	2014-10-25	4	-18/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to test compiler changes more easily, spit out the assembled shader with some header information so that we can know about inputs/outputs more easily. See: git://people.freedesktop.org/~robclark/ir3test In ir3test we have a big collection of tgsi shaders and reference ir3_compiler outputs. When making compiler changes, regenerate the compiler outputs and feed to ir3test to compare the new vs reference shader. Signed-off-by: Rob Clark <[email protected]>
*	gallium: introduce PIPE_CAP_CLIP_HALFZ.	Mathias Fröhlich	2014-10-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	In preparation of ARB_clip_control. Let the driver decide if it supports pipe_rasterizer_state::clip_halfz being set to true. v3: Initially enable on ilo. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]
*	Revert "freedreno/a3xx: only emit dirty consts"	Rob Clark	2014-10-23	2	-9/+5
\| \| \| \| \| \| \|	This reverts commit 94bb33617d1e8978dc52b8aaa4eb41bfb6703f79. Which somehow broke gnome-shell.. and needs more investigation. For now, revert..
*	freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE	Rob Clark	2014-10-23	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fd_bo_cpu_prep() doesn't realize the bo is already referenced in unflushed cmdstream. It could be made to do so (but would have to be implemented twice, ie. both for msm and kgsl). But we still can't do the expected thing if the caller isn't using _NOSYNC. Because of the way the tiling works, we need to build quite a bit of cmdstream at flush time, which is not possible to do at the libdrm level. So rather than trying to make fd_bo_cpu_prep() smarter than it can possibly be, just always discard and reallocate if the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: fix depth/stencil restore format	Rob Clark	2014-10-21	1	-1/+5
\| \| \| \| \| \|	Also fix z16 restore format which was completely wrong. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: fix viewport state during clear	Rob Clark	2014-10-21	1	-1/+19
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: mark scissor state dirty when enable bit changes	Rob Clark	2014-10-21	1	-0/+10
\| \| \| \| \| \| \| \|	We don't have a scissor enable bit in hw, so when a raster state change results in scissor enable bit changing, we need to also mark scissor state as dirty. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: clear vs scissor	Rob Clark	2014-10-21	7	-13/+96
\| \| \| \| \| \| \| \| \| \| \|	The optimization of avoiding restore (mem2gmem) if there was a clear falls down a bit if you don't have a fullscreen scissor. We need to make the decision logic a bit more clever to keep track of what was cleared, so that we can (a) completely skip mem2gmem if entire buffer was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that were completely cleared. Signed-off-by: Rob Clark <[email protected]>
*	gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa	Marek Olšák	2014-10-21	1	-0/+1
\| \| \| \| \| \| \| \|	With 5 shader stages and various combinations of enabled and disabled shaders, the maximum number of outputs in one shader doesn't have to be equal to the maximum number of inputs in the following shader. v2: return 32 for softpipe and llvmpipe
*	freedreno/ir3: add debug flag to disable cp	Rob Clark	2014-10-20	4	-1/+10
\| \| \| \| \| \|	FD_MESA_DEBUG=nocp will disable copy propagation pass. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: positions come out as integers, not half-integers	Ilia Mirkin	2014-10-20	1	-2/+2
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: disable early-z when we have kill's	Rob Clark	2014-10-20	3	-0/+10
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix potential gpu lockup with kill	Rob Clark	2014-10-20	4	-2/+61
\| \| \| \| \| \| \| \|	It seems like the hardware is unhappy if we execute a kill instruction prior to last input (ei). Probably the shader thread stops executing and the end-input flag is never set. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: comment + better fxn name	Rob Clark	2014-10-20	1	-3/+5
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: only emit dirty consts	Rob Clark	2014-10-20	2	-5/+9
\| \| \| \| \| \| \| \|	If app only updates (for example) vertex uniforms, it would be nice to only re-emit those and not also frag uniforms. Means we need to mark the first frag shader const buffer dirty after a clear. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: more layer/level fixes	Rob Clark	2014-10-20	3	-8/+14
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: large const support	Rob Clark	2014-10-15	5	-13/+33
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: update generated headers	Rob Clark	2014-10-15	4	-5/+10
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: fix layer_stride	Rob Clark	2014-10-15	1	-1/+1
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: inline fd_draw_emit()	Rob Clark	2014-10-15	2	-49/+47
\| \| \| \| \| \|	Manual LTO Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: optimize shader key comparision	Rob Clark	2014-10-15	5	-40/+79
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: refactor/optimize emit	Rob Clark	2014-10-15	7	-83/+125
\| \| \| \| \| \| \| \| \| \| \| \| \|	Because we reuse various bits of emit code (for state/vertex/prog/etc) for both regular draws and internal draws (gmem<->mem, clear, etc), the number of parameters getting passed around has been growing. Refactor to group these into fd3_emit. This simplifies fxn signatures, avoids passing around shader key on the stack, etc. It also gives us a nice place to cache shader-variant lookup to avoid looking up shader variants multiple times per draw (without having to also pass them around as fxn args everywhere). Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: refactor vertex state emit	Rob Clark	2014-10-15	11	-79/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Get rid of fd3_vertex_buf and use fd_vertex_state directly for all draws. Removes a tiny bit of CPU overhead for munging around the vertex state every time it is emitted, but more importantly it cleans things up for later optimizations, so the emit paths don't have to special case internal draws (gmem<->mem, clears, etc) with regular draws. Instead of constructing fd3_vertex_buf array each time for internal draws, and context init time pre-create solid_vbuf_state and blit_vbuf_state. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: use tgsi_lowering	Rob Clark	2014-10-14	8	-1673/+6
\| \| \| \| \| \| \|	Now that the freedreno_lowering code is moved to tgsi_lowering, remove our private copy and switch over to using the common version. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: query fixes	Rob Clark	2014-10-03	3	-8/+13
\| \| \| \| \| \| \|	Fixes a few issues, including a potential empty-IB (which triggers gpu hangs in piglit occlusion_query_meta_no_fragments) Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: handle VS only outputting BCOLOR	Rob Clark	2014-10-03	1	-2/+10
\| \| \| \| \| \| \|	Possibly we should map the front color to black (zeroes). But not sure there is a way to do that without generating a shader variant. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix lockups with lame FRAG shaders	Rob Clark	2014-10-03	4	-6/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Shaders like: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0], LOCAL IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D 1: MOV OUT[0], IMM[0].xyxx 2: END cause unhappyness. They have an IN[], but once this is compiled the useless TEX instruction goes away. Leaving a varying that is never fetched, which makes the hw unhappy. In the process fix a signed vs unsigned compare. If the vertex shader has max_reg=-1, MAX2() vs an unsigned would not give the desired result. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: add TXF support	Ilia Mirkin	2014-10-02	1	-1/+39
\| \| \| \| \| \| \|	Still failing a bunch of the fairly picky texelFetch tests, but the 1D(Array) ones are full passes. Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: add TXD support and expose ARB_shader_texture_lod	Ilia Mirkin	2014-10-02	3	-9/+56
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: add texture offset support	Ilia Mirkin	2014-10-02	1	-4/+45
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: shadow comes before array	Ilia Mirkin	2014-10-02	1	-2/+2
\| \| \| \| \| \| \|	Experimentally, this makes *ArrayShadow tex-miplevel-selection tests pass. Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: make TXQ return integers, not floats	Ilia Mirkin	2014-10-02	1	-1/+1
\| \| \| \| \| \|	We're still doing something wrong for array textures. Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: add UMAD support	Ilia Mirkin	2014-10-02	1	-4/+15
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: add ISSG support	Ilia Mirkin	2014-10-02	1	-0/+39
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: add MOD support	Ilia Mirkin	2014-10-02	1	-8/+12
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: add UMOD support, based on UDIV	Ilia Mirkin	2014-10-02	1	-6/+31
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: add IDIV/UDIV support	Ilia Mirkin	2014-10-02	1	-3/+197
\| \| \| \| \| \|	Logic shamelessly copied from nv50 lowering pass. Signed-off-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: avoid fan-in sources referring to same instruction	Ilia Mirkin	2014-10-02	1	-2/+10
\| \| \| \| \| \| \| \| \| \|	Since the RA has to be done s.t. each one gets its own (adjacent) register, it would complicate matters if instructions were allowed to be repeated. This enables copy-propagation use in situations where previously that might have happened. Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: emit all immediates in one shot	Rob Clark	2014-10-02	1	-8/+16
\| \| \| \| \| \| \|	Makes the command stream a bit tighter when there are lots of immediates. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: instanced drawing/compute not yet supported	Ilia Mirkin	2014-10-02	1	-3/+3
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a3xx: handle large shader program sizes	Rob Clark	2014-10-02	1	-11/+63
\| \| \| \| \| \| \|	Above a certain limit use CACHE mode instead of BUFFER mode. This should solve gpu hangs with large shader programs. Signed-off-by: Rob Clark <[email protected]>