summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: move some helpersRob Clark2014-11-142-65/+71
| | | | | | | Split out a few helpers from fd3_program so we don't have to duplicate for fd4_program. Signed-off-by: Rob Clark <[email protected]>
* freedreno: rename draw->draw_vboRob Clark2014-11-144-6/+6
| | | | | | | Gets rid of a namespace conflict w/ a4xx which wants an fd4_draw() version of fd_draw().. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: missing u_upload_destroyRob Clark2014-11-141-0/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix borked check for a320.0Rob Clark2014-11-141-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: half vs full reg in standalone compiler outputRob Clark2014-11-141-6/+10
| | | | | | Handle hrN.c in printing outputs/inputs. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: consider instruction neighbors in cpRob Clark2014-10-252-11/+178
| | | | | | | | | | | | | | | | | Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive scalar registers. Keep track of instruction neighbors in copy- propagation step and avoid eliminating mov's which would cause an instruction to need multiple distinct left and/or right neighbors. This lets us not fall on our face when we encounter things like: 1: MOV TEMP[2], IN[0].xyzw 2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D 3: MOV TEMP[2].xy, IN[0].yxzz 4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D 5: END Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: always mov tex coordsRob Clark2014-10-251-54/+30
| | | | | | | | | | | | | | | | Always insert extra mov's for the tex coord into the fanin. This simplifies things a bit, and avoids a scenario where multiple sam instructions can have mutually exclusive input's to it's fanin, for example: 1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D 2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D The CP pass can always remove the mov's that are not actually needed, so better to start out with too many mov's in the front end, than not enough. Signed-off-by: Rob Clark <[email protected]>
* freedreno: rename a couple debug flagsRob Clark2014-10-253-7/+7
| | | | | | | | | dscis -> noscis dbypass -> nobypass a bit more consistant w/ nobin, etc. And IMO a bit more sensible names. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: skip virtual outputs in standalone compilerRob Clark2014-10-251-0/+3
| | | | | | | Kills get added to the outputs list, to ensure they get scheduled. But they aren't *really* outputs so skip them in the header comment block. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: standalone compiler updates for ir3testRob Clark2014-10-254-18/+51
| | | | | | | | | | | | | | | In order to test compiler changes more easily, spit out the assembled shader with some header information so that we can know about inputs/outputs more easily. See: git://people.freedesktop.org/~robclark/ir3test In ir3test we have a big collection of tgsi shaders and reference ir3_compiler outputs. When making compiler changes, regenerate the compiler outputs and feed to ir3test to compare the new vs reference shader. Signed-off-by: Rob Clark <[email protected]>
* gallium: introduce PIPE_CAP_CLIP_HALFZ.Mathias Fröhlich2014-10-241-0/+1
| | | | | | | | | | | | In preparation of ARB_clip_control. Let the driver decide if it supports pipe_rasterizer_state::clip_halfz being set to true. v3: Initially enable on ilo. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]
* Revert "freedreno/a3xx: only emit dirty consts"Rob Clark2014-10-232-9/+5
| | | | | | | This reverts commit 94bb33617d1e8978dc52b8aaa4eb41bfb6703f79. Which somehow broke gnome-shell.. and needs more investigation. For now, revert..
* freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCERob Clark2014-10-231-7/+6
| | | | | | | | | | | | | | | fd_bo_cpu_prep() doesn't realize the bo is already referenced in unflushed cmdstream. It could be made to do so (but would have to be implemented twice, ie. both for msm and kgsl). But we still can't do the expected thing if the caller isn't using _NOSYNC. Because of the way the tiling works, we need to build quite a bit of cmdstream at flush time, which is not possible to do at the libdrm level. So rather than trying to make fd_bo_cpu_prep() smarter than it can possibly be, just *always* discard and reallocate if the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix depth/stencil restore formatRob Clark2014-10-211-1/+5
| | | | | | Also fix z16 restore format which was completely wrong. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix viewport state during clearRob Clark2014-10-211-1/+19
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: mark scissor state dirty when enable bit changesRob Clark2014-10-211-0/+10
| | | | | | | | We don't have a scissor enable bit in hw, so when a raster state change results in scissor enable bit changing, we need to also mark scissor state as dirty. Signed-off-by: Rob Clark <[email protected]>
* freedreno: clear vs scissorRob Clark2014-10-217-13/+96
| | | | | | | | | | | The optimization of avoiding restore (mem2gmem) if there was a clear falls down a bit if you don't have a fullscreen scissor. We need to make the decision logic a bit more clever to keep track of *what* was cleared, so that we can (a) completely skip mem2gmem if entire buffer was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that were completely cleared. Signed-off-by: Rob Clark <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesaMarek Olšák2014-10-211-0/+1
| | | | | | | | With 5 shader stages and various combinations of enabled and disabled shaders, the maximum number of outputs in one shader doesn't have to be equal to the maximum number of inputs in the following shader. v2: return 32 for softpipe and llvmpipe
* freedreno/ir3: add debug flag to disable cpRob Clark2014-10-204-1/+10
| | | | | | FD_MESA_DEBUG=nocp will disable copy propagation pass. Signed-off-by: Rob Clark <[email protected]>
* freedreno: positions come out as integers, not half-integersIlia Mirkin2014-10-201-2/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: disable early-z when we have kill'sRob Clark2014-10-203-0/+10
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix potential gpu lockup with killRob Clark2014-10-204-2/+61
| | | | | | | | It seems like the hardware is unhappy if we execute a kill instruction prior to last input (ei). Probably the shader thread stops executing and the end-input flag is never set. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: comment + better fxn nameRob Clark2014-10-201-3/+5
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: only emit dirty constsRob Clark2014-10-202-5/+9
| | | | | | | | If app only updates (for example) vertex uniforms, it would be nice to only re-emit those and not also frag uniforms. Means we need to mark the first frag shader const buffer dirty after a clear. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: more layer/level fixesRob Clark2014-10-203-8/+14
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: large const supportRob Clark2014-10-155-13/+33
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2014-10-154-5/+10
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix layer_strideRob Clark2014-10-151-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: inline fd_draw_emit()Rob Clark2014-10-152-49/+47
| | | | | | Manual LTO Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: optimize shader key comparisionRob Clark2014-10-155-40/+79
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: refactor/optimize emitRob Clark2014-10-157-83/+125
| | | | | | | | | | | | | Because we reuse various bits of emit code (for state/vertex/prog/etc) for both regular draws and internal draws (gmem<->mem, clear, etc), the number of parameters getting passed around has been growing. Refactor to group these into fd3_emit. This simplifies fxn signatures, avoids passing around shader key on the stack, etc. It also gives us a nice place to cache shader-variant lookup to avoid looking up shader variants multiple times per draw (without having to *also* pass them around as fxn args everywhere). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: refactor vertex state emitRob Clark2014-10-1511-79/+83
| | | | | | | | | | | | | | Get rid of fd3_vertex_buf and use fd_vertex_state directly for all draws. Removes a tiny bit of CPU overhead for munging around the vertex state every time it is emitted, but more importantly it cleans things up for later optimizations, so the emit paths don't have to special case internal draws (gmem<->mem, clears, etc) with regular draws. Instead of constructing fd3_vertex_buf array each time for internal draws, and context init time pre-create solid_vbuf_state and blit_vbuf_state. Signed-off-by: Rob Clark <[email protected]>
* freedreno: use tgsi_loweringRob Clark2014-10-148-1673/+6
| | | | | | | Now that the freedreno_lowering code is moved to tgsi_lowering, remove our private copy and switch over to using the common version. Signed-off-by: Rob Clark <[email protected]>
* freedreno: query fixesRob Clark2014-10-033-8/+13
| | | | | | | Fixes a few issues, including a potential empty-IB (which triggers gpu hangs in piglit occlusion_query_meta_no_fragments) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: handle VS only outputting BCOLORRob Clark2014-10-031-2/+10
| | | | | | | Possibly we should map the front color to black (zeroes). But not sure there is a way to do that without generating a shader variant. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix lockups with lame FRAG shadersRob Clark2014-10-034-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | Shaders like: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0], LOCAL IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D 1: MOV OUT[0], IMM[0].xyxx 2: END cause unhappyness. They have an IN[], but once this is compiled the useless TEX instruction goes away. Leaving a varying that is never fetched, which makes the hw unhappy. In the process fix a signed vs unsigned compare. If the vertex shader has max_reg=-1, MAX2() vs an unsigned would not give the desired result. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add TXF supportIlia Mirkin2014-10-021-1/+39
| | | | | | | Still failing a bunch of the fairly picky texelFetch tests, but the 1D(Array) ones are full passes. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add TXD support and expose ARB_shader_texture_lodIlia Mirkin2014-10-023-9/+56
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add texture offset supportIlia Mirkin2014-10-021-4/+45
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: shadow comes before arrayIlia Mirkin2014-10-021-2/+2
| | | | | | | Experimentally, this makes *ArrayShadow tex-miplevel-selection tests pass. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: make TXQ return integers, not floatsIlia Mirkin2014-10-021-1/+1
| | | | | | We're still doing something wrong for array textures. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add UMAD supportIlia Mirkin2014-10-021-4/+15
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add ISSG supportIlia Mirkin2014-10-021-0/+39
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add MOD supportIlia Mirkin2014-10-021-8/+12
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add UMOD support, based on UDIVIlia Mirkin2014-10-021-6/+31
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add IDIV/UDIV supportIlia Mirkin2014-10-021-3/+197
| | | | | | Logic shamelessly copied from nv50 lowering pass. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: avoid fan-in sources referring to same instructionIlia Mirkin2014-10-021-2/+10
| | | | | | | | | | Since the RA has to be done s.t. each one gets its own (adjacent) register, it would complicate matters if instructions were allowed to be repeated. This enables copy-propagation use in situations where previously that might have happened. Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: emit all immediates in one shotRob Clark2014-10-021-8/+16
| | | | | | | Makes the command stream a bit tighter when there are lots of immediates. Signed-off-by: Rob Clark <[email protected]>
* freedreno: instanced drawing/compute not yet supportedIlia Mirkin2014-10-021-3/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: handle large shader program sizesRob Clark2014-10-021-11/+63
| | | | | | | Above a certain limit use CACHE mode instead of BUFFER mode. This should solve gpu hangs with large shader programs. Signed-off-by: Rob Clark <[email protected]>