aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno/freedreno_util.h
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: Add a6xx backendKristian H. Kristensen2018-08-161-1/+8
| | | | | | | | | | This adds a freedreno backend for the a6xx generation GPUs, which at the time of this commit is about 98% GLES2 conformant. Much remains to be done - both performance work and feature work towards more recent GLES versions, but this is a good start. Signed-off-by: Kristian H. Kristensen <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: perfmance countersRob Clark2018-07-181-0/+1
| | | | | | AMD_performance_monitor support Signed-off-by: Rob Clark <[email protected]>
* freedreno: get rid of noop renderRob Clark2018-07-171-0/+6
| | | | | | | This was basically to avoid a zero-dword IB (indirect-branch), but instead just don't emit the IB packet in that case. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add a20xJonathan Marek2018-06-221-0/+13
| | | | | | | | | | | | | this patch adds support for a20x, which has some differences with a220: -no VGT_MAX_VTX_INDX register -no CLEAR_COLOR register -set RB_BC_CONTROL in restore (hangs without) -different CP_DRAW_INDX format tested with kmscube and glmark2 scenes, on par with a220 Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: MSAARob Clark2018-06-211-0/+16
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: move emit_marker5() into a5xx backendRob Clark2018-06-191-21/+0
| | | | | | | | | The scratch registers move again in a6xx.. so for post-a4xx let's just move this into the backend, and move the one place it used to be needed in core into fd5_emit_ib(). For a6xx we will do similar, calling emit_marker6() from fd6_emit_ib(). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: texture tilingRob Clark2018-01-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Overall a nice 5-10% gain for most games. And more for things like glmark2 texture benchmark. There are some rough edges. In particular, the hardware seems to only support tiling or component swap. (Ie. from hw PoV, ARGB/ABGR/RGBA/ BGRA are all the same format but with different component swap.) For tiled formats, only ARGB is possible. This isn't a big problem for *sampling* since we also have swizzle state there (and since util_format_compose_swizzles() already takes into account the component order, we didn't use COLOR_SWAP for sampling). But it is a problem if you try to render to a tiled BGRA (for example) surface. The next patch introduces a workaround for blitter, so we can generate tiled textures in ABGR/RGBA/BGRA, but that doesn't help the render- target case. To handle that, I think we'd need to keep track that the tiled format is different from the linear format, which seems like it would get extra fun with sampler views/etc. So for now, disabled by default, enable with FD_MESA_DEBUG=ttile. In practice it works fine for all the games I've tried, but makes piglit grumpy. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add debug flag to force high priority contextRob Clark2017-12-191-0/+1
| | | | | | | Mainly for testing, FD_MESA_DEBUG=hiprio will force high priority contexts. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: add a5xx blitterRob Clark2017-12-171-0/+1
| | | | | | FD_MESA_DEBUG=noblit to disable Signed-off-by: Rob Clark <[email protected]>
* freedreno: add debug option to force emulated indirectRob Clark2017-12-031-0/+1
| | | | | | Useful mostly for debugging indirect draw. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: LRZ supportRob Clark2017-06-071-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: reshuffle FD_MESA_DEBUG bitmaskRob Clark2017-06-071-3/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: drop WFIs in emit_marker5()Rob Clark2017-05-301-5/+0
| | | | | | | | | Results in always having at least one WFI between draws, which was slowing stk down by ~5% and ~10% in xonotic. (also drop bogus assert while we're at it.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: switch to NIR by defaultRob Clark2017-05-231-1/+0
| | | | | | | | | | | Now that we lower vars to regs, we no longer regress for anything that does complex dereferences. (With tgsi, derefers are already lowered before tgsi_to_nir, but not with glsl_to_nir.) In fact it actually fixes a few things to bypass tgsi. So make NIR the default (finally!) Signed-off-by: Rob Clark <[email protected]>
* freedreno: extract helper for stage->sb for a4xx+Rob Clark2017-04-171-0/+23
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: enable draw/batch reordering by defaultRob Clark2017-04-141-1/+1
| | | | | | | | Probably should have flipped the switch a long time ago, since it doesn't seem to cause any problems and is a nice perf boost in a number of cases. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add "nogrow" debug paramRob Clark2017-01-101-0/+1
| | | | | | | Sometimes it is useful to disable the "growable" cmdstream buffers for debugging. (See 419a154d in libdrm) Signed-off-by: Rob Clark <[email protected]>
* freedreno: fdN_gmem_restore_format() is not gen specificRob Clark2016-12-181-0/+1
| | | | | | | | | | Refactor out into a common helper, since this is the same across generations when we need equiv z/s gmem restore format. Next patch needs this in a5xx, rather than creating yet another helper push this into core. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: initial supportRob Clark2016-11-301-5/+111
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: a bit of micro-optimizationRob Clark2016-07-301-0/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: shadow textures if possible to avoid stall/flushRob Clark2016-07-301-0/+6
| | | | | | | | | | | | | | To make batch re-ordering useful, we need to be able to create shadow resources to avoid a flush/stall in transfer_map(). For example, uploading new texture contents or updating a UBO mid-batch. In these cases, we want to clone the buffer, and update the new buffer, leaving the old buffer (whose reference is held by cmdstream) as a shadow. This is done by blitting the remaining other levels (and whatever part of current level that is not discarded) from the old/shadow buffer to the new one. Signed-off-by: Rob Clark <[email protected]>
* freedreno: spiff up some debug tracesRob Clark2016-07-301-0/+1
| | | | | | | Make it easier to track batches, to ensure things happen properly when they are reordered. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add batch-cache and batch reorderingRob Clark2016-07-301-0/+1
| | | | | | | | | | | | Note that I originally also had a entry-point that would construct a key and do lookup from a pipe_surface. I ended up not needing that (yet?) but it is easy-enough to re-introduce later if we need it for the blit path. For now, not enabled by default, but can be enabled (on a3xx/a4xx) with FD_MESA_DEBUG=reorder. Signed-off-by: Rob Clark <[email protected]>
* freedreno: dynamically sized/growable cmd buffersRob Clark2016-07-301-13/+18
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: introduce fd_batchRob Clark2016-07-301-5/+4
| | | | | | | | | | | | | | | | | | | Introduce the batch object, to track a batch/submit's worth of ringbuffers and other bookkeeping. In this first step, just move the ringbuffers into batch, since that is mostly uninteresting churn. For now there is just a single batch at a time. Note that one outcome of this change is that rb's are allocated/freed on each use. But the expectation is that the bo pool in libdrm_freedreno will save us the GEM bo alloc/free which was the initial reason to implement a rb pool in gallium. The purpose of the batch is to eventually facilitate out-of-order rendering, with batches associated to framebuffer state, and tracking the dependencies on other batches. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add support for NIR as preferred IRRob Clark2016-05-151-0/+1
| | | | | | For now under debug flag, since only suitable for debugging/testing. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add some debug_asserts() to catch insane offsetsRob Clark2016-05-041-0/+2
| | | | | | | Ofc won't catch *all* faults, but at least helpful for catching offsets which are completely bogus. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add flag to enable dEQP hacksRob Clark2016-04-131-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: per-generation OUT_IB packetRob Clark2016-01-181-3/+3
| | | | | | | | | | Some a4xx firmware doesn't implement the "PFD" (prefetch-disabled) version of the CP_INDIRECT_BUFFER packet. So allow for PFD vs PFE per generation. Switch a3xx and a4xx over to using prefetch-enabled version (which is also what blob does.. it seems only on a2xx we cannot use PFE). Signed-off-by: Rob Clark <[email protected]>
* freedreno: add debug option to dirty state after drawRob Clark2015-10-151-1/+2
| | | | | | Similar to "dclear", "ddraw" will mark all state dirty after each draw. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx+a4xx: add texture buffer object supportRob Clark2015-08-121-0/+16
| | | | | | | | | Basic texture buffer support. Should be straightforward to add first/ last_element support. And with a bit of work in ir3 emulate larger texture buffer sizes. But this seems to be enough for stk gl31 render paths. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: add s8/z32/z32_s8x24 supportRob Clark2015-08-101-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: MRT supportRob Clark2015-08-041-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: move the half-precision logic into coreRob Clark2015-08-041-0/+36
| | | | | | | | Both a3xx and a4xx need the same logic to decide if half-precision can be used for blit shaders. So move it to core and simplify things a bit with a helper that considers all render targets. Signed-off-by: Rob Clark <[email protected]>
* freedreno: small bit of cleanup about max rendertargetsRob Clark2015-08-041-0/+6
| | | | | | | | We hard-coded 4 or 8 as the max in various places. Switch it all to a define since the limit will go up with a4xx (and maybe even again in the future?) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: move emit_const to ir3Rob Clark2015-07-271-0/+1
| | | | | | | | | | | | | | | Details of the cmdstream packets are different between a3xx and a4xx, but the logic about the layout of const registers is the same, as that is dictated by the ir3 shader compiler. So rather than duplicating logic that is tightly coupled to ir3 between a3xx and a4xx, move this into ir3 and use per-generation callbacks for to build the cmdstream packets. This should make it easier to pass additional const regs (such as for transform feedback). And it also keeps the layout internal to ir3 in case we want to make the layout more dynamic some day. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: shader-db tracesRob Clark2015-07-101-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: move inputs/outputs to shaderRob Clark2015-06-211-2/+2
| | | | | | | | | These belong in the shader, rather than the block. Mostly a lot of churn and nothing too interesting. But splitting this out from the rest of ir3_block reshuffling to cut down the noise in the later patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: dump nocp optionRob Clark2015-06-211-1/+0
| | | | | | No longer used, or even possible, with NIR frontend. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove tgsi f/eRob Clark2015-06-211-1/+0
| | | | | | Also remove ir3_flatten which was only used by tgsi f/e. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop dot graph dumpingRob Clark2015-06-211-1/+0
| | | | | | | | At least for now.. right now the instruction and instruction list printing should suffice, and the re-working of ir3_block would require a lot of changes in that code. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add NIR compilerRob Clark2015-04-051-0/+1
| | | | | | | | | | | | | | | | The NIR compiler frontend is an alternative to the TGSI f/e, producing the same ir3 IR and using the same backend passes for scheduling, etc. It is not enabled by default yet, as there are still some regressions. To enable, use 'FD_MESA_DEBUG=nir'. It is enough to use with, for example, xonotic or supertuxkart. With the NIR f/e, scalarizing and a number of other lowering steps happen in NIR, so we don't have to do them in ir3. Which simplifies the f/e and allows the lowered instructions to pass through other optimization stages. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove old compilerRob Clark2015-03-151-1/+0
| | | | | | | Now that piglit is no longer falling back to old compiler for any tests, we can remove it. Hurray \o/ Signed-off-by: Rob Clark <[email protected]>
* freedreno: replace glsl130 debug flag with glsl120Rob Clark2015-03-081-1/+1
| | | | | | | | | | | | | | Now that relative-dst works, we should never fall back to the old compiler. (Which is almost true, other than a couple edge case sched fails in piglit). So replace glsl130 flag to force GLSL 130 and integers on a3xx/a4xx with a glsl120 flag to force GLSL 120 and !integers. If this commit breaks any game/app/etc use FD_MESA_DEBUG=glsl120 as a workaround and please let me know. Signed-off-by: Rob Clark <[email protected]>
* freedreno: drop ARRAY_SIZE macroRob Clark2015-02-251-2/+0
| | | | | | | | | | | | | | | | | Since now ARRAY_SIZE has been added to util/macros.h. Fixes a bunch of: freedreno_util.h:79:0: warning: "ARRAY_SIZE" redefined #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0])) ^ In file included from ../../../../src/gallium/include/pipe/p_compiler.h:36:0, from ../../../../src/gallium/include/pipe/p_context.h:31, from freedreno_context.h:32, from freedreno_context.c:29: ../../../../src/util/macros.h:29:0: note: this is the location of the previous definition # define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x))) ^ Signed-off-by: Rob Clark <[email protected]>
* freedreno: pass number of instances to drawIlia Mirkin2015-02-191-2/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: fix signed vs unsigned lolsRob Clark2014-12-031-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: rename a couple debug flagsRob Clark2014-10-251-2/+2
| | | | | | | | | dscis -> noscis dbypass -> nobypass a bit more consistant w/ nobin, etc. And IMO a bit more sensible names. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add debug flag to disable cpRob Clark2014-10-201-0/+1
| | | | | | FD_MESA_DEBUG=nocp will disable copy propagation pass. Signed-off-by: Rob Clark <[email protected]>
* freedreno: query fixesRob Clark2014-10-031-1/+5
| | | | | | | Fixes a few issues, including a potential empty-IB (which triggers gpu hangs in piglit occlusion_query_meta_no_fragments) Signed-off-by: Rob Clark <[email protected]>