aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno/Makefile.sources
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: introduce ir3_compiler objectRob Clark2015-06-211-0/+1
| | | | | | | | Right now, just provides a cleaner way to get at the gpu-id, given the separation between compiler and context. But we will need this also to hold the reg-set for new register allocation. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove tgsi f/eRob Clark2015-06-211-2/+0
| | | | | | Also remove ir3_flatten which was only used by tgsi f/e. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop dot graph dumpingRob Clark2015-06-211-1/+1
| | | | | | | | At least for now.. right now the instruction and instruction list printing should suffice, and the re-working of ir3_block would require a lot of changes in that code. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/nir: lower if/elseRob Clark2015-04-171-0/+2
| | | | | | | For now, completely flatten if/else blocks. That will almost certainly change once we have flow control. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add NIR compilerRob Clark2015-04-051-0/+1
| | | | | | | | | | | | | | | | The NIR compiler frontend is an alternative to the TGSI f/e, producing the same ir3 IR and using the same backend passes for scheduling, etc. It is not enabled by default yet, as there are still some regressions. To enable, use 'FD_MESA_DEBUG=nir'. It is enough to use with, for example, xonotic or supertuxkart. With the NIR f/e, scalarizing and a number of other lowering steps happen in NIR, so we don't have to do them in ir3. Which simplifies the f/e and allows the lowered instructions to pass through other optimization stages. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove old compilerRob Clark2015-03-151-1/+0
| | | | | | | Now that piglit is no longer falling back to old compiler for any tests, we can remove it. Hurray \o/ Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: simplify RARob Clark2015-01-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | Group inputs/outputs, in addition to fanin/fanout, as they must also exist in sequential scalar registers. This lets us simplify RA by working in terms of neighbor groups. NOTE: has the slight problem that it can't optimize out mov's for things like: MOV OUT[n], IN[m] To avoid this, instead of trying to figure out what mov's we can eliminate, we first remove all mov's prior to grouping, and then re-insert mov's as needed while grouping inputs/outputs/fanins. Eventually we'd prefer the frontend to not insert extra mov's in the first place (so we don't have to bother removing them). This is the plan for an eventual NIR based frontend, so separate out the instr grouping (which will still be needed for NIR frontend) from the mov elimination (which won't). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split out legalize passRob Clark2014-12-231-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fd4_util -> fd4_formatRob Clark2014-12-041-2/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fd3_util -> fd3_formatIlia Mirkin2014-11-291-2/+2
| | | | | | | All the "util" helpers are actually format-related Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: add missing headers in Makefile.sourcesEmil Velikov2014-11-161-1/+14
| | | | | | ... or autotools will fail to pick them up for the distribution tarball. Signed-off-by: Emil Velikov <[email protected]>
* freedreno: add adreno 420 supportRob Clark2014-11-151-0/+14
| | | | | | | | Very initial support. Basic stuff working (es2gears, es2tri, and maybe about half of glmark2). Expect broken stuff. Still missing: mem->gmem (restore), queries, mipmaps (blob segfaults!), hw binning, etc. Signed-off-by: Rob Clark <[email protected]>
* freedreno: use tgsi_loweringRob Clark2014-10-141-2/+0
| | | | | | | Now that the freedreno_lowering code is moved to tgsi_lowering, remove our private copy and switch over to using the common version. Signed-off-by: Rob Clark <[email protected]>
* gallium/freedreno: ship all files in the tarballEmil Velikov2014-09-051-12/+63
| | | | | | | | | | | - include all headers in Makefile.sources - sort the list(s) - bundle the android build Cc: [email protected] Cc: Rob Clark <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]>
* freedreno/ir3: split out shader compiler from a3xxRob Clark2014-07-251-11/+14
| | | | | | | | | | | | | | | | | | | | | | Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 |- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: occlusion query supportRob Clark2014-05-131-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add support for hw queriesRob Clark2014-05-131-0/+1
| | | | | | | | | | | Real GPU queries need some infrastructure to track samples per tile and accumulate the results. But fortunately this can be shared across GPU generation. See: https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries Signed-off-by: Rob Clark <[email protected]>
* freedreno/query: allow multiple query implementationsRob Clark2014-05-131-0/+1
| | | | | | | | Split out fd_query into an abstract base class, to allow multiple implementations. The current sw based queries are moved into fd_sw_query. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: drop hand-coded blit/solid shadersRob Clark2014-02-231-0/+1
| | | | | | | | | Instead in the common code, construct these shaders from TGSI. For now we let a2xx keep it's hand coded shaders, as it's compiler isn't quite up to the job yet. All the same it is a net drop in code size and gets rid of special cases. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: new compilerRob Clark2014-02-031-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new compiler generates a dependency graph of instructions, including a few meta-instructions to handle PHI and preserve some extra information needed for register assignment, etc. The depth pass assigned a weight/depth to each node (based on sum of instruction cycles of a given node and all it's dependent nodes), which is used to schedule instructions. The scheduling takes into account the minimum number of cycles/slots between dependent instructions, etc. Which was something that could not be handled properly with the original compiler (which was more of a naive TGSI translator than an actual compiler). The register assignment is currently split out as a standalone pass. I expect that it will be replaced at some point, once I figure out what to do about relative addressing (which is currently the only thing that should cause fallback to old compiler). There are a couple new debug options for FD_MESA_DEBUG env var: optmsgs - enable debug prints in optimizer optdump - dump instruction graph in .dot format, for example: http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot At this point, thanks to proper handling of instruction scheduling, the new compiler fixes a lot of things that were broken before, and does not appear to break anything that was working before[1]. So even though it is not finished, it seems useful to merge it in it's current state. [1] Not merged in this commit, because I'm not sure if it really belongs in mesa tree, but the following commit implements a simple shader emulator, which I've used to compare the output of the new compiler to the original compiler (ie. run it on all the TGSI shaders dumped out via ST_DEBUG=tgsi with various games/apps): https://github.com/freedreno/mesa/commit/163b6306b1660e05ece2f00d264a8393d99b6f12 Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: split out old compilerRob Clark2014-02-031-0/+1
| | | | | | | | For the time being, keep old compiler as fallback for things that the new compiler does not support yet. Split out as it's own commit to make the later new-compiler commits easier to follow. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: prepare for new compilerRob Clark2014-02-031-1/+1
| | | | | | Shuffle things around to prepare for new compiler. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add tgsi lowering passRob Clark2014-02-011-0/+1
| | | | | | | | | | | | | | | | Currently lowers the following instructions: DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, DP2 translating these into equivalent simpler TGSI instructions. This probably should be moved to util so other drivers can use it, but just adding under freedreno for now so that I can clear out a lot of the lowering code in a3xx compiler before beginning to add new compiler. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add basic query supportRob Clark2014-01-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add for now some simple/basic query support (ie. things not actually requiring the GPU). Might change around a bit when I actually add GPU queries, but for now this enables some useful performance info in the GALLIUM_HUD. For example: GALLIUM_HUD=fps+batches+batches-sysmem+batches-gmem+restores,draw-calls The driver specific specific queries are: + draw-calls + batches - number of batches per second, sum of batches-sysmem plus batches-gmem + batches-gmem - render a set of tiles in GMEM, for each tile (optionally) system mem -> gmem (restore), plus N draws, plus gmem -> system mem (resolve) per second + batches-sysmem - N draws to system memory (GMEM bypass) per second + restores - number of GMEM batches that required restore per second Ideally for GMEM rendering, you want batches-gmem to equal fps. If the app is doing something that triggers multiple passes (ie. requires extra round trip gmem <-> system memory) then the # of batches per second will go up relative to fps. Signed-off-by: Rob Clark <[email protected]>
* freedreno: compact a2xx and a3xx makefiles into parent onesJohannes Obermayr2013-11-161-0/+32
| | | | | | | | | Nearly everything within the three Makefile.am's is identical. Let's simplify things a little. v2: Rebase and rewrite the commit message (Emil Velikov) Signed-off-by: Emil Velikov <[email protected]>
* freedreno: consolidate C sources list into Makefile.sourcesEmil Velikov2013-10-011-0/+11
Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Tom Stellard <[email protected]>