summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a4xx: add independent blend function supportRob Clark2015-08-042-8/+10
| | | | | | needed for MRT Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: MRT supportRob Clark2015-08-0412-132/+212
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: move the half-precision logic into coreRob Clark2015-08-044-31/+38
| | | | | | | | Both a3xx and a4xx need the same logic to decide if half-precision can be used for blit shaders. So move it to core and simplify things a bit with a helper that considers all render targets. Signed-off-by: Rob Clark <[email protected]>
* freedreno: simplify/cleanup resource status trackingRob Clark2015-08-044-48/+71
| | | | | | | | | Collapse dirty/reading bools into status bitmask (and drop writing which should really be the same as dirty). And use 'used_resources' list for all tracking, including zsbuf/cbufs, rather than special casing the color and depth/stencil buffers. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix stream-out caps vec4->componentsRob Clark2015-08-041-2/+2
| | | | | | Should be in units of components, not vec4's Signed-off-by: Rob Clark <[email protected]>
* freedreno: small bit of cleanup about max rendertargetsRob Clark2015-08-0413-17/+40
| | | | | | | | We hard-coded 4 or 8 as the max in various places. Switch it all to a define since the limit will go up with a4xx (and maybe even again in the future?) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add transform-feedback supportRob Clark2015-07-274-4/+230
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: track "keeps" in irRob Clark2015-07-274-23/+17
| | | | | | | | | | | | Previously we had a fixed array to track kills, since they don't generate an SSA value, and then cheated by stuffing them in the outputs array before sending things through depth/sched/etc. But store instructions will need similar treatment. So convert this over to a more general array of instructions that must be kept and fix up the places that were previously relying on kills being in the output array. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add support for store instructionsRob Clark2015-07-273-6/+43
| | | | | | | | | | For store instructions, the "dst" register is a read register, not a written register. (Ie. it is the address to store to.) Lets not confuse register allocation, scheduling, etc, with these details. Instead just leave a dummy instr->regs[0], and take "dst" from instr->regs[1] and srcs following. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cleanup driver-param stuffRob Clark2015-07-273-10/+31
| | | | | | | | | Add 'enum ir3_driver_param' to track driver-param slots, and a create_driver_param() helper to avoid having the knowledge about where driver params are placed in const regs spread throughout the code as we add additional driver-params. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add transform-feedback stateRob Clark2015-07-274-3/+95
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add resource tracking support for written buffersRob Clark2015-07-274-12/+17
| | | | | | | | With stream-out (transform-feedback) we have the case where resources are *written* by the gpu, which needs basically the same tracking to figure out when rendering must be flushed. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx+a4xx: add support for vtxcnt semanticRob Clark2015-07-274-14/+31
| | | | | | This will be used for stream-out (transform-feedback) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add stream-output support to cmdline compilerRob Clark2015-07-271-4/+22
| | | | | | A bit hard-coded configuration at the moment, but sufficient for now. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop unused create_input() argRob Clark2015-07-271-11/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: move emit_const to ir3Rob Clark2015-07-2712-229/+263
| | | | | | | | | | | | | | | Details of the cmdstream packets are different between a3xx and a4xx, but the logic about the layout of const registers is the same, as that is dictated by the ir3 shader compiler. So rather than duplicating logic that is tightly coupled to ir3 between a3xx and a4xx, move this into ir3 and use per-generation callbacks for to build the cmdstream packets. This should make it easier to pass additional const regs (such as for transform feedback). And it also keeps the layout internal to ir3 in case we want to make the layout more dynamic some day. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: bit of shader API refactoringRob Clark2015-07-277-22/+25
| | | | | | | | | | | | Since for transform-feedback, we'll need more than just the TGSI tokens from the state object, just pass the entire state object to ir3_shader_create(). This also cleans things up a bit for some day in the future when we could take shader either as TGSI or directly NIR (for ex, glsl2nir or spirv2nir paths). In the same spirit, drop extra args from ir3_compile_shader_nir() (since it can anyways get what it needs from the ir3_shader_variant). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: updated cat6 encodingRob Clark2015-07-275-113/+230
| | | | | | | Sync updated cat6 encoding from freedreno.git, needed to properly encode store instructions. Signed-off-by: Rob Clark <[email protected]>
* gallium: replace INLINE with inlineIlia Mirkin2015-07-2120-29/+29
| | | | | | | | | | | | | | | | Generated by running: git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_MAX_SHADER_PATCH_VARYINGSMarek Olšák2015-07-161-0/+1
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* freedreno/a4xx: occlusion query supportRob Clark2015-07-101-1/+85
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-07-1011-57/+116
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/sched: fixup new instr's blockRob Clark2015-07-101-0/+4
| | | | | | | | | | | | If we split addr/pred, the original instruction could have originated from a different block. If we don't fixup the block ptr we hit asserts later (in debug builds). NOTE: perhaps we don't want to try to preserve addr/pred reg's across block boundaries.. this at least needs some thought in case addr/pred writes end up inside a conditional block.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/ra: fix failed assert for a0/p0Rob Clark2015-07-101-0/+5
| | | | | | | | The address and predicate register are special, they don't get assigned in RA. So do a better job of ignoring them rather than hitting later asserts. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: shader-db tracesRob Clark2015-07-107-8/+67
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix crash in fd_invalidate_resource()Rob Clark2015-07-101-2/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: unref old fenceRob Clark2015-07-101-1/+3
| | | | | | | | Some, but not all, state trackers will explicitly unref (and set to NULL) the previous *fence before calling pipe->flush(). So driver should use fence_ref() which will unref the old fence if not NULL. Signed-off-by: Rob Clark <[email protected]>
* android: freedreno: add missing components to the buildVarad Gautam2015-07-081-1/+4
| | | | | | | Freedreno requires {a4xx,ir3}_SOURCES and NIR to build. Signed-off-by: Varad Gautam <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: remove redundant pipe_context::fence_signalledMarek Olšák2015-07-051-1/+0
| | | | | | fence_finish(timeout=0) does the same thing Reviewed-by: Brian Paul <[email protected]>
* gallium: handle fence_finish timeout in various driversMarek Olšák2015-07-051-0/+3
| | | | | | I copied what fence_signalled does. Reviewed-by: Brian Paul <[email protected]>
* freedreno/ir3: don't be confused by eliminated indirectsRob Clark2015-07-032-0/+14
| | | | | | | | If an instruction using address register value gets eliminated, we need to remove it from the indirects list, otherwise it causes mayhem in sched for scheduling address register usage. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: sched fixes for addr register usageRob Clark2015-07-031-12/+65
| | | | | | | | | | | | | | | | A handful of fixes and cleanups: 1) If we split addr/pred, we need the newly created instruction to end up in the unscheduled_list 2) Avoid scheduling a write to the address register if there is no instruction using the address register that is otherwise ready to schedule. Note that I currently don't bother with the same logic for predicate register, since the only instructions using predicate (br/kill) don't take any other src registers, so this situation should not arise. 3) few other cosmetic cleanups Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix indirects trackingRob Clark2015-07-035-10/+23
| | | | | | | | | | cp would update instr->address but not update the indirects array resulting in sched getting confused when it had to 'spill' the address register. Add an ir3_instr_set_address() helper to set instr->address and also update ir->indirects, and update all places that were writing instr->address to use helper instead. Signed-off-by: Rob Clark <[email protected]>
* gallium/ttn: mark location specially in nir for color0-writes-allIlia Mirkin2015-07-031-0/+4
| | | | | | | | | | We need to distinguish a shader that has separate writes to each MRT from one which is supposed to write the data from MRT 0 to all the MRTs. In TGSI this is done with a property. NIR doesn't have that, so encode it as a funny location and decode on the other end. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno: use consistent version string formatTimothy Arceri2015-07-011-1/+1
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* freedreno/ir3: cache defining instructionRob Clark2015-06-303-69/+91
| | | | | | | | | It is silly to traverse back to find first instruction that writes part of a larger "virtual" register many times per instruction (plus per use as a src to later instructions). Cache this information so we only figure it out once. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix RA issue with faninRob Clark2015-06-301-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | The fanin source could be grouped, for example with shaders like: VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[9] DCL SAMP[0] DCL SVIEW[0], 2D, FLOAT DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[1].xyyy 1: MOV TEMP[0].w, IN[1].wwww 2: TXF TEMP[0], TEMP[0], SAMP[0], 2D 3: MOV OUT[1], TEMP[0] 4: MOV OUT[0], IN[0] 5: END The second arg to the isaml is IN[1].w, so we need to look at the fanin source to get the correct offset. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add ir3_shader_disasm()Rob Clark2015-06-303-120/+124
| | | | | | | | | Split out most of dump_info() from ir3_cmdline compiler into a function that can be used both by cmdline compiler and also for the disasm debug option. This way, for FD_MESA_DEBUG=disasm we also get to see intput/ output registers, etc. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix for sparse-samplersRob Clark2015-06-301-3/+7
| | | | | | | | | Some piglit tests, like arb_fragment_program-sparse-samplers, result in having a null samp#0 but valid samp#1. TODO: a3xx probably needs similar fix Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix crash in fail pathRob Clark2015-06-303-3/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix crash in RARob Clark2015-06-301-2/+5
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fixes for indirect writesRob Clark2015-06-303-4/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix constlen in case of load_uniform_indirectRob Clark2015-06-301-0/+5
| | | | | | | | We can't rely on what we get from the assembler if we have indirect addressing of constant file, since the assembler doesn't know the array index. This got lost in the transition to NIR. Signed-off-by: Rob Clark <[email protected]>
* mesa: Enable subdir-objects globally.Matt Turner2015-06-261-2/+0
| | | | Reviewed-by: Emil Velikov <[email protected]>
* freedreno/ir3: pass sz to split_dest()Rob Clark2015-06-212-5/+7
| | | | | | | | | | | | | | | | | | | | For query_levels, we generate a getinfo with writemask of (z), which RA will consider as size==3. But we were still generating four fanouts. Which meant that RA would see it as two different register classes, depending on the path to definer. Ie. on the getinfo instruction itself it would see size==3, but when chasing back through the fanouts it would see size==4. Easiest way to solve that is to just generate the chain of neighboring fanouts to have the correct size in the first place. Note: we may eventually want split_dest() to take start/end or wrmask instead, since really we only need size==1. But RA is not clever enough for that, query_levels is not that common, and the other two registers that get allocated are never used so those register slots can be immediately re-used. So bunch of work for probably no real gain. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/nir: add more opcodesRob Clark2015-06-211-1/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: only unminify txf coords on a3xxRob Clark2015-06-211-1/+9
| | | | | | Seems like a4xx gets this right. Signed-off-by: Rob Clark <[email protected]>
* freedreno: remove int sampler shader variantsRob Clark2015-06-218-104/+7
| | | | | | | | We get this information from NIR (which gets it from sview decl in tgsi when translating from tgsi), so no need to maintain shader variants for this. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: block reshuffling and loops!Rob Clark2015-06-2110-126/+1025
| | | | | | | | | | | | | | | | This shuffles things around to allow the shader to have multiple basic blocks. We drop the entire CFG structure from nir and just preserve the blocks. At scheduling we know whether to schedule conditional branches or unconditional jumps at the end of the block based on the # of block successors. (Dropping jumps to the following instruction, etc.) One slight complication is that variables (load_var/store_var, ie. arrays) are not in SSA form, so we have to figure out where to put the phi's ourself. For this, we use the predecessor set information from nir_block. (We could perhaps use NIR's dominance frontier information to help with this?) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: a4xx encodes larger immed offsetRob Clark2015-06-214-7/+21
| | | | | | | Without this, negative branch/jump offsets look like very large positive offsets. Signed-off-by: Rob Clark <[email protected]>