summaryrefslogtreecommitdiffstats
path: root/src/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: remove unused parameterRob Clark2019-11-121-4/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: legalize cleanupsRob Clark2019-11-121-1/+7
| | | | | | | | | We can clear the "needs" flags once we emit a flag. And also, don't open-code the opcode name. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: fix gpu hang with pre-fs-tex-fetchRob Clark2019-11-122-10/+32
| | | | | | | | | | | | | | | | | | | | | | For pre-fs-dispatch texture fetch, we need to assign bary_ij to r0.x, even if it is not used in the shader (ie. only varying use is for tex coords). But if, for example, gl_FragCoord is used, it could get assigned on top of bary_ij, resulting in a GPU hang. The solution to this is two-fold: (1) the inputs/outputs rework has the benefit of making RA realize bary_ij is a vec2, even if there are no split/collect instructions (due to no varying fetches in the shader itself). And (2) extend the live ranges of meta:input instructions to the first non-input, to prevent RA from assigning the same register to multiple inputs. Backport note: because of (1) above, a better solution for 19.3 would be to revert f30c256ec05. Fixes: f30c256ec05 ("freedreno/ir3: enable pre-fs texture fetch for a6xx") Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: only tex instructions have wrmaskRob Clark2019-11-121-6/+3
| | | | | | | | | | | At the ir3 level, we would assume that we could use wrmask to mask off other components of an instruction returning a vecN when they are not used. Which would let RA use components not written for other live values. But this is only true for tex instructions. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: re-work shader inputs/outputsRob Clark2019-11-125-270/+192
| | | | | | | | | | | | | | | | | | | Allow inputs/outputs to be vecN (ie. whatever their actual size is), and use split to get scalar components of inputs, and collect to gather up scalar components of outputs. The main motivation is to simplify RA, by only having to consider split/ collect to figure out where values need to land in consecutive scalar registers, rather than having to also deal with left/right neighbors. Because of varying packing, and the resulting fractional location (location_frac), to implement load_input/store_output, it is still convenient to have a table of scalar inputs/outputs. We move this to the compile ctx (since it is only needed for nir->ir3). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: simplify creating sysval inputsRob Clark2019-11-121-90/+58
| | | | | | | | | | | | In almost all places, the add_sysval_input() is paired directly with a create_input(). (The one exception is frag shader ij bary coord, and this exception will go away in a later patch.) So go ahead and clean this up before reworking input/output handling. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: remove first-vertex sysvalRob Clark2019-11-121-1/+0
| | | | | | | | | This is a driver-param (loaded from uniform), not a sysval (populated by hw into a register). So it has no value to having a sysval slot. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: helper to print ir if debug enabledRob Clark2019-11-122-28/+16
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: show input/output wrmask's in disasmRob Clark2019-11-121-2/+9
| | | | | | | | | Currently it is always 0x1 (scalar), but that will change in a later patch. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: add input/output iteratorsRob Clark2019-11-1211-63/+49
| | | | | | | | We can at least get rid of the if-not-NULL check in a bunch of places. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: remove impossible conditionRob Clark2019-11-121-3/+0
| | | | | | | | | We keep kill's alive w/ keeps these days, rather than a fake output. This condition was left over from prior to that change. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: rename fanin/fanout to collect/splitRob Clark2019-11-1210-44/+48
| | | | | | | | | | | If I'm going to refactor a bit to use these meta instructions to also handle input/output, then might as well cleanup the names first. Nouveau also uses collect/split for names of these meta instructions, and I like those names better. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: remove half-precision outputRob Clark2019-11-121-30/+0
| | | | | | | | | | | | | This doesn't really work, we can't necessarily just change the outputs to half-precision like this in anything but simple cases. Keep the shader key entry around though, eventually with proper mediump support we could use this with a nir pass to use lower precision frag shader outputs when the render target format has <= 16b/component. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: fix valgrind complaint with STLWRob Clark2019-11-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The instruction has 3 src regs, so `instr->regs[0..3]` are valid, but `instr->regs[4]` is not. ``` Test case 'dEQP-GLES31.functional.shaders.linkage.es31.tessellation.varying.rules.output_superfluous_declaration'.. ==29239== Invalid read of size 8 ==29239== at 0x5BE9CDC: emit_cat6 (ir3.c:841) ==29239== by 0x5BEA1BF: ir3_assemble (ir3.c:921) ==29239== by 0x5BDF0A7: ir3_shader_assemble (ir3_shader.c:133) ==29239== by 0x5BDF193: assemble_variant (ir3_shader.c:162) ==29239== by 0x5BDF407: create_variant (ir3_shader.c:215) ==29239== by 0x5BDF4DB: shader_variant (ir3_shader.c:241) ==29239== by 0x5BDF553: ir3_shader_get_variant (ir3_shader.c:257) ==29239== by 0x5BA85F7: ir3_shader_variant (ir3_gallium.c:80) ==29239== by 0x5BA7703: ir3_cache_lookup (ir3_cache.c:96) ==29239== by 0x5B8B8B3: fd6_emit_get_prog (fd6_emit.h:119) ==29239== by 0x5B8C137: fd6_draw_vbo (fd6_draw.c:186) ==29239== by 0x5BB1FBB: fd_draw_vbo (freedreno_draw.c:290) ==29239== Address 0xb97f2d0 is 0 bytes after a block of size 240 alloc'd ==29239== at 0x4848D54: malloc (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so) ==29239== by 0x61BD35B: ralloc_size (ralloc.c:119) ==29239== by 0x61BD41B: rzalloc_size (ralloc.c:151) ==29239== by 0x5BE599B: ir3_alloc (ir3.c:45) ==29239== by 0x5BEA583: instr_create (ir3.c:984) ==29239== by 0x5BEA5DF: ir3_instr_create2 (ir3.c:1000) ==29239== by 0x5BEE317: ir3_STLW (ir3.h:1431) ==29239== by 0x5BF12D3: emit_intrinsic_store_shared_ir3 (ir3_compiler_nir.c:903) ==29239== by 0x5BF418B: emit_intrinsic (ir3_compiler_nir.c:1802) ==29239== by 0x5BF5D07: emit_instr (ir3_compiler_nir.c:2339) ==29239== by 0x5BF603F: emit_block (ir3_compiler_nir.c:2426) ==29239== by 0x5BF624B: emit_cf_list (ir3_compiler_nir.c:2474) ==29239== ``` Probably this only triggers in non-optimized builds? Fixes: 1f3b52ce503 ("freedreno/a6xx: Add register offset for STG/LDG") Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno: add Adreno 640 IDJonathan Marek2019-11-111-0/+1
| | | | | | | A640 seems to work without any other changes (glmark and vkcube). Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/ir3: also track # of nops for shader-dbRob Clark2019-11-092-0/+4
| | | | | | | | | | | | | The instruction count is (mostly) a measure of what optimization passes can do, while # of nops is more an indication of how effectively the scheduler is balancing register pressure vs instruction count. So track these independently. (There could be opportunities to rematerialize values to reduce register pressure, swapping some nop's with other alu instructions, so nothing is truely independent.. but it is still useful to break these stats out.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: sync disasm changes from envytoolsRob Clark2019-11-092-24/+94
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove obsolete commentRob Clark2019-11-091-4/+0
| | | | | | | | The meta PHI instruction was removed long ago. And fanin/fanout themselves to not contribute actual instructions (at least not by the time you get to sched, they may prevent copy-propagating away a mov) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/ra: remove ir print after livein/outRob Clark2019-11-091-1/+0
| | | | | | | The IR hasn't changed at this point, so it isn't really adding any value. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/ra: move regs_count==0 checkRob Clark2019-11-091-9/+2
| | | | | | | | Fold it in to writes_gpr() (since a register that does not reference any registers by definition does not write a register). This lets us avoid having to handle this case in a few other places. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: ir3_print tweaksRob Clark2019-11-092-47/+102
| | | | | | Handle HALF/HIGH flags in all cases, and colorize SSA src notation. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use SSA flag on dest register tooRob Clark2019-11-094-45/+48
| | | | | | | | | | | | We did this in some places before, but not consistantly. But it will be useful for two-pass RA, to identify which registers have already been assigned. While we are cleaning this up, use __ssa_src() and new __ssa_dst() helper more consistently. (If nothing else, this reduces the # of callers of ir3_reg_create() to audit that we didn't miss something) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split pre-coloring to it's own functionRob Clark2019-11-091-3/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: Use regid() helper when setting up precolor regsKristian H. Kristensen2019-11-071-4/+4
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Program state for tessellation stagesKristian H. Kristensen2019-11-071-0/+5
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Allocate const space for tessellation parametersKristian H. Kristensen2019-11-071-0/+7
| | | | | | | | | | The tessellation stages need size and stride or the patch layout as well as locations of attributes in the patch. The tesselation stages also use two system memory BOs and need the iovas of those. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Pre-color TCS header and primitive ID inputsKristian H. Kristensen2019-11-071-2/+12
| | | | | | | | | | Similar to GS, the registers are shared and not reinitialized betewen VS and TCS, so we need to make sure to allocate the same registers for the system values between stages. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Don't assume binning shader is always VSKristian H. Kristensen2019-11-071-2/+2
| | | | | | | | In tessellation mode, the TES is (probably) the binning shader. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Setup inputs and outputs for tessellation stagesKristian H. Kristensen2019-11-071-7/+52
| | | | | | | | | | Similar to GS, some inputs are reused when the chsh from VS to TCS or TES to GS, so we need to make sure we setup the right inputs and make the shared system values outputs so they don't get clobbered. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Implement TCS synchronization intrinsicsKristian H. Kristensen2019-11-071-0/+33
| | | | | | | | | We add two new IR3 specific nir intrinsics that map to the new condend and endpatch instructions. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Implement tess coord intrinsicKristian H. Kristensen2019-11-071-0/+12
| | | | | | | | | | Our lowering pass made the z component unused by replacing its uses by 1 - x - y. The intrinsic implementation then just need to return the x and y components. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: End TES with chsh when using GSKristian H. Kristensen2019-11-071-1/+3
| | | | | | | | | When we have both TES and GS, the TES needs to chain to the VS with chmask and chsh GS just like the VS does to either TCS or GS. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add new synchronization opcodesKristian H. Kristensen2019-11-075-1/+15
| | | | | | | | | | | | | There are two new opcodes in use in tesselation control shaders: category 0, opcodes 13 and 15. unk13 is a kill type of instruction that terminates threads where !p0.x and it used to narrow down a patch wavefront to just thread 0. Then, once thread 0 has written the tess levels, it issues unk15, which might signal the TE that another patch has been fully written. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Extend geometry lowering pass to handle tessellationKristian H. Kristensen2019-11-073-8/+520
| | | | | | | | | | | | | | | | VS and TCS pass varyings the same way as VS and GS does. TCS then writes entire patch to a system memory BO and TES eventually reads back from the BO once the TE starts generating vertices. TES outputs vertices the same way as VS and GS, except when there's a GS as well, in which case TES passes varyings to GS same way the VS would. In addition, the TCS needs a little bit of control flow massaging so that it only runs for valid invocations needs a couple of unknown instructions to synchronize with the TE. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add tessellation field to shader keyKristian H. Kristensen2019-11-072-1/+34
| | | | | | | | | | Whether we're tessellating and which primitives the TES outputs affects the entire pipeline so let's add a field to the key to track that. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Use imul24 in offset calculationsKristian H. Kristensen2019-11-071-2/+2
| | | | | | | | | With the imul24 opcode in place, we can now use it for computing local offsets (ie for ldlw/stlw). Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add ir3 intrinsics for tessellationKristian H. Kristensen2019-11-074-0/+26
| | | | | | | | | These provide the iovas for system memory buffers used for tessellation as well as a new HW specific system value. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add load and store intrinsics for global ioKristian H. Kristensen2019-11-071-0/+49
| | | | | | | | | These intrinsics take a ivec2 for the 64 bit base address and a integer offset. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Add register offset for STG/LDGKristian H. Kristensen2019-11-075-9/+64
| | | | | | | | | | These instructions take a 64 bit iova as two conescutive registers and a immediate offset. This patch adds support for the offset to be a single register, which is added to the 64 bit iova. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6x: Rename z/s formatsKristian H. Kristensen2019-11-073-10/+10
| | | | | | | | | | What we call eRB6_Z24_UNORM_S8_UINT now is actually RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually RB6_Z24_UNORM_S8_UINT. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Fix layered texture type enumKristian H. Kristensen2019-11-071-3/+4
| | | | | | | | 2D array textures and 3D textures are different enum values after all. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Clear sysmem with CP_BLITKristian H. Kristensen2019-11-071-0/+4
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/registers: Add comments about primitive countersKristian H. Kristensen2019-11-071-12/+10
| | | | | | | | Adding comments about best guess at what the counters count. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/registers: Move SP_PRIMITIVE_CNTL and SP_VS_VPC_DSTKristian H. Kristensen2019-11-071-28/+28
| | | | | | | | Move these two to be in order with the other VS regs. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/registers: Fix typoKristian H. Kristensen2019-11-071-1/+1
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* meson: move the generic symbols check arguments to a common variableEric Engestrom2019-11-051-1/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviwed-by: Dylan Baker <dylan@pnwbakers>
* meson: add variable to control the symbols checksEric Engestrom2019-11-051-1/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviwed-by: Dylan Baker <dylan@pnwbakers>
* util: rename PIPE_ARCH_*_ENDIAN to UTIL_ARCH_*_ENDIANDylan Baker2019-11-051-1/+1
| | | | | | | | | | | As requested by Tim. This was generated with: grep 'PIPE_ARCH_.*_ENDIAN' -rIl | xargs sed -ie 's@PIPE_ARCH_\(.*\)_ENDIAN@UTIL_ARCH_\1_ENDIAN@'g v2: - add this patch Reviewed-by: Eric Engestrom <[email protected]>
* util/u_endian: set PIPE_ARCH_*_ENDIAN to 1Dylan Baker2019-11-051-1/+1
| | | | | | | | | | | | This will allow it to be used as a drop in replacement for _mesa_little_endian in a number of cases. v2: - Always define PIPE_ARCH_LITTLE_ENDIAN and PIPE_ARCH_BIG_ENDIAN, define the one that reflects the host system to 1 and the other to 0 - replace all uses of #ifdef, #ifndef, and #if defined() with #if and #if ! with PIPE_ARCH_*_ENDIAN Reviewed-by: Eric Engestrom <[email protected]>
* turnip: Remove _mesa_locale_init/fini calls.Bas Nieuwenhuizen2019-10-311-3/+0
| | | | | | | | | The resulting locale is not used for Vulkan, and it is not reference counted, giving issues when multiple instances are created. CC: 19.2 19.3 <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>