aboutsummaryrefslogtreecommitdiffstats
path: root/src/broadcom/compiler/vir_dump.c
Commit message (Collapse)AuthorAgeFilesLines
* v3d: Add missing dumping for the spill offset/size uniforms.Eric Anholt2019-04-121-0/+8
|
* v3d: Move constant offsets to UBO addresses into the main uniform stream.Eric Anholt2019-03-211-1/+3
| | | | | | | | | | We'd end up with the constant offset in the uniform stream anyway, since they're bigger than small immediates. Avoids the extra uniforms and adds in the shader in favor of just adding once on the CPU. shader-db: total instructions in shared programs: 6496865 -> 6494851 (-0.03%) total uniforms in shared programs: 2119511 -> 2117243 (-0.11%)
* v3d: Rename v3d_tmu_config_data to v3d_unit_data.Eric Anholt2019-03-211-6/+6
| | | | | | I want to reuse this for encoding small constant UBO/SSBO offsets into the uniform stream to reduce the extra uniform loads and adds for the small constant offsets.
* v3d: Include a count of register pressure in the RA failure dumps.Eric Anholt2019-03-061-1/+13
| | | | | | You usually want to go find the highest pressure and figure out why you couldn't spill or what pattern led to a bunch of pressure leading to that point.
* v3d: Eliminate the TLB and TLBU files.Eric Anholt2019-03-051-13/+2
| | | | We can just use the magic register file like we do for other magic waddrs.
* v3d: Use ldunif instructions for uniforms.Eric Anholt2019-03-051-9/+0
| | | | | | | | | | | | | | The idea is that for repeated use of the same uniform, we could avoid loading it on each consumer. The results look pretty good. total instructions in shared programs: 6413571 -> 6521464 (1.68%) total threads in shared programs: 154214 -> 154000 (-0.14%) total uniforms in shared programs: 2393604 -> 2119629 (-11.45%) total spills in shared programs: 4960 -> 4984 (0.48%) total fills in shared programs: 6350 -> 6418 (1.07%) Once we do scheduling at the NIR level, the register pressure (and thus also instructions) issues we see here will drop back down.
* v3d: Switch implicit uniforms over to being any qinst->uniform != ~0.Eric Anholt2019-03-051-11/+10
| | | | | I'm not sure why I didn't do this before -- it's clearly much simpler to add dumping of the extra thing than to have it as another implicit source.
* v3d: Fix dumping of shaders with alpha test.Eric Anholt2019-02-051-1/+3
| | | | We were trying to print a NULL entry from the table.
* v3d: Add support for CS shared variable load/store/atomics.Eric Anholt2019-01-141-0/+1
| | | | | CS shared variables are handled effectively as SSBO access to a temporary buffer that will be allocated at CS dispatch time.
* v3d: Add support for CS workgroup/invocation id intrinsics.Eric Anholt2019-01-141-0/+4
| | | | | | We get a payload for the ivec3 workgroup and an int local invocation index, and we use the core lowering to turn into the global invocation id and the local invocation id ivec3s.
* v3d: Add support for shader_image_load_store.Eric Anholt2019-01-141-0/+19
| | | | | | This is only exposed on V3D 4.1+, because we didn't have the TMU write operations for images on 3.3 (To do GLES 3.1 there, you have to lower it to SSBO load/stores, which is a problem to solve later).
* v3d: Add SSBO/atomic counters support.Eric Anholt2019-01-141-0/+8
| | | | | So far I assume that all the buffers get written. If they weren't, you'd probably be using UBOs instead.
* v3d: Move uniform pretty-printing to its own helper function.Eric Anholt2018-12-141-71/+76
| | | | I want to reuse it in the QPU dump.
* v3d: Add VIR dumping of TMU config p0/p1.Eric Anholt2018-12-071-0/+12
| | | | I had a bit of it for V3D 3.x, but didn't update it for 4.x.
* v3d: Simplify VIR uniform dumping using a temporary.Eric Anholt2018-12-071-19/+10
|
* v3d: Implement a small immediates optimization, based on VC4's.Eric Anholt2018-07-231-9/+18
| | | | | | | | | We can do one per instruction, and we have to be careful not to overwrite raddr_b, but this greatly reduces the pressure on uniform loads (particularly around ldvpm/stvpm instructions). total instructions in shared programs: 90768 -> 88220 (-2.81%) instructions in affected programs: 82711 -> 80163 (-3.08%)
* broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.Eric Anholt2018-03-261-0/+1
| | | | Just like TLB without a config uniform, we don't have a register index.
* broadcom/vc5: Don't annotate dumps with stale live intervals.Eric Anholt2018-03-191-2/+2
| | | | | As you're debugging register allocation, you may have changed the intervals and not recomputed yet. Just skip the dump in that case.
* broadcom/vc5: Add support for loading varyings in V3D 4.1.Eric Anholt2018-01-121-1/+0
| | | | | | | The LDVARY signal now writes an arbitrary register, so I took out the magic src register file and replaced it with an instruction with LDVARY set so we have somewhere to hang a QFILE_TEMP destination for register allocation.
* broadcom/vc5: Add support for V3Dv4 signal bits.Eric Anholt2018-01-121-2/+43
| | | | | | | The WRTMUC replaces the implicit uniform loads in the first two texture instructions. LDVPM disappears in favor of an ALU op. LDVARY, LDTMU, LDTLB, and LDUNIF*RF now write to arbitrary registers, which required passing the devinfo through to a few more functions.
* broadcom: Add VC5 NIR compiler.Eric Anholt2017-10-101-0/+339
This is a pretty straightforward fork of VC4's NIR compiler to VC5. The condition codes, registers, and I/O have all changed, making the backend hard to share, though their heritage is still recognizable. v2: Move to src/broadcom/compiler to match intel's layout, rename more "vc5" to "v3d", rename QIR to VIR ("V3D IR") to avoid symbol conflicts with vc4, use new v3d_debug header, add compiler init/free functions, do texture swizzling in NIR to allow optimization.