aboutsummaryrefslogtreecommitdiffstats
path: root/src/broadcom/Makefile.sources
Commit message (Collapse)AuthorAgeFilesLines
* v3d: Use the new lower_to_scratch implementation for indirects on temps.Eric Anholt2019-04-121-0/+1
| | | | | | | | | | | | | We can use the same register spilling infrastructure for our loads/stores of indirect access of temp variables, instead of doing an if ladder. Cuts 50% of instructions and max-temps from 2 KSP shaders in shader-db. Also causes several other KSP shaders with large bodies and large loop counts to not be force-unrolled. The change was originally motivated by NOLTIS slightly modifying register pressure in piglit temp mat4 array read/write tests, triggering register allocation failures.
* v3d: Add an optimization pass for redundant flags updates.Eric Anholt2019-04-111-0/+1
| | | | | | | | | | | | Our exec masking introduces lots of redundant flags updates, and even without that there will be cases where NIR comparisons on the same sources for different reasons may generate the same comparison instruction before the selection. total instructions in shared programs: 6492930 -> 6460934 (-0.49%) total uniforms in shared programs: 2117460 -> 2115106 (-0.11%) total spills in shared programs: 4983 -> 4987 (0.08%) total fills in shared programs: 6408 -> 6416 (0.12%)
* v3d: Use ldunif instructions for uniforms.Eric Anholt2019-03-051-1/+0
| | | | | | | | | | | | | | The idea is that for repeated use of the same uniform, we could avoid loading it on each consumer. The results look pretty good. total instructions in shared programs: 6413571 -> 6521464 (1.68%) total threads in shared programs: 154214 -> 154000 (-0.14%) total uniforms in shared programs: 2393604 -> 2119629 (-11.45%) total spills in shared programs: 4960 -> 4984 (0.48%) total fills in shared programs: 6350 -> 6418 (1.07%) Once we do scheduling at the NIR level, the register pressure (and thus also instructions) issues we see here will drop back down.
* v3d: Avoid duplicating limits defines between gallium and v3d core.Eric Anholt2019-01-271-0/+1
| | | | | We don't want to pull the compiler into every include in the gallium driver, so just make a new little header to store the limits.
* v3d: Add support for shader_image_load_store.Eric Anholt2019-01-141-0/+1
| | | | | | This is only exposed on V3D 4.1+, because we didn't have the TMU write operations for images on 3.3 (To do GLES 3.1 there, you have to lower it to SSBO load/stores, which is a problem to solve later).
* vc4: Move the utile load/store functions to a header for reuse by v3d.Eric Anholt2018-12-191-0/+1
| | | | | These implementations of whole-utile load/stores would be the same for v3d, though the layouts of blocks of utiles has changed.
* v3d: Implement a small immediates optimization, based on VC4's.Eric Anholt2018-07-231-0/+1
| | | | | | | | | We can do one per instruction, and we have to be careful not to overwrite raddr_b, but this greatly reduces the pressure on uniform loads (particularly around ldvpm/stvpm instructions). total instructions in shared programs: 90768 -> 88220 (-2.81%) instructions in affected programs: 82711 -> 80163 (-3.08%)
* v3d: Merge the V3D 4.1 and 4.2 XML into V3D 3.3'x XML.Eric Anholt2018-06-291-2/+0
| | | | | | The XML ends up noisier if you're only looking at one version, but from the diffstat there's obvious wins in terms of deduplication. This will get even more significant if we ever support 3.2 or 4.0.
* broadcom/vc5: Add XML for V3D 4.2.Eric Anholt2018-01-271-0/+2
|
* broadcom: add missing headers to the tarballEmil Velikov2018-01-181-2/+5
| | | | Signed-off-by: Emil Velikov <[email protected]>
* broadcom/vc5: Add compiler support for V3D 4.x texturing.Eric Anholt2018-01-121-0/+1
|
* broadcom/vc5: Move V3D 3.3 texturing to a separate file.Eric Anholt2018-01-121-0/+1
| | | | | V3D 4.x texturing changes enough that #ifdefs would just make a mess of it.
* broadcom/vc5: Move V3D 3.3 VPM write setup to a separate file.Eric Anholt2018-01-121-0/+1
| | | | | For V4.1 texturing, I need the V4.1 XML, so the main compiler needs to stop including V3.3 XML.
* broadcom/vc5: Move the body of CLIF dumping to a per-version file.Eric Anholt2018-01-121-0/+4
| | | | | I want the library's entrypoints to still be unversioned, but the actual packet dumping needs to be per-version.
* broadcom/vc5: Add XML for V3D v4.1 (BCM7278)Eric Anholt2018-01-121-0/+2
|
* broadcom/vc5: Add lowering for txf_ms to a txf on a 2x2-scaled texture.Eric Anholt2017-10-301-0/+1
| | | | | | | | | The HW has no native sampler support for multisample textures, but since we only need to support txf_ms and the layout is UIF, we just need to scale up the texcoords and then add in the sample. This drops the old TEXTURE_MSAA_ADDR special uniform, since we're treating MSAA textures as textures, rather than basically texbos like VC4 had to.
* broadcom: Add VC5 NIR compiler.Eric Anholt2017-10-101-0/+13
| | | | | | | | | | | This is a pretty straightforward fork of VC4's NIR compiler to VC5. The condition codes, registers, and I/O have all changed, making the backend hard to share, though their heritage is still recognizable. v2: Move to src/broadcom/compiler to match intel's layout, rename more "vc5" to "v3d", rename QIR to VIR ("V3D IR") to avoid symbol conflicts with vc4, use new v3d_debug header, add compiler init/free functions, do texture swizzling in NIR to allow optimization.
* broadcom: Add vc5 CLIF dumpingEric Anholt2017-10-101-0/+2
| | | | | | | | This will be usable with "VC5_DEBUG=cl" on the vc5 driver to stream a CLIF file (the Broadcom equivalent of i965's AUB) to stderr. I haven't tested that this is actually usable with the internal CLIF-consuming tools, but is close enough as a baseline and is useful for visually inspecting the command stream.
* broadcom: Add V3D 3.3 QPU instruction pack, unpack, and disasm.Eric Anholt2017-10-101-0/+5
| | | | | | | | | | | | | | | | | Unlike VC4, I've defined an unpacked instruction format with pack/unpack functions to convert to 64-bit encoded instructions. This will let us incrementally put together our instructions and validate them in a more natural way than the QPU_GET_FIELD/QPU_SET_FIELD used to. The pack/unpack unfortuantely are written by hand. While I could define genxml for parts of it, there are many special cases (like operand order of commutative binops choosing which binop is being performed!) and it probably wouldn't come out much cleaner. The disasm unit test ensures that we have the same assembly format as Broadcom's internal tools, other than whitespace changes. v2: Fix automake variable redefinition complaints, add test to .gitignore
* broadcom: Introduce a v3d_debug.h header for vc5 and broadcom Vulkan.Eric Anholt2017-10-101-0/+2
| | | | | | | Unlike vc4, where the compiler and gallium driver live together, for vc5 the compiler will live up in the shared broadcom directory, and need access to the debug flags. Define a set of debug flags and helpers there, so it can be shared between compiler, vc5, and vulkan.
* broadcom/genxml: Add V3D 3.3 packet definitions.Eric Anholt2017-08-181-0/+2
| | | | | This will be used by the new vc5 gallium driver, and a future Vulkan driver.
* broadcom/genxml: Introduce a V3D packet/struct decoder.Eric Anholt2017-07-251-0/+6
| | | | | | | This is copied from Intel's XML decoder, modified to handle V3D's byte-oriented packets. v2: Squash in robher's fixes for Android
* broadcom: correct header file in BROADCOM_FILESAndres Gomez2017-07-241-1/+1
| | | | | | | | | | | | | | | | This fixes `make distcheck` > make[3]: *** No rule to make target 'common/v3d_devinfo.h', needed by 'distdir'. Stop. > make[3]: Leaving directory '/home/local/mesa/src/broadcom' > Makefile:945: recipe for target 'distdir' failed > make[2]: Leaving directory '/home/local/mesa/src' > make[2]: *** [distdir] Error 1 > make[1]: *** [distdir] Error 1 Fixes: 427bbbb99c ("broadcom: Introduce a header for talking about chip revisions.") Cc: Emil Velikov <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* broadcom: Introduce a header for talking about chip revisions.Eric Anholt2017-07-131-0/+1
| | | | | This will be used by the VC5 driver and various shared VC4/VC5 tooling, like the XML decoder.
* vc4: Introduce XML-based packet header generation like Intel's.Eric Anholt2017-06-301-0/+12
I really liked this idea, as it should help with management of packet parsing tools like the CL dump. The python script is forked off of theirs because our packets are byte-based instead of dwords, and the changes to do so while avoiding performance regressions due to unaligned accesses were quite invasive. v2: Fix Android.mk paths, drop shebang for python script, fix overlap detection. Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Tested-by: Rob Herring <[email protected]>