diff options
author | Eric Anholt <[email protected]> | 2017-02-24 12:57:03 -0800 |
---|---|---|
committer | Eric Anholt <[email protected]> | 2017-02-24 17:01:29 -0800 |
commit | 292c24ddac5acc35676424f05291c101fcd47b3e (patch) | |
tree | 1cc326dc2c1dd5c8abd664dae0b4e1fcfa4bf373 /src/gallium/drivers/vc4/vc4_qir.h | |
parent | f06915d7b71eb955cc0db4b5555f5c6474926a01 (diff) |
vc4: Lazily emit our FS/VS input loads.
This reduces register pressure in both types of shaders, by reordering the
input loads from the var->data.driver_location order to whatever order
they appear first in the NIR shader. These instructions aren't
reorderable at our QIR scheduling level because the FS takes two in
lockstep to do an interpolation, and the VS takes multiple read
instructions in a row to get a whole vec4-level attribute read.
shader-db impact:
total instructions in shared programs: 76666 -> 76590 (-0.10%)
instructions in affected programs: 42945 -> 42869 (-0.18%)
total max temps in shared programs: 9395 -> 9208 (-1.99%)
max temps in affected programs: 2951 -> 2764 (-6.34%)
Some programs get their max temps hurt, depending on the order that the
load_input intrinsics appear, because we end up being unable to copy
propagate an older VPM read into its only use.
Diffstat (limited to 'src/gallium/drivers/vc4/vc4_qir.h')
-rw-r--r-- | src/gallium/drivers/vc4/vc4_qir.h | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/src/gallium/drivers/vc4/vc4_qir.h b/src/gallium/drivers/vc4/vc4_qir.h index 6469e51b051..fe86232aeb2 100644 --- a/src/gallium/drivers/vc4/vc4_qir.h +++ b/src/gallium/drivers/vc4/vc4_qir.h @@ -462,6 +462,13 @@ struct vc4_compile { uint8_t vattr_sizes[8]; /** + * Order in which the vattrs were loaded by the program, to arrange + * vattr_offsets[] in the program data appropriately. + */ + uint8_t vpm_input_order[8]; + uint8_t next_vpm_input; + + /** * Array of the VARYING_SLOT_* of all FS QFILE_VARY reads. * * This includes those that aren't part of the VPM varyings, like |