summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4/vc4_qpu.h
diff options
context:
space:
mode:
authorEric Anholt <[email protected]>2014-12-10 14:56:46 -0800
committerEric Anholt <[email protected]>2014-12-17 19:35:13 -0800
commite473fbe4690b5cbe3769042a4917f22559e2ba8d (patch)
treed2c2a467d69a4713651b40bf269db9691544baab /src/gallium/drivers/vc4/vc4_qpu.h
parentff266483fb61fd69775daf5c931ca7a56a26f4ac (diff)
vc4: Add support for turning constant uniforms into small immediates.
Small immediates have the downside of taking over the raddr B field, so you might have less chance to pack instructions together thanks to raddr B conflicts. However, it also reduces some register pressure since it lets you load 2 "uniform" values in one instruction (avoiding a previous load of the constant value to a register), and increases some pairing for the same reason. total uniforms in shared programs: 16231 -> 13374 (-17.60%) uniforms in affected programs: 10280 -> 7423 (-27.79%) total instructions in shared programs: 40795 -> 41168 (0.91%) instructions in affected programs: 25551 -> 25924 (1.46%) In a previous version of this patch I had a reduction in instruction count by forcing the other args alongside a SMALL_IMM to be in the A file or accumulators, but that increases register pressure and had a bug in handling FRAG_Z. In this patch is I just use raddr conflict resolution, which is more expensive. I think I'd rather tweak allocation to have some way to slightly prefer good choices for files in general, rather than risk failing to register allocate by forcing things into register classes.
Diffstat (limited to 'src/gallium/drivers/vc4/vc4_qpu.h')
-rw-r--r--src/gallium/drivers/vc4/vc4_qpu.h1
1 files changed, 1 insertions, 0 deletions
diff --git a/src/gallium/drivers/vc4/vc4_qpu.h b/src/gallium/drivers/vc4/vc4_qpu.h
index e1307ebb57b..c9ab6344589 100644
--- a/src/gallium/drivers/vc4/vc4_qpu.h
+++ b/src/gallium/drivers/vc4/vc4_qpu.h
@@ -134,6 +134,7 @@ uint64_t qpu_load_imm_ui(struct qpu_reg dst, uint32_t val);
uint64_t qpu_set_sig(uint64_t inst, uint32_t sig);
uint64_t qpu_set_cond_add(uint64_t inst, uint32_t cond);
uint64_t qpu_set_cond_mul(uint64_t inst, uint32_t cond);
+uint32_t qpu_encode_small_immediate(uint32_t i);
bool qpu_waddr_is_tlb(uint32_t waddr);
bool qpu_inst_is_tlb(uint64_t inst);