diff options
author | Kenneth Graunke <[email protected]> | 2018-06-08 14:24:16 -0700 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2018-06-13 02:07:58 -0700 |
commit | b8fa847c2ed9c7c743f31e57560a09fae3992f46 (patch) | |
tree | c360bd6dd8c6a19cdb9bf23e7691cb90729a572f /src | |
parent | 3c288da5eec81ee58b85927df18d9194ead8f5c2 (diff) |
intel/compiler: Properly consider UBO loads that cross 32B boundaries.
The UBO push analysis pass incorrectly assumed that all values would fit
within a 32B chunk, and only recorded a bit for the 32B chunk containing
the starting offset.
For example, if a UBO contained the following, tightly packed:
vec4 a; // [0, 16)
float b; // [16, 20)
vec4 c; // [20, 36)
then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
which means that we ought to record two 32B chunks in the bitfield.
Similarly, dvec4s would suffer from the same problem.
Reviewed-by: Rafael Antognolli <[email protected]>
Diffstat (limited to 'src')
-rw-r--r-- | src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 8 |
1 files changed, 7 insertions, 1 deletions
diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c index d58fe3dd2e3..6d6ccf73ade 100644 --- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c +++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c @@ -141,10 +141,16 @@ analyze_ubos_block(struct ubo_analysis_state *state, nir_block *block) if (offset >= 64) continue; + /* The value might span multiple 32-byte chunks. */ + const int bytes = nir_intrinsic_dest_components(intrin) * + (nir_dest_bit_size(intrin->dest) / 8); + const int end = DIV_ROUND_UP(offset_const->u32[0] + bytes, 32); + const int regs = end - offset + 1; + /* TODO: should we count uses in loops as higher benefit? */ struct ubo_block_info *info = get_block_info(state, block); - info->offsets |= 1ull << offset; + info->offsets |= ((1ull << regs) - 1) << offset; info->uses[offset]++; } } |