diff options
author | Kenneth Graunke <[email protected]> | 2015-11-07 18:58:34 -0800 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2015-11-14 16:41:37 -0800 |
commit | d2f089ba17c6b17823fc3d244e15c0a18108d5ce (patch) | |
tree | 9832b89b85298b9223b22e1647467ea62d1a154e /src/mesa/drivers/dri/i965/brw_fs.cpp | |
parent | 848fa3101d5077b1aecfb0886c69a7d0dd7f75bc (diff) |
i965: Introduce a MOV_INDIRECT opcode.
The geometry and tessellation control shader stages both read from
multiple URB entries (one per vertex). The thread payload contains
several URB handles which reference these separate memory segments.
In GLSL, these inputs are represented as per-vertex arrays; the
outermost array index selects which vertex's inputs to read. This
array index does not necessarily need to be constant.
To handle that, we need to use indirect addressing on GRFs to select
which of the thread payload registers has the appropriate URB handle.
(This is before we can even think about applying the pull model!)
This patch introduces a new opcode which performs a MOV from a
source using VxH indirect addressing (which allows each of the 8
SIMD channels to select distinct data.)
Based on a patch by Jason Ekstrand.
v2: Rename from INDIRECT_THREAD_PAYLOAD_MOV to MOV_INDIRECT; make it
a bit more generic. Use regs_read() instead of hacking up the
register allocator. (Suggested by Jason Ekstrand.)
v3: Fix regs_read() to be more accurate for small unaligned regions.
Also rebase on Matt's work.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]> [v3]
Reviewed-by: Abdiel Janulgue <[email protected]> [v1]
Diffstat (limited to 'src/mesa/drivers/dri/i965/brw_fs.cpp')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_fs.cpp | 28 |
1 files changed, 28 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 80b8c8e1207..84b5920d4f5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -840,6 +840,34 @@ fs_inst::regs_read(int arg) const case SHADER_OPCODE_BARRIER: return 1; + case SHADER_OPCODE_MOV_INDIRECT: + if (arg == 0) { + assert(src[2].file == IMM); + unsigned region_length = src[2].ud; + + if (src[0].file == FIXED_GRF) { + /* If the start of the region is not register aligned, then + * there's some portion of the register that's technically + * unread at the beginning. + * + * However, the register allocator works in terms of whole + * registers, and does not use subnr. It assumes that the + * read starts at the beginning of the register, and extends + * regs_read() whole registers beyond that. + * + * To compensate, we extend the region length to include this + * unread portion at the beginning. + */ + if (src[0].subnr) + region_length += src[0].subnr * type_sz(src[0].type); + + return DIV_ROUND_UP(region_length, REG_SIZE); + } else { + assert(!"Invalid register file"); + } + } + break; + default: if (is_tex() && arg == 0 && src[0].file == VGRF) return mlen; |