summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJason Ekstrand <[email protected]>2016-10-28 14:48:53 -0700
committerJason Ekstrand <[email protected]>2016-10-28 17:11:16 -0700
commit2a4a86862c949055c71637429f6d5f2e725d07d8 (patch)
tree243e8cfae6d5b705987f452d3b5acb13f33d3b1a
parent4bf45a6079b5cc6b0360b637c0c7baa456b8257d (diff)
i965/fs/generator: Don't use the address immediate for MOV_INDIRECT
The address immediate field is only 9 bits and, since the value is in bytes, the highest GRF we can point to with it is g15. This makes it pretty close to useless for MOV_INDIRECT. There were already piles of restrictions preventing us from using it prior to Broadwell, so let's get rid of the gen8+ code path entirely. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779 Cc: "12.0 13.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
-rw-r--r--src/mesa/drivers/dri/i965/brw_fs_generator.cpp55
1 files changed, 27 insertions, 28 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index d7b5cc66584..c5b50e1f73e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -400,34 +400,33 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
indirect_byte_offset =
retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
- struct brw_reg ind_src;
- if (devinfo->gen < 8) {
- /* From the Haswell PRM section "Register Region Restrictions":
- *
- * "The lower bits of the AddressImmediate must not overflow to
- * change the register address. The lower 5 bits of Address
- * Immediate when added to lower 5 bits of address register gives
- * the sub-register offset. The upper bits of Address Immediate
- * when added to upper bits of address register gives the register
- * address. Any overflow from sub-register offset is dropped."
- *
- * This restriction is only listed in the Haswell PRM but emperical
- * testing indicates that it applies on all older generations and is
- * lifted on Broadwell.
- *
- * Since the indirect may cause us to cross a register boundary, this
- * makes the base offset almost useless. We could try and do
- * something clever where we use a actual base offset if
- * base_offset % 32 == 0 but that would mean we were generating
- * different code depending on the base offset. Instead, for the
- * sake of consistency, we'll just do the add ourselves.
- */
- brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
- ind_src = brw_VxH_indirect(0, 0);
- } else {
- brw_MOV(p, addr, indirect_byte_offset);
- ind_src = brw_VxH_indirect(0, imm_byte_offset);
- }
+ /* There are a number of reasons why we don't use the base offset here.
+ * One reason is that the field is only 9 bits which means we can only
+ * use it to access the first 16 GRFs. Also, from the Haswell PRM
+ * section "Register Region Restrictions":
+ *
+ * "The lower bits of the AddressImmediate must not overflow to
+ * change the register address. The lower 5 bits of Address
+ * Immediate when added to lower 5 bits of address register gives
+ * the sub-register offset. The upper bits of Address Immediate
+ * when added to upper bits of address register gives the register
+ * address. Any overflow from sub-register offset is dropped."
+ *
+ * Since the indirect may cause us to cross a register boundary, this
+ * makes the base offset almost useless. We could try and do something
+ * clever where we use a actual base offset if base_offset % 32 == 0 but
+ * that would mean we were generating different code depending on the
+ * base offset. Instead, for the sake of consistency, we'll just do the
+ * add ourselves. This restriction is only listed in the Haswell PRM
+ * but empirical testing indicates that it applies on all older
+ * generations and is lifted on Broadwell.
+ *
+ * In the end, while base_offset is nice to look at in the generated
+ * code, using it saves us 0 instructions and would require quite a bit
+ * of case-by-case work. It's just not worth it.
+ */
+ brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
+ struct brw_reg ind_src = brw_VxH_indirect(0, 0);
brw_inst *mov = brw_MOV(p, dst, retype(ind_src, dst.type));