summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorNeil Roberts <[email protected]>2015-07-02 17:49:19 +0100
committerNeil Roberts <[email protected]>2015-07-03 09:39:09 +0100
commit7abc1e3286bc4729e144d3a247c2a275e46aaf53 (patch)
treee23da24325549eb09e1f3b5c4781434fca66c4b6
parent89bd5ee64c5aa1b977f4ba832cf7772e81ee286d (diff)
i965/fs: Don't disable SIMD16 when using the pixel interpolator
There was a comment saying that in SIMD16 mode the pixel interpolator returns coords interleaved 8 channels at a time and that this requires extra work to support. However, this interleaved format is exactly what the PLN instruction requires so I don't think anything needs to be done to support it apart from removing the line to disable it and to ensure that the message lengths for the send message are correct. I am more convinced that this is correct because as it says in the comment this interleaved output is identical to what is given in the thread payload. The code generated to apply the plane equation to these coordinates is identical on SIMD16 and SIMD8 except that the dispatch width is larger which implies no special unmangling is needed. Perhaps the confusion stems from the fact that the description of the PLN instruction in the IVB PRM seems to imply that the src1 inputs are not interleaved so it wouldn't work. However, in the HSW and BDW PRMs, the pseudo-code is different and looks like it expects the interleaved format. Mesa doesn't seem to generate different code on IVB to uninterleave the payload registers and everything is working so I can only assume that the PRM is wrong. I tested the interpolateAt tests on HSW and did a full Piglit run on IVB on there were no regressions. Reviewed-by: Chris Forbes <[email protected]>
-rw-r--r--src/mesa/drivers/dri/i965/brw_fs_nir.cpp11
1 files changed, 3 insertions, 8 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index caf1300d71b..bd71404ef8d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1481,12 +1481,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
case nir_intrinsic_interp_var_at_centroid:
case nir_intrinsic_interp_var_at_sample:
case nir_intrinsic_interp_var_at_offset: {
- /* in SIMD16 mode, the pixel interpolator returns coords interleaved
- * 8 channels at a time, same as the barycentric coords presented in
- * the FS payload. this requires a bit of extra work to support.
- */
- no16("interpolate_at_* not yet supported in SIMD16 mode.");
-
fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
/* For most messages, we need one reg of ignored data; the hardware
@@ -1551,7 +1545,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
bld.SEL(offset(src, bld, i), itemp, fs_reg(7)));
}
- mlen = 2;
+ mlen = 2 * dispatch_width / 8;
inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, dst_xy, src,
fs_reg(0u));
}
@@ -1563,7 +1557,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
}
inst->mlen = mlen;
- inst->regs_written = 2; /* 2 floats per slot returned */
+ /* 2 floats per slot returned */
+ inst->regs_written = 2 * dispatch_width / 8;
inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
INTERP_QUALIFIER_NOPERSPECTIVE;