diff options
author | Samuel Iglesias Gonsálvez <[email protected]> | 2016-08-29 10:10:30 +0200 |
---|---|---|
committer | Francisco Jerez <[email protected]> | 2017-04-14 14:56:08 -0700 |
commit | a21dc2b500cff6e0aaf31867c5b42651306ddaf1 (patch) | |
tree | 38ba3688df51336582c62cf7ad9470de7ec7eaf0 /src/intel/compiler/brw_vec4.cpp | |
parent | a5399e8b1cc3e2e12b8aa067e8380d1b088c35ca (diff) |
i965/vec4: split DF instructions and later double its execsize in IVB/BYT
We need to split DF instructions in two on IVB/BYT as it needs an
execsize 8 to process 4 DF values (one GRF in total).
v2:
- Rename helper and make it static inline function (Matt).
- Fix indention and add braces (Matt).
v3:
- Don't edit IR instruction when doubling exec_size (Curro)
- Add comment into the code (Curro).
- Manage ARF registers like the others (Curro)
v4:
- Add get_exec_type() function and use it to calculate the execution
size.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
[ Francisco Jerez: Fix bogus 'type != BAD_FILE' check. Take
destination type as execution type where there is no valid source.
Assert-fail if the deduced execution type is byte. Clarify comment
in get_lowered_simd_width(). Move SIMD width workaround outside of
'if (...inst->size_written > REG_SIZE)' conditional block, since the
problem should be independent of whether the amount of data written
by the instruction is greater or lower than a GRF. Drop redundant
is_ivb_df definition. Drop bogus inst->exec_size < 8 check.
Simplify channel group assertion. ]
Reviewed-by: Francisco Jerez <[email protected]>
Diffstat (limited to 'src/intel/compiler/brw_vec4.cpp')
-rw-r--r-- | src/intel/compiler/brw_vec4.cpp | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index d7c09093032..adbd85036e0 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2115,6 +2115,16 @@ get_lowered_simd_width(const struct gen_device_info *devinfo, } } + /* IvyBridge can manage a maximum of 4 DFs per SIMD4x2 instruction, since + * it doesn't support compression in Align16 mode, no matter if it has + * force_writemask_all enabled or disabled (the latter is affected by the + * compressed instruction bug in gen7, which is another reason to enforce + * this limit). + */ + if (devinfo->gen == 7 && !devinfo->is_haswell && + (get_exec_type_size(inst) == 8 || type_sz(inst->dst.type) == 8)) + lowered_width = MIN2(lowered_width, 4); + return lowered_width; } |