diff options
author | Kenneth Graunke <[email protected]> | 2015-02-20 15:11:49 -0800 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2015-04-06 13:49:02 -0700 |
commit | 797d606127c131a6ccff28150495d2b1f3f7e46e (patch) | |
tree | 376706ada2b695f57eaa4dd5c96d7068f8383861 /src/mesa/drivers/dri/i965/brw_fs.h | |
parent | 8aee87fe4cce0a883867df3546db0e0a36908086 (diff) |
i965: Implement SIMD16 texturing on Gen4.
This allows SIMD16 mode to work for a lot more programs. Texturing is
also more efficient in SIMD16 mode than SIMD8. Several messages don't
actually exist in SIMD8 mode, so we did SIMD16 messages and threw away
half of the data. Now we compute real data in both halves.
Also, the SIMD16 "sample" message doesn't require all three coordinate
components to exist (like the SIMD8 one), so we can shorten the message
lengths, cutting register usage a bit.
I chose to implement the visitor functionality in a separate function,
since mixing true SIMD16 with SIMD8 code that uses SIMD16 fallbacks
seemed like a mess. The new code bails on a few cases where we'd
have to do two SIMD8 messages - we just fall back to SIMD8 for now.
Improves performance in "Shadowrun: Dragonfall - Director's Cut" by
about 20% on GM45 (measured with LIBGL_SHOW_FPS=1 while standing around
in the first mission).
v2: Add ir_txf to the has_lod case (caught by Jordan Justen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Diffstat (limited to 'src/mesa/drivers/dri/i965/brw_fs.h')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_fs.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 278a8eed76d..cfdbf555d62 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -271,6 +271,10 @@ public: fs_reg shadow_comp, fs_reg lod, fs_reg lod2, int grad_components, uint32_t sampler); + fs_inst *emit_texture_gen4_simd16(ir_texture_opcode op, fs_reg dst, + fs_reg coordinate, int vector_elements, + fs_reg shadow_c, fs_reg lod, + uint32_t sampler); fs_inst *emit_texture_gen5(ir_texture_opcode op, fs_reg dst, fs_reg coordinate, int coord_components, fs_reg shadow_comp, |