gallivm: optimize lp_build_minify for sse

SSE can't handle true vector shifts (with variable shift count), so llvm is turning them into a mess of extracts, scalar shifts and inserts. It is however possible to emulate them in lp_build_minify with float muls, which should be way faster (saves over 20 instructions per 8-wide lp_build_minify). This wouldn't work for "generic" 32bit shifts though since we've got only 24bits of mantissa (actually for left shifts it would work by using sse41 int mul instead of float mul but not for right shifts). Note that this has very limited scope for now, since this is only used with per-pixel lod (otherwise we're avoiding the non-constant shift count by doing per-quad shifts manually), and only 1d textures even then (though the latter should change). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
author: Roland Scheidegger <[email protected]> 2013-11-05 19:21:25 +0100
committer: Roland Scheidegger <[email protected]> 2013-11-05 23:32:24 +0100
commit: 5ae31d7e1d3d51c7843571c63aa228f8ca9b879f (patch)
tree: 0627f4c46df7c88b89171458bd0f8201f47bca14 /src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
parent: 7df7e730fb71249993c9dcabff4b5e7075a775f6 (diff)
1 files changed, 1 insertions, 1 deletions
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index 2d833318aee..e8c04d1e6c5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -2940,7 +2940,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
                                     lp_build_const_int32(gallivm, 2), "");
    }
 
-   size = lp_build_minify(&bld_int_vec4, size, lod);
+   size = lp_build_minify(&bld_int_vec4, size, lod, TRUE);
 
    if (has_array)
       size = LLVMBuildInsertElement(gallivm->builder, size,
author	Roland Scheidegger <[email protected]>	2013-11-05 19:21:25 +0100
committer	Roland Scheidegger <[email protected]>	2013-11-05 23:32:24 +0100
commit	5ae31d7e1d3d51c7843571c63aa228f8ca9b879f (patch)
tree	0627f4c46df7c88b89171458bd0f8201f47bca14 /src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
parent	7df7e730fb71249993c9dcabff4b5e7075a775f6 (diff)