diff options
author | Jason Ekstrand <[email protected]> | 2014-07-21 16:46:39 -0700 |
---|---|---|
committer | Jason Ekstrand <[email protected]> | 2014-07-24 12:44:56 -0700 |
commit | 989d2e370993c87d1bbda4950657bfcc5b0a58dd (patch) | |
tree | 1b9a048a63e291fb68d7c1a2bf00e08d69d7e672 /src | |
parent | 2a33510f1649f2ef5c5b2d693aa89ef0efc5dcfb (diff) |
Add an accelerated version of F_TO_I for x86_64
According to a quick micro-benchmark, this new version is 20% faster on my
Haswell laptop.
v2: Removed the XXX note about x86_64 from the comment
v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC
support for free.
v4: Enable it for all x86_64 builds, not just with USE_X86_64_ASM
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Diffstat (limited to 'src')
-rw-r--r-- | src/mesa/main/imports.h | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h index af780b2498f..09e55ebf0ff 100644 --- a/src/mesa/main/imports.h +++ b/src/mesa/main/imports.h @@ -274,10 +274,12 @@ static inline int IROUND_POS(float f) return (int) (f + 0.5F); } +#ifdef __x86_64__ +# include <xmmintrin.h> +#endif /** * Convert float to int using a fast method. The rounding mode may vary. - * XXX We could use an x86-64/SSE2 version here. */ static inline int F_TO_I(float f) { @@ -292,6 +294,8 @@ static inline int F_TO_I(float f) fistp r } return r; +#elif defined(__x86_64__) + return _mm_cvt_ss2si(_mm_load_ss(&f)); #else return IROUND(f); #endif |