util: add avx2 and xop detection to cpu detection code

Going to need this soon (not going to bother with avx2 intrinsics at this time but don't want to do workarounds for true vector shifts if llvm itself can use them just fine and won't need the gazillion instruction emulation). Not really tested other than my cpu returns 0 for these features... (I have no idea if llvm actually would emit avx2/xop instructions neither...) Reviewed-by: Jose Fonseca <[email protected]>
author: Roland Scheidegger <[email protected]> 2013-08-20 04:20:33 +0200
committer: Roland Scheidegger <[email protected]> 2013-08-20 23:00:24 +0200
commit: 4b45b61fef6e0f3325888c190e6e557d8948b31a (patch)
tree: 0d75ae0bb86cbc59fb114a8a2cca817f8b31d96a /src/gallium/auxiliary/gallivm
parent: 9299128bf297263144654abd0fc596f64dc13436 (diff)
1 files changed, 9 insertions, 2 deletions
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 61eadb838dc..61b561f9343 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -461,12 +461,15 @@ lp_build_init(void)
                                                  lp_native_vector_width);
 
    if (lp_native_vector_width <= 128) {
-      /* Hide AVX support, as often LLVM AVX instrinsics are only guarded by
+      /* Hide AVX support, as often LLVM AVX intrinsics are only guarded by
        * "util_cpu_caps.has_avx" predicate, and lack the
        * "lp_native_vector_width > 128" predicate. And also to ensure a more
        * consistent behavior, allowing one to test SSE2 on AVX machines.
+       * XXX: should not play games with util_cpu_caps directly as it might
+       * get used for other things outside llvm too.
        */
       util_cpu_caps.has_avx = 0;
+      util_cpu_caps.has_avx2 = 0;
    }
 
    if (!HAVE_AVX) {
@@ -476,13 +479,17 @@ lp_build_init(void)
        * omit it unnecessarily on amd cpus, see above).
        */
       util_cpu_caps.has_f16c = 0;
+      util_cpu_caps.has_xop = 0;
    }
 
 #ifdef PIPE_ARCH_PPC_64
    /* Set the NJ bit in VSCR to 0 so denormalized values are handled as
-    * specified by IEEE standard (PowerISA 2.06 - Section 6.3). This garantees
+    * specified by IEEE standard (PowerISA 2.06 - Section 6.3). This guarantees
     * that some rounding and half-float to float handling does not round
     * incorrectly to 0.
+    * XXX: should eventually follow same logic on all platforms.
+    * Right now denorms get explicitly disabled (but elsewhere) for x86,
+    * whereas ppc64 explicitly enables them...
     */
    if (util_cpu_caps.has_altivec) {
       unsigned short mask[] = { 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
author	Roland Scheidegger <[email protected]>	2013-08-20 04:20:33 +0200
committer	Roland Scheidegger <[email protected]>	2013-08-20 23:00:24 +0200
commit	4b45b61fef6e0f3325888c190e6e557d8948b31a (patch)
tree	0d75ae0bb86cbc59fb114a8a2cca817f8b31d96a /src/gallium/auxiliary/gallivm
parent	9299128bf297263144654abd0fc596f64dc13436 (diff)