summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/gallivm
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: Introduce lp_format_intrinsic.Jose Fonseca2016-04-043-14/+54
| | | | | | | | | | For adding .v4f32 like suffixes to intrinsics, taking special care for scalar case, which was being often neglected. This fixes invalid IR when doing mipmap filtering on SSE2 (the only case where we'd use intrinsics with scalars.) Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use llvm.fabs.Jose Fonseca2016-04-031-8/+3
| | | | | | Exactly the same code. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Prefer backend agnostic intrinsic for rounding.Jose Fonseca2016-04-031-7/+39
| | | | | | | | | We could unconditionally use these instrinsics, but performance with SSE2 would suck, as LLVM falls back to calling libm. lp_test_arit. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Add debug option to force SSE2.Jose Fonseca2016-04-031-11/+14
| | | | | | For simulating less capable machines. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Fix performance regressions due to vector selects.Jose Fonseca2016-04-031-22/+18
| | | | | | | | | LLVM often can't determine the mask elements are all ones/zeros, and there doesn't seem to be a good way to hint that. Thanks to Roland Scheidegger for spotting and analyzing the issue. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Remove lp_build_load_volatile.Jose Fonseca2016-04-032-12/+0
| | | | | | | No longer needed. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards.Jose Fonseca2016-04-037-23/+35
| | | | | | | | | Only provide a fallback for LLVM 3.3. One less dependency on LLVM C++ interface. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Prevent disassembly debug output from being truncated.Jose Fonseca2016-04-011-9/+9
| | | | | | | | | | | By using os_log_message directly, as _debug_vprintf truncates messages to 4K. Also cleanup the disassemble interface. Spotted by Roland. Trivial.
* gallivm: Use vector selects on LLVM 3.3+.Jose Fonseca2016-04-011-3/+5
| | | | | | | | | | | | This is an old patch I had around. Vector selects seem to work well from LLVM 3.3. Using them should improve code quality, as it might make constant propagation pass more effective. Tested lp_test_* Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: special case TGSI_OPCODE_STORENicolai Hähnle2016-03-091-1/+1
| | | | | | | | This instruction has the resource (buffer or image) as a destination to represent the writemask for SSBO writes. However, this is obviously not a "real" destination for the purpose of emitting LLVM IR. Reviewed-by: Marek Olšák <[email protected]>
* gallium/auxilary: more __cplusplus exportsTim Rowley2016-03-024-0/+28
| | | | | | | | swr driver which is written in C++ needs access to some more gallium utility functions than are currently exposed. Reviewed-by: Roland Scheidegger <[email protected]> Acked-by: Jose Fonseca <[email protected]>
* gallivm: Check whether to stop disassemble only for x86Oded Gabbay2016-02-191-0/+2
| | | | | | | | | | Because the if statement that checks whether we have a return statement is valid only on x86, surround it with X86 or X86-64 arch defines Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: use sstream for dissasemblingOded Gabbay2016-02-191-21/+30
| | | | | | | | | | | | | Currently, disassemble() directly prints to stdout. This has broke the profiling support for llvmpipe JIT code. This patch redirects the output to an sstream object, which is then either gets printed to stdout (for assembly debugging) or gets written to a file in /tmp/ (for profiling support). Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm, tgsi: provide fake sample_i_ms implementationsRoland Scheidegger2016-02-181-1/+6
| | | | | | | Just like the rest of the msaa "implementation" it's just fake for now... Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Add helpers for creating and destroying TargetLibraryInfoTom Stellard2016-02-172-0/+37
| | | | | | | This functionality is not exposed via the LLVM C API. Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* Handle removal of LLVMAddTargetData in SVN revision 260919Matthew Dawson2016-02-161-0/+2
| | | | | | | | | | | | | | | | | | | LLVM removed LLVMAddTargetData for the 3.9 release in r260919. For the two places in mesa where this is called, only enable the lines when compiling for less then 3.9. For the radeon driver, I'm not sure how to check if any other LLVM calls need to be adjusted. I think since the target data used is extracted from the LLVMModule, it isn't necessary to pass it back to LLVM again. The code does compile, and at least for radeonsi does run OpenGL games. [ Michel Dänzer: Move #if closer to LLVMAddTargetData in lp_bld_init.c, and add HAVE_LLVM < 0x0309 guards around now unused occurrences of TD and data_layout ] Signed-off-by: Matthew Dawson <[email protected]> Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_IMAGESIlia Mirkin2016-02-151-0/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add PIPE_SHADER_CAP_SUPPORTED_IRSSamuel Pitoiset2016-02-131-0/+2
| | | | | | | | | | | | This cap indicates the supported representations of programs. It should be a mask of pipe_shader_ir bits. It will allow to enable ARB_compute_shader if the underlying driver supports TGSI. Changes from v2: - improve description of PIPE_SHADER_CAP_SUPPORTED_IRS Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallivm: add PK2H/UP2H supportRoland Scheidegger2016-02-022-7/+9
| | | | | | | | | | | | Add support for these opcodes, the conversion functions were already there albeit need some new packing stuff. Just like the tgsi version, piglit won't like it for all the same reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing tests, albeit since PK2H won't due to those rounding differences I don't know if that one works or not as the piglit test is rather difficult to deal with). Reviewed-by: Brian Paul <[email protected]>
* gallivm: add PK2H/UP2H supportRoland Scheidegger2016-02-025-2/+119
| | | | | | | | | | Add support for these opcodes, the conversion functions were already there albeit need some new packing stuff. Just like the tgsi version, piglit won't like it for all the same reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing tests, albeit since PK2H won't due those rounding differences I don't know if that one works or not as the piglit test is rather difficult to deal with).
* llvmpipe: use vpkswss when dst is signedOded Gabbay2016-01-181-16/+15
| | | | | | | | | | | | | | | | | | | This patch fixes a bug when building a pack instruction. For POWER (altivec), in case the destination is signed and the src width is 32, we need to use vpkswss. The original code used vpkuwus, which emits an unsigned result. This fixes the following piglit tests on ppc64le: - spec@arb_color_buffer_float@gl_rgba8-drawpixels - shaders@glsl-fs-fogscale I've also corrected some coding style issues in the function. v2: Returned else statements to vmware style Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: avoid crashing in mod by 0 with llvmpipeJeff Muizelaar2016-01-161-2/+16
| | | | | | | This adds code that is basically the same as the code in umod, udiv and idiv. However, unlike idiv we return -1. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERSIlia Mirkin2016-01-081-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe: use sse2 conv code for altivecOded Gabbay2016-01-071-2/+2
| | | | | | | | | | | | | | | In lp_build_conv() and lp_build_conv_auto(), there is a special case of conversion when sse2 is present. That code path is suitable without any changes to altivec, because all the functions that are called in that code path already support altivec. This patch increase the FPS in POWER arch across the board between 10%-25% I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2. Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: Use unsigned for loop indexEdward O'Callaghan2016-01-061-3/+3
| | | | | | Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: Remove unnecessary semicolonsEdward O'Callaghan2016-01-061-1/+1
| | | | | | | | | Fix silly issue with MSVC case fall-though support to need a extra 'break;' Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: fix fp64 inputs to geom shader.Dave Airlie2015-12-091-4/+12
| | | | | | | | | | | This fixes the fetching of fp64 inputs to the geometry shader, this fixes the recently posted piglit's arb_gpu_shader_fp64/execution/gs-fs-vs-double-array.shader_test arb_vertex_attrib_64bit/execution/gs-fs-vs-attrib-double-array.shader_test Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/auxiliary: Sanitize NULL checks into canonical formEdward O'Callaghan2015-12-061-1/+1
| | | | | | | | | | Use NULL tests of the form `if (ptr)' or `if (!ptr)'. They do not depend on the definition of the symbol NULL. Further, they provide the opportunity for the accidental assignment, are clear and succinct. Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/auxiliary: Trivial code style cleanupEdward O'Callaghan2015-12-062-11/+11
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallivm: use sampler index 0 for texel fetchesRoland Scheidegger2015-11-201-1/+6
| | | | | | | | | | | | | texel fetches don't use any samplers. Previously we just set the same number for both texture and sampler unit (as per "ordinary" gl style sampling where the numbers are always the same) however this would trigger some assertions checking that the sampler index isn't over PIPE_MAX_SAMPLERS limit elsewhere with d3d10, so just set to 0. (Fixing the assertion instead isn't really an option, the sampler isn't really used but might still pass an out-of-bound pointer around and even copy some things from it.) Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: disable VSX in ppc due to LLVM PPC bugOded Gabbay2015-11-181-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch disables the use of VSX instructions, as they cause some piglit tests to fail For more details, see: https://llvm.org/bugs/show_bug.cgi?id=25503#c7 With this patch, ppc64le reaches parity with x86-64 as far as piglit test suite is concerned. v2: - Added check that we have at least LLVM 3.4 - Added the LLVM bug URL as a comment in the code v3: - Only disable VSX if Altivec is supported, because if Altivec support is missing, then VSX support doesn't exist anyway. - Change original patch description. Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.0" <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix sampling for s3tc srgb formats when using texture cacheRoland Scheidegger2015-11-041-1/+3
| | | | | | | | | This actually stored the values as 8bit linear values in the cache, then did another srgb->linear conversion... We don't want to do the former (decoding 8bit srgb values to 8bit linear completely defeats the purpose of srgb in the first place), so just decode to 8bit srgb. Fixes piglit.spec.ext_texture_srgb.texwrap formats-s3tc tests.
* llvmpipe: add cache for compressed texturesRoland Scheidegger2015-11-0410-7/+615
| | | | | | | | | | | | | | | | | | | | | | compressed textures are very slow because decoding is rather complex (and because there's no jit code code to decode them too for non-technical reasons). Thus, add some texture cache which holds a couple of decoded blocks. Right now this handles only s3tc format albeit it could be extended to work with other formats rather trivially as long as the result of decode fits into 32bit per texel (ideally, rgtc actually would decode to more than 8 bits per channel, but even then making it work for it shouldn't be too difficult). This can improve performance noticeably but don't expect wonders (uncompressed is unsurprisingly still faster). It's also possible it might be slower in some cases (using nearest filtering for example or if there's otherwise not many cache hits, the cache is only direct mapped which isn't great). Also, actual decode of a block relies on util code, thus even though always full blocks are decoded it is done texel by texel - this could obviously benefit greatly from simd-optimized code decoding full blocks at once... Note the cache is per (raster) thread, and currently only used for fragment shaders. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: disable f16c when not using AVXRoland Scheidegger2015-10-261-0/+3
| | | | | | | | | | | | | | | | | f16c intrinsic can only be emitted when AVX is used. So when we disable AVX due to forcing 128bit vectors we must not use this intrinsic (depending on llvm version, this worked previously because llvm used AVX even when we didn't tell it to, however I've seen this fail with llvm 3.3 since 718249843b915decf8fccec92e466ac1a6219934 which seems to have the side effect of disabling avx in llvm albeit it only touches sse flags really, but with ea421e919ae6e72e1319fb205c42a6fb53ca2f82 it's now really disabled). Albeit being able to use AVX with 128bit vectors also would have its uses, the code as is really was meant to emulate jit code creation for less capable cpus. v2: add some (ifdefed out) missing de-featuring options for simulating less capable cpus. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix tex offsets with mirror repeat linearRoland Scheidegger2015-10-241-4/+5
| | | | | | | | Can't see why anyone would ever want to use this, but it was clearly broken. This fixes the piglit texwrap offset test using this combination. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix sampling with texture offsets in SoA pathRoland Scheidegger2015-10-241-3/+8
| | | | | | | | | | | | | When using nearest filtering and clamp / clamp to edge wrapping results could be wrong for negative offsets. Fix this by adding the offset before doing the conversion to int coords (could also use floor instead of trunc int conversion but probably more complex on "typical" cpu). This fixes the piglit texwrap offset failures with this filter/wrap combo (which only leaves the linear/mirror repeat combination broken). Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Explicitly disable unsupported CPU features.Jose Fonseca2015-10-231-38/+34
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92214 CC: "10.6 11.0" <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Translate all util_cpu_caps bits to LLVM attributes.Jose Fonseca2015-10-221-2/+34
| | | | | | | | | | | | | | This should prevent disparity between features Mesa and LLVM believe are supported by the CPU. http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990 Tested on a i7-3720QM w/ LLVM 3.3 and 3.6. v2: Increase SmallVector initial size as suggested by Gustaw Smolarczyk. Reviewed-by: Roland Scheidegger <[email protected]> CC: "10.6 11.0" <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTMarek Olšák2015-10-201-0/+2
| | | | | | | | | | | | | | This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallivm: implement the correct version of LRPMarek Olšák2015-10-171-6/+13
| | | | | | | | | The previous version has precision issues. This can be a problem with tessellation. Sadly, I can't find the article where I read it anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert this. v2: added the comment
* gallivm: set correct opcode info from unary/binary/ternary emitsMarek Olšák2015-10-171-3/+6
| | | | | | | | and clear the emit_data structure. The new radeonsi min/max opcode implementation requires this. (it looks good according to Roland S.)
* gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2Tom Stellard2015-10-022-7/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers and state trackers that use LLVM for generating code, must register the targets they use with LLVM's global TargetRegistry. The TargetRegistry is not thread-safe, so all targets must be added to the registry before it can be queried for target information. When drivers and state trackers initialize their own targets, they need a way to force gallivm to initialize its targets at the same time. Otherwise, there can be a race condition in some multi-threaded applications (e.g. glx-multihreaded-shader-compile in piglit), when one thread creates a context for a driver that uses LLVM (e.g. radeonsi) and another thread creates a gallivm context (glxContextCreate does this). The race happens when the driver thread initializes its LLVM targets and then starts using the registry before the gallivm thread has a chance to register its targets. This patch allows users to force gallivm to register its targets by calling the gallivm_init_llvm_targets() function. v2: - Use call_once and remove mutexes and static initializations. - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets(). Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-by: Emil Velikov <[email protected]> CC: "10.6 11.0" <[email protected]>
* llvmpipe: convert double to long long instead of unsigned long longOded Gabbay2015-09-041-1/+1
| | | | | | | | | | | | | | | | | round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <[email protected]> CC: "10.6 11.0" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Fix GCC unused-variable warning.Vinson Lee2015-07-311-2/+1
| | | | | | | | | | lp_bld_tgsi_soa.c: In function 'lp_emit_immediate_soa': lp_bld_tgsi_soa.c:3065:18: warning: unused variable 'size' [-Wunused-variable] const uint size = imm->Immediate.NrTokens - 1; ^ Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallivm: add LLVMAttribute parameter to lp_build_intrinsicMarek Olšák2015-07-315-10/+15
| | | | | | This will help remove some duplicated code from radeon. Reviewed-by: Dave Airlie <[email protected]>
* gallivm: Fix profile build.Jose Fonseca2015-07-231-1/+1
|
* gallivm: Add ifdefs so raw_debug_stream is only defined when usedTom Stellard2015-07-231-0/+2
| | | | | | | Its only use is to implement a custom version of LLVMDumpValue on some Windows and embedded platforms. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Don't use raw_debug_ostream for dissasemblingTom Stellard2015-07-231-14/+13
| | | | | | | | All LLVM API calls that require an ostream object have been removed from the disassemble() function, so we don't need to use this class to wrap _debug_printf() we can just call this function directly. Reviewed-by: Jose Fonseca <[email protected]>
* gallium: replace INLINE with inlineIlia Mirkin2015-07-2111-41/+41
| | | | | | | | | | | | | | | | Generated by running: git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]>
* gallivm: Initialize LLVM Modules's DataLayout to an empty string.Tom Stellard2015-07-201-5/+23
| | | | | | | | | | | | | | This fixes crashes in llvmpipe with LLVM 3.8 and also some piglit tests on radeonsi that use the draw module. This is just a temporary solution. The correct solution will require creating a TargetMachine during gallivm initialization and pulling the DataLayout from there. This will be a somewhat invasive change, and it will need to be validatated on multiple LLVM versions. https://llvm.org/bugs/show_bug.cgi?id=24172 Reviewed-by: Roland Scheidegger <[email protected]>