aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/gallivm
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: fix fp64 inputs to geom shader.Dave Airlie2015-12-091-4/+12
| | | | | | | | | | | This fixes the fetching of fp64 inputs to the geometry shader, this fixes the recently posted piglit's arb_gpu_shader_fp64/execution/gs-fs-vs-double-array.shader_test arb_vertex_attrib_64bit/execution/gs-fs-vs-attrib-double-array.shader_test Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/auxiliary: Sanitize NULL checks into canonical formEdward O'Callaghan2015-12-061-1/+1
| | | | | | | | | | Use NULL tests of the form `if (ptr)' or `if (!ptr)'. They do not depend on the definition of the symbol NULL. Further, they provide the opportunity for the accidental assignment, are clear and succinct. Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/auxiliary: Trivial code style cleanupEdward O'Callaghan2015-12-062-11/+11
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallivm: use sampler index 0 for texel fetchesRoland Scheidegger2015-11-201-1/+6
| | | | | | | | | | | | | texel fetches don't use any samplers. Previously we just set the same number for both texture and sampler unit (as per "ordinary" gl style sampling where the numbers are always the same) however this would trigger some assertions checking that the sampler index isn't over PIPE_MAX_SAMPLERS limit elsewhere with d3d10, so just set to 0. (Fixing the assertion instead isn't really an option, the sampler isn't really used but might still pass an out-of-bound pointer around and even copy some things from it.) Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: disable VSX in ppc due to LLVM PPC bugOded Gabbay2015-11-181-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch disables the use of VSX instructions, as they cause some piglit tests to fail For more details, see: https://llvm.org/bugs/show_bug.cgi?id=25503#c7 With this patch, ppc64le reaches parity with x86-64 as far as piglit test suite is concerned. v2: - Added check that we have at least LLVM 3.4 - Added the LLVM bug URL as a comment in the code v3: - Only disable VSX if Altivec is supported, because if Altivec support is missing, then VSX support doesn't exist anyway. - Change original patch description. Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.0" <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix sampling for s3tc srgb formats when using texture cacheRoland Scheidegger2015-11-041-1/+3
| | | | | | | | | This actually stored the values as 8bit linear values in the cache, then did another srgb->linear conversion... We don't want to do the former (decoding 8bit srgb values to 8bit linear completely defeats the purpose of srgb in the first place), so just decode to 8bit srgb. Fixes piglit.spec.ext_texture_srgb.texwrap formats-s3tc tests.
* llvmpipe: add cache for compressed texturesRoland Scheidegger2015-11-0410-7/+615
| | | | | | | | | | | | | | | | | | | | | | compressed textures are very slow because decoding is rather complex (and because there's no jit code code to decode them too for non-technical reasons). Thus, add some texture cache which holds a couple of decoded blocks. Right now this handles only s3tc format albeit it could be extended to work with other formats rather trivially as long as the result of decode fits into 32bit per texel (ideally, rgtc actually would decode to more than 8 bits per channel, but even then making it work for it shouldn't be too difficult). This can improve performance noticeably but don't expect wonders (uncompressed is unsurprisingly still faster). It's also possible it might be slower in some cases (using nearest filtering for example or if there's otherwise not many cache hits, the cache is only direct mapped which isn't great). Also, actual decode of a block relies on util code, thus even though always full blocks are decoded it is done texel by texel - this could obviously benefit greatly from simd-optimized code decoding full blocks at once... Note the cache is per (raster) thread, and currently only used for fragment shaders. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: disable f16c when not using AVXRoland Scheidegger2015-10-261-0/+3
| | | | | | | | | | | | | | | | | f16c intrinsic can only be emitted when AVX is used. So when we disable AVX due to forcing 128bit vectors we must not use this intrinsic (depending on llvm version, this worked previously because llvm used AVX even when we didn't tell it to, however I've seen this fail with llvm 3.3 since 718249843b915decf8fccec92e466ac1a6219934 which seems to have the side effect of disabling avx in llvm albeit it only touches sse flags really, but with ea421e919ae6e72e1319fb205c42a6fb53ca2f82 it's now really disabled). Albeit being able to use AVX with 128bit vectors also would have its uses, the code as is really was meant to emulate jit code creation for less capable cpus. v2: add some (ifdefed out) missing de-featuring options for simulating less capable cpus. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix tex offsets with mirror repeat linearRoland Scheidegger2015-10-241-4/+5
| | | | | | | | Can't see why anyone would ever want to use this, but it was clearly broken. This fixes the piglit texwrap offset test using this combination. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix sampling with texture offsets in SoA pathRoland Scheidegger2015-10-241-3/+8
| | | | | | | | | | | | | When using nearest filtering and clamp / clamp to edge wrapping results could be wrong for negative offsets. Fix this by adding the offset before doing the conversion to int coords (could also use floor instead of trunc int conversion but probably more complex on "typical" cpu). This fixes the piglit texwrap offset failures with this filter/wrap combo (which only leaves the linear/mirror repeat combination broken). Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Explicitly disable unsupported CPU features.Jose Fonseca2015-10-231-38/+34
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92214 CC: "10.6 11.0" <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Translate all util_cpu_caps bits to LLVM attributes.Jose Fonseca2015-10-221-2/+34
| | | | | | | | | | | | | | This should prevent disparity between features Mesa and LLVM believe are supported by the CPU. http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990 Tested on a i7-3720QM w/ LLVM 3.3 and 3.6. v2: Increase SmallVector initial size as suggested by Gustaw Smolarczyk. Reviewed-by: Roland Scheidegger <[email protected]> CC: "10.6 11.0" <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTMarek Olšák2015-10-201-0/+2
| | | | | | | | | | | | | | This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallivm: implement the correct version of LRPMarek Olšák2015-10-171-6/+13
| | | | | | | | | The previous version has precision issues. This can be a problem with tessellation. Sadly, I can't find the article where I read it anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert this. v2: added the comment
* gallivm: set correct opcode info from unary/binary/ternary emitsMarek Olšák2015-10-171-3/+6
| | | | | | | | and clear the emit_data structure. The new radeonsi min/max opcode implementation requires this. (it looks good according to Roland S.)
* gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2Tom Stellard2015-10-022-7/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers and state trackers that use LLVM for generating code, must register the targets they use with LLVM's global TargetRegistry. The TargetRegistry is not thread-safe, so all targets must be added to the registry before it can be queried for target information. When drivers and state trackers initialize their own targets, they need a way to force gallivm to initialize its targets at the same time. Otherwise, there can be a race condition in some multi-threaded applications (e.g. glx-multihreaded-shader-compile in piglit), when one thread creates a context for a driver that uses LLVM (e.g. radeonsi) and another thread creates a gallivm context (glxContextCreate does this). The race happens when the driver thread initializes its LLVM targets and then starts using the registry before the gallivm thread has a chance to register its targets. This patch allows users to force gallivm to register its targets by calling the gallivm_init_llvm_targets() function. v2: - Use call_once and remove mutexes and static initializations. - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets(). Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-by: Emil Velikov <[email protected]> CC: "10.6 11.0" <[email protected]>
* llvmpipe: convert double to long long instead of unsigned long longOded Gabbay2015-09-041-1/+1
| | | | | | | | | | | | | | | | | round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <[email protected]> CC: "10.6 11.0" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Fix GCC unused-variable warning.Vinson Lee2015-07-311-2/+1
| | | | | | | | | | lp_bld_tgsi_soa.c: In function 'lp_emit_immediate_soa': lp_bld_tgsi_soa.c:3065:18: warning: unused variable 'size' [-Wunused-variable] const uint size = imm->Immediate.NrTokens - 1; ^ Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallivm: add LLVMAttribute parameter to lp_build_intrinsicMarek Olšák2015-07-315-10/+15
| | | | | | This will help remove some duplicated code from radeon. Reviewed-by: Dave Airlie <[email protected]>
* gallivm: Fix profile build.Jose Fonseca2015-07-231-1/+1
|
* gallivm: Add ifdefs so raw_debug_stream is only defined when usedTom Stellard2015-07-231-0/+2
| | | | | | | Its only use is to implement a custom version of LLVMDumpValue on some Windows and embedded platforms. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Don't use raw_debug_ostream for dissasemblingTom Stellard2015-07-231-14/+13
| | | | | | | | All LLVM API calls that require an ostream object have been removed from the disassemble() function, so we don't need to use this class to wrap _debug_printf() we can just call this function directly. Reviewed-by: Jose Fonseca <[email protected]>
* gallium: replace INLINE with inlineIlia Mirkin2015-07-2111-41/+41
| | | | | | | | | | | | | | | | Generated by running: git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]>
* gallivm: Initialize LLVM Modules's DataLayout to an empty string.Tom Stellard2015-07-201-5/+23
| | | | | | | | | | | | | | This fixes crashes in llvmpipe with LLVM 3.8 and also some piglit tests on radeonsi that use the draw module. This is just a temporary solution. The correct solution will require creating a TargetMachine during gallivm initialization and pulling the DataLayout from there. This will be a somewhat invasive change, and it will need to be validatated on multiple LLVM versions. https://llvm.org/bugs/show_bug.cgi?id=24172 Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix lp_build_compare_extRoland Scheidegger2015-07-062-1/+4
| | | | | | | | | | | | The expansion should always be to the same width as the input arguments no matter what, since these functions should work with any bit width of the arguments (the sext is a no-op on any sane simd architecture). Thus, fix the caller expecting differently. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=91222 Tested-by: Vinson Lee <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: add fp64 support. (v2.1)Dave Airlie2015-07-018-31/+553
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for ARB_gpu_shader_fp64 and ARB_vertex_attrib_64bit to llvmpipe. Two things that don't mix well are SoA and doubles, see emit_fetch_double, and emit_store_double_chan in this. I've also had to split emit_data.chan, to add src_chan, which can be different for doubles. It handles indirect double fetches from temps, inputs, constants and immediates. It doesn't handle double stores to indirects, however it appears the mesa/st doesn't currently emit these, it always does UARL/MOV combos, which will work fine. tested with piglit, no regressions, all the fp64 tests seem to pass. v2: switch to using shuffles for fetch/store (Roland) assert on indirect double stores - mesa/st never emits these (it uses MOV) fix indirect temp/input/constant/immediates (Roland) typos/formatting fixes (Roland) v2.1: cleanup some long lines, emit_store_double_chan cleanups. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe,llvmpipe: fix PIPE_SHADER_CAP_MAX_INPUTS valueMarek Olšák2015-06-251-1/+1
| | | | | | | | | | PIPE_MAX_SHADER_INPUTS was recently bumped to 80 because of tessellation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91099 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91101 Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/gallivm: add invocation ID support for llvmpipe.Dave Airlie2015-06-232-0/+6
| | | | | | | This extends the draw code to add support for invocations. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* llvmpipe: Truncate the binned constants to max const buffer size.Jose Fonseca2015-06-191-1/+5
| | | | | | | | | Tested with Ilia Mirkin's gzdoom.trace and "arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test without my earlier fix to fail linkage when UBO exceeds GL_MAX_UNIFORM_BLOCK_SIZE. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Only build lp_profile() body when PROFILE is definedTom Stellard2015-06-121-1/+1
| | | | | | | The only use of lp_profile() is wrapped in #if defined(PROFILE), so there is no reason to build it unless this macro is defined. Reviewed-by: Jose Fonseca <[email protected]>
* tgsi/ureg: don't emit in/out arrays if drivers don't support ranged declarationsMarek Olšák2015-06-051-0/+1
| | | | | | Softpipe, llvmpipe, r300g, and radeonsi pass tests. Other drivers need testing. Freedreno and nv30 are definitely broken. Other drivers seem to be alright.
* gallivm: silence unused var warnings for non-debug buildBrian Paul2015-06-011-0/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Remove stub disassemblerSymbolLookupCB.Jose Fonseca2015-06-011-13/+1
| | | | | | | | | It's incompletete -- it wasn't filling ReferenceType so it was causing garbagge on the disassembly. Furthermore it seems impossible to get the jump information through this interface. The solution for function size problem is to effectively book-keep the machine code start and end address while JIT'ing.
* gallivm: make sampling more robust when the sampler setup is bogusRoland Scheidegger2015-05-291-6/+32
| | | | | | | | | Pure integer formats cannot be sampled with linear tex / mip filters. In GL such a setup would make the texture incomplete. We shouldn't rely on the state tracker though to filter that out, just return all zeros instead of dying in the lerp. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Use the LLVM's C disassembly interface.Jose Fonseca2015-05-291-223/+37
| | | | | | | | | | | It doesn't do everything we want. In particular it doesn't allow to detect jumps or return opcodes. Currently we detect the x86's RET opcode. Even though it's worse for LLVM 3.3, it's an improvement for LLVM 3.7, which was totally busted. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Disable frame pointer omission on LLVM 3.7.Jose Fonseca2015-05-291-0/+10
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Workaround LLVM PR23628.Jose Fonseca2015-05-281-0/+11
| | | | | | | | | | Temporarily undefine DEBUG macro while including LLVM C++ headers, leveraging the push/pop_macro pragmas, which are supported both by GCC and MSVC. https://bugs.freedesktop.org/show_bug.cgi?id=90621 Trivial.
* gallivm: Do not use NoFramePointerElim with LLVM 3.7.Vinson Lee2015-05-272-0/+4
| | | | | | | | | TargetOptions::NoFramePointerElim was removed in llvm-3.7.0svn r238244 "Remove NoFramePointerElim and NoFramePointerElimOverride from TargetOptions and remove ExecutionEngine's dependence on CodeGen. NFC." Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* gallium: remove TGSI_SAT_MINUS_PLUS_ONEMarek Olšák2015-05-202-35/+2
| | | | | | | | It's a remnant of some old NV extension. Unused. I also have a patch that removes predicates if anyone is interested. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: enable ARB_texture_viewRoland Scheidegger2015-05-131-1/+1
| | | | | | | | | | | | | | | All the functionality was pretty much there, just not tested. Trivially fix up the missing pieces (take target info from view not resource), and add some missing bits for cubes. Also add some minimal debug validation to detect uninitialized target values in the view... 49 new piglits, 47 pass, 2 fail (both related to fake multisampling, not texture_view itself). No other piglit changes. v2: move sampler view validation to sampler view creation, update docs. Reviewed-by: Brian Paul <[email protected]>
* gallivm: Fix build against LLVM 3.7 SVN r235265Nick Sarnie2015-04-202-2/+2
| | | | | | | | | LLVM removed JITEmitDebugInfo from TargetOptions since they weren't used v2: Be consistent with the LLVM version check (Aaron Watry) Signed-off-by: Nick Sarnie <[email protected]> Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* gallivm: Fix build since llvm-3.7.0svn r234495Nick Sarnie2015-04-101-4/+0
| | | | | | | | Revert 50e9fa2ed69cb5f76f66231976ea789c0091a64d as LLVM reverted their change. Signed-off-by: Nick Sarnie <[email protected]> Reviewed-by: Jan Vesely <[email protected]>
* gallivm: Fix build since llvm-3.7.0svn r234460.Vinson Lee2015-04-091-0/+4
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89963 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* gallivm: don't use control flow when doing indirect constant buffer lookupsRoland Scheidegger2015-04-091-57/+36
| | | | | | | | | | | | | | | | | | | | | | | llvm goes crazy when doing that, using way more memory and time, though there's probably more to it - this points to a very much similar issue as fixed in 8a9f5ecdb116d0449d63f7b94efbfa8b205d826f. In any case I've seen a quite plain looking vertex shader with just ~50 simple tgsi instructions (but with a dozen or so such indirect constant buffer lookups) go from a terribly high ~440ms compile time (consuming 25MB of memory in the process) down to a still awful ~230ms and 13MB with this fix (with llvm 3.3), so there's still obvious improvements possible (but I have no clue why it's so slow...). The resulting shader is most likely also faster (certainly seemed so though I don't have any hard numbers as it may have been influenced by compile times) since generally fetching constants outside the buffer range is most likely an app error (that is we expect all indices to be valid). It is possible this fixes some mysterious vertex shader slowdowns we've seen ever since we are conforming to newer apis at least partially (the main draw loop also has similar looking conditionals which we probably could do without - if not for the fetch at least for the additional elts condition.) v2: use static vars for the fake bufs, minor code cleanups Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: (trivial) fix the logic deciding if function call should be used...Roland Scheidegger2015-04-011-3/+1
| | | | | Copy and paste bug with the img filter decision. Since there's only 2 different filters anyway just drop this bit.
* gallivm: do some hack heuristic to disable texture functionsRoland Scheidegger2015-04-011-0/+40
| | | | | | | | | | | | | | We've seen some cases where performance can hurt quite a bit. Technically, the more simple the function the more overhead there is for using a function for this (and the less benefits this provides). Hence don't do this if we expect the generated code to be simple. There's an even more important reason why this hurts performance, which is shaders reusing the same unit with some of the same inputs, as llvm cannot figure out the calculations are the same if they are performned in the function (even just reusing the same unit without any input being the same provides such optimization opportunities though not very much). This is something which would need to be handled by IPO passes however.
* gallivm: implement TG4 for ARB_texture_gatherRoland Scheidegger2015-03-312-40/+133
| | | | | | | | | | | | | | | | This is quite trivial, essentially just follow all the same code you'd use with linear min/mag (and no mip) filter, then just skip the filtering after looking up the texels in favor of direct assignment of the right channel to the result. (This is though not true for the multi-offset version if we'd want to support it - for this would probably need to do something along the lines of 4x nearest sampling due to the necessity of doing coord wrapping individually per texel.) Supports multi-channel formats. From the SM5 gather cap bit, should support non-constant offsets, plus shadow comparisons (the former untested), but not component selection (should be easy to implement but all this stuff is not really exposable anyway for now). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: add gather support to sampler interfaceRoland Scheidegger2015-03-313-21/+34
| | | | | | Luckily thanks to the revamped interface this is a lot less work now... Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: simplify sampler interfaceRoland Scheidegger2015-03-314-220/+205
| | | | | | | | | | | | | | | | | | This has got a bit out of control with more and more parameters added. Worse, whenever something in there changes all callees have to be updated for that, even though they don't really do much with any parameter in there except pass it on to the actual sampling function. Hence simply put almost everything into a struct. Also instead of relying on some arguments being NULL, be explicit and set this in a key (which is just reused for function generation for simplicity). (The code still relies on them being NULL in the end for now.) Technically there is a minimal functional change here for shadow sampling: if shadow sampling is done is now determined explicitly by the texture function (either sample_c or the gl-style tex func inherit this from target) instead of the static texture state. These two should always match, however. Otherwise, it should generate all the same code. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Fix build against LLVM 3.7 SVN r233648Michel Dänzer2015-03-311-0/+5
| | | | Reviewed-by: Jose Fonseca <[email protected]>