summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* util: Do not use __builtin_clrsb with Intel C++ Compiler.Vinson Lee2014-05-301-1/+1
| | | | | | | | | | | This patch fixes this build error with icc 14.0.2. In file included from state_tracker/st_glsl_to_tgsi.cpp(63): ../../src/gallium/auxiliary/util/u_math.h(583): error: identifier "__builtin_clrsb" is undefined return 31 - __builtin_clrsb(i); ^ Signed-off-by: Vinson Lee <[email protected]>
* gallivm: Disable workaround for PR12833 on LLVM 3.2+.José Fonseca2014-05-231-2/+2
| | | | Fixed upstream.
* gallivm: Support MCJIT on Windows.José Fonseca2014-05-231-0/+9
| | | | | | | It works fine, though it requires using ELF objects. With this change there is nothing preventing us to switch exclusively to MCJIT, everywhere. It's still off though.
* tgsi: add GS_INVOCATIONS to property names arrayIlia Mirkin2014-05-211-1/+2
| | | | | | | | | | In commit 4be146b1, I neglected to add the new property to the strings array. This leads to the string '(null)' to be printed instead when converting a GS shader to text. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: do IR counting for shader cache management after optimization.Roland Scheidegger2014-05-192-2/+20
| | | | | | | | | | | 2ea923cf571235dfe573c35c3f0d90f632bd86d8 had the side effect of IR counting now being done after IR optimization instead of before. Some quick analysis shows that there's roughly 1.5 times more IR instructions before optimization than after, hence the effective shader cache size got quite a bit smaller. Could counter this with an increase of the instruction limit but it probably makes more sense to count them after optimizations, so move that code. Reviewed-by: Brian Paul <[email protected]>
* gallivm: (trivial) fix compilation with llvm 3.1, 3.2Roland Scheidegger2014-05-171-0/+4
| | | | | | I actually checked the getModuleIdentifier() function exists with 3.1 but missed that the file moved... This fixes https://bugs.freedesktop.org/show_bug.cgi?id=78803
* gallivm: print out how long it takes to optimize shader IR.Roland Scheidegger2014-05-163-1/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enabled with GALLIVM_DEBUG=perf (which up to now was only used to print warnings for unoptimized code). While some unexpectedly long shader compile times for some shaders were fixed with 8a9f5ecdb116d0449d63f7b94efbfa8b205d826f this should help recognize such problems in the future. For now though only available in debug builds (which are not always suitable for such analysis). And since this uses system time, it might not be all that accurate (even llvmpipe's own rasterization threads might be running at the same time, or just other tasks). (llvmpipe also has LP_DEBUG=counters but this only gives an average per shader and the the total time for all shaders.) This prints information like this: optimizing module fs17_variant0 took 1 msec optimizing module setup_variant_0 took 0 msec optimizing module draw_llvm_vs_variant0 took 9 msec optimizing module draw_llvm_vs_variant0 took 12 msec optimizing module fs17_variant1 took 2 msec v2: rebase for recent gallivm compilation changes, and print time for whole modules instead of functions (otherwise it would be very spammy since it would include all trivial inline sse2 functions), using the shiny new module names, prying them off LLVM using new helper (not available through C bindings). Per function timings, while possibly giving more information (if there'd be a problem only in for instance the partial not the whole function), don't seem all that useful for now. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: give more verbose names to modulesRoland Scheidegger2014-05-163-10/+17
| | | | | | | | | When we had just one module "gallivm" was an appropriate name. But now we have modules containing all functions for a particular variant, so give it a corresponding name (this is really just for helping debugging). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: remove optimization workaround when not having sse 4.1Roland Scheidegger2014-05-161-8/+1
| | | | | | | | | | | This workaround doesn't list any llvm version, but it was introduced 2010-06-10 (e277d5c1f6b2c5a6d202561e67d2b6821a69ecc4). It is unlikely this bug is still present in llvm versions we support (3.1+). There's no specific test listed, but I ran lp_test_arit (which uses the mentioned functions) on llvm 3.1 and 3.3 with sse41 disabled and this pass enabled without issues. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: remove workaround for reversing optimization pass order.Roland Scheidegger2014-05-161-13/+2
| | | | | | | | | | | | | | | | | 32bit code generation and llvm >= 2.7 used a different optimization pass order - this code was initially introduced (2010-07-23) by 815e79e72c1f4aa849c0ee6103621685b678bc9d, apparently due to buggy code being generated with then brand new llvm versions (which was llvm 2.7 plus pre 2.8 devel). It seems very highly likely that whatever this bug was it has been fixed in newer llvm versions, though there's no easy way to test this - the mentioned piglit test has been removed years ago, and even if you'd build it I'm sceptical the glsl compiler would still produce the required code to trigger it. I have no idea what a good order of passes is, but just remove the workaround and use the same order everywhere. Reviewed-by: Jose Fonseca <[email protected]>
* draw: better llvm names for shaders for debugging.Roland Scheidegger2014-05-151-6/+12
| | | | | | | | All shaders had the same name. We could probably use some identifier per shader too, but for now only use the variant number. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: only fetch pointers to constant buffers onceRoland Scheidegger2014-05-142-37/+65
| | | | | | | | | | | | | | | | | In 1d35f77228ad540a551a8e09e062b764a6e31f5e support for multiple constant buffers was introduced. This meant we had another indirection, and we did resolve the indirection for each constant buffer access. This looks very reasonable since llvm can figure out if it's the same pointer, however it turns out that this can cause llvm compilation time to go through the roof and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more than 10 seconds (!)), with all the additional time spent in IR optimization passes (and in the end all of it in DominatorTree::dominate()). I've been unable to narrow it down a bit more (only some shaders seem affected, seemingly without much correlation to overall shader complexity or constant usage) but it is easily avoidable by doing the buffer lookups themeselves just once (at constant buffer declaration time). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix output stream flushing in error case for disassembly.Roland Scheidegger2014-05-141-0/+5
| | | | | When there's an error, also need to flush the stream, otherwise an assertion is hit (meaning you don't actually see the error neither).
* tgsi: support parsing texture offsets from text tgsi shadersIlia Mirkin2014-05-141-5/+48
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Remove lp_func_delete_body.José Fonseca2014-05-143-15/+0
| | | | | | | Not necessary, now that we will free the whole module (hence all function bodies) immediately after compiling. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Remove gallivm_free_function.José Fonseca2014-05-142-23/+0
| | | | | | Unused. Deprecated by gallivm_free_ir(). Reviewed-by: Roland Scheidegger <[email protected]>
* draw: Delete unneeded LLVM stuff earlier.Frank Henigman2014-05-141-15/+4
| | | | | | | | | | Free up unneeded LLVM stuff immediately after generating vertex shader code. Saves about 500K per shader. v2: Don't bother calling gallivm_free_function (Jose) Signed-off-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Separate freeing LLVM intermediate data from freeing final code.Frank Henigman2014-05-142-7/+22
| | | | | | | | | | | | Split free_gallivm_state() into two steps. First step is gallivm_free_ir() which cleans up the LLVM scaffolding used to generate code while preserving the code itself. Second step is gallivm_free_code() to free the memory occupied by the code. v2: s/gallivm_teardown/gallivm_free_ir/ (Jose) Signed-off-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: One code memory pool with deferred free.Frank Henigman2014-05-144-1/+283
| | | | | | | | | | | | | | | | Provide a JITMemoryManager derivative which puts all generated code into one memory pool instead of creating a new one each time code is generated. This saves significant memory per shader as the pool size is 512K and a small shader occupies just several K. This memory manager also defers freeing generated code until you tell it to do so, making it possible to destroy the LLVM engine while keeping the code, thus enabling future memory savings. v2: Fix compilation errors with LLVM 3.4 (Jose) Signed-off-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Run passes per module, not per function.José Fonseca2014-05-141-28/+19
| | | | | | This is how it is meant to be done nowadays. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use LLVM global context.José Fonseca2014-05-141-23/+17
| | | | | | | | | | | I saw that LLVM internally uses its global context for some things, even when we use our own. Given ours is also global, might as well use LLVM's. However, sepearate contexts can still be enabled with a simple source code modification, for when the need/benefit arises. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Stop using module providers.José Fonseca2014-05-142-27/+7
| | | | | | | Nowadays LLVMModuleProviderRef is just an alias for LLVMModuleRef, so its use just causes unnecessary confusion. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm,draw,llvmpipe: Remove support for versions of LLVM prior to 3.1.José Fonseca2014-05-1414-520/+20
| | | | | | | Older versions haven't been tested probably don't work anyway. But more importantly, code supporting it is hindering further work. Reviewed-by: Roland Scheidegger <[email protected]>
* pipe-loader: Don't destroy the winsys in the sw loaderTom Stellard2014-05-091-3/+0
| | | | | | | | | | | | | The screen takes ownership of the winsys, and is responsible for destroying it. Users of pipe-loader should make sure they destory and screens they've created to avoid memory leaks. This fixes a crash in clover introduced by ce6c17c0833032e91a2d1b34f9eb80c738a854a2 where the pipe-loader was destroying the winsys while a screen was still using it. Cc: "10.2" <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* draw: do not use draw_get_option_use_llvm() inside draw execution pathsRoland Scheidegger2014-05-085-12/+12
| | | | | | | | | | | | | | 1c73e919a4b4dd79166d0633075990056f27fd28 made it possible to not allocate the tgsi machine if llvm was used. However, draw_get_option_use_llvm() is not reliable after draw context creation, since drivers can explicitly request a non-llvm draw context even if draw_get_option_use_llvm() would return true (and softpipe does just that) which leads to crashes. Thus use draw->llvm to determine if we're using llvm or not instead (and make draw->llvm available even if HAVE_LLVM is false so we don't have to put even more ifdefs). Cc: "10.2" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* tgsi: add missing switch cases in tgsi_exec_get_shader_param()Brian Paul2014-05-071-2/+8
| | | | | | | | Add cases for PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS and PIPE_SHADER_CAP_PREFERRED_IR. Remove default switch case so we learn of missing cases at compile time. Reviewed-by: José Fonseca <[email protected]>
* gallivm: add PIPE_SHADER_CAP_PREFERRED_IR switch case, remove defaultBrian Paul2014-05-071-2/+6
| | | | | | | | Return PIPE_SHADER_IR_TGSI for the PIPE_SHADER_CAP_PREFERRED_IR query. Remove default switch case so we learn of missing switch cases at compile time. Reviewed-by: José Fonseca <[email protected]>
* util: Don't attempt to redefine INFINITY/NAN on VS 2013.José Fonseca2014-05-021-0/+5
| | | | | | | There are now provided by VS. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: Prevent signed/unsigned comparisons.José Fonseca2014-05-021-1/+1
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util/u_debug_flush: Use util_snprintf.José Fonseca2014-05-021-2/+3
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix 2 leaks in disassembly codeRoland Scheidegger2014-05-011-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | don't leak the MCSubtargetInfo (not really big, was already fixed with llvm master) and TargetMachine (big). While this is only used for debugging the leak is large enough to get you into trouble in some cases. Tested with llvm 3.1 and master. Before (llvm 3.1), GALLIVM_DEBUG=asm glxgears: ==14152== LEAK SUMMARY: ==14152== definitely lost: 105,228 bytes in 20 blocks ==14152== indirectly lost: 347,252 bytes in 261 blocks ==14152== possibly lost: 866,625 bytes in 1,453 blocks ==14152== still reachable: 7,344,677 bytes in 6,494 blocks ==14152== suppressed: 0 bytes in 0 blocks After: ==13799== LEAK SUMMARY: ==13799== definitely lost: 3,108 bytes in 6 blocks ==13799== indirectly lost: 0 bytes in 0 blocks ==13799== possibly lost: 804,143 bytes in 1,429 blocks ==13799== still reachable: 7,314,267 bytes in 6,473 blocks ==13799== suppressed: 0 bytes in 0 blocks Reviewed-by: Brian Paul <[email protected]>
* translate_sse: Use the correct buffer index in this fast path.Andreas Hartmetz2014-04-291-1/+3
| | | | | | | | | | | | It is possible that there are multiple input buffers but only one is relevant for translation. Then there will be only a single translation group, which might need to source data from a buffer index != 0. Fixes wrong vertex shader inputs as observed while debugging with an application and driver combination that requires translation of a vertex attribute in a non-trivial set of attributes and input buffers. Reviewed-by: Ilia Mirkin <[email protected]>
* tgsi: add tgsi_exec support for new bit manipulation opcodesIlia Mirkin2014-04-281-0/+172
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/util: add helpers for bitfield manipulationIlia Mirkin2014-04-281-0/+31
| | | | | | | | Add bitwise reversing and signed MSB helpers for software implementation of the new TGSI opcodes. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add new opcodes for ARB_gs5 bit manipulation supportIlia Mirkin2014-04-281-0/+8
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util: Fix cross-compiles between endiannessesRichard Sandiford2014-04-282-32/+46
| | | | | | | | | | The old python code used sys.is_big_endian to select between little-endian and big-endian formats, which meant that the build and host endiannesses needed to be the same. This patch instead generates both big- and little- endian layouts, using PIPE_ARCH_BIG_ENDIAN to select between them. Signed-off-by: Richard Sandiford <[email protected]> Signed-off-by: José Fonseca <[email protected]>
* util: Split out channel-parsing Python codeRichard Sandiford2014-04-281-46/+50
| | | | | | | | | | Splits out the code that parses the channel list, so that we can have different lists for little and big endian. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <[email protected]> Signed-off-by: José Fonseca <[email protected]>
* util: Split out channel-printing Python codeRichard Sandiford2014-04-282-41/+69
| | | | | | | | | | | | Rather than iterate over format.channels and format.swizzles directly, use Python subfunctions that take the channel and swizzle lists as arguments. This allow the channel and swizzle lists to depend on endianness. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <[email protected]> Signed-off-by: José Fonseca <[email protected]>
* util: Turn inv_swizzle into a global functionRichard Sandiford2014-04-282-11/+11
| | | | | | | | | | | With the big-endian changes, there can be two swizzle orders for each format. This patch turns Format.inv_swizzle() into a global function that takes the swizzle list as a parameter. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <[email protected]> Signed-off-by: José Fonseca <[email protected]>
* util: Add more query methods to u_format_parse.FormatRichard Sandiford2014-04-283-36/+51
| | | | | | | | | | The main aim is to reduce the number of places that access channels[0], swizzles[0] and swizzles[1] directly. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <[email protected]> Signed-off-by: José Fonseca <[email protected]>
* gallium: add GS_INVOCATIONS propertyIlia Mirkin2014-04-261-0/+9
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add INVOCATIONID semanticIlia Mirkin2014-04-261-1/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa/st: add support for ARB_sample_shadingIlia Mirkin2014-04-265-0/+32
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add basic support for ARB_sample_shadingIlia Mirkin2014-04-261-1/+4
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* pipe-loader: conditionally build and use pipe_loader_sw_probe_driEmil Velikov2014-04-252-0/+6
| | | | | | | | | | | | | The function relies on the sw/dri winsys which is build only when --enable-dri is set. Fixes build issues with the following config ./configure --disable-dri --with-gallium-drivers=svga --enable-xa Issue can be reproduced with any hw gallium driver + st that uses the pipe-loader. Cc: Brian Paul <[email protected]> Reported-by: Brian Paul <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* gallium/util: use ui[4] instead of ui in union util_colorRoland Scheidegger2014-04-252-20/+20
| | | | | | | | util_color often merely represents a collection of bytes, however it is inconvenient if those bytes can only be accessed as floats/doubles for int formats exceeding 32bits. (Note that since rgba8 formats use one uint, not 4 bytes, hence the byte and short member were left as is.)
* draw/llvm: reduce memory usageZack Rusin2014-04-245-20/+27
| | | | | | | | | | Lets make draw_get_option_use_llvm function available unconditionally and use it to avoid useless allocations when LLVM paths are active. TGSI machine is never used when we're using LLVM. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallivm: Fix wrong operator in lp_exec_default.José Fonseca2014-04-241-1/+1
| | | | | | Courtesy of MSVC static code analyser. Reviewed-by: Roland Scheidegger <[email protected]>
* util/u_debug: Pass correct size to strncat.José Fonseca2014-04-231-4/+4
| | | | | | | Courtesy of Clang static analyzer. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util: Add __declspec(noreturn) to _debug_assert_fail().José Fonseca2014-04-171-0/+3
| | | | | | | | Mostly for consistency; as MSVC's static source code analysis doesn't seem to rely on assertions, but instead on different kind of source annotations( http://msdn.microsoft.com/en-us/library/hh916383.aspx ). Reviewed-by: Brian Paul <[email protected]>