summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* os: s/Elements/ARRAY_SIZE/Brian Paul2016-04-271-1/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* hud: s/Elements/ARRAY_SIZE/Brian Paul2016-04-273-7/+7
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: s/Elements/ARRAY_SIZE/Brian Paul2016-04-279-29/+29
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* draw: s/Elements/ARRAY_SIZE/Brian Paul2016-04-277-24/+24
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* gallium/util: add u_bit_consecutive for generating a consecutive range of bitsNicolai Hähnle2016-04-271-0/+12
| | | | | | | There are some undefined behavior subtleties, so having a function to match the u_bit_scan_consecutive_range makes sense. Reviewed-by: Marek Olšák <[email protected]>
* tgsi/exec: initialise SysSemanticToIndex array to -1Dave Airlie2016-04-271-0/+3
| | | | | | | | We want to use the SysSemanticToIndex to tell if we've seen the semantics at all. Acked-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi/exec: implement restartable machine.Dave Airlie2016-04-272-17/+35
| | | | | | | | | | | | | | This lets us restart the machine at a PC value, and exits the machine when we hit a barrier. Compute shaders will then execute all the threads up to the barrier, then restart the machines after the barrier once all are done. v2: comment the code a bit, change return types. Acked-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi/exec: make inputs/outputs optional for compute shaders.Dave Airlie2016-04-271-19/+24
| | | | | | | | compute shaders don't need input/outputs so don't bother allocating memory for these. Acked-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi/exec: implement load/store/atomic on MEMORY.Dave Airlie2016-04-272-3/+150
| | | | | | | | This implements basic load/store/atomic ops on MEMORY types for compute shaders. Acked-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi/exec: split out setting up masks to separate functionDave Airlie2016-04-271-9/+14
| | | | | | | | This is just a cleanup that will make later changes easier to make. Acked-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: accept a starting PC value for exec machine.Dave Airlie2016-04-274-4/+4
| | | | | | | | This will be used later to restart barriered execution threads in compute, for now we just want to change the API. Acked-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: move to using vector for system values.Dave Airlie2016-04-274-7/+7
| | | | | | | | | | For compute support some of the system values are .xyz types, so move to using a vector instead of a single channel. [airlied: squash swizzle fix from compute series]. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi/exec: fix system value handling.Dave Airlie2016-04-271-3/+5
| | | | | | | | | | | a) SysSemanticToIndex needs to be indexed with the semantic name not the decl->Declaration.Semantic. b) doing this in run is too late, as the mappings are all setup prior to run in the execs. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium: Include intrin.h instead of defining ourselves.Jose Fonseca2016-04-261-2/+4
| | | | | | | | | More portable, particularly when building with Clang, which implements all MSVC intrisincs in its own intrin.h, but doesn't actually support `#pragma instrinsic`. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* tgsi: pass a shader type to the machine create and clean up.Dave Airlie2016-04-264-14/+14
| | | | | | | | | | | | | There was definitely bugs here mixing up the PIPE_ and TGSI_ defines, hopefully they didn't cause any problems, since mostly it was special cases for GEOMETRY. This clarifies at shader machine create what type of shader this machine will execute. This is needed also for compute shaders where we don't want to allocate inputs/outputs. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/tgsi: move tgsi_exec.h header out of draw_context.hDave Airlie2016-04-262-1/+1
| | | | | | | | | It gets annoying that changing the tgsi exec rebuilds the state tracker unnecessarily. Putting this include into draw_gs.h which uses it causes a lot less rebuilds. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallivm: make sampling more robust against bogus coordinatesRoland Scheidegger2016-04-263-13/+43
| | | | | | | | | | | | | | | | | | | | | | Some cases (especially these using fract for coord wrapping) did not handle NaNs (or Infs) correctly - the following code assumed the fract result could not be outside [0,1], but if the input is a NaN (or +-Inf) the fract result was NaN - which then could produce out-of-bound offsets. (Note that the explicit NaN behavior changes for min/max on x86 sse don't result in actual changes in the generated jit code, but may on other architectures. Found by looking through all the wrap functions.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94955 No piglit changes. (v2: fix min/max typo in coord_mirror, add comment) Cc: "11.1 11.2" <[email protected]> Tested-by: Bruce Cherniak <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* util/blitter: use ARRAY_SIZE macroBrian Paul2016-04-251-9/+3
| | | | | | And remove local definition of Elements() macro. Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: initialize pipe_framebuffer_state to zerosBrian Paul2016-04-251-1/+1
| | | | | | | | | To silence a valgrind uninitialized memory warning. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955 Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util/cache: add comments, fix formattingBrian Paul2016-04-251-5/+35
|
* gallium: use unreachable instead of assertsGrazvydas Ignotas2016-04-251-1/+1
| | | | | | | Avoids warnings in release builds. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: fix warnings in release buildGrazvydas Ignotas2016-04-252-2/+3
| | | | | | | | Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings in release build. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: remove helpers converting to/from TGSI_PROCESSOR_*Marek Olšák2016-04-222-24/+1
| | | | Acked-by: Jose Fonseca <[email protected]>
* gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*Marek Olšák2016-04-2222-122/+122
| | | | Acked-by: Jose Fonseca <[email protected]>
* gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*Marek Olšák2016-04-2214-182/+170
| | | | | | | | Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <[email protected]>
* gallivm: fix bogus argument order to lp_build_sample_mipmap functionRoland Scheidegger2016-04-211-2/+2
| | | | | | | | | | | | | | | Screwed up since 0753b135f6e83b171d8a1b08aea967374f3542bc. (Only an issue with different min/mag filters, and then only in some cases, which is probably why it went unnoticed for quite a while. The effect should have simply been nearest mip filter instead of linear, iff min was nearest, mag was linear, and all pixels hit the mignifying path.) Fixes a bunch of dEQP failures. Reviewed-by: Jose Fonseca <[email protected]> Cc: "11.1 11.2" <[email protected]>
* tgsi/lowering: improved lowering for LRPRussell King2016-04-191-35/+20
| | | | | | | | | | | Provide an improved lowering for LRP, which can be implemented in two MAD instructions with a bit of rearranging of the equation, rather than the literal implementation of two multiplies, an add and a subtract. Signed-off-by: Russell King <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* tgsi/lowering: improved lowering for XPDRussell King2016-04-191-22/+13
| | | | | | | | | Improve XPD lowering to consume less instructions by using the MAD instruction to perform the multiply and subtraction together. Signed-off-by: Russell King <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* tgsi/lowering: add support for lowering TRUNCRussell King2016-04-192-0/+85
| | | | | | | | | | | | | | Add support for lowering TRUNC using the following sequence: FRC tmpA, |src| SUB tmpA, |src|, tmpA CMP dst, -tmpA, tmpA Note that this is incompatible with FRC lowering. Signed-off-by: Russell King <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* tgsi/lowering: add support for lowering FLR and CEILRussell King2016-04-192-20/+149
| | | | | | | | | | | | | | | | Add support for lowering FLR and CEIL to FRC/SUB and FRC/ADD instructions for GPUs that support FRC but not FLR or CEIL. Since these uses FRC, it is invalid to ask for FLR or CEIL to be lowered along with FRC, so add an assert to catch this invalid configuration. We also need to deal with FLR instructions emitted by the lowering code. Fix these up with the FRC+SUB equivalent when FLR lowering is enabled. Signed-off-by: Russell King <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* gallium/util: Add u_bit_scan_consecutive_range64.Bas Nieuwenhuizen2016-04-191-0/+14
| | | | | | | | | For use by radeonsi. v2: Make sure that it works for all 64 bits set. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallivm: Avoid llvm::sys::getProcessTriple().Jose Fonseca2016-04-191-3/+3
| | | | | | | Just use LLVM_HOST_TRIPLE, which is available at least from LLVM 3.3 onwards, and is pretty much what llvm::sys::getProcessTriple() does anyway, Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Remove lp_get_module_id.Jose Fonseca2016-04-194-12/+15
| | | | | | Just keep a copy of the module_name in gallivm. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Fix MCJIT with LLVM 3.3.Jose Fonseca2016-04-191-3/+3
| | | | | | | | | | | | One needs to call setJITMemoryManager for LLVM 3.3, instead of setMCJITMemoryManager. This regressed in commits 065256df/75ad4fe7 when trying to make the code to build with LLVM 3.6. Tested MCJIT with LLVM 3.3 to 3.6. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Make MCJIT a runtime option.Jose Fonseca2016-04-191-75/+72
| | | | | | | | | | | | | On the LLVM versions that support it, so we can easily switch between MCJIT/old-jit for testing. The new option is GALLIVM_MCJIT. Unfortunately setting GALLIVM_MCJIT=1 for LLVM 3.3 or 3.4 causes segfault, both on Linux and Windows. I'm almost certain this used to work, so there probably is a regression somewhere. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use LLVMSetTarget.Jose Fonseca2016-04-191-3/+9
| | | | | | Instead of LLVM C++ interfaces. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use LLVMPrintValueToString where available.Jose Fonseca2016-04-191-35/+10
| | | | | | | | | | | | | And llvm::raw_string_ostream where not (LLVM 3.3). Thereby eliminating yet another dependency on unstable LLVM interfaces. As a bonus this also gets LLVM IR on OutputDebugMessageA on MSVC (which was disabled, probably due to C++ issues.) Tested `lp_test_arit -v -v` on LLVM 3.3, 3.4 and 3.8. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/tests: Update UTIL_FORMAT_MAX_* defines.Jose Fonseca2016-04-191-3/+3
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: convert size query to using a set of parameters.Dave Airlie2016-04-195-61/+43
| | | | | | | | | | This isn't currently that easy to expand, so fix it up before expanding it later to include dynamic samplers. [airlied: use some local variables (Roland)] Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/util: fix undefined shift to the last bit in u_bit_scanMarek Olšák2016-04-181-1/+1
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/util: fix u_bit_scan_consecutive_range for mask == 0xffffffffMarek Olšák2016-04-181-1/+7
| | | | | | | | The second ffs returns 0, yielding count == -1. v2: change 1 to 1u Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallivm: don't use vector selects with llvm 3.7Roland Scheidegger2016-04-181-3/+5
| | | | | | | | | | llvm 3.7 sometimes simply miscompiles vector selects. See https://bugs.freedesktop.org/show_bug.cgi?id=94972 This was fixed in llvm r249669 (https://llvm.org/bugs/show_bug.cgi?id=24532). Reviewed-by: Jose Fonseca <[email protected]>
* gallium/swr: allow swr use as a swrast dri driverTim Rowley2016-04-151-2/+13
| | | | | Reviewed-by: Emil Velikov <[email protected]> Tested-by: Ilia Mirkin <[email protected]>
* gallivm: Workaround LLVM PR 27332.Jose Fonseca2016-04-131-3/+14
| | | | | | | | | | | | | The credit for finding and isolating this bug goes to Vinson and Roland. The buggy LLVM versions were found by doing opt -instcombine llvm-pr27332.ll > /dev/null where llvm-pr27332.ll is the IR from https://llvm.org/bugs/show_bug.cgi?id=27332#c3 Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: use llvm.nearbyint instead of llvm.round.Roland Scheidegger2016-04-131-98/+1
| | | | | | | | | | | | | | | | We used to use sse roundps intrinsic directly, but switched to use the llvm intrinsics for rounding with e4f01da15d8c6ce3e8c77ff3ff3d2ce2574a3f7b. However, llvm semantics follows standard math lib round function which is specced to do roundNearestAwayFromZero but we really want roundNearestEven (moreoever, using round generates atrocious code since the cpu can't do it directly and it results in scalar calls to libm __roundf). So, use llvm.nearbyint instead, which does exactly the right thing, and even has the advantage of being available with llvm 3.3 too. (I've verified it actually generates a roundps instruction with llvm 3.3.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909 Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: fix buffer overflowThomas Hindoe Paaboel Andersen2016-04-131-1/+1
| | | | | | | Increase r to four channels as rgba is written to it Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium: Use STATIC_ASSERT whenever possible.Jose Fonseca2016-04-123-3/+3
| | | | | Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* winsys/amdgpu: add support for 64-bit buffer sizesMarek Olšák2016-04-121-0/+6
| | | | | | v2: fail in radeon_winsys_bo_create if size > 32 bits Reviewed-by: Nicolai Hähnle <[email protected]>
* pb_buffer: switch pb_buffer::size to 64 bitsMarek Olšák2016-04-123-8/+10
| | | | | | being able to allocate more than 4 GB may be useful Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: pause queries for all meta opsMarek Olšák2016-04-126-0/+12
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>