summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* hud: fix compilation warnings in hud_nic_graph_install()Samuel Pitoiset2017-01-301-2/+2
| | | | | | | v2: use PRId64 instead of PRIx64 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallivm: remove explicit __STDC_.*_MACROS definesEmil Velikov2017-01-271-8/+0
| | | | | | | | | | Correctly handled by the build systems. Cc: Roland Scheidegger <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: don't try to use fast rcp for fdivRoland Scheidegger2017-01-241-1/+3
| | | | | | | | The use of fast rcp instruction is disabled, and will always fall back to use a division instead (1 / x). Hence, if we get a division opcode, it doesn't make much sense trying to split that into rcp/mul. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: (trivial) fix ddiv cpu implementationRoland Scheidegger2017-01-241-1/+0
| | | | | | | | | | | we can't use the cpu implementation of fdiv, as this one uses different lp_build_context, which causes assertion failure. Just use default fdiv action (there is no fast rcp for doubles which we could potentially use anyway). Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: implement ddiv opcodeRoland Scheidegger2017-01-241-0/+14
| | | | | | | | | softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64, so we really need to support that opcode. Cc: 17.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: add TGSI_PROPERTY_MUL_ZERO_WINSIlia Mirkin2017-01-231-1/+2
| | | | | | | | | This will be useful for proper D3D9 emulation, where this behavior is expected by some shaders. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* gallium/hud: add missing break in hud_cpufreq_graph_install()Samuel Pitoiset2017-01-201-0/+1
| | | | | | | Fixes: e99b9395bef "gallium/hud: Add support for CPU frequency monitoring" Cc: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallivm: use #ifdef not #if for PIPE_ARCH_BIG_ENDIANDave Airlie2017-01-191-1/+1
| | | | | | | | This fixes the build on ppc/s390. Reviewed-by: Roland Scheidegger <[email protected]> Cc: "17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* android: gallium/auxiliary: fix building error in Android 7.0Mauro Rossi2017-01-181-1/+1
| | | | | | | | | | | | | | | | | | Conditional libLLVMCore static library dependency is added, for the case when MESA_ENABLE_LLVM is true Fixes the following building error with Android 7.0: In file included from external/mesa/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp:62: ... external/llvm/include/llvm/IR/Attributes.h:68:14: fatal error: 'llvm/IR/Attributes.inc' file not found #include "llvm/IR/Attributes.inc" ^ 1 error generated. Reviewed-by: Emil Velikov <[email protected]>
* gallium: correctly manage libsensors link flagsEmil Velikov2017-01-181-2/+0
| | | | | | | | | | We should be using LIBS rather than the LDFLAGS variable. Furthermore try to keep the linking to the final stage, rather than intermetent static library. Cc: Steven Toth <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: (trivial) fix copy/paste bug with big endian codeRoland Scheidegger2017-01-181-2/+4
| | | | | | 8bd67a35c50e68c21aed043de11e095c284d151a introduced using undefined variable on big endian archs due to copy/paste bug. (compile hack tested only)
* configure.ac: Revert recent HAVE_LLVM changes.Jose Fonseca2017-01-185-12/+12
| | | | | | | | | | | | | | | | This reverts changes 903eb09b5fb78d47d0f8a4bdf826a113ca2aff40..1a0aa468f354f0ee94dd383cd40ae915584624aa: Tobias Droste (5): configure.ac: Rename MESA_LLVM to FOUND_LLVM configure.ac: Only set LLVM_LIBS if LLVM is used configure.ac: Only define HAVE_LLVM if LLVM is used configure.ac: Set and use HAVE_GALLIUM_LLVM define configure.ac: Don't check LLVM version in gallium_require_llvm They break scons build, and I'm not convinced this is the right fix. In particular changing HAVE_LLVM in the C code is something I'd rather avoid no matter what. So it's better to discuss without the pressure of broken builds.
* configure.ac: Set and use HAVE_GALLIUM_LLVM defineTobias Droste2017-01-185-12/+12
| | | | | | | | | | | | Gallium code used HAVE_LLVM to check if it needs to compile code for LLVM in header and source files. With the new logic HAVE_LLVM is always set. Use extra define to figure out if LLVM is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010 Signed-off-by: Tobias Droste <[email protected]>
* gallivm: Cleanup USE_MCJIT.Jose Fonseca2017-01-181-10/+25
| | | | | | | Split USE_MCJIT macro dual nature into a separate constant time define and a run-time variable. Reviewed-by: Emil Velikov <[email protected]>
* vl/dri3: use external texture as back buffers(v4)Nayan Deshmukh2017-01-172-17/+114
| | | | | | | | | | | | | | | | | | | | | dri3 allows us to send handle of a texture directly to X so this patch allows a state tracker to directly send its texture to X to be used as back buffer and avoids extra copying v2: use clip width/height to display a portion of the surface v3: remove redundant variables, fix wrapping, rename variables handle vaapi path v3.1: we need clip_width/height for every frame so we don't need to maintain it for each buffer instead use a global variable v4: In case of single gpu we can cache the buffers as applications use constant number of buffer and we can avoid calls to present extension for every frame Reviewed and Suggested-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Andy Furniss <[email protected]> Signed-off-by: Nayan Deshmukh <[email protected]>
* gallium: add FBFETCH opcode to retrieve the current sample valueIlia Mirkin2017-01-161-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi: add DDIV instructionNicolai Hähnle2017-01-162-0/+4
| | | | | | | | | Double-precision division, to allow more precision than a DRCP + DMUL sequence. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/hud: avoid buffer overrunThomas Hindoe Paaboel Andersen2017-01-161-2/+4
| | | | | | | | | | | | Renaming data sources was added in e8bb97ce30051b999a4a69c9b27884daeb8d71e6 It was possible to use a new name longer than the name array in hud_graph of 128. This patch truncates the name to fit the array. CC: Marek Olšák <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/hud: disable queries during HUD draw callsMarek Olšák2017-01-163-1/+29
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/hud: increase the vertex buffer size for background quadsMarek Olšák2017-01-161-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* imx: gallium driver for imx-drm scanout driverChristian Gmeiner2017-01-123-0/+32
| | | | | | | | | | Changes from V1 -> V2: - updated Copyright - added $(top_srcdir)/src/gallium/winsys to include path (suggested by Emil) - adapted driver to new renderonly API Signed-off-by: Christian Gmeiner <[email protected]> Acked-by: Emil Velikov <[email protected]>
* etnaviv: gallium driver for Vivante GPUsThe etnaviv authors2017-01-123-0/+30
| | | | | | | | | | | | | | | | | This driver supports a wide range of Vivante IP cores like GC880, GC1000, GC2000 and GC3000. Changes from V1 -> V2: - added missing files to actually integrate the driver into build system. - adapted driver to new renderonly API Signed-off-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Philipp Zabel <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Wladimir J. van der Laan <[email protected]> Acked-by: Emil Velikov <[email protected]>
* gallium: add renderonly libraryChristian Gmeiner2017-01-124-0/+298
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This a very lightweight library to add basic support for renderonly GPUs. A kms gallium driver must specify how a renderonly_scanout objects gets created. Also it must provide file handles to the used kms device and the used gpu device. This could look like: struct renderonly ro = { .create_for_resource = renderonly_create_gpu_import_for_resource, .kms_fd = fd, .gpu_fd = open("/dev/dri/renderD128", O_RDWR | O_CLOEXEC) }; The renderonly_scanout object exits for two reasons: - Do any special treatment for a scanout resource like importing the GPU resource into the scanout hw. - Make it easier for a gallium driver to detect if anything special needs to be done in flush_resource(..) like a resolve to linear. A GPU gallium driver which gets used as renderonly GPU needs to be aware of the renderonly library. This library will likely break android support and hopefully will get replaced with a better solution based on gbm2. Changes from V1 -> V2: - reworked the lifecycle of renderonly object (suggested by Nicolai Hähnle) - killed the midlayer (suggested by Thierry Reding) - made the API more explicit regarding gpu and kms fd's - added some docs Signed-off-by: Christian Gmeiner <[email protected]> Acked-by: Emil Velikov <[email protected]> Tested-by: Alexandre Courbot <[email protected]>
* gallium/tgsi: fix overflow in parse propertyLi Qiang2017-01-111-3/+6
| | | | | | | | | | | In parse_identifier, it doesn't stop copying '*pcur' untill encounter the NULL. As the 'ret' has a fixed-size buffer, if the '*pcur' has a long string, there will be a buffer overflow. This patch avoid this. Signed-off-by: Li Qiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Marc-André Lureau <[email protected]>
* gallivm: generalize 4x4f->1x16ub special case conversionRoland Scheidegger2017-01-061-56/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | This special packing path can be easily extended to handle not just float->unorm8 but also float->snorm8 and uint32->uint8 and int32->int8 (i.e. all interesting cases for llvmpipe fs backend code). The packing parts all stay the same (only the last step packing will be signed->signed instead of signed->unsigned but luckily even sse2 can do both). While here also note some bugs with that (we keep the bugs identical to what we did before on x86, albeit other archs may differ). In particular float->unorm8 too large values will still get clamped to 0, not 255, and for float->snorm8 NaNs will end up as -1, not 0 (but we do the clamp against 1.0 there to prevent too large values ending up as -1.0 - this is inconsistent to unorm8 handling but is what we ended up before, I'm not sure we can get away without it). This is quite fishy in any case as we depend on arch-dependent behavior of the iround (my understanding is in fact with altivec the conversion would actually saturate although I've no idea about NaNs, so probably wouldn't need to do anything for snorm). (There are only minimal piglit tests for unorm clamping behavior AFAICT, in particular nothing seems to test values which are too large to be handled by the float->int conversion.) For uint32->uint8 we also do a min against MAX_INT, since the source for the packs is always signed (again, on x86 - should probably be able to express these arch-dependent bits better some day). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: (trivial) fix typo bug with small AoS format unpackingRoland Scheidegger2017-01-061-1/+1
| | | | | | | Fix typo using wrong (uninitialized) build context introduced by 4634cb5921b985f04f2daf00cda2d28036143bd3. (This only affects very rare small packed formats which have a PIPE_SWIZZLE_0 channel, such as r4a4, which is never used by mesa/st. Nevertheless it broke lp_test_format.)
* gallivm: implement aos unpack (to unorm8) for small unorm formatsRoland Scheidegger2017-01-051-12/+152
| | | | | | | | | | | | | | | | | | | Using bit replication. This path now resembles something which might make sense. (The logic was mostly copied from llvmpipe fs backend.) I am not convinced though it is actually faster than SoA sampling (actually I'm quite certain it's always a loss with AVX). With SoA it's just shift/mask/cvt/mul for getting the colors, whereas there's still roughly 3 shifts, 3 or/and per channel for AoS (i.e. for SoA it's exactly the same as it would be for a rgba8 format, whereas the extra effort for AoS is significant). The filtering might still be faster (albeit with FMA the instruction count gets down quite a bit there on the SoA float filtering path on new cpus). And those small unorm formats often don't have an alpha channel (which makes things worse relatively for AoS path). (This also fixes a trivial bug in the llvmpipe fs code this was derived from, albeit it was only relevant for 4-bit channels.) Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: optimize lp_build_unpack_arith_rgba_aos slightlyRoland Scheidegger2017-01-051-19/+97
| | | | | | | | | | | | | | | | | This code uses a vector shift which has to be emulated on x86 unless there's AVX2. Luckily in some cases we can actually avoid the shift altogether, so do that. Also make sure we hit the fast lp_build_conv() path when applicable, albeit that's quite the hack... That said, this path is taken for AoS sampling for small unorm (smaller than rgba8) formats, and it is completely hopeless even with those changes, with or without AVX. (Probably should have some code similar to the one in the llvmpipe fs backend code, using bit replication to extend to rgba8888 - rounding is not quite 100% accurate but if it's good enough there it should be here as well.) Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: use 2 srcs for 32->16bit conversions in lp_bld_conv_autoRoland Scheidegger2017-01-051-2/+19
| | | | | | | | | | | | | | | | | If we only feed one source vector at a time, we cannot use pack intrinsics (as we only have a 64bit destination dst vector). lp_bld_conv_auto is specifically designed to alter the length and number of destination vectors, so this works just fine (if we use single source vectors at a time, afterwards we immediately reassemble the vectors). For AVX though this isn't really possible, since we expect 128bit output already for a single 256bit input. (One day we should handle AVX2 which again would need multiple inputs, however there's the problem that we get different ordered output there and we don't want to reorder, so would need to be able to tell build_conv to handle upper and lower halfs independently.) A similar strategy would probably work for 32->8bit too (if it doesn't hit the special case) but I'm going to try something different for that... Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: (trivial) minimally simplify mask constructionRoland Scheidegger2017-01-051-0/+2
| | | | | | | | | | | | | | | | simd instruction sets usually have comparisons for equal, not unequal. So use a different comparison against the mask itself - which also means we don't need a all-zero as well as a all-one (for the pxor) reg. Also add code to avoid scalar expansion of i1 values which we definitely shouldn't do. There's problems with this though with llvm select interaction, so it's disabled (basically using llvm select instead of intrinsics may still produce atrocious code, even in cases where we figured it should not, albeit I think this could probably be fixed with some better selection of optimization passes, but I have zero idea there really). Reviewed-by: Jose Fonseca <[email protected]>
* gallium/hud: increase the vertex buffer size for textMarek Olšák2017-01-051-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/hud: add an option to sort items below graphsMarek Olšák2017-01-052-5/+33
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/hud: add an option to reset the color counterMarek Olšák2017-01-052-3/+19
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/hud: allow more data sources per paneMarek Olšák2017-01-051-13/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/hud: add an option to rename each data sourceMarek Olšák2017-01-051-2/+35
| | | | | | | | useful for radeonsi performance counters v2: allow specifying both : and = Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: remove TGSI_OPCODE_SUBMarek Olšák2017-01-0517-94/+61
| | | | | | It's redundant with the source modifier. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: remove TGSI_OPCODE_ABSMarek Olšák2017-01-058-25/+4
| | | | | | It's redundant with the source modifier. Reviewed-by: Nicolai Hähnle <[email protected]>
* swr: fix windows build breakGeorge Kyriazis2017-01-051-0/+7
| | | | | | | | | | wrap lp_bld_type.h around extern "C". Windows decorates global variables, so when used from .cpp files, need to use an undecorated version. Also, removed related and unneeded code from swr_screen.cpp Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/hud: add a path separator between dump directory and filenameEdmondo Tommasina2017-01-031-1/+2
| | | | | | | It's more user friendly and it avoids to write files in unexpected places. Signed-off-by: Marek Olšák <[email protected]>
* vl/zscan: fix "Fix trivial sign compare warnings"Christian König2017-01-031-1/+1
| | | | | | | | | | | The variable actually needs to be signed, otherwise converting it to a float doesn't work as expected. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98914 Signed-off-by: Christian König <[email protected]> Reviewed-by: Nayan Deshmukh <[email protected]> Cc: "13.0" <[email protected]> Fixes: 1fb4179f927 ("vl: Fix trivial sign compare warnings")
* vl/compositor: implement error handlingNayan Deshmukh2017-01-032-3/+12
| | | | | | | pipe_buffer_map and pipe_buffer_create may return NULL Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/hud: fix the windows build by disabling file dumpingMarek Olšák2017-01-021-0/+2
|
* gallium/hud: set filedescriptor for fps graphEdmondo Tommasina2017-01-011-0/+2
| | | | Signed-off-by: Marek Olšák <[email protected]>
* gallium/hud: set filedescriptor for cpu graphEdmondo Tommasina2017-01-011-0/+2
| | | | Signed-off-by: Marek Olšák <[email protected]>
* gallium/hud: move file initialization to a functionEdmondo Tommasina2017-01-013-11/+20
| | | | | | | The function will be used later to create the filedescriptor for other metrics. Signed-off-by: Marek Olšák <[email protected]>
* gallium/hud: dump hud_driver_query values to filesEdmondo Tommasina2017-01-013-0/+19
| | | | | | | | | | | | | | Dump values for every selected data source in GALLIUM_HUD. Every data source has its own file and the filename is equal to the data source identifier. Set GALLIUM_HUD_DUMP_DIR to dump values to files in this directory. No values are dumped if the environment variable is not set, the directory doesn't exist or the user doesn't have write access. Signed-off-by: Marek Olšák <[email protected]>
* ttn: set ->info->num_ubosRob Clark2016-12-271-1/+4
| | | | | | | | | For dealing w/ 32b vs 64b gpu addresses, I need to rework how we pass UBO buffer addresses to shader, and knowing up front the # of UBOs is useful. But I noticed ttn wasn't setting this. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ttn: handle GLSL_SAMPLER_DIM_SUBPASS_MS caseJuan A. Suarez Romero2016-12-211-0/+1
| | | | | | Fixes a warning. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* draw: use SoA fetch, not AoS oneRoland Scheidegger2016-12-211-48/+23
| | | | | | | | | | | | | | | | | | | Now that there's some SoA fetch which never falls back, we should always get results which are better or at least not worse (something like rgba32f will stay the same). For cases which get way better, think something like R16_UNORM with 8-wide vectors: this was 8 sign-extend fetches, 8 cvt, 8 muls, followed by a couple of shuffles to stitch things together (if it is smart enough, 6 unpacks) and then a (8-wide) transpose (not sure if llvm could even optimize the shuffles + transpose, since the 16bit values were actually sign-extended to 128bit before being cast to a float vec, so that would be another 8 unpacks). Now that is just 8 fetches (directly inserted into vector, albeit there's one 128bit insert needed), 1 cvt, 1 mul. v2: ditch the old AoS code instead of just disabling it. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: generalize the compressed format soa fetch a bitRoland Scheidegger2016-12-211-37/+49
| | | | | | | | | This can now handle rgtc (unorm) too - this path no longer handles plain formats, but that's unnecessary they now all have their proper SoA unpack (this will still be dog-slow though due to the actual fetch being per-pixel util fallbacks). Reviewed-by: Jose Fonseca <[email protected]>