summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: dump bitcode before optimizationRoland Scheidegger2018-04-241-13/+20
| | | | | | | | | | | | | | | | | | If we dump the bitcode for off-line debug purposes, we really want the pre-optimized bitcode, otherwise it's useless in identifying problems with IR optimization (if you have a shader which takes an hour to do IR optimization, it's also nice you don't have to wait that hour...). Also, print out the function passes for opt which correspond to what was used for jit compilation (and also the opt level for codegen). Using opt/llc this way should then pretty much mimic what was done for jit. (When specifying something like -time-passes -debug-pass=[Structure|Arguments] (for either opt or llc) that also gives very useful information in which passes all the time was spent, and which passes are really run along with the order - llvm will add passes due to dependencies on its own, and of course -O2 for llc comes with a ~100 pass list.) Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: (trivial) do division by 1000 with int64Roland Scheidegger2018-04-241-1/+1
| | | | | | | Conversion to int can otherwise overflow if compile times are over ~71min. (Yes this can happen...) Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: remove LICM passRoland Scheidegger2018-04-241-1/+9
| | | | | | | | | | | | | | LICM is simply too expensive, even though it presumably can help quite a bit in some cases. It was definitely cheaper in llvm 3.3, though as far as I can tell with llvm 3.3 it failed to do anything in most cases. early-cse also actually seems to cause licm to be able to move things when it previously couldn't, which causes noticeable compile time increases. There's more loop passes in llvm, but I'm not sure which ones are helpful, and I couldn't find anything which would roughly do what the old licm in llvm 3.3 did, so ditch it. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: add early cse passRoland Scheidegger2018-04-241-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass is quite cheap, and can simplify the IR quite a bit for our generated IR. In particular on a variety of shaders I've found the time saved by other passes due to the simplified IR more than makes up for the cost of this pass, and on top of that the end result is actually better. The only downside I've found is this enables the LICM pass to move some things out of the main shader loop (in the case I've seen, instanced vertex fetch (which is constant within the jit shader) plus the derived instructions in the shader) which it couldn't do before for some reason. This would actually be desirable but can increase compile time considerably (licm seems to have considerable cost when it actually can move things out of loops, due to alias analysis). But blaming early cse for this seems inappropriate. (Note that the first two sroa / earlycse passes are similar to what a standard llvm opt -O1/-O2 pipeline would do, albeit this has some more passes even before but I don't think they'd do much for us.) It also in particular helps some crazy shader used for driver verification (don't ask...) a lot (about factor of 6 faster in compile time) (due to simplfiying the ir before LICM is run). While here, also move licm behind simplifycfg. For some shaders there seems to be very significant compile time gains (we've seen a factor of 10000 albeit that was a really crazy shader you'd certainly never see in a real app), beause LICM is quite expensive and there's cases where running simplifycfg (along with sroa and early-cse) before licm reduces IR complexity significantly. (I'm not entirely sure if it would make sense to also run it afterwards.) Reviewed-by: Jose Fonseca <[email protected]>
* trace: allow image resource to be nullIlia Mirkin2018-04-211-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: Android backtrace supportStefan Schake2018-04-202-1/+113
| | | | | | | | | | | | | | We can't use any of the existing implementations in u_debug_stack. Android technically has libunwind, but it's been modified to the point where it no longer compiles with the Mesa usage. The library is also not meant to be referenced by vendor libraries. The officially sanctioned way of obtaining backtraces is through the Android own libbacktrace, a C++ library. Access it through a separate C++ source file on Android only. Signed-off-by: Stefan Schake <[email protected]> Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Rob Herring <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* gallium/util: Don't stub u_debug_stack on AndroidStefan Schake2018-04-201-1/+2
| | | | | | | | | The fallback path for no libunwind ends up being stubs for Android. Don't compile them in so we can provide our own implementation. Signed-off-by: Stefan Schake <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* radeonsi: generate image load/store/atomic ops using ac_build_image_opcodeNicolai Hähnle2018-04-201-1/+1
| | | | | | In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass address components individually to ac_build_image_intrinsicNicolai Hähnle2018-04-201-1/+1
| | | | | | This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <[email protected]>
* gallium/util: put (void) in a few function signaturesBrian Paul2018-04-131-2/+2
| | | | | | | To match the header file. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* ddebug: add PIPE_OS_UNIX/LINUX checks to fix MSVC buildBrian Paul2018-04-132-2/+12
| | | | | | | Don't include Unix headers or use Unix functions when building with MSVC. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium: move ddebug, noop, rbug, trace to auxiliary to improve build timesMarek Olšák2018-04-1337-28/+13026
| | | | which also simplifies the build scripts.
* gallium/hud: add a simple HUD view that only draws textMarek Olšák2018-04-132-15/+60
| | | | | | | | | | | Add this prefix to the env var: "simple," For example: GALLIUM_HUD=simple,fps The X coordinates are the same, but the Y coordinates are different, because there is only text. '+' happens to behave the same as "\n". ',' happens to behave the same as "\n\n".
* vl: add VP9 profile2 supportLeo Liu2018-04-121-0/+1
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* vl: add VP9 probability tablesLeo Liu2018-04-123-1/+588
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* vl: add VP9 profile0 and formatLeo Liu2018-04-121-0/+3
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* gallium/util: implement util_format_is_yuvLucas Stach2018-04-081-0/+12
| | | | | | | | | This adds a helper to check if a pipe format is in YUV color space. Drivers want to know about this, as YUV mostly needs special handling. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Philipp Zabel <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* meson: fix warnings about comparing unlike typesDylan Baker2018-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | In the old days (0.42.x), when mesa's meson system was written the recommendation for handling conditional dependencies was to define them as empty lists. When meson would evaluate the dependencies of a target it would recursively flatten all of the arguments, and empty lists would be removed. There are some problems with this, among them that lists and dependencies have different methods (namely .found()), so the recommendation changed to use `dependency('', required : false)` for such cases. This has the advantage of providing a .found() method, so there is no need to do things like `dep_foo != [] and dep_foo.found()`, such a dependency should never exist. I've tested this with 0.42 (the minimum we claim to support) and 0.45. On 0.45 this removes warnings about comparing unlike types, such as: meson.build:1337: WARNING: Trying to compare values of different types (DependencyHolder, list) using !=. v2: - Use dependency('', required : false) instead of declare_dependency(), the later will always report that it is found, which is not what we want. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* gallium/pp: fix MLAA shadersMarek Olšák2018-04-041-4/+4
| | | | | | Reviewed-by: Ilia Mirkin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549
* gallium/pp: use user constant buffersMarek Olšák2018-04-044-33/+25
| | | | | | This fixes a radeonsi crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026
* gallium/pipebuffer: fix parenthesis locationTimothy Arceri2018-04-031-1/+1
| | | | | | | | | | Without this the return value will never get set to -1. This was first added in 49866c8f3457 and copied in 2b396eeed983. Fixes: 2b396eeed983 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager" Reviewed-by: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342
* gallivm: Fix include for LLVMAddPromoteMemoryToRegisterPassMike Lothian2018-04-021-0/+3
| | | | | | | | | Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/u_tests: test FBFETCH and shader-based blending with MSAAMarek Olšák2018-04-021-40/+128
| | | | Tested-by: Dieter Nützel <[email protected]>
* util: Move util_is_power_of_two to bitscan.h and rename to ↵Ian Romanick2018-03-298-22/+14
| | | | | | | | | | | util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* gallium/u_vbuf: Protect against overflow with large instance divisors.Eric Anholt2018-03-261-1/+10
| | | | | | | | | | | | GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1 as a divisor, so we would overflow to count=0 and upload no data, triggering the assert below. We want to upload 1 element in this case, fixing the test on VC5. v2: Use some more obvious logic, and explain why we don't use the normal round_up(). Reviewed-by: Brian Paul <[email protected]>
* gallium: Do not add -Wframe-address option for gcc <= 4.4.Vinson Lee2018-03-261-1/+1
| | | | | | | | | | | | | | | | This patch fixes these build errors with GCC 4.4. Compiling src/gallium/auxiliary/util/u_debug_stack.c ... src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’: src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions Fixes: 370e356ebab4 ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* tgsi,softpipe: use enum tgsi_opcodeBrian Paul2018-03-231-2/+2
| | | | Reviewed-by: Eric Anholt <[email protected]>
* st/mesa,tgsi: use enum tgsi_opcodeBrian Paul2018-03-234-29/+29
| | | | | | | | | | | Need to update the tgsi code and st_glsl_to_tgsi code at the same time to prevent compile break since C++ is much pickier about implicit enum/unsigned casting. Bump size of glsl_to_tgsi_instruction::op to 10 bits to be sure to avoid MSVC signed enum overflow issue. No change in class size. Reviewed-by: Eric Anholt <[email protected]>
* tgsi/nir: use enum tgsi_opcodeBrian Paul2018-03-231-2/+2
| | | | Reviewed-by: Eric Anholt <[email protected]>
* tgsi: use enum tgsi_opcodeBrian Paul2018-03-235-14/+14
| | | | Reviewed-by: Eric Anholt <[email protected]>
* gallivm: use enum tgis_opcodeBrian Paul2018-03-232-8/+12
| | | | Reviewed-by: Eric Anholt <[email protected]>
* tgsi: move tgsi_processor_to_shader_stage() to a headerEmil Velikov2018-03-162-15/+16
| | | | | | | | This way we can utilise it with later patches. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: silence __builtin_frame_address nonzero argument is unsafe warningTimothy Arceri2018-03-131-0/+3
| | | | | | | | | Calling __builtin_frame_address with a nonzero argument is unsafe but is sometimes done for debugging purposes. Since this code is part of some debug util code I'm assuming that is the case here and using GCC pragma to silence the warning. Reviewed-by: Jose Fonseca <[email protected]>
* u_vbuf/translate: pass max_index into the set_buffer.Dave Airlie2018-03-121-1/+1
| | | | | | | | | This fixes a memory trashing crash (not the test) seen with dEQP-GLES3.stress.draw.unaligned_data.random.203 on virgl. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/util: add helper util_wait_for_idleMarek Olšák2018-03-112-0/+15
| | | | This is an old patch that I had.
* u_blit: (trivial) u_blit.h needs to include p_defines.hRoland Scheidegger2018-03-101-0/+1
| | | | | | | (For the pipe_tex_filter enum) Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix alpha value for very short aa linesRoland Scheidegger2018-03-102-2/+24
| | | | | | | | The logic would not work correctly for line lengths smaller than 1.0, even a degenerated line with length 0 would still produce a fragment with anyhwere between alpha 0.0 and 0.5. Reviewed-by: Brian Paul <[email protected]>
* gallium: Add a util_blitter path for using a custom VS and FS.Eric Anholt2018-03-092-0/+69
| | | | | | | | | Like the r600 paths to use other custom states, we pass in a couple of parameters to customize the innards of the blitter. It's up to the caller to wrap other state necessary for its shaders (for example, constant buffers for the uniforms the shader uses). Reviewed-by: Marek Olšák <[email protected]>
* tegra: Initial supportThierry Reding2018-03-093-1/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tegra K1 and later use a GPU that can be driven by the Nouveau driver. But the GPU is a pure render node and has no display engine, hence the scanout needs to happen on the Tegra display hardware. The GPU and the display engine each have a separate DRM device node exposed by the kernel. To make the setup appear as a single device, this driver instantiates a Nouveau screen with each instance of a Tegra screen and forwards GPU requests to the Nouveau screen. For purposes of scanout it will import buffers created on the GPU into the display driver. Handles that userspace requests are those of the display driver so that they can be used to create framebuffers. This has been tested with some GBM test programs, as well as kmscube and weston. All of those run without modifications, but I'm sure there is a lot that can be improved. Some fixes contributed by Hector Martin <[email protected]>. Changes in v2: - duplicate file descriptor in winsys to avoid potential issues - require nouveau when building the tegra driver - check for nouveau driver name on render node - remove unneeded dependency on libdrm_tegra - remove zombie references to libudev - add missing headers to C_SOURCES variable - drop unneeded tegra/ prefix for includes - open device files with O_CLOEXEC - update copyrights Changes in v3: - properly unwrap resources in ->resource_copy_region() - support vertex buffers passed by user pointer - allocate custom stream and const uploader - silence error message on pre-Tegra124 - support X without explicit PRIME Changes in v4: - ship Meson build files in distribution tarball - drop duplicate driver_tegra dependency Reviewed-by: Emil Velikov <[email protected]> Acked-by: Emil Velikov <[email protected]> Tested-by: Andre Heider <[email protected]> Reviewed-by: Dmitry Osipenko <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Signed-off-by: Thierry Reding <[email protected]>
* gallium/st_dri: Honor the glx_disable_sgi_video_sync config optionThomas Hellstrom2018-03-081-0/+1
| | | | | | | | | | This option is disabled by default. Primarily intended for drivers on virtual hardware. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Deepak Rawat <[email protected]>
* draw: fix line stippling with aa linesRoland Scheidegger2018-03-071-4/+13
| | | | | | | | | | | | | | In contrast to non-aa, where stippling is based on either dx or dy (depending on if it's a x or y major line), stippling is based on actual distance with smooth lines, so adjust for this. (It looks like there's some minor artifacts with mesa demos line-sample and stippling, it looks like the line endpoints aren't quite right with aa + stippling - maybe due to the integer math in the stipple stage, but I can't quite pinpoint it.) Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: simplify (and correct) aaline fallback (v2)Roland Scheidegger2018-03-071-409/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The motivation actually was to get rid of the additional tex instruction, since that requires the draw fallback code to intercept all sampler / view calls (even if the fallback is never hit). Basically, the idea is to use coverage of the pixel to calculate the alpha value, and coverage is simply based on the distance to the center of the line (in both line direction, which is useful for wide lines, as well as perpendicular to the line). This is much closer to what hw supporting this natively actually does. It also fixes an issue with line width not quite being correct, as well as endpoints getting stretched too far (in line direction) with wide lines, which is apparent with mesa demo line-sample. (For llvmpipe, it would probably make sense to do something like this directly when drawing lines, since rendering two tris is twice as expensive as a line, but it would need some changes with state management.) Since we're no longer relying on mipmapping to get the alpha value, we also don't need to draw 3 rects (6 tris), one is sufficient. There's still issues (as before): - quite sure it's not correct without half_pixel_center, but can't test this with GL. - aaline + line stipple is incorrect (evident with line-sample demo). Looking at the spec the stipple pattern should actually be based on distance (not just dx or dy for x/y major lines as without aa). - outputs (other than pos + the one used for line aa) should be reinterpolated since we actually increase line length by half a pixel (but there's no tests which would care). v2: simplify the math (should be equivalent), don't need immediate v3: use float versions of atan2,cos,sin, minor cleanups Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* tgsi/scan: use wrap-around shift behavior explicitly for file_maskRoland Scheidegger2018-03-061-2/+5
| | | | | | | | | | | | | | The comment said it will only represent the lowest 32 regs. This was not entirely true in practice, since at least on x86 you'll get masked shifts (unless the compiler could recognize it already and toss it out). It turns out this actually works out alright (presumably noone uses it for temp regs) when increasing max sampler views, so make that behavior explicit. Albeit it feels a bit hacky (but in any case, explicit behavior there is better than undefined behavior). Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/aux/hud: Avoid possible buffer overflowGert Wollny2018-03-051-2/+6
| | | | | | | | | | Limit the length of acceptable cpu names for use in hud_get_num_cpufreq in order to avoid a buffer overflow later in add_object when this name is copied into cpufreq_info::name. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274 Signed-off-by: Gert Wollny <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/util: use sockets on PIPE_OS_UNIX in u_networkJonathan Gray2018-03-012-10/+4
| | | | | | | | Instead of listing all the UNIX PIPE_OS platforms just use PIPE_OS_UNIX. Makes BSD sockets available on PIPE_OS_BSD. Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroyRoland Scheidegger2018-02-281-1/+3
| | | | | | | There's no point, we know the highest non-null one. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: don't needlessly iterate through all sampler view slotsRoland Scheidegger2018-02-281-1/+1
| | | | | | | We already stored the highest (potentially) used number. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_infoTimothy Arceri2018-02-262-7/+0
| | | | | | | Seems to have not been used since 16be87c90429 Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: use PIPE_CAP_CONSTBUF0_FLAGSMarek Olšák2018-02-172-1/+22
|
* meson: link dri3 xcb libs into vlwinsys instead of into each targetDylan Baker2018-02-151-1/+6
| | | | | | | | | | This makes the dependencies easier to manage, since each media target doesn't need to worry about linking to half a dozen libraries. Fixes: b1b65397d0c4978e3 ("meson: Build gallium auxiliary") Signed-off-by: Dylan Baker <[email protected]> Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>