summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: add smallfloat to float conversion not relying on cpu denorm handlingRoland Scheidegger2014-02-201-20/+65
| | | | | | | | | | | | | | | | | | | | | | The previous code relied on cpu denorm support for converting small float formats (such r11g11b10_float and r16_float) to floats, otherwise denorms are flushed to zero. We worked around that in llvmpipe blend code by reenabling denorms, but this did nothing for texture sampling. Now it would be possible to reenable it there too but I'm not really a fan of messing with fpu flags (and it seems we can't actually do it reliably with llvm in any case looking at some bug reports). (Not to mention if you actually have a lot of denorms in there, you can expect some order-of-magnitude slowdown with x86 cpus.) So instead use code which adjusts exponents etc. directly hence not relying on cpu denorm support for the rescaling mul. (We still need the fpu flag handling as we can't do float-to-smallfloat without using cpu denorms at least for now - I actually wanted to keep both the old and new code and using one or the other depending on from where it's called but that didn't work out as the parameter would have to be passed through too many layers than I'd like.) Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Si Chen <[email protected]>
* pipe-loader: split out "client" versionRob Clark2014-02-161-4/+19
| | | | | | | | Build two versions of pipe-loader, with only the client version linking in x11 client side dependencies. This will allow the XA state tracker to use pipe-loader. Signed-off-by: Rob Clark <[email protected]>
* gallium/pipebuffer: change pb_cache_manager_create() size_factor to floatBrian Paul2014-02-142-6/+6
| | | | | | | Requested by Marek. Reviewed-by: Marek Olšák <[email protected]> Cc: "10.1" <[email protected]>
* gallium/util: Add flush/map debug utility codeThomas Hellstrom2014-02-143-0/+530
| | | | | | Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* gallium/pipebuffer: Add a cache buffer manager bypass maskThomas Hellstrom2014-02-142-4/+22
| | | | | | | | | | | | | In some situations, it may be desirable to bypass the cache at buffer creation but to insert the buffer in the cache at buffer destruction. One such situation is where we already have a kernel representation of a buffer that we want to use, but we also want to insert it in the cache when it's freed up. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* pipebuffer, winsys: Add a size match parameter to the cached buffer managerThomas Hellstrom2014-02-142-3/+7
| | | | | | | | In some situations it's important to restrict the sizes of buffers that the cached buffer manager is allowed to return Signed-off-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* vl: add motion adaptive deinterlacerGrigori Goronzy2014-02-143-1/+569
| | | | Reviewed-by: Christian König <[email protected]>
* pipe-loader: Add support for render nodes v2Tom Stellard2014-02-131-3/+77
| | | | | | | v2: - Add missing call to pipe_loader_drm_release() - Fix render node macros - Drop render-node configure option
* pipe-loader: Add auth_x parameter to pipe_loader_drm_probe_fd()Tom Stellard2014-02-132-4/+10
| | | | | The caller can use this boolean parameter to tell the pipe-loader to authenticate with the X server when probing a file descriptor.
* gallium/vl: remove remaining softpipe video functionsChristian König2014-02-132-173/+1
| | | | | | | Unused and unmaintained for quite a while. Signed-off-by: Christian König <[email protected]> Reviewed-by: Maarten Lankhorst <[email protected]>
* auxiliary/pipe-loader: automake: avoid exporting all symbolsEmil Velikov2014-02-111-0/+1
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* pipe-loader: drop obsolete libudev.h includeEmil Velikov2014-02-111-1/+0
| | | | | | | All the udev code is in the loader, so there is no reason for us to include this header. Signed-off-by: Emil Velikov <[email protected]>
* vl: add H264 encoding interfaceChristian König2014-02-111-1/+1
| | | | | Signed-off-by: Christian König <[email protected]> Signed-off-by: Leo Liu <[email protected]>
* gallium/tgsi: correct typo propagated from NV_vertex_program1_1Erik Faye-Lund2014-02-071-2/+2
| | | | | | | | | | | | | | | | In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F800000 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/tgsi: use CLAMP instead of open-coded clampsErik Faye-Lund2014-02-071-22/+4
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: remove PIPE_USAGE_STATICMarek Olšák2014-02-069-15/+15
| | | | Reviewed-by: Brian Paul <[email protected]>
* vl/rbsp: add H.264 RBSP implementationChristian König2014-02-061-0/+164
| | | | Signed-off-by: Christian König <[email protected]>
* vl/vlc: add function to limit the vlc sizeChristian König2014-02-061-12/+41
| | | | Signed-off-by: Christian König <[email protected]>
* vl/vlc: add remove bits functionChristian König2014-02-061-0/+12
| | | | Signed-off-by: Christian König <[email protected]>
* tgsi/ureg: increase the number of immediatesZack Rusin2014-02-051-1/+1
| | | | | | | | | | ureg_program is allocated on the heap so we can just bump the number of immediates that it can handle. It's needed for d3d10. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: make sure analysis works with large number of immediatesZack Rusin2014-02-051-8/+9
| | | | | | | | | | | We need to handle a lot more immediates and in order to do that we also switch from allocating this structure on the stack to allocating it on the heap. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: handle huge number of immediatesZack Rusin2014-02-054-44/+86
| | | | | | | | | | | | | | We only supported up to 256 immediates, which isn't enough. We had code which was allocating immediates as an allocated array, but it was always used along a statically backed array for performance reasons. This commit adds code to skip that performance optimization and always use just the dynamically allocated immediates if the number of them is too great. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: allow large numbers of temporariesZack Rusin2014-02-054-5/+20
| | | | | | | | | | | | | | The number of allowed temporaries increases almost with every iteration of an api. We used to support 128, then we started increasing and the newer api's support 4096+. So if we notice that the number of temporaries is larger than our statically allocated storage would allow we just treat them as indexable temporaries and allocate them as an array from the start. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix F2U opcodeRoland Scheidegger2014-02-051-20/+22
| | | | | | | | | | | | | | | | | | | | | Previously, we were really doing F2I. And also move it to generic section. (Note that for llvmpipe the code generated is definitely bad, due to lack of unsigned conversions with sse. I think though what llvm does (using scalar conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit) including lots of domain changes is quite suboptimal, could do something like is_large = arg >= 2^31 half_arg = 0.5 * arg small_c = fptoint(arg) large_c = fptoint(half_arg) << 1 res = select(is_large, large_c, small_c) which should be much less instructions but that's something llvm should do itself.) This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs GL 3.0 version override to run.) Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* draw: fix incorrect color of flat-shaded clipped linesBrian Paul2014-02-031-1/+12
| | | | | | | | | | | | When we clipped a line weren't copying the provoking vertex color to the second vertex. We also weren't checking for first vs. last provoking vertex. Fixes failures found with the new piglit line-flat-clip-color test. Cc: "10.0, 10.1" <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium/auxiliary/indices: replace free() with FREE()Brian Paul2014-02-031-1/+1
| | | | | | | | To match the CALLOC_STRUCT() call. Cc: "10.0, 10.1" <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: fix opcode and function nestingZack Rusin2014-02-032-157/+317
| | | | | | | | | | | | | | gallivm soa code supported only a single level of nesting for control flow opcodes (if, switch, loops...) but the d3d10 spec clearly states that those are nested within functions. To support nesting of conditionals inside functions we need to store the nesting data inside function contexts and keep a stack of those. Furthermore we make sure that if nesting for subroutines is deeper than 32 then we simply ignore all subsequent 'call' invocations. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: add a few const qualifiersBrian Paul2014-02-022-4/+4
| | | | Trivial.
* translate: reindent translate_sse.cBrian Paul2014-02-021-472/+474
| | | | Trivial.
* gallivm: Workaround http://llvm.org/PR18600José Fonseca2014-01-281-2/+4
| | | | | | | | | | | | | | | | | | We have code generation paths that carry out swizzles of AoS vectors via bitwise shifts, as these tend to generate more efficient code than straightforward byte shuffles. But when the input is a constant the additional bitwise arithmetic operations somehow don't really get constant propagated properly, evenutally causing assertion failure in InstCombine pass. Therefore avoid the bug by using the trivial shuffles for constant inputs. Although the sample LLVM IR can cause a crash with any LLVM version, this was only seen in practice with LLVM 3.2. Reviewed-by: Matthew McClure <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util/u_vbuf: correct map offset calculation for crazy offsetsIlia Mirkin2014-01-271-1/+1
| | | | | | | | | When the min_index is very large (or very negative), the multipliation can overflow 32 bits and result in an incorrect map pointer modification. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* translate: deal with size overflows by casting to ptrdiff_tIlia Mirkin2014-01-272-3/+7
| | | | | | | | | | | This was discovered as a result of the draw-elements-base-vertex-neg piglit test, which passes very negative offsets in, followed up by large indices. The nouveau code correctly adjusts the pointer, but the translate code needs to do the proper inverse correction. Similarly fix up the SSE code to do a 64-bit multiply to compute the proper offset. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/rtasm: handle mmap failures appropriatelyEmil Velikov2014-01-271-3/+7
| | | | | | | | | | | | | | For a variety of reasons mmap (selinux and pax to name a few) and can fail and with current code. This will result in a crash in the driver, if not worse. This has been the case since the inception of the gallium copy of rtasm. Cc: 9.1 9.2 10.0 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73473 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]>
* draw: Save original driver functions earlier.José Fonseca2014-01-232-14/+14
| | | | | | | | Otherwise they will be NULL when stage destroy is invoked prematurely, (i.e, on out of memory). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* os/os_thread: Revert pipe_barrier pre-processing logic.José Fonseca2014-01-231-1/+1
| | | | | Whitelist platforms instead of blacklisting, as several pthread implementations are missing pthread_barrier_t, in particular MacOSX.
* gallium: Use C11 thread abstractions.José Fonseca2014-01-231-230/+32
| | | | | | | Note that PIPE_ROUTINE now returns an int. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* os: Remove pipe_static_condvar.José Fonseca2014-01-231-12/+0
| | | | | | Never used. Reviewed-by: Brian Paul <[email protected]>
* gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formatsMarek Olšák2014-01-231-0/+3
| | | | | | | | | This fixes a serious regression introduced in 4e549ddb500cf677b6fa16d9ebdfa67cc23da097. Cc: 9.2 10.0 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/u_upload_mgr: don't expose u_upload_flushMarek Olšák2014-01-232-22/+4
| | | | | | | It's unused and shouldn't be used at all in my opinion. If some driver doesn't support the unsynchronized flag, u_upload_mgr should avoid the synchronization by other means, e.g. by using the DONTBLOCK flag.
* gallium/hud: just unmap the upload vertex buffer instead of recreating itMarek Olšák2014-01-231-1/+1
|
* gallium/vl: use u_upload_mgr to upload vertices for vl_compositorMarek Olšák2014-01-232-32/+20
| | | | | | | This is the recommended way for streaming vertices. Always use this if you need to upload vertices every frame. Reviewed-by: Christian König <[email protected]>
* draw: fix points with negative w coords for d3d style point clippingRoland Scheidegger2014-01-211-2/+6
| | | | | | | | | | | | | Even with depth clipping disabled, vertices which have negative w coords must be discarded. And since we don't have a proper guardband implementation yet (relying on driver to handle all values except infs/nans in rasterization for such points) we need to kill them off manually (as they can end up with coordinates inside viewport otherwise). v2: use 0.0f instead of 0 (spotted by Brian). Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: use some cast wrappers in draw_pt_fetch_shade_pipeline*.cBrian Paul2014-01-202-19/+29
| | | | Trivial.
* draw: whitespace and formatting fixes in draw_pt_fetch_shade_pipeline*.cBrian Paul2014-01-202-81/+105
| | | | Trivial.
* draw: fix incorrect vertex size computation in LLVM drawing codeBrian Paul2014-01-202-11/+30
| | | | | | | | | | | | | | We were calling draw_total_vs_outputs() too early. The call to draw_pt_emit_prepare() could result in the vertex size changing. So call draw_total_vs_outputs() after draw_pt_emit_prepare(). This fix would seem to be needed for the non-LLVM code as well, but it's not obvious. Instead, I added an assertion there to try to catch this problem if it were to occur there. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72926 Cc: 10.0 <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw: clean up d3d style point clippingRoland Scheidegger2014-01-205-25/+42
| | | | | | | | | | | | | | | | | | | | Instead of skipping x/y clipping completely if there's point_tri_clip points use guard band clipping. This should be easier (previously we could not disable generating the x/y bits in the clip mask for llvm path, hence requiring custom clip path), and it also allows us to enable this for tris-as-points more easily too (this would require custom tri clip filtering too otherwise). Moreover, some unexpected things could have happen if there's a NaN or just a huge number in some tri-turned-point, as the driver's rasterizer would need to deal with it and that might well lead to undefined behavior in typical rasterizers (which need to convert these numbers to fixed point). Using a guardband should hence be more robust, while "usually" guaranteeing the same results. (Only "usually" because unlike hw guardbands draw guardband is always just twice the vp size, hence small vp but large points could still lead to different results.) Unfortunately because the clipmask generated is completely unaffected by guard band clipping, we still need a custom clip stage for points (but not for tris, as the actual clipping there takes guard band into account). Reviewed-by: Jose Fonseca <[email protected]>
* pipe-loader: Fix buildArmin K2014-01-191-0/+1
| | | | | | | pipe_loader_drm.c: In function 'pipe_loader_drm_probe_fd': pipe_loader_drm.c:120:4: error: implicit declaration of function 'loader_get_pci_id_for_fd' [-Werror=implicit-function-declaration] Reviewed-by: Emil Velikov <[email protected]>
* pipe-loader: add support for non-pci (platform) devicesEmil Velikov2014-01-182-0/+3
| | | | | | | | | Culled out of the "loader: refactor duplicated code into loader util lib" patch by Rob Clark. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* pipe-loader: use loader util libEmil Velikov2014-01-182-81/+14
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* loader: introduce the loader util libEmil Velikov2014-01-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | All the various window system integration layers duplicate roughly the same code for figuring out device and driver name, pci-id's, etc. Which is sad. So extract it out into a loader util lib. v2 (Emil) * Separate the introduction of libloader from the code de-duplication. * Strip out non-pci devices support. * Add scons + Android build system support. * Add VISIBILITY_CFLAGS to avoid exporting the loader funcs. v3 (Emil) * PIPE_OS_ANDROID is undefined at this scope, use ANDROID * Make sure we define _EGL_NO_DRM when building only swrast Signed-off-by: Rob Clark <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Ian Romanick <[email protected]>