summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: optimize repeat linear npot code in the aos int pathJeff Muizelaar2014-03-141-12/+62
| | | | | | Similar to the other cases, shift some weight/coord calculations to int space. This should be slightly faster (on x86 sse it should actually safe one instruction, and generally int instructions are cheaper).
* gallivm: use correct rounding for nearest wrap mode (in the aos int path)Roland Scheidegger2014-03-141-29/+9
| | | | | | | | | | | | | The previous code used coords which were calculated as (int) (f_coord * tex_size * 256) >> 8. This is not only unnecessarily complex but can give the wrong texel due to rounding for negative coords (as an example, after denormalization coords from -1.0 to 0.0 should give -1, but this will give -1 for numbers from -1.0-1/256 - 0.0-1/256. Instead, juse use ifloor, dropping the shift stuff. Unfortunately, this will most likely be slower - with arch rounding available it shouldn't be too bad (trades a int shift for a round but also saves an int mul (which is shared by all coords) but otherwise it's a mess.
* gallivm: use correct rounding for linear wrap mode (in the aos int path)Jeff Muizelaar2014-03-141-6/+8
| | | | | | | | | | | | | | | | | | | The previous method for converting coords to ints was sligthly inaccurate (effectively losing 1bit from the 8bit lerp weight). This is probably especially noticeable when trying to draw a pixel-aligned texture. As an example, for a 100x100 texture after dernormalization the texture coords in this case would turn up as 0.5, 1.5, 2.5, 3.5, 4.5, ... After the mul by 256, conversion to int and 128 subtraction, they end up as 0, 256, 512, 768, 1024, ... which gets us the correct coords/weights of 0/0, 1/0, 2/0, 3/0, 4/0, ... But even LSB errors (which are unavoidable) in the input coords may cause these coords/weights to be wrong, e.g. for a coord of 3.49999 we'd get a coord/weight of 2/255 instead. Fix this by using round-to-nearest int instead of FPToSi (trunc). Should be equally fast on x86 sse though other archs probably suffer a little.
* automake: silence folder creationEmil Velikov2014-03-111-4/+4
| | | | | | | | | | | There is little gain in printing whenever a folder is created. v2: - Use $(AM_V_at) over @ to have control in verbose builds. Suggested by Erik Faye-Lund. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
* gallium: allow setting of the internal stream output offsetZack Rusin2014-03-0710-21/+29
| | | | | | | | | | | | | | | | D3D10 allows setting of the internal offset of a buffer, which is in general only incremented via actual stream output writes. By allowing setting of the internal offset draw_auto is capable of rendering from buffers which have not been actually streamed out to. Our interface didn't allow. This change functionally shouldn't make any difference to OpenGL where instead of an append_bitmask you just get a real array where -1 means append (like in D3D) and 0 means do not append. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: rename R4A4 and A4R4 formats to match their swizzleMarek Olšák2014-03-072-4/+3
| | | | | | Like L4A4. Reviewed-by: Brian Paul <[email protected]>
* vl: Add rotation v3Kusanagi Kouichi2014-03-072-12/+107
| | | | | | | | | v2: rotate in gen_rect_verts instead v3: clear rotate in vl_compositor_clear_layers, update calc_drawn_area as well Signed-off-by: Kusanagi Kouichi <[email protected]> Signed-off-by: Christian König <[email protected]>
* gallium/util: Fix memory leakAaron Watry2014-03-061-0/+2
| | | | | | | | | | Fix a leaked vertex shader in u_blitter.c Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]> CC: "10.1" <[email protected]>
* translate: fix buffer overflowsZack Rusin2014-03-044-6/+18
| | | | | | | | | | | | | | Because in draw we always inject position at slot 0 whenever fragment shader would take the maximum number of inputs (32) it meant that we had PIPE_MAX_ATTRIBS + 1 slots to translate, which meant that we were crashing with fragment shaders that took the maximum number of attributes as inputs. The actual max number of attributes we need to translate thus is PIPE_MAX_ATTRIBS + 1. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Matthew McClure <[email protected]>
* draw/llvm: fix generation of the VS with GS presentZack Rusin2014-03-041-7/+7
| | | | | | | | | | | | | | | | | draw_current_shader_* functions return a final output when considering both the geometry shader and the vertex shader. But when code generating vertex shader we can not be using output slots from the geometry shader because, obviously, those can be completely different. This fixes a number of very non-obvious crashes. A side-effect of this bug was that sometimes the vertex shading code could save some random outputs as position/clip when the geometry shader was writing them and vertex shader had different outputs at those slots (sometimes writing garbage and sometimes something correct). Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Matthew McClure <[email protected]>
* util: don't define isfinite(), isnan() for MSVC >= 1800Hans2014-03-031-0/+4
| | | | | Signed-off-by: Brian Paul <[email protected]> Cc: "10.0" "10.1" <[email protected]>
* gallium/util: add missing u_math includeIlia Mirkin2014-02-281-0/+2
| | | | | | | This is needed for MIN2/MAX2 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* util/u_format: don't crash in util_format_translate if we can't do translationRoland Scheidegger2014-02-272-6/+17
| | | | | | | | | | | | Some formats can't be handled - in particular cannot handle ints/uints formats, which lack the pack_rgba_float/unpack_rgba_float functions. Instead of trying to call these (and crash) return an error (I'm not sure yet if we should try to translate such formats too here might not make much sense). v2: suggested by Jose, use separate checks for pack/unpack of rgba_8unorm and rgba_float functions (right now if one exists the other should as well). Reviewed-by: Jose Fonseca <[email protected]>
* gallium/upload_mgr: remove useless variable "size"Marek Olšák2014-02-251-6/+4
| | | | Reviewed-by: Fredrik Höglund <[email protected]>
* gallium/upload_mgr: don't unmap buffers if persistent mappings are supportedMarek Olšák2014-02-251-14/+51
| | | | Reviewed-by: Fredrik Höglund <[email protected]>
* gallium: add texture gather support to gallium (v3)Dave Airlie2014-02-251-0/+1
| | | | | | | | | | | | | | | | | | | | | This adds support to gallium for a TG4 instruction, and two CAPs. The first CAP is required for GL_ARB_texture_gather. The second CAP is required to expose GL_ARB_gpu_shader5. However so far we haven't found any hardware that natively exposes the textureGatherOffsets feature from GL, so just lower it for now. If hardware appears for this we can add another CAP to allow TG4 to take 4 offsets. v2: add component selection src and a cap to say hw can do it. (st can use to help control GL_ARB_gpu_shader5/GLSL 4.00). Add docs. v3: rename to SM5, add docs. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* util: Add util_cpu_to_le* helpersTom Stellard2014-02-241-0/+3
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* util: Add util_bswap64() v3Tom Stellard2014-02-241-0/+16
| | | | | | | | | | | | v2: - Use __builtin_bswap64() - Remove unnecessary mask - Add util_le64_to_cpu() helper v3: - Remove unnecessary AC_SUBST Reviewed-by: Michel Dänzer <[email protected]>
* configure.ac: Use AX_GCC_BUILTIN to check availability of __builtin_bswap32 v2Tom Stellard2014-02-241-1/+2
| | | | | | | v2: - Remove unnecessary AC_SUBST Reviewed-by: Matt Turner <[email protected]>
* pipe-loader: wrap pipe_loader_sw_probe_xlib within HAVE_PIPE_LOADER_XLIBEmil Velikov2014-02-243-7/+6
| | | | | | | | | | | | | | | | | | | The above function implies using the the xlib winsys, which has additional library dependencies that should not be forced. Make the software xlib pipe loader optional thus avoid all the dependency hell. A user that wishes to use the particular pipe-loader would need to set the following within configure.ac. enable_gallium_xlib_loader=yes v2: - Wrap sw/xlib/xlib_sw_winsys.h to handle compilation on systems lacking X11 headers. Spotted by Christian Prochaska. Tested-by: Tom Stellard <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75356 Signed-off-by: Emil Velikov <[email protected]>
* pipe-loader: introduce pipe_loader_sw_probe_null helper functionEmil Velikov2014-02-222-0/+31
| | | | | | | v2: Handle null_sw_create failure, add missing function return type Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v1)
* pipe-loader: introduce pipe_loader_sw_probe_dri helperEmil Velikov2014-02-222-0/+36
| | | | | | | | | | | Will be used in the following commits. v2: Link gallium tests against the library. v3: Handle dri_create_sw_winsys failure v4: Rebase on top of the targets/xa changes Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v2)
* pipe-loader: introduce pipe_loader_sw_probe_xlib helperEmil Velikov2014-02-223-3/+45
| | | | | | | | | Will be used in the upcoming patches. v2: handle xlib_create_sw_winsys failure, drop unneeded header Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v1)
* pipe-loader: use bool type for pipe_loader_drm_probe_fd()Emil Velikov2014-02-222-5/+5
| | | | | | | | v2: Rebase on top of the rendernode changes. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v1) Reviewed-by: Francisco Jerez <[email protected]> (v1)
* winsys/xlib: move xlib_create_sw_winsys within the winsysEmil Velikov2014-02-221-1/+1
| | | | | | v2: Rebase on top of vl_winsys_xsp.c removal Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v1)
* pipe-loader: handle memory allocation failureEmil Velikov2014-02-222-0/+4
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]>
* pipe-loader: build pipe_loader_drm_x_auth whenever HAVE_PIPE_LOADER_XCB is ↵Emil Velikov2014-02-221-1/+1
| | | | | | | | | | defined Currently HAVE_PIPE_LOADER_XCB is defined, rather than being set to 1/0. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* pipe-loader: destroy sw_winsys on sw_releaseEmil Velikov2014-02-221-0/+3
| | | | | | | | | | | | The sw pipe-loader implicitly handles winsys_create, thus we it would make sense to implicitly destroy it upon releasing the loader. Currently we leak the sw_winsys when releasing the pipe-loader. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* vl/winsys_dri: cleanup vl_screen_create error pathEmil Velikov2014-02-221-13/+19
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> Reviewed-by: Christian König <[email protected]>
* tgsi_ureg: add property_gs_invocationsJordan Justen2014-02-202-0/+11
| | | | | | | | Fixes a build break in state_tracker/st_program.c Signed-off-by: Jordan Justen <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75278 Reviewed-by: Dave Airlie <[email protected]>
* gallivm: add smallfloat to float conversion not relying on cpu denorm handlingRoland Scheidegger2014-02-201-20/+65
| | | | | | | | | | | | | | | | | | | | | | The previous code relied on cpu denorm support for converting small float formats (such r11g11b10_float and r16_float) to floats, otherwise denorms are flushed to zero. We worked around that in llvmpipe blend code by reenabling denorms, but this did nothing for texture sampling. Now it would be possible to reenable it there too but I'm not really a fan of messing with fpu flags (and it seems we can't actually do it reliably with llvm in any case looking at some bug reports). (Not to mention if you actually have a lot of denorms in there, you can expect some order-of-magnitude slowdown with x86 cpus.) So instead use code which adjusts exponents etc. directly hence not relying on cpu denorm support for the rescaling mul. (We still need the fpu flag handling as we can't do float-to-smallfloat without using cpu denorms at least for now - I actually wanted to keep both the old and new code and using one or the other depending on from where it's called but that didn't work out as the parameter would have to be passed through too many layers than I'd like.) Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Si Chen <[email protected]>
* pipe-loader: split out "client" versionRob Clark2014-02-161-4/+19
| | | | | | | | Build two versions of pipe-loader, with only the client version linking in x11 client side dependencies. This will allow the XA state tracker to use pipe-loader. Signed-off-by: Rob Clark <[email protected]>
* gallium/pipebuffer: change pb_cache_manager_create() size_factor to floatBrian Paul2014-02-142-6/+6
| | | | | | | Requested by Marek. Reviewed-by: Marek Olšák <[email protected]> Cc: "10.1" <[email protected]>
* gallium/util: Add flush/map debug utility codeThomas Hellstrom2014-02-143-0/+530
| | | | | | Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* gallium/pipebuffer: Add a cache buffer manager bypass maskThomas Hellstrom2014-02-142-4/+22
| | | | | | | | | | | | | In some situations, it may be desirable to bypass the cache at buffer creation but to insert the buffer in the cache at buffer destruction. One such situation is where we already have a kernel representation of a buffer that we want to use, but we also want to insert it in the cache when it's freed up. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* pipebuffer, winsys: Add a size match parameter to the cached buffer managerThomas Hellstrom2014-02-142-3/+7
| | | | | | | | In some situations it's important to restrict the sizes of buffers that the cached buffer manager is allowed to return Signed-off-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* vl: add motion adaptive deinterlacerGrigori Goronzy2014-02-143-1/+569
| | | | Reviewed-by: Christian König <[email protected]>
* pipe-loader: Add support for render nodes v2Tom Stellard2014-02-131-3/+77
| | | | | | | v2: - Add missing call to pipe_loader_drm_release() - Fix render node macros - Drop render-node configure option
* pipe-loader: Add auth_x parameter to pipe_loader_drm_probe_fd()Tom Stellard2014-02-132-4/+10
| | | | | The caller can use this boolean parameter to tell the pipe-loader to authenticate with the X server when probing a file descriptor.
* gallium/vl: remove remaining softpipe video functionsChristian König2014-02-132-173/+1
| | | | | | | Unused and unmaintained for quite a while. Signed-off-by: Christian König <[email protected]> Reviewed-by: Maarten Lankhorst <[email protected]>
* auxiliary/pipe-loader: automake: avoid exporting all symbolsEmil Velikov2014-02-111-0/+1
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* pipe-loader: drop obsolete libudev.h includeEmil Velikov2014-02-111-1/+0
| | | | | | | All the udev code is in the loader, so there is no reason for us to include this header. Signed-off-by: Emil Velikov <[email protected]>
* vl: add H264 encoding interfaceChristian König2014-02-111-1/+1
| | | | | Signed-off-by: Christian König <[email protected]> Signed-off-by: Leo Liu <[email protected]>
* gallium/tgsi: correct typo propagated from NV_vertex_program1_1Erik Faye-Lund2014-02-071-2/+2
| | | | | | | | | | | | | | | | In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F800000 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/tgsi: use CLAMP instead of open-coded clampsErik Faye-Lund2014-02-071-22/+4
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: remove PIPE_USAGE_STATICMarek Olšák2014-02-069-15/+15
| | | | Reviewed-by: Brian Paul <[email protected]>
* vl/rbsp: add H.264 RBSP implementationChristian König2014-02-061-0/+164
| | | | Signed-off-by: Christian König <[email protected]>
* vl/vlc: add function to limit the vlc sizeChristian König2014-02-061-12/+41
| | | | Signed-off-by: Christian König <[email protected]>
* vl/vlc: add remove bits functionChristian König2014-02-061-0/+12
| | | | Signed-off-by: Christian König <[email protected]>
* tgsi/ureg: increase the number of immediatesZack Rusin2014-02-051-1/+1
| | | | | | | | | | ureg_program is allocated on the heap so we can just bump the number of immediates that it can handle. It's needed for d3d10. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>