summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallium/u_blitter: implement scaled blitting in the Z directionMarek Olšák2014-04-041-9/+31
| | | | So that pipe->blit can be used for 3D mipmap generation.
* gallium/u_blitter: don't adjust cubemap coordinates by a small numberMarek Olšák2014-04-041-1/+1
| | | | | It may cause issues with mipmap generation. I think it was used to make some piglit tests pass on r300g.
* cso: check for no sampler view changes in cso_set_sampler_views()Brian Paul2014-04-031-3/+8
| | | | | | | | | As we do for sampler states in single_sampler_done() and many other CSO functions. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* cso: fix sampler view count in cso_set_sampler_views()Brian Paul2014-04-021-3/+4
| | | | | | | | | | | | | | We want to call pipe->set_sampler_views() with count being the maximum of the old number of sampler views and the new number. This makes sure we null-out any old sampler views. We already do the same thing for sampler states in single_sampler_done(). Fixes some assertions seen in the VMware driver with XA tracker. Cc: "10.0" "10.1" <[email protected]> Reviewed-by: Thomas Hellstrom <[email protected]> Tested-by: Thomas Hellstrom <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/llvm: improve debugging output a bitZack Rusin2014-03-262-2/+3
| | | | | | | | | | | it's useful to know what the llvmbuildstore arguments are going to be before executing it because it can crash and make sure to print out the inputs only if we're not generating a gs because it fetches inputs differently. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/gs: reduce the size of the gs output bufferZack Rusin2014-03-261-7/+13
| | | | | | | | | | | | We used to overallocate the output buffer sometimes running out of memory with applications rendering large geometries. The actual maximum number of vertices out is simply the maximum number of primitives in (number of gs invocations) multiplied by the maximum number of output vertices per gs input primitive (i.e. gs invocation). Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: fix no-op n:n lp_build_resize()Roland Scheidegger2014-03-261-6/+6
| | | | | | | | | | This can get called in some circumstances if both src type and dst type have same width (seen with float32->unorm32). While this particular case was bogus anyway let's just fix that as it can work trivially (due to the way it was called it actually worked anyway apart from the assert). Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: Duplicate TGSI tokens in draw_pipe_pstipple module.José Fonseca2014-03-251-1/+2
| | | | | | | | | As done in draw_pipe_aaline and draw_pipe_aapoint modules. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]> Cc: "10.0 10.1" <[email protected]>
* llvmpipe: add support for b5g6r5_srgbRoland Scheidegger2014-03-213-5/+25
| | | | | | | | | | | | | The conversion code for srgb was tuned for n x 4x8bit AoS -> 4 x nxfloat SoA (and vice versa), fix this to handle also 16bit 565-style srgb formats. Still not really all that generic, things like r10g10b10a2_srgb or r4g4b4a4_srgb wouldn't work (the latter trivial to fix, the former would not require more work to not crash but near certainly need some higher precision calculation) but not needed right now. The code is not fully optimized for this (could use more direct calculation instead of expanding to 8-bit range first) but should be good enough. Reviewed-by: Jose Fonseca <[email protected]>
* gallium: add b5g6r5 srgb formatRoland Scheidegger2014-03-213-4/+19
| | | | | | | | | | | | | | | GL generally doesn't seem to allow srgb formats with less (or more) than 8 bit for the rgb channels, though some hw could easily do it (typically for formats with up to 10 bits for the rgb channels, at least for formats with less than 8 bits support is likely widespread even). While it may be true there aren't really any benefits for such formats, we need for it for d3d, though luckily only for b5g6r5_srgb it seems. So add this format along with the util code for conversion - since that util code is heavily tuned for 8bit srgb this isn't really all that well optimized and rounding doesn't seem right but at least it should give some halfway meaningful results. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/u_gen_mipmap: remove the software fallbackMarek Olšák2014-03-211-1160/+2
| | | | | | | | The last changes to it are from 2008 and 2009. It doesn't support most texture formats and some texture targets. Nobody can possibly be using this. Reviewed-by: Brian Paul <[email protected]>
* st/mesa: fix generating mipmaps for cube arraysMarek Olšák2014-03-211-28/+20
| | | | | Cc: [email protected] Reviewed-by: Brian Paul <[email protected]>
* gallivm: optimize repeat linear npot code in the aos int pathJeff Muizelaar2014-03-141-12/+62
| | | | | | Similar to the other cases, shift some weight/coord calculations to int space. This should be slightly faster (on x86 sse it should actually safe one instruction, and generally int instructions are cheaper).
* gallivm: use correct rounding for nearest wrap mode (in the aos int path)Roland Scheidegger2014-03-141-29/+9
| | | | | | | | | | | | | The previous code used coords which were calculated as (int) (f_coord * tex_size * 256) >> 8. This is not only unnecessarily complex but can give the wrong texel due to rounding for negative coords (as an example, after denormalization coords from -1.0 to 0.0 should give -1, but this will give -1 for numbers from -1.0-1/256 - 0.0-1/256. Instead, juse use ifloor, dropping the shift stuff. Unfortunately, this will most likely be slower - with arch rounding available it shouldn't be too bad (trades a int shift for a round but also saves an int mul (which is shared by all coords) but otherwise it's a mess.
* gallivm: use correct rounding for linear wrap mode (in the aos int path)Jeff Muizelaar2014-03-141-6/+8
| | | | | | | | | | | | | | | | | | | The previous method for converting coords to ints was sligthly inaccurate (effectively losing 1bit from the 8bit lerp weight). This is probably especially noticeable when trying to draw a pixel-aligned texture. As an example, for a 100x100 texture after dernormalization the texture coords in this case would turn up as 0.5, 1.5, 2.5, 3.5, 4.5, ... After the mul by 256, conversion to int and 128 subtraction, they end up as 0, 256, 512, 768, 1024, ... which gets us the correct coords/weights of 0/0, 1/0, 2/0, 3/0, 4/0, ... But even LSB errors (which are unavoidable) in the input coords may cause these coords/weights to be wrong, e.g. for a coord of 3.49999 we'd get a coord/weight of 2/255 instead. Fix this by using round-to-nearest int instead of FPToSi (trunc). Should be equally fast on x86 sse though other archs probably suffer a little.
* automake: silence folder creationEmil Velikov2014-03-111-4/+4
| | | | | | | | | | | There is little gain in printing whenever a folder is created. v2: - Use $(AM_V_at) over @ to have control in verbose builds. Suggested by Erik Faye-Lund. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jon TURNEY <[email protected]>
* gallium: allow setting of the internal stream output offsetZack Rusin2014-03-0710-21/+29
| | | | | | | | | | | | | | | | D3D10 allows setting of the internal offset of a buffer, which is in general only incremented via actual stream output writes. By allowing setting of the internal offset draw_auto is capable of rendering from buffers which have not been actually streamed out to. Our interface didn't allow. This change functionally shouldn't make any difference to OpenGL where instead of an append_bitmask you just get a real array where -1 means append (like in D3D) and 0 means do not append. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: rename R4A4 and A4R4 formats to match their swizzleMarek Olšák2014-03-072-4/+3
| | | | | | Like L4A4. Reviewed-by: Brian Paul <[email protected]>
* vl: Add rotation v3Kusanagi Kouichi2014-03-072-12/+107
| | | | | | | | | v2: rotate in gen_rect_verts instead v3: clear rotate in vl_compositor_clear_layers, update calc_drawn_area as well Signed-off-by: Kusanagi Kouichi <[email protected]> Signed-off-by: Christian König <[email protected]>
* gallium/util: Fix memory leakAaron Watry2014-03-061-0/+2
| | | | | | | | | | Fix a leaked vertex shader in u_blitter.c Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]> CC: "10.1" <[email protected]>
* translate: fix buffer overflowsZack Rusin2014-03-044-6/+18
| | | | | | | | | | | | | | Because in draw we always inject position at slot 0 whenever fragment shader would take the maximum number of inputs (32) it meant that we had PIPE_MAX_ATTRIBS + 1 slots to translate, which meant that we were crashing with fragment shaders that took the maximum number of attributes as inputs. The actual max number of attributes we need to translate thus is PIPE_MAX_ATTRIBS + 1. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Matthew McClure <[email protected]>
* draw/llvm: fix generation of the VS with GS presentZack Rusin2014-03-041-7/+7
| | | | | | | | | | | | | | | | | draw_current_shader_* functions return a final output when considering both the geometry shader and the vertex shader. But when code generating vertex shader we can not be using output slots from the geometry shader because, obviously, those can be completely different. This fixes a number of very non-obvious crashes. A side-effect of this bug was that sometimes the vertex shading code could save some random outputs as position/clip when the geometry shader was writing them and vertex shader had different outputs at those slots (sometimes writing garbage and sometimes something correct). Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Matthew McClure <[email protected]>
* util: don't define isfinite(), isnan() for MSVC >= 1800Hans2014-03-031-0/+4
| | | | | Signed-off-by: Brian Paul <[email protected]> Cc: "10.0" "10.1" <[email protected]>
* gallium/util: add missing u_math includeIlia Mirkin2014-02-281-0/+2
| | | | | | | This is needed for MIN2/MAX2 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* util/u_format: don't crash in util_format_translate if we can't do translationRoland Scheidegger2014-02-272-6/+17
| | | | | | | | | | | | Some formats can't be handled - in particular cannot handle ints/uints formats, which lack the pack_rgba_float/unpack_rgba_float functions. Instead of trying to call these (and crash) return an error (I'm not sure yet if we should try to translate such formats too here might not make much sense). v2: suggested by Jose, use separate checks for pack/unpack of rgba_8unorm and rgba_float functions (right now if one exists the other should as well). Reviewed-by: Jose Fonseca <[email protected]>
* gallium/upload_mgr: remove useless variable "size"Marek Olšák2014-02-251-6/+4
| | | | Reviewed-by: Fredrik Höglund <[email protected]>
* gallium/upload_mgr: don't unmap buffers if persistent mappings are supportedMarek Olšák2014-02-251-14/+51
| | | | Reviewed-by: Fredrik Höglund <[email protected]>
* gallium: add texture gather support to gallium (v3)Dave Airlie2014-02-251-0/+1
| | | | | | | | | | | | | | | | | | | | | This adds support to gallium for a TG4 instruction, and two CAPs. The first CAP is required for GL_ARB_texture_gather. The second CAP is required to expose GL_ARB_gpu_shader5. However so far we haven't found any hardware that natively exposes the textureGatherOffsets feature from GL, so just lower it for now. If hardware appears for this we can add another CAP to allow TG4 to take 4 offsets. v2: add component selection src and a cap to say hw can do it. (st can use to help control GL_ARB_gpu_shader5/GLSL 4.00). Add docs. v3: rename to SM5, add docs. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* util: Add util_cpu_to_le* helpersTom Stellard2014-02-241-0/+3
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* util: Add util_bswap64() v3Tom Stellard2014-02-241-0/+16
| | | | | | | | | | | | v2: - Use __builtin_bswap64() - Remove unnecessary mask - Add util_le64_to_cpu() helper v3: - Remove unnecessary AC_SUBST Reviewed-by: Michel Dänzer <[email protected]>
* configure.ac: Use AX_GCC_BUILTIN to check availability of __builtin_bswap32 v2Tom Stellard2014-02-241-1/+2
| | | | | | | v2: - Remove unnecessary AC_SUBST Reviewed-by: Matt Turner <[email protected]>
* pipe-loader: wrap pipe_loader_sw_probe_xlib within HAVE_PIPE_LOADER_XLIBEmil Velikov2014-02-243-7/+6
| | | | | | | | | | | | | | | | | | | The above function implies using the the xlib winsys, which has additional library dependencies that should not be forced. Make the software xlib pipe loader optional thus avoid all the dependency hell. A user that wishes to use the particular pipe-loader would need to set the following within configure.ac. enable_gallium_xlib_loader=yes v2: - Wrap sw/xlib/xlib_sw_winsys.h to handle compilation on systems lacking X11 headers. Spotted by Christian Prochaska. Tested-by: Tom Stellard <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75356 Signed-off-by: Emil Velikov <[email protected]>
* pipe-loader: introduce pipe_loader_sw_probe_null helper functionEmil Velikov2014-02-222-0/+31
| | | | | | | v2: Handle null_sw_create failure, add missing function return type Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v1)
* pipe-loader: introduce pipe_loader_sw_probe_dri helperEmil Velikov2014-02-222-0/+36
| | | | | | | | | | | Will be used in the following commits. v2: Link gallium tests against the library. v3: Handle dri_create_sw_winsys failure v4: Rebase on top of the targets/xa changes Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v2)
* pipe-loader: introduce pipe_loader_sw_probe_xlib helperEmil Velikov2014-02-223-3/+45
| | | | | | | | | Will be used in the upcoming patches. v2: handle xlib_create_sw_winsys failure, drop unneeded header Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v1)
* pipe-loader: use bool type for pipe_loader_drm_probe_fd()Emil Velikov2014-02-222-5/+5
| | | | | | | | v2: Rebase on top of the rendernode changes. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v1) Reviewed-by: Francisco Jerez <[email protected]> (v1)
* winsys/xlib: move xlib_create_sw_winsys within the winsysEmil Velikov2014-02-221-1/+1
| | | | | | v2: Rebase on top of vl_winsys_xsp.c removal Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> (v1)
* pipe-loader: handle memory allocation failureEmil Velikov2014-02-222-0/+4
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]>
* pipe-loader: build pipe_loader_drm_x_auth whenever HAVE_PIPE_LOADER_XCB is ↵Emil Velikov2014-02-221-1/+1
| | | | | | | | | | defined Currently HAVE_PIPE_LOADER_XCB is defined, rather than being set to 1/0. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* pipe-loader: destroy sw_winsys on sw_releaseEmil Velikov2014-02-221-0/+3
| | | | | | | | | | | | The sw pipe-loader implicitly handles winsys_create, thus we it would make sense to implicitly destroy it upon releasing the loader. Currently we leak the sw_winsys when releasing the pipe-loader. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* vl/winsys_dri: cleanup vl_screen_create error pathEmil Velikov2014-02-221-13/+19
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> Reviewed-by: Christian König <[email protected]>
* tgsi_ureg: add property_gs_invocationsJordan Justen2014-02-202-0/+11
| | | | | | | | Fixes a build break in state_tracker/st_program.c Signed-off-by: Jordan Justen <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75278 Reviewed-by: Dave Airlie <[email protected]>
* gallivm: add smallfloat to float conversion not relying on cpu denorm handlingRoland Scheidegger2014-02-201-20/+65
| | | | | | | | | | | | | | | | | | | | | | The previous code relied on cpu denorm support for converting small float formats (such r11g11b10_float and r16_float) to floats, otherwise denorms are flushed to zero. We worked around that in llvmpipe blend code by reenabling denorms, but this did nothing for texture sampling. Now it would be possible to reenable it there too but I'm not really a fan of messing with fpu flags (and it seems we can't actually do it reliably with llvm in any case looking at some bug reports). (Not to mention if you actually have a lot of denorms in there, you can expect some order-of-magnitude slowdown with x86 cpus.) So instead use code which adjusts exponents etc. directly hence not relying on cpu denorm support for the rescaling mul. (We still need the fpu flag handling as we can't do float-to-smallfloat without using cpu denorms at least for now - I actually wanted to keep both the old and new code and using one or the other depending on from where it's called but that didn't work out as the parameter would have to be passed through too many layers than I'd like.) Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Si Chen <[email protected]>
* pipe-loader: split out "client" versionRob Clark2014-02-161-4/+19
| | | | | | | | Build two versions of pipe-loader, with only the client version linking in x11 client side dependencies. This will allow the XA state tracker to use pipe-loader. Signed-off-by: Rob Clark <[email protected]>
* gallium/pipebuffer: change pb_cache_manager_create() size_factor to floatBrian Paul2014-02-142-6/+6
| | | | | | | Requested by Marek. Reviewed-by: Marek Olšák <[email protected]> Cc: "10.1" <[email protected]>
* gallium/util: Add flush/map debug utility codeThomas Hellstrom2014-02-143-0/+530
| | | | | | Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* gallium/pipebuffer: Add a cache buffer manager bypass maskThomas Hellstrom2014-02-142-4/+22
| | | | | | | | | | | | | In some situations, it may be desirable to bypass the cache at buffer creation but to insert the buffer in the cache at buffer destruction. One such situation is where we already have a kernel representation of a buffer that we want to use, but we also want to insert it in the cache when it's freed up. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* pipebuffer, winsys: Add a size match parameter to the cached buffer managerThomas Hellstrom2014-02-142-3/+7
| | | | | | | | In some situations it's important to restrict the sizes of buffers that the cached buffer manager is allowed to return Signed-off-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* vl: add motion adaptive deinterlacerGrigori Goronzy2014-02-143-1/+569
| | | | Reviewed-by: Christian König <[email protected]>
* pipe-loader: Add support for render nodes v2Tom Stellard2014-02-131-3/+77
| | | | | | | v2: - Add missing call to pipe_loader_drm_release() - Fix render node macros - Drop render-node configure option