aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: fix arguments order given to vec_andcOded Gabbay2016-01-171-0/+6
| | | | | | | | | | | | | | | | This patch fixes a classic "confuse the enemy" bug. _mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the arguments are opposite. _mm_andnot_si128 performs "r = (~a) & b" while vec_andc performs "r = a & (~b)" To make sure this error won't return in another place, I added a wrapper function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside. Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* ttn: use writemask for store_varRob Clark2016-01-161-26/+2
| | | | | | | Only user is freedreno, and after array-rework it can cope. Avoids generating loads for a store. Signed-off-by: Rob Clark <[email protected]>
* ttn: add missing writemask on store_outputRob Clark2016-01-161-0/+1
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallivm: avoid crashing in mod by 0 with llvmpipeJeff Muizelaar2016-01-161-2/+16
| | | | | | | This adds code that is basically the same as the code in umod, udiv and idiv. However, unlike idiv we return -1. Reviewed-by: Roland Scheidegger <[email protected]>
* draw: fix key comparison with uninitialized valueRoland Scheidegger2016-01-132-6/+6
| | | | | | | | Discovered by accident, valgrind was complaining (could have possibly caused us to create redundant geometry shader variants). v2: convinced by Brian and Jose, just use memset for both gs and vs keys, just as easy and less error prone.
* vl: use preferred format for deinterlacingChristian König2016-01-121-1/+7
| | | | | Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vl: improve motion adaptive deinterlacerChristian König2016-01-122-22/+49
| | | | | | | Handle other formats than YV12 as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vl/buffers: extract vl_video_buffer_adjust_size helperChristian König2016-01-122-8/+20
| | | | | | | Useful for the state trackers as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium/util: removed unused header-fileErik Faye-Lund2016-01-122-53/+0
| | | | | | | | | This hasn't been in use since c476305 ("gallium/util: pregenerate half float tables"), where the last bit of run-time init using this was killed. So let's just get rid of the pointless header. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* gallium: add a RESQ opcode to query info about a resourceIlia Mirkin2016-01-081-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERSIlia Mirkin2016-01-082-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add a is_store propertyIlia Mirkin2016-01-082-223/+224
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: provide a way to encode memory qualifiers for SSBOIlia Mirkin2016-01-089-1/+165
| | | | | | | | | | Each load/store on most hardware can specify what caching to do. Since SSBO allows individual variables to also have separate caching modes, allow loads/stores to have the qualifiers instead of attempting to encode them in declarations. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ureg: add buffer support to uregIlia Mirkin2016-01-085-0/+66
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add ureg support for image declsIlia Mirkin2016-01-088-42/+134
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vl: allow fragment shader POSITION to be a system valueMarek Olšák2016-01-081-4/+8
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* util/pstipple: allow fragment shader POSITION to be a system valueMarek Olšák2016-01-082-7/+26
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* tgsi/scan: update for POSITION and FACE sytem valuesMarek Olšák2016-01-081-1/+4
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* tgsi/ureg: handle redundant declarations in ureg_DECL_system_valueMarek Olšák2016-01-081-1/+9
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* tgsi/ureg: remove index parameter from ureg_DECL_system_valueMarek Olšák2016-01-082-7/+6
| | | | | | | | It can be trivially derived from the number of already declared system values. This allows ureg users not to worry about which index to choose. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/aux: Use TGSI chan name defines inplace of literalsEdward O'Callaghan2016-01-081-6/+7
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* llvmpipe: do 64bit plane calculations in the sse pathRoland Scheidegger2016-01-081-12/+80
| | | | | | | | | | | | The sse path was pretty much disabled for practical purposes because the largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations. This is actually not that difficult, though a problem is that we can't do a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall, the code still looks reasonable, though it's not like changes there in setup really make much of a difference in the end... Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: initialize prim header flags when clipping linesRoland Scheidegger2016-01-081-0/+2
| | | | | | | | | Otherwise, clipped lines would have undefined stippling reset bit if line stippling is enabled. (Untested, and I just assume copying over the bits from the original line is actually the right thing to do.) Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix line stippling with unfilled primsRoland Scheidegger2016-01-081-18/+38
| | | | | | | | | | | | | The unfilled stage was not filling in the prim header, and the line stage then decided to reset the stipple counter or not based on the uninitialized data. This causes some failures in conform linestipple test (albeit quite randomly happening depending on environment). So fill in the prim header in the unfilled stage - I am not entirely sure if anybody really needs determinant after that stage, but there's at least later stages (wide line for instance) which copy over the determinant as well. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: use sse2 conv code for altivecOded Gabbay2016-01-071-2/+2
| | | | | | | | | | | | | | | In lp_build_conv() and lp_build_conv_auto(), there is a special case of conversion when sse2 is present. That code path is suitable without any changes to altivec, because all the functions that are called in that code path already support altivec. This patch increase the FPS in POWER arch across the board between 10%-25% I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2. Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi/scan: set which color components are read by a fragment shaderMarek Olšák2016-01-072-8/+23
| | | | | | This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: fix tgsi_shader_info::reads_zMarek Olšák2016-01-071-2/+3
| | | | | | This has no users in Mesa. Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: set if a fragment shader writes sample maskMarek Olšák2016-01-072-0/+3
| | | | | | This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <[email protected]>
* draw: nuke the interp parameter from vertex_infoRoland Scheidegger2016-01-071-16/+1
| | | | | | | | | | | | | draw emit couldn't care less what the interpolation mode is... This somehow looked like it would matter, all drivers more or less dutifully filled that in correctly. But this is only used for emit, if draw needs to know about interpolation mode (for clipping for instance) it will get that information from the vs anyway. softpipe actually used to depend on that interpolation parameter, as it abused that structure quite a bit but no longer. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* draw: rework handling of non-existing outputs in emit codeRoland Scheidegger2016-01-073-23/+46
| | | | | | | | | | | | | Previously the code would just redirect requests for attributes which don't exist to use output 0. Rework this to output all zeros instead which seems more useful - in particular some extensions like ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by previous stages. That way, drivers don't have to special case this depending if the vs/gs outputs some attribute or not. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: Use unsigned for loop indexEdward O'Callaghan2016-01-061-3/+3
| | | | | | Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: Remove unnecessary semicolonsEdward O'Callaghan2016-01-065-5/+6
| | | | | | | | | Fix silly issue with MSVC case fall-though support to need a extra 'break;' Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: add POWER8 portability file - u_pwr8.hOded Gabbay2016-01-061-0/+310
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file provides a portability layer that will make it easier to convert SSE-based functions to VMX/VSX-based functions. All the functions implemented in this file are prefixed using "vec_". Therefore, when converting from SSE-based function, one needs to simply replace the "_mm_" prefix of the SSE function being called to "vec_". Having said that, not all functions could be converted as such, due to the differences between the architectures. So, when doing such conversion hurt the performance, I preferred to implement a more ad-hoc solution. For example, converting the _mm_shuffle_epi32 needed to be done using ad-hoc masks instead of a generic function. All the functions in this file support both little-endian and big-endian but currently the file is build only on POWER8 LE machine. All of the functions are implemented using the Altivec/VMX intrinsics, except one where I needed to use inline assembly (due to missing intrinsic). v2: - Use vec_vgbbd instead of __builtin_vec_vgbbd - Add an aligned load function - Don't use typeof() - Make file build only on POWER8 LE machine Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: minor indentation fixBrian Paul2016-01-051-1/+1
|
* util: add debug_dump_ubyte_rgba_bmp()Brian Paul2016-01-052-0/+63
| | | | | | Like debug_dump_float_rgba_bmp() but takes ubyte values. Reviewed-by: Charmaine Lee <[email protected]>
* tgsi: update PK2H/UP2H channel behavior infoIlia Mirkin2016-01-031-8/+8
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* u_upload_mgr: allow specifying PIPE_USAGE_* for the upload bufferMarek Olšák2016-01-027-12/+19
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: remove alignment parameter from u_upload_createMarek Olšák2016-01-027-10/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_buffer manuallyMarek Olšák2016-01-022-1/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_data manuallyMarek Olšák2016-01-024-5/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: pass alignment to u_upload_alloc manuallyMarek Olšák2016-01-026-6/+9
| | | | | | | | | | The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <[email protected]>
* u_upload_mgr: rework the application of alignmentMarek Olšák2016-01-021-10/+14
| | | | | | | | | | | | | | | The function only aligned the size, but not the offset. The offset was aligned only when the previous suballocation was aligned. That yielded the correct offset alignment if the alignment was constant for all suballocations. Instead, directly align the offset, but allow an unaligned size. There is no change in behavior, because the alignment is constant at the moment. This a prerequisite for allowing a variable alignment for suballocations. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add baseinstance/drawid semanticsIlia Mirkin2015-12-301-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nir/builder: Add an init function that creates a simple shader for youJason Ekstrand2015-12-291-8/+4
| | | | | | | | | | | A hugely common case when using nir_builder is to have a shader with a single function called main. This adds a helper that gives you just that. This commit also makes us use it in the NIR control-flow unit tests as well as tgsi_to_nir and prog_to_nir. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* gallium/util: add DEBUG_GET_ONCE_OPTIONNicolai Hähnle2015-12-291-0/+13
| | | | | | This is analogous to the alreading existing macros for BOOL, NUM, and FLAGS. Reviewed-by: Marek Olšák <[email protected]>
* nir: Get rid of function overloadsJason Ekstrand2015-12-281-2/+1
| | | | | | | | | | | | | | | | | When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <[email protected]> Acked-by: Kenneth Graunke <[email protected]> ir3 bits are Reviewed-by: Rob Clark <[email protected]>
* gallium/auxiliary: don't build NIR sources with MSVC2008 flagsConnor Abbott2015-12-232-7/+15
| | | | | | | | | | | | | NIR has never been built with MSVC2008, so we shouldn't add MSVC2008_COMPAT_CFLAGS to anything that uses it. This allows us to get rid of the pragma in tgsi_to_nir.c. Build tested with freedreno. v2: Use MSVC2013_COMPAT_CLFAGS instead. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: Add a writemask to store intrinsics.Kenneth Graunke2015-12-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tessellation control shaders need to be careful when writing outputs. Because multiple threads can concurrently write the same output variables, we need to only write the exact components we were told. Traditionally, for sub-vector writes, we've read the whole vector, updated the temporary, and written the whole vector back. This breaks down with concurrent access. This patch prepares the way for a solution by adding a writemask field to store_var intrinsics, as well as the other store intrinsics. It then updates all produces to emit a writemask of "all channels enabled". It updates nir_lower_io to copy the writemask to output store intrinsics. Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks by doing a read-modify-write cycle (which is safe, because local variables are specific to a single thread). This should have no functional change, since no one actually emits partial writemasks yet. v2: Make nir_validate momentarily assert that writemasks cover the complete value - we shouldn't have partial writemasks yet (requested by Jason Ekstrand). v3: Fix accidental SSBO change that arose from merge conflicts. v4: Don't try to handle writemasks in ir3_compiler_nir - my code for indirects was likely wrong, and TTN doesn't generate partial writemasks today anyway. Change them to asserts as requested by Rob Clark. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> [v3]
* nir: Delete bany, ball, fany, fall.Matt Turner2015-12-181-1/+3
| | | | | | | | | | | As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <[email protected]>
* gallium/util: (trivial) include p_shader_tokens.h in u_simple_shaders.hRoland Scheidegger2015-12-181-0/+1
| | | | as it uses definition from it (enum tgsi_return_type).