summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* tgsi: set correct src type for UP2HMarek Olšák2016-02-021-0/+1
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: add PK2H/UP2H supportRoland Scheidegger2016-02-022-7/+9
| | | | | | | | | | | | Add support for these opcodes, the conversion functions were already there albeit need some new packing stuff. Just like the tgsi version, piglit won't like it for all the same reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing tests, albeit since PK2H won't due to those rounding differences I don't know if that one works or not as the piglit test is rather difficult to deal with). Reviewed-by: Brian Paul <[email protected]>
* gallivm: add PK2H/UP2H supportRoland Scheidegger2016-02-025-2/+119
| | | | | | | | | | Add support for these opcodes, the conversion functions were already there albeit need some new packing stuff. Just like the tgsi version, piglit won't like it for all the same reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing tests, albeit since PK2H won't due those rounding differences I don't know if that one works or not as the piglit test is rather difficult to deal with).
* tgsi: add PK2H/UP2H supportRoland Scheidegger2016-02-022-3/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | The util functions handle the half-float conversion. Note that piglit won't like it much due to: a) The util functions use magic float mul conversion but when run inside softpipe/llvmpipe, denorms are flushed to zero, therefore when the conversion is from/to f16 denorm the result will be zero. This is a bug which should be fixed in these functions (should not rely on denorms being available), but will happen elsewhere just the same (e.g. conversion to f16 render targets). b) The util functions use trunc round mode rather than round-to-nearest. This is NOT a bug (as it is a d3d10 requirement). This will result of rounding not representable finite values to MAX_F16 rather than INFINITY. My belief is the piglit tests are wrong here but it's difficult to tell (generally glsl rounding mode is undefined, however I'm not sure if rounding mode might need to be consistent for different operations). Nevertheless, for gl it would be better to use round-to-nearest, but using different rounding for GL and d3d10 is an unsolved problem (as it affects things like conversion to f16 render targets, clear colors, this shader opcode). Hence for now don't enable the cap bit (so the code is unused). (Code is from imirkin, comment from sroland) Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: fix typo in python scriptsRoland Scheidegger2016-02-021-1/+1
| | | | Reviewed-by: Matt Turner <[email protected]>
* virgl: reuse screen when fd is already openRob Herring2016-02-021-6/+1
| | | | | | | | | | It is necessary to share the screen between mesa and gralloc to properly ref count resources. This implements a hash lookup on the file description to re-use an already created screen. This is a similar implementation as freedreno and radeon. Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium: Add DragonFly supportFrançois Tigeot2016-01-311-1/+1
| | | | | Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]>
* tgsi: add MEMBAR opcode to handle memoryBarrier* GLSL intrinsicsIlia Mirkin2016-01-291-1/+1
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (v1) v1 -> v2: add defines for the various bits Reviewed-by: Roland Scheidegger <[email protected]>
* glsl: move to compiler/Emil Velikov2016-01-261-1/+1
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
* nir: move to compiler/Emil Velikov2016-01-263-6/+6
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
* nir: move shader_enums.[ch] to compilerEmil Velikov2016-01-261-1/+1
| | | | | | | | | This way one can reuse it in glsl, nir or other infrastructure without pulling nir as dependency. Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
* llvmpipe,i915: add back NEW_RASTERIZER dependency when computing vertex infoRoland Scheidegger2016-01-211-0/+6
| | | | | | | | | | | | | | | | | | | | I removed this mistakenly in 2dbc20e45689e09766552517a74e2270e49817b5. I actually thought it should not be necessary and a piglit run didn't show any differences, but this shouldn't have been in there. draw_prepare_shader_outputs() is in fact dependent on NEW_RASTERIZER. The new polygon-mode-facing test indeed shows why this is necessary, there's lots of invalid reads and writes with valgrind (also crashes without valgrind), because the pre-pipeline vertex size doesn't match the post-pipeline vertex size (note this won't help much with stages which don't have the prepare hook which can grow the vertex size, in particular the wide point stage, but this isn't used by llvmpipe). The test still won't pass, of course, but it is only usage of uninitialized values now, which is much less dangerous... (Albeit I'm pretty sure for i915 it really is not needed anymore as it doesn't care about the extra outputs and doesn't call draw_prepare_shader_outputs().) Reviewed-by: Jose Fonseca <[email protected]>
* util/u_pstipple.c: copy immediates during transformationNicolai Hähnle2016-01-191-0/+1
| | | | | | | | | | | Apparently, nobody has combined stippling with a fragment shader containing immediates in almost five years... Fixes a bug in Kodi with radeonsi reported by Christian König. Cc: "11.0 11.1" <[email protected]> Tested-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: bundle the compat header u_pwr8.h in the tarballEmil Velikov2016-01-181-0/+1
| | | | Signed-off-by: Emil Velikov <[email protected]>
* llvmpipe: use vpkswss when dst is signedOded Gabbay2016-01-181-16/+15
| | | | | | | | | | | | | | | | | | | This patch fixes a bug when building a pack instruction. For POWER (altivec), in case the destination is signed and the src width is 32, we need to use vpkswss. The original code used vpkuwus, which emits an unsigned result. This fixes the following piglit tests on ppc64le: - spec@arb_color_buffer_float@gl_rgba8-drawpixels - shaders@glsl-fs-fogscale I've also corrected some coding style issues in the function. v2: Returned else statements to vmware style Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi: initialize Atomic field in tgsi_default_declarationIlia Mirkin2016-01-171-0/+1
| | | | | | | | Spotted by Coverity. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe: fix arguments order given to vec_andcOded Gabbay2016-01-171-0/+6
| | | | | | | | | | | | | | | | This patch fixes a classic "confuse the enemy" bug. _mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the arguments are opposite. _mm_andnot_si128 performs "r = (~a) & b" while vec_andc performs "r = a & (~b)" To make sure this error won't return in another place, I added a wrapper function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside. Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* ttn: use writemask for store_varRob Clark2016-01-161-26/+2
| | | | | | | Only user is freedreno, and after array-rework it can cope. Avoids generating loads for a store. Signed-off-by: Rob Clark <[email protected]>
* ttn: add missing writemask on store_outputRob Clark2016-01-161-0/+1
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallivm: avoid crashing in mod by 0 with llvmpipeJeff Muizelaar2016-01-161-2/+16
| | | | | | | This adds code that is basically the same as the code in umod, udiv and idiv. However, unlike idiv we return -1. Reviewed-by: Roland Scheidegger <[email protected]>
* draw: fix key comparison with uninitialized valueRoland Scheidegger2016-01-132-6/+6
| | | | | | | | Discovered by accident, valgrind was complaining (could have possibly caused us to create redundant geometry shader variants). v2: convinced by Brian and Jose, just use memset for both gs and vs keys, just as easy and less error prone.
* vl: use preferred format for deinterlacingChristian König2016-01-121-1/+7
| | | | | Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vl: improve motion adaptive deinterlacerChristian König2016-01-122-22/+49
| | | | | | | Handle other formats than YV12 as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vl/buffers: extract vl_video_buffer_adjust_size helperChristian König2016-01-122-8/+20
| | | | | | | Useful for the state trackers as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium/util: removed unused header-fileErik Faye-Lund2016-01-122-53/+0
| | | | | | | | | This hasn't been in use since c476305 ("gallium/util: pregenerate half float tables"), where the last bit of run-time init using this was killed. So let's just get rid of the pointless header. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* gallium: add a RESQ opcode to query info about a resourceIlia Mirkin2016-01-081-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERSIlia Mirkin2016-01-082-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add a is_store propertyIlia Mirkin2016-01-082-223/+224
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: provide a way to encode memory qualifiers for SSBOIlia Mirkin2016-01-089-1/+165
| | | | | | | | | | Each load/store on most hardware can specify what caching to do. Since SSBO allows individual variables to also have separate caching modes, allow loads/stores to have the qualifiers instead of attempting to encode them in declarations. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ureg: add buffer support to uregIlia Mirkin2016-01-085-0/+66
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add ureg support for image declsIlia Mirkin2016-01-088-42/+134
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vl: allow fragment shader POSITION to be a system valueMarek Olšák2016-01-081-4/+8
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* util/pstipple: allow fragment shader POSITION to be a system valueMarek Olšák2016-01-082-7/+26
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* tgsi/scan: update for POSITION and FACE sytem valuesMarek Olšák2016-01-081-1/+4
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* tgsi/ureg: handle redundant declarations in ureg_DECL_system_valueMarek Olšák2016-01-081-1/+9
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* tgsi/ureg: remove index parameter from ureg_DECL_system_valueMarek Olšák2016-01-082-7/+6
| | | | | | | | It can be trivially derived from the number of already declared system values. This allows ureg users not to worry about which index to choose. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/aux: Use TGSI chan name defines inplace of literalsEdward O'Callaghan2016-01-081-6/+7
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* llvmpipe: do 64bit plane calculations in the sse pathRoland Scheidegger2016-01-081-12/+80
| | | | | | | | | | | | The sse path was pretty much disabled for practical purposes because the largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations. This is actually not that difficult, though a problem is that we can't do a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall, the code still looks reasonable, though it's not like changes there in setup really make much of a difference in the end... Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: initialize prim header flags when clipping linesRoland Scheidegger2016-01-081-0/+2
| | | | | | | | | Otherwise, clipped lines would have undefined stippling reset bit if line stippling is enabled. (Untested, and I just assume copying over the bits from the original line is actually the right thing to do.) Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix line stippling with unfilled primsRoland Scheidegger2016-01-081-18/+38
| | | | | | | | | | | | | The unfilled stage was not filling in the prim header, and the line stage then decided to reset the stipple counter or not based on the uninitialized data. This causes some failures in conform linestipple test (albeit quite randomly happening depending on environment). So fill in the prim header in the unfilled stage - I am not entirely sure if anybody really needs determinant after that stage, but there's at least later stages (wide line for instance) which copy over the determinant as well. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: use sse2 conv code for altivecOded Gabbay2016-01-071-2/+2
| | | | | | | | | | | | | | | In lp_build_conv() and lp_build_conv_auto(), there is a special case of conversion when sse2 is present. That code path is suitable without any changes to altivec, because all the functions that are called in that code path already support altivec. This patch increase the FPS in POWER arch across the board between 10%-25% I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2. Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi/scan: set which color components are read by a fragment shaderMarek Olšák2016-01-072-8/+23
| | | | | | This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: fix tgsi_shader_info::reads_zMarek Olšák2016-01-071-2/+3
| | | | | | This has no users in Mesa. Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: set if a fragment shader writes sample maskMarek Olšák2016-01-072-0/+3
| | | | | | This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <[email protected]>
* draw: nuke the interp parameter from vertex_infoRoland Scheidegger2016-01-071-16/+1
| | | | | | | | | | | | | draw emit couldn't care less what the interpolation mode is... This somehow looked like it would matter, all drivers more or less dutifully filled that in correctly. But this is only used for emit, if draw needs to know about interpolation mode (for clipping for instance) it will get that information from the vs anyway. softpipe actually used to depend on that interpolation parameter, as it abused that structure quite a bit but no longer. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* draw: rework handling of non-existing outputs in emit codeRoland Scheidegger2016-01-073-23/+46
| | | | | | | | | | | | | Previously the code would just redirect requests for attributes which don't exist to use output 0. Rework this to output all zeros instead which seems more useful - in particular some extensions like ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by previous stages. That way, drivers don't have to special case this depending if the vs/gs outputs some attribute or not. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: Use unsigned for loop indexEdward O'Callaghan2016-01-061-3/+3
| | | | | | Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: Remove unnecessary semicolonsEdward O'Callaghan2016-01-065-5/+6
| | | | | | | | | Fix silly issue with MSVC case fall-though support to need a extra 'break;' Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: add POWER8 portability file - u_pwr8.hOded Gabbay2016-01-061-0/+310
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file provides a portability layer that will make it easier to convert SSE-based functions to VMX/VSX-based functions. All the functions implemented in this file are prefixed using "vec_". Therefore, when converting from SSE-based function, one needs to simply replace the "_mm_" prefix of the SSE function being called to "vec_". Having said that, not all functions could be converted as such, due to the differences between the architectures. So, when doing such conversion hurt the performance, I preferred to implement a more ad-hoc solution. For example, converting the _mm_shuffle_epi32 needed to be done using ad-hoc masks instead of a generic function. All the functions in this file support both little-endian and big-endian but currently the file is build only on POWER8 LE machine. All of the functions are implemented using the Altivec/VMX intrinsics, except one where I needed to use inline assembly (due to missing intrinsic). v2: - Use vec_vgbbd instead of __builtin_vec_vgbbd - Add an aligned load function - Don't use typeof() - Make file build only on POWER8 LE machine Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: minor indentation fixBrian Paul2016-01-051-1/+1
|