summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: add helper lp_add_attr_dereferenceableMarek Olšák2016-07-132-0/+14
| | | | | | | | | Not sure if this is the right way to do it, but it seems to work. v2: make it a no-op on LLVM <= 3.5 Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* vl/compositor: set layer of y or uv to renderLeo Liu2016-07-122-0/+42
| | | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Julien Isorce <[email protected]>
* vl/compositor: add weave to yuv shaderLeo Liu2016-07-122-0/+43
| | | | | | | | This shader will make interlaced yuv to progressive yuv. Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Julien Isorce <[email protected]>
* vl/compositor: move weave shader out from rgb weavingLeo Liu2016-07-122-76/+83
| | | | | | | | We'll use weave shader in the later patch. Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Julien Isorce <[email protected]>
* gallivm: set LLVMNoUnwindAttribute on all intrinsicsMarek Olšák2016-07-111-2/+4
| | | | | | | | | RadeonSI stats: Mostly 0% difference, but Valley shows a small improvement: Application Files SGPRs VGPRs SpillSGPR SpillVGPR Code Size LDS Max Waves Waits unigine_valley 278 0.00 % -0.29 % 0.00 % 0.00 % 0.01 % 0.00 % 0.17 % 0.00 % Reviewed-by: Jose Fonseca <[email protected]>
* gallium/u_queue: assert that users must wait on fences before destroying themNicolai Hähnle2016-07-111-0/+1
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/u_queue: guard fence->signalled checks with fence->mutexNicolai Hähnle2016-07-111-3/+0
| | | | | | | | | | | | | | I have seen a hang during application shutdown that could be explained by the following race condition which this patch fixes: 1. Worker thread enters util_queue_fence_signal, sets fence->signalled = true. 2. Main thread calls util_queue_job_wait, which returns immediately. 3. Main thread deletes the job and fence structures, leaving garbage behind. 4. Worker thread calls pipe_condvar_broadcast, which gets stuck forever because it is accessing garbage. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* vl: add half pixel to v_tex before adding offsetsNayan Deshmukh2016-07-081-0/+2
| | | | | | | | Since pixel center lies at 0.5, add half_pixel to vtex before adding offsets to it. Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/util: make util_copy_framebuffer_state(src=NULL) workRob Clark2016-07-061-11/+26
| | | | | | | | Be more consistent with the other u_inlines util_copy_xyz_state() helpers and support NULL src. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: Add WORK_DIM System ValueHans de Goede2016-07-021-0/+1
| | | | | | | | | | | Add a new WORK_DIM SV type, this is will return the grid dimensions (1-4) for compute (opencl) kernels. This is necessary to implement the opencl get_work_dim() function. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* vl: add a bicubic interpolation filter(v5)Nayan Deshmukh2016-07-013-0/+528
| | | | | | | | | | | | | | | This is a shader based bicubic interpolater which uses cubic Hermite spline algorithm. v2: set dst_area and dst_clip during scaling (Christian) v3: clear the render target before rendering v4: intialize offsets while initializing shaders use a constant buffer to send dst_size to frag shader small changes to reduce calculation in shader v5: send half pixel offset instead of sending dst_size Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/util: check for window cliprects in util_can_blit_via_copy_region()Brian Paul2016-06-301-0/+1
| | | | | | We can't blit with resource_copy_region() if there are window clip rects. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/util: add tight_format_check param to util_can_blit_via_copy_region()Brian Paul2016-06-302-11/+30
| | | | | | | | The VMware driver will use this for implementing GL_ARB_copy_image. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium/util: simplify a few things in util_can_blit_via_copy_region()Brian Paul2016-06-301-12/+8
| | | | | | | | | Since only the src box can have negative dims for flipping, just comparing the src/dst box sizes is enough to detect flips. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium/util: new util_try_blit_via_copy_region() functionBrian Paul2016-06-302-15/+32
| | | | | | | | | Pulled out of the util_try_blit_via_copy_region() function. Subsequent changes build on this. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kmsHans de Goede2016-06-281-0/+7
| | | | | | | | | | | | Make pipe_loader_sw_probe_kms take ownership of the passed in fd, like pipe_loader_drm_probe_fd does. The only caller is dri_kms_init_screen which passes in a dupped fd, just like dri2_init_screen passes in a dupped fd to pipe_loader_drm_probe_fd. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium/u_queue: allow the execute function to differ per jobMarek Olšák2016-06-242-10/+12
| | | | | | so that independent types of jobs can use the same queue. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: reduce the number of mutexes by 2Marek Olšák2016-06-242-20/+35
| | | | | | by converting semaphores to condvars and using the main mutex Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: add an option to name threadsMarek Olšák2016-06-242-0/+11
| | | | | | | | for debugging v2: correct the snprintf use Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: add an option to have multiple worker threadsMarek Olšák2016-06-242-15/+63
| | | | | | | | independent jobs don't have to be stuck on only one thread v2: use CALLOC & FREE Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: rewrite util_queue_fence to allow multiple waitersMarek Olšák2016-06-242-16/+43
| | | | | | | | Checking "signalled" is first done without a mutex, then with a mutex. Also, checking without waiting doesn't lock the mutex. This is racy, but should be safe. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_queue: use a ring instead of a stackMarek Olšák2016-06-242-18/+45
| | | | | | | | and allow specifying its size in util_queue_init. v2: use CALLOC & FREE Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/util: fix some 4-space indentation in blitter codeBrian Paul2016-06-231-21/+21
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* translate: fix start_instance parameter in sse versionIlia Mirkin2016-06-211-7/+7
| | | | | | | | | The generic version gets this right already, but this was using an incorrect formula in SSE. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium/u_blitter: implement mipmap generationMarek Olšák2016-06-212-114/+238
| | | | | | | | | for pipe_context::generate_mipmap first move some of the blit code from util_blitter_blit_generic to a separate function, then use it from util_blitter_generate_mipmap Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9Roland Scheidegger2016-06-201-2/+4
| | | | | | | | | | | | | | | | | | | Apparently, these are deprecated. There's some AutoUpgrade feature which is supposed to promote these to cmp/select, which apparently doesn't work with jit code. It is possible it's not actually even meant to work (see the bug filed against llvm which couldn't provide an answer neither) but in any case this is meant to be only temporary unless the intrinsics are really illegal. So, just use the fallback code (which should be cmp/select, we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages to optimize this back to pmin/pmax in the end). This addresses https://llvm.org/bugs/show_bug.cgi?id=28176 CC: <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Tested-by: Vinson Lee <[email protected]> Tested-by: Aaron Watry <[email protected]>
* vl: support luma keying for interlaced surfaces as wellChristian König2016-06-161-35/+41
| | | | | | We had the CSC code twice in there, factor it out into a separate function. Signed-off-by: Christian König <[email protected]>
* auxilary/os: allow appending to GALLIUM_LOG_FILEBrian Paul2016-06-151-2/+13
| | | | | | | | | | If the log file specified by the GALLIUM_LOG_FILE begins with '+', open the file in append mode. This is useful to log all gallium output for an entire piglit run, for example. v2: put GALLIUM_LOG_FILE support inside an #ifdef DEBUG block. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/util: import the multithreaded job queue from amdgpu winsys (v2)Marek Olšák2016-06-153-0/+211
| | | | | | v2: rename the event to util_queue_fence Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/util: don't use blocksize for minify for assertionsRoland Scheidegger2016-06-141-20/+8
| | | | | | | | | | | | | | | | The previous assertions required for texture sizes smaller than block_size that src_box.x + src_box.width still be block size. (e.g. for a texture with width 3, and src_box.x = 0, src_box.width would have to be 4 to not assert.) This caused some assertions with some other state tracker. It looks though like callers aren't expected to round up widths to block sizes (for sizes larger than block size the assertion would still have verified it wouldn't have been rounded up) so we simply shouldn't use a minify which rounds up to block size. (No piglit change with llvmpipe.) Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/va: ensure linear memory for dmabufJulien Isorce2016-06-141-1/+1
| | | | | | | | | | | | | | | | In order to do zero-copy between two different devices the memory should not be tiled. Tested with GStreamer on a laptop that has 2 GPUs: 1- gstvaapidecode: HW decoding and dmabuf export with nouveau driver on Nvidia GPU. 2- glimagesink: EGLImage imports dmabuf on Intel GPU. TEST: DRI_PRIME=1 gst-launch vaapidecodebin ! glimagesink Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Christian König <[email protected]>
* mesa/gallium: Move u_bit_scan{,64} from gallium to util.Mathias Fröhlich2016-06-141-148/+1
| | | | | | | | | | | | | | | The functions are also useful for mesa. Introduce src/util/bitscan.{h,c}. Move ffs function implementations from src/mesa/main/imports.{h,c}. Move bit scan related functions from src/gallium/auxiliary/util/u_math.h. Merge platform handling with what is available from within mesa. v2: Try to fix MSVC compile. Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* util: update some assertions in util_resource_copy_region()Brian Paul2016-06-131-4/+8
| | | | | | | | To cope with copies of compressed images which are not multiples of the block size. Suggested by Jose. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <sroland@[email protected]>
* vl: Fix trivial sign compare warningsJan Vesely2016-06-137-18/+15
| | | | | | | | | v2: add whitepace fixes Signed-off-by: Jan Vesely <[email protected]> Acked-by: Jose Fonseca <[email protected]> [Emil Velikov: squash a few more whitespace issues] Reviewed-by: Emil Velikov <[email protected]>
* Android: move libdrm settings to top-level Android.common.mkRob Herring2016-06-131-3/+0
| | | | | | | | | | | | | | Fix warnings like these due to HAVE_LIBDRM being inconsistently defined: external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition] typedef struct drm_clip_rect drm_clip_rect_t; HAVE_LIBDRM needs to be set project wide to fix this. This change also harmlessly links libdrm with everything, but simplifies the makefiles a bit. Signed-off-by: Rob Herring <[email protected]> Acked-by: Emil Velikov <[email protected]>
* gallivm: Fix trivial sign warningsJan Vesely2016-06-138-21/+22
| | | | | | | v2: include whitespace fixes Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* util: update util_resource_copy_region() for GL_ARB_copy_imageBrian Paul2016-06-101-20/+95
| | | | | | | This primarily means added support for copying between compressed and uncompressed formats. Reviewed-by: Charmaine Lee <[email protected]>
* gallium: Fix region overlap conditions for rectangles with a shared edgeAnuj Phogat2016-06-101-4/+4
| | | | | | | | | | | | | | | | | | | | | >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- | | ------- --- | | | | | | ------- --- Cc: "12.0" <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: more 64-bit integer prep work.Dave Airlie2016-06-111-8/+8
| | | | | | | This converts one other place to using the new helper. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallivm: make non-float return code bitcast consistent.Dave Airlie2016-06-111-12/+6
| | | | | | | | This just uses the same form across the fetches. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/gallivm: use 64-bit test instead of doubles.Dave Airlie2016-06-111-37/+36
| | | | | | | | | This just makes some generic code that currently emits double suitable for emitting 64-bit values. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/tgsi: add 64-bitness type check function.Dave Airlie2016-06-111-0/+7
| | | | | | | | | Currently this just doubles, but we'll convert users to this so making adding 64-bit integers easier. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* vl/dri3: support receiving new pixmap for front bufferLeo Liu2016-06-101-1/+6
| | | | | | | | | | | | With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets renewed in each frame, so when we receive a new pixmap, should get a new front buffer for it. This also fixes Totem player playback corruption. Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Cc: "12.0" <[email protected]>
* vl/dri3: get Makefile properlyLeo Liu2016-06-103-5/+10
| | | | | | | | | | | | | | From original commit, the macro "if HAVE_DRI3" was in Makefile.sources, this file is shared with SCons, SCons is not able to parse this marco, the SCons build failed. Jose quickly gave two approaches and quick fix with his second approach, thanks Jose for the solutions and fixes. This patch is Jose's first approach, and it's more proper, because the dri3 c file should not be included to build when DRI3 is not enabled. Signed-off-by: Leo Liu <[email protected]> Acked-by: Emil Velikov <[email protected]> Cc: "12.0" <[email protected]>
* gallivm: Never emit llvm.fmuladd on LLVM 3.3.Jose Fonseca2016-06-102-1/+7
| | | | | | | | Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't handle llvm.fmuladd and instead it fall backs to a C function. Which in turn causes a segfault on Windows. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use llvm.fmuladd.*.Jose Fonseca2016-06-106-57/+92
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* util,gallivm: Explicitly enable/disable fma attribute.Jose Fonseca2016-06-104-0/+13
| | | | | | | | | | As suggested by Roland Scheidegger. Use the same logic as f16c, since fma requires VEX encoding. But disable FMA on LLVM 3.3 without MCJIT. Reviewed-by: Roland Scheidegger <[email protected]>
* vl: Apply luma key filter before CSC conversionNayan Deshmukh2016-06-092-13/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apply the luma key filter to the YCbCr values during the CSC conversion in video buffer shader. The initial values of max and min luma are set to opposite values to disable the filter initially and will be set when enabling it. Add extra parmeters min and max luma for the luma key filter in vl_compositor_set_csc_matrix in va, xvmc. Setting them to opposite value 1.f and 0.f respectively won't effect the CSC conversion v2: -Squash 1,2 and 3 into one patch to avoid breaking build of other components. (Christian) -use ureg_swizzle. (Christian) -change name of the variables. (Christian) v3: -Squash all patches in one to avoid breaking of build. (Emil) -wrap functions properly. (Emil) -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil) v4: -Divide it in two patches one which introduces the functionality and assigs dummy values to the changed functions and second which implements the lumakey filter. (Christian) -use ureg_scalar instead ureg_swizzle. (Christian) Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* tgsi/scan: add uses_derivatives (v2)Nicolai Hähnle2016-06-072-0/+31
| | | | | | | | | | | | v2: - TG4 does not calculate derivatives (Ilia) - also handle SAMPLE* instructions (Roland) Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (v1) Reviewed-by: Brian Paul <[email protected]> (v1) Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_voteIlia Mirkin2016-06-061-0/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>