summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* swr: automake: add missing -I flagEmil Velikov2016-06-131-0/+1
| | | | | | | | | | | | | When building from a release tarball (where the generated/built files are in srcdir) in an OOT fashion we need to have both builddir and srcdir in the includes list. Otherwise we'll error out, as the file (header gen_knobs.h in this case) won't be in the location where we are looking. Cc: "12.0" <[email protected]> Cc: Tim Rowley <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* swr: Add missing headers for package inclusionChuck Atkins2016-06-131-1/+9
| | | | | CC: "12.0" <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallivm: Fix trivial sign warningsJan Vesely2016-06-138-21/+22
| | | | | | | v2: include whitespace fixes Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* st/va: use proper temp pipe_video_buffer templateJulien Isorce2016-06-131-4/+4
| | | | | | | | | | Instead of changing the format on the existing template which makes error handling not nice and confuses coverity. CoverityID: 1337953 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/va: it is valid to release the VABuffer of an exported resourceJulien Isorce2016-06-131-7/+1
| | | | | | | | | | | | pipe_resource_reference(&res, NULL) will decrement reference counting, i.e. p_atomic_dec(res->count). But the va surface still has the initial reference since it has created the resource. So calling vaDestroyImage on a derived image calls VaDestroyBuffer but the decrementation won't reach 0. It is just wrong for vlVaDestroyBuffer to rely on the export_refcount flag. Finally the vaapi intel driver has the same logic. Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Christian König <[email protected]>
* nv50: reinstate dedicated constbuf push pathIlia Mirkin2016-06-115-29/+50
| | | | | | | | | | | | | | | | | This was disabled due to occasionally incorrect behavior when trying to upload data. It later became apparent that nvc0 also had a similar but slightly different issue, which was resolved in commit e50c01d5. This takes the same logic as nvc0 and applies it to nv50 (which has somewhat different interfaces). Unfortunately I did not note down precisely what was broken with UBOs when removing the support from nv50, but I've tested a bunch of local traces, and none of them appear to regress. This should hopefully improve performance when UBOs are used, but this was not directly verified. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50: enable indirect addressing of fragment shader inputsIlia Mirkin2016-06-112-1/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* llvmpipe: turn on pipe cap for GL_ARB_copy_image supportBrian Paul2016-06-101-1/+2
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* llvmpipe: don't use 3-component formats, except 32-bit x 3 formatsBrian Paul2016-06-101-11/+12
| | | | | | | | | | | This basically disallows all 8-bit x 3 and 16-bit x 3 formats for textures and render targets. Some 3-component formats were already disallowed before. This avoids problems with GL_ARB_copy_image. v2: the previous version of this patch disallowed all 3-component formats Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe: turn on pipe cap for GL_ARB_copy_image supportBrian Paul2016-06-101-1/+2
| | | | Reviewed-by: Charmaine Lee <[email protected]>
* softpipe: don't use 3-component formatsBrian Paul2016-06-101-0/+18
| | | | | | | | | | | | | | | Mesa and gallium don't have a complete set of matching 3-component texture formats. For example, 8-bit sRGB unorm. To fully support the GL_ARB_copy_image extension we need to have support for all of these formats: RGB8_UNORM, RGB8_SNORM, RGB8_SRGB, RGB8_UINT, and RGB8_SINT using the same component order. Since we don't have that, disable the 3-component formats for now. v2: Simplify 3-component format check, per Marek. Also check that target != PIPE_BUFFER. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* util: update util_resource_copy_region() for GL_ARB_copy_imageBrian Paul2016-06-101-20/+95
| | | | | | | This primarily means added support for copying between compressed and uncompressed formats. Reviewed-by: Charmaine Lee <[email protected]>
* gallium: Fix region overlap conditions for rectangles with a shared edgeAnuj Phogat2016-06-101-4/+4
| | | | | | | | | | | | | | | | | | | | | >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- | | ------- --- | | | | | | ------- --- Cc: "12.0" <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: more 64-bit integer prep work.Dave Airlie2016-06-111-8/+8
| | | | | | | This converts one other place to using the new helper. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: convert to 64-bitness checks instead of doubles.Dave Airlie2016-06-113-31/+33
| | | | | | | | This converts to testing for 64-bit types and renames some things in anticipation of 64-bit integer support. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallivm: make non-float return code bitcast consistent.Dave Airlie2016-06-111-12/+6
| | | | | | | | This just uses the same form across the fetches. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/gallivm: use 64-bit test instead of doubles.Dave Airlie2016-06-111-37/+36
| | | | | | | | | This just makes some generic code that currently emits double suitable for emitting 64-bit values. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/tgsi: add 64-bitness type check function.Dave Airlie2016-06-111-0/+7
| | | | | | | | | Currently this just doubles, but we'll convert users to this so making adding 64-bit integers easier. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* vl/dri3: support receiving new pixmap for front bufferLeo Liu2016-06-101-1/+6
| | | | | | | | | | | | With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets renewed in each frame, so when we receive a new pixmap, should get a new front buffer for it. This also fixes Totem player playback corruption. Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Cc: "12.0" <[email protected]>
* vl/dri3: get Makefile properlyLeo Liu2016-06-103-5/+10
| | | | | | | | | | | | | | From original commit, the macro "if HAVE_DRI3" was in Makefile.sources, this file is shared with SCons, SCons is not able to parse this marco, the SCons build failed. Jose quickly gave two approaches and quick fix with his second approach, thanks Jose for the solutions and fixes. This patch is Jose's first approach, and it's more proper, because the dri3 c file should not be included to build when DRI3 is not enabled. Signed-off-by: Leo Liu <[email protected]> Acked-by: Emil Velikov <[email protected]> Cc: "12.0" <[email protected]>
* gallivm: Never emit llvm.fmuladd on LLVM 3.3.Jose Fonseca2016-06-102-1/+7
| | | | | | | | Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't handle llvm.fmuladd and instead it fall backs to a C function. Which in turn causes a segfault on Windows. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use llvm.fmuladd.*.Jose Fonseca2016-06-107-68/+98
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* util,gallivm: Explicitly enable/disable fma attribute.Jose Fonseca2016-06-104-0/+13
| | | | | | | | | | As suggested by Roland Scheidegger. Use the same logic as f16c, since fma requires VEX encoding. But disable FMA on LLVM 3.3 without MCJIT. Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: Reinitialize all descriptors in CE preamble.Bas Nieuwenhuizen2016-06-103-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | This fixes a problem with the CE preamble and restoring only stuff in the preamble when needed. To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we have a context switch at the start of IB 1, but not between IB 1 and IB 2. The old code put the CE RAM loads in the preamble of IB 2. As the preamble of IB 1 does not have the loads and the preamble of IB 2 does not get executed, the old values are not load into CE RAM. Fix this by always restoring the entire CE RAM. v2: - Just load all descriptor set buffers instead of load and store the entire CE RAM. - Leave the ce_ram_dirty tracking in place for the non-preamble case. v3: - Fixed parameter alignment. - Rebased to master (Nicolai's descriptor series). Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* swr: implement clipPlanes/clipVertex/clipDistance/cullDistanceTim Rowley2016-06-095-2/+94
| | | | | | | | | | | | | | v2: only load the clip vertex once v3: fix clip enable logic, add cullDistance v4: remove duplicate fields in vs jit key, fix test of clip fixup needed v5: fix clipdistance linkage for slot!=0,4 v6: support clip+cull; passes most piglit clip (failures understood) Reviewed-by: Bruce Cherniak <[email protected]>
* st/vdpau: implement luma keyingNayan Deshmukh2016-06-092-12/+39
| | | | | Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* vl: Apply luma key filter before CSC conversionNayan Deshmukh2016-06-097-20/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apply the luma key filter to the YCbCr values during the CSC conversion in video buffer shader. The initial values of max and min luma are set to opposite values to disable the filter initially and will be set when enabling it. Add extra parmeters min and max luma for the luma key filter in vl_compositor_set_csc_matrix in va, xvmc. Setting them to opposite value 1.f and 0.f respectively won't effect the CSC conversion v2: -Squash 1,2 and 3 into one patch to avoid breaking build of other components. (Christian) -use ureg_swizzle. (Christian) -change name of the variables. (Christian) v3: -Squash all patches in one to avoid breaking of build. (Emil) -wrap functions properly. (Emil) -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil) v4: -Divide it in two patches one which introduces the functionality and assigs dummy values to the changed functions and second which implements the lumakey filter. (Christian) -use ureg_scalar instead ureg_swizzle. (Christian) Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* virgl: fix checking fencesMarc-André Lureau2016-06-092-2/+2
| | | | | | | | | | | | | | When calling virgl_fence_wait() with timeout=0, virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE for a busy resource, whereace virgl_fence_wait() should return TRUE for a completed (non-busy) resource. This fixes running supertuxkart in a VM (I could not reproduce locally with vtest though there is a similar fix) Signed-off-by: Marc-André Lureau <[email protected]> Cc: "11.1 11.2 12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: improve the computation and comment of scratch_wavesMarek Olšák2016-06-081-4/+18
| | | | | | 2% isn't much. If you think the number should be decreased, please speak up. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print the number of spilled VGPRsMarek Olšák2016-06-081-3/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove dead code creating LLVMTargetMachineMarek Olšák2016-06-083-27/+1
| | | | | | | | This was for some old unsupported LLVM version. Only si_create_context creates the target machine now. r600g doesn't use this function. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't enable scratch just for SGPR spillsMarek Olšák2016-06-081-2/+17
| | | | | | | | | Diff from shader-db: Scratch: 3221504 -> 17408 (-99.46 %) bytes per wave v2: add "break;" Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces"Marek Olšák2016-06-081-0/+1
| | | | | | This reverts commit ffd54d1936fcd07424265b780e1d049222a01e94. No, it doesn't work. The test case is "glxgears -samples 2".
* radeonsi: re-enable PBO ReadPixels accelerationMarek Olšák2016-06-081-3/+6
| | | | | | disabled by 4f1cccf570112f93265a4cace504eb763fa8f73e Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow MSAA resolving into a texture that has DCC enabledMarek Olšák2016-06-082-4/+23
| | | | | | | | Since DCC is enabled almost everywhere now, it's important not to disable this fast path. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: move DCC clearing into a separate functionMarek Olšák2016-06-082-5/+19
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: allow direct hw MSAA resolve for scanout surfacesMarek Olšák2016-06-081-1/+0
| | | | | | | No idea why this was disabled, but it works fine. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't allocate DCC for the temporary MSAA resolve surfaceMarek Olšák2016-06-083-2/+5
| | | | | | | Allocating it has no effect, but it adds overhead (useless DCC clear). Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't enable DCC in the sampler if first_level doesn't have itMarek Olšák2016-06-083-7/+21
| | | | | | | | If first_level > 0 and DCC is disabled for that level, let's skip DCC reads entirely. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* winsys/amdgpu: enable DCC for mipmapped texturesMarek Olšák2016-06-083-9/+31
| | | | | | | | Also add dcc_fast_clear_size for clearing only the necessary subset of DCC. For no AA, it's equal to the size of the whole DCC level. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: don't disable DCC because of SDMAMarek Olšák2016-06-081-20/+3
| | | | | | | | We want to keep DCC enabled to save bandwidth. It was a bad idea to disable it here. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't flag renderbuffer feedback loop if DCC has just been disabledMarek Olšák2016-06-081-2/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add per-level dcc_enabled flagsMarek Olšák2016-06-086-11/+24
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: compute DCC register parameters in si_emit_framebuffer_stateMarek Olšák2016-06-084-14/+12
| | | | | | | | This will get more complicated with mipmapped DCC or when DCC is enabled after allocation. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: add an assertion checking the validity of PIPE_BIND_SCANOUTMarek Olšák2016-06-081-3/+10
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: don't allocate DCC for non-renderable texture formatsMarek Olšák2016-06-083-0/+7
| | | | | | | | | | | R9G9B9E5 is the only uncompressed one hopefully. This fixes incorrect rendering not discovered (due to a lack of tests) until DCC mipmapping was enabled. Cc: 11.1 11.2 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: enable WQM in PS prolog when neededNicolai Hähnle2016-06-072-0/+10
| | | | | | | | | | | | WQM is needed when the PS prolog computes a VGPR that is consumed by a shader with (implicit or explicit) derivatives. Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be effective (otherwise it's just a no-op). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: add uses_derivatives (v2)Nicolai Hähnle2016-06-072-0/+31
| | | | | | | | | | | | v2: - TG4 does not calculate derivatives (Ilia) - also handle SAMPLE* instructions (Roland) Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (v1) Reviewed-by: Brian Paul <[email protected]> (v1) Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* swr: fix provoking vertexTim Rowley2016-06-077-12/+77
| | | | | | | | | | Use rasterizer provoking vertex API. Fix rasterizer provoking vertex for tristrips and quad list/strips. v2: make provoking vertex tables static const Reviewed-by: Bruce Cherniak <[email protected]>
* gk104/ir: fix conditions for adding a texbarIlia Mirkin2016-06-071-4/+6
| | | | | | | | | | | | Sometimes a register source can actually be double- or even quad-wide. We must make sure that the inserted texbars take that width into account. Based on an earlier patch by Samuel Pitoiset. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "12.0 11.2" <[email protected]>