summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* vl/dri3: support receiving new pixmap for front bufferLeo Liu2016-06-101-1/+6
| | | | | | | | | | | | With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets renewed in each frame, so when we receive a new pixmap, should get a new front buffer for it. This also fixes Totem player playback corruption. Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Cc: "12.0" <[email protected]>
* vl/dri3: get Makefile properlyLeo Liu2016-06-103-5/+10
| | | | | | | | | | | | | | From original commit, the macro "if HAVE_DRI3" was in Makefile.sources, this file is shared with SCons, SCons is not able to parse this marco, the SCons build failed. Jose quickly gave two approaches and quick fix with his second approach, thanks Jose for the solutions and fixes. This patch is Jose's first approach, and it's more proper, because the dri3 c file should not be included to build when DRI3 is not enabled. Signed-off-by: Leo Liu <[email protected]> Acked-by: Emil Velikov <[email protected]> Cc: "12.0" <[email protected]>
* gallivm: Never emit llvm.fmuladd on LLVM 3.3.Jose Fonseca2016-06-102-1/+7
| | | | | | | | Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't handle llvm.fmuladd and instead it fall backs to a C function. Which in turn causes a segfault on Windows. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use llvm.fmuladd.*.Jose Fonseca2016-06-107-68/+98
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* util,gallivm: Explicitly enable/disable fma attribute.Jose Fonseca2016-06-104-0/+13
| | | | | | | | | | As suggested by Roland Scheidegger. Use the same logic as f16c, since fma requires VEX encoding. But disable FMA on LLVM 3.3 without MCJIT. Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: Reinitialize all descriptors in CE preamble.Bas Nieuwenhuizen2016-06-103-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | This fixes a problem with the CE preamble and restoring only stuff in the preamble when needed. To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we have a context switch at the start of IB 1, but not between IB 1 and IB 2. The old code put the CE RAM loads in the preamble of IB 2. As the preamble of IB 1 does not have the loads and the preamble of IB 2 does not get executed, the old values are not load into CE RAM. Fix this by always restoring the entire CE RAM. v2: - Just load all descriptor set buffers instead of load and store the entire CE RAM. - Leave the ce_ram_dirty tracking in place for the non-preamble case. v3: - Fixed parameter alignment. - Rebased to master (Nicolai's descriptor series). Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* swr: implement clipPlanes/clipVertex/clipDistance/cullDistanceTim Rowley2016-06-095-2/+94
| | | | | | | | | | | | | | v2: only load the clip vertex once v3: fix clip enable logic, add cullDistance v4: remove duplicate fields in vs jit key, fix test of clip fixup needed v5: fix clipdistance linkage for slot!=0,4 v6: support clip+cull; passes most piglit clip (failures understood) Reviewed-by: Bruce Cherniak <[email protected]>
* st/vdpau: implement luma keyingNayan Deshmukh2016-06-092-12/+39
| | | | | Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* vl: Apply luma key filter before CSC conversionNayan Deshmukh2016-06-097-20/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apply the luma key filter to the YCbCr values during the CSC conversion in video buffer shader. The initial values of max and min luma are set to opposite values to disable the filter initially and will be set when enabling it. Add extra parmeters min and max luma for the luma key filter in vl_compositor_set_csc_matrix in va, xvmc. Setting them to opposite value 1.f and 0.f respectively won't effect the CSC conversion v2: -Squash 1,2 and 3 into one patch to avoid breaking build of other components. (Christian) -use ureg_swizzle. (Christian) -change name of the variables. (Christian) v3: -Squash all patches in one to avoid breaking of build. (Emil) -wrap functions properly. (Emil) -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil) v4: -Divide it in two patches one which introduces the functionality and assigs dummy values to the changed functions and second which implements the lumakey filter. (Christian) -use ureg_scalar instead ureg_swizzle. (Christian) Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Christian König <[email protected]>
* virgl: fix checking fencesMarc-André Lureau2016-06-092-2/+2
| | | | | | | | | | | | | | When calling virgl_fence_wait() with timeout=0, virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE for a busy resource, whereace virgl_fence_wait() should return TRUE for a completed (non-busy) resource. This fixes running supertuxkart in a VM (I could not reproduce locally with vtest though there is a similar fix) Signed-off-by: Marc-André Lureau <[email protected]> Cc: "11.1 11.2 12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: improve the computation and comment of scratch_wavesMarek Olšák2016-06-081-4/+18
| | | | | | 2% isn't much. If you think the number should be decreased, please speak up. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print the number of spilled VGPRsMarek Olšák2016-06-081-3/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove dead code creating LLVMTargetMachineMarek Olšák2016-06-083-27/+1
| | | | | | | | This was for some old unsupported LLVM version. Only si_create_context creates the target machine now. r600g doesn't use this function. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't enable scratch just for SGPR spillsMarek Olšák2016-06-081-2/+17
| | | | | | | | | Diff from shader-db: Scratch: 3221504 -> 17408 (-99.46 %) bytes per wave v2: add "break;" Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces"Marek Olšák2016-06-081-0/+1
| | | | | | This reverts commit ffd54d1936fcd07424265b780e1d049222a01e94. No, it doesn't work. The test case is "glxgears -samples 2".
* radeonsi: re-enable PBO ReadPixels accelerationMarek Olšák2016-06-081-3/+6
| | | | | | disabled by 4f1cccf570112f93265a4cace504eb763fa8f73e Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow MSAA resolving into a texture that has DCC enabledMarek Olšák2016-06-082-4/+23
| | | | | | | | Since DCC is enabled almost everywhere now, it's important not to disable this fast path. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: move DCC clearing into a separate functionMarek Olšák2016-06-082-5/+19
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: allow direct hw MSAA resolve for scanout surfacesMarek Olšák2016-06-081-1/+0
| | | | | | | No idea why this was disabled, but it works fine. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't allocate DCC for the temporary MSAA resolve surfaceMarek Olšák2016-06-083-2/+5
| | | | | | | Allocating it has no effect, but it adds overhead (useless DCC clear). Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't enable DCC in the sampler if first_level doesn't have itMarek Olšák2016-06-083-7/+21
| | | | | | | | If first_level > 0 and DCC is disabled for that level, let's skip DCC reads entirely. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* winsys/amdgpu: enable DCC for mipmapped texturesMarek Olšák2016-06-083-9/+31
| | | | | | | | Also add dcc_fast_clear_size for clearing only the necessary subset of DCC. For no AA, it's equal to the size of the whole DCC level. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: don't disable DCC because of SDMAMarek Olšák2016-06-081-20/+3
| | | | | | | | We want to keep DCC enabled to save bandwidth. It was a bad idea to disable it here. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't flag renderbuffer feedback loop if DCC has just been disabledMarek Olšák2016-06-081-2/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add per-level dcc_enabled flagsMarek Olšák2016-06-086-11/+24
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: compute DCC register parameters in si_emit_framebuffer_stateMarek Olšák2016-06-084-14/+12
| | | | | | | | This will get more complicated with mipmapped DCC or when DCC is enabled after allocation. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: add an assertion checking the validity of PIPE_BIND_SCANOUTMarek Olšák2016-06-081-3/+10
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: don't allocate DCC for non-renderable texture formatsMarek Olšák2016-06-083-0/+7
| | | | | | | | | | | R9G9B9E5 is the only uncompressed one hopefully. This fixes incorrect rendering not discovered (due to a lack of tests) until DCC mipmapping was enabled. Cc: 11.1 11.2 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: enable WQM in PS prolog when neededNicolai Hähnle2016-06-072-0/+10
| | | | | | | | | | | | WQM is needed when the PS prolog computes a VGPR that is consumed by a shader with (implicit or explicit) derivatives. Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be effective (otherwise it's just a no-op). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: add uses_derivatives (v2)Nicolai Hähnle2016-06-072-0/+31
| | | | | | | | | | | | v2: - TG4 does not calculate derivatives (Ilia) - also handle SAMPLE* instructions (Roland) Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (v1) Reviewed-by: Brian Paul <[email protected]> (v1) Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* swr: fix provoking vertexTim Rowley2016-06-077-12/+77
| | | | | | | | | | Use rasterizer provoking vertex API. Fix rasterizer provoking vertex for tristrips and quad list/strips. v2: make provoking vertex tables static const Reviewed-by: Bruce Cherniak <[email protected]>
* gk104/ir: fix conditions for adding a texbarIlia Mirkin2016-06-071-4/+6
| | | | | | | | | | | | Sometimes a register source can actually be double- or even quad-wide. We must make sure that the inserted texbars take that width into account. Based on an earlier patch by Samuel Pitoiset. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "12.0 11.2" <[email protected]>
* radeonsi: keep track of dirty descriptor setsNicolai Hähnle2016-06-072-4/+36
| | | | | | | Reduces CPU load for draw calls that change none or few of the descriptors. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move si_descriptors into a per-context arrayNicolai Hähnle2016-06-073-83/+166
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_disable_shader_imageNicolai Hähnle2016-06-071-4/+8
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: access descriptor sets via local variablesNicolai Hähnle2016-06-071-31/+41
| | | | | | | This will simplify moving them to a per-context array. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_set_rw_buffer to be used for internal descriptorsNicolai Hähnle2016-06-073-14/+15
| | | | | | | | So that callers outside of si_descriptors.c need to worry less about the details of descriptor handling. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_set_shader_imageNicolai Hähnle2016-06-071-5/+5
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_set_sampler_viewNicolai Hähnle2016-06-071-4/+5
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move descriptor set begin_new_cs handling into a separate functionNicolai Hähnle2016-06-071-21/+15
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move enabled_mask out of si_descriptorsNicolai Hähnle2016-06-074-30/+34
| | | | | | | | This mask is irrelevant for the generic descriptor set handling, and having it outside simplifies subsequent changes slightly. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add support for sharing textures with DCC between processesMarek Olšák2016-06-073-4/+51
| | | | | | v2: use a function for calculating WORD1 of bo metadata Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't discard DCC if an external user can write to itMarek Olšák2016-06-073-12/+31
| | | | | | | | | We don't import textures with DCC now, but soon we will. v2: if we can't disable DCC for image writes, at least decompress DCC at bind time Reviewed-by: Nicolai Hähnle <[email protected]>
* i915: fix typo CAP.Dave Airlie2016-06-071-1/+1
| | | | Signed-off-by: Dave Airlie <[email protected]>
* nvc0: add support for VOTE tgsi opcodesIlia Mirkin2016-06-065-24/+77
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowedIlia Mirkin2016-06-0617-0/+17
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_voteIlia Mirkin2016-06-063-1/+26
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nv50/ir: use round toward 0 when converting doubles to integersSamuel Pitoiset2016-06-061-1/+3
| | | | | | | | | | | | Like floats, we should use the round toward 0 mode instead of the nearest one (which is the default) for doubles to integers. This fixes all arb_gpu_shader_fp64 piglits which convert doubles to integers (16 tests). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]>
* gallium/radeon: don't re-set BO metadata after CMASK deallocationMarek Olšák2016-06-061-1/+0
| | | | | | CMASK has no effect on metadata, because it's not sharable. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a performance tweak for 4 SE partsMarek Olšák2016-06-061-0/+11
| | | | | | Ported from Vulkan. Reviewed-by: Nicolai Hähnle <[email protected]>