summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallium: Fix uninitialized variable warning in compute test.Eric Anholt2018-11-271-1/+1
| | | | | | | The compiler doesn't know that ny != 0, so x might be uninitialized for the printf at the end. Reviewed-by: Elie Tournier <[email protected]>
* nv50/ir: remove dnz flag when converting MAD to ADD due to optimizationsIlia Mirkin2018-11-241-0/+3
| | | | | | | | | | dnz flag only applies for multiplications (e.g. to make 0 * Infinity becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz flag no longer makes sense, and upsets the GM107 emitter (since it looks at the ftz and dnz flags together). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* winsys/amdgpu: fix a device handle leak in amdgpu_winsys_createMarek Olšák2018-11-231-0/+6
| | | | | Cc: 18.2 18.3 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* winsys/amdgpu: fix a buffer leak in amdgpu_bo_from_handleMarek Olšák2018-11-231-0/+6
| | | | | Cc: 18.2 18.3 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* st/nine: Remove thread_submit warningAxel Davy2018-11-211-3/+0
| | | | | | | | thread_submit can be useful even without DRI_PRIME, as it can help avoid missed pageflips. Signed-off-by: Axel Davy <[email protected]> Tested-by: Andre Heider <[email protected]>
* st/nine: Allow 'triple buffering' with thread_submitAxel Davy2018-11-212-17/+50
| | | | | | | | The path allowing triple buffering behaviour wasn't implemented yet for thread_submit Signed-off-by: Axel Davy <[email protected]> Tested-by: Andre Heider <[email protected]>
* virgl: add assert and missing function parameterRobert Foss2018-11-211-1/+4
| | | | | | | | | | Verify the pipe_fd_type to be of PIPE_FD_TYPE_NATIVE_SYNC. Fixes: d1a1c21e7621b5177feb "virgl: native fence fd support" Suggested-by: Eric Engestrom <[email protected]> Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* r600: clean up the GS ring buffers when the context is destroyedGert Wollny2018-11-211-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two memory leaks reported by ASAN: Direct leak of 248 byte(s) in 1 object(s) allocated from: in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880) in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578 in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600 in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265 in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:725 in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291 in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1482 Direct leak of 248 byte(s) in 1 object(s) allocated from: in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880) in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578 in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600 in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265 in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:722 in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291 in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1489 Signed-off-by: Gert Wollny <[email protected]> Fixes: 1371d65a7fbd695d3516861fe733685569d890d0 r600g: initial support for geometry shaders on evergreen (v2) Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: go back to using bottom-of-pipe for beginning of TIME_ELAPSEDMarek Olšák2018-11-201-11/+4
| | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102597 Cc: 18.3 <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: don't send data after write-confirm with BOTTOM_OF_PIPE_TSMarek Olšák2018-11-203-9/+5
| | | | | | | There are no writes. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: pin driver threads to a fixed CCX when glthread is enabledMarek Olšák2018-11-202-56/+10
| | | | | | | | radeonsi has 3 driver threads (glthread, gallium, winsys), other drivers may have 2 (glthread, gallium), so it makes sense to pin them to a random CCX and keep that irrespective of the app thread. Reviewed-by: Dave Airlie <[email protected]>
* gallium/u_tests: fix MSVC build by using old-style zero initializersMarek Olšák2018-11-201-3/+3
|
* gallium/u_tests: add a compute shader test that clears an imageMarek Olšák2018-11-201-0/+77
|
* meson: Add tests to suitesDylan Baker2018-11-202-2/+4
| | | | | | | | | | | | | | | | Meson test has a concepts of suites, which allow tests to be grouped together. This allows for a subtest of tests to be run only (say only the tests for nir). A test can be added to more than one suite, but for the most part I've only added a test to a single suite, though I've added a compiler group that includes nir, glsl, and glcpp tests. To use this you'll need to invoke meson test directly, instead of ninja test (which always runs all targets). it can be invoked as: `meson test -C builddir --suite $suitename` (meson test has addition options that are pretty useful). Tested-By: Gert Wollny <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* nir: Make nir_lower_clip_vs optionally work with variables.Kenneth Graunke2018-11-192-2/+3
| | | | | | | | | | | | The way nir_lower_clip_vs() works with store_output intrinsics makes a ton of assumptions about the driver_location field. In i965 and iris, I'd rather do this lowering early and work with variables. v3d may want to switch to that as well, and ir3 could too, but I'm not sure exactly what would need updating. For now, handle both methods. Reviewed-by: Eric Anholt <[email protected]>
* etnaviv: use dummy RT buffer when rendering without color bufferLucas Stach2018-11-193-2/+19
| | | | | | | | | | | | | | | At least GC2000 seems to push some dirt from the PE color cache into the last bound render target when drawing depth only. Newer cores seem to behave properly and don't do this, but I have found no way to fix it on GC2000. Flushes and stalls don't seem to make any difference. In order to stop the core from pushing the dirt into a precious real render target, plug in dummy buffer when rendering without a color buffer. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* virgl: fix vtest regression since fencing changes.Dave Airlie2018-11-191-0/+1
| | | | | | | | The in_fence_fd needs to be initialised to -1. Fixes: d1a1c21e7 (virgl: native fence fd support) Reviewed-by: Robert Foss <[email protected]>
* radeonsi: fix an out-of-bounds read reported by ASANNicolai Hähnle2018-11-191-0/+4
| | | | | | | | We read 4 values out of sample_locs_8x, so make sure the array is big enough. Fixes: ac76aeef20 ("radeonsi: switch back to standard DX sample positions") Reviewed-by: Marek Olšák <[email protected]>
* r600: Only set context streamout strides info from the shader that has outputsGert Wollny2018-11-191-3/+9
| | | | | | | | | | | | | | | | | | | With 5d517a streamout info is only attached to the shader for which the transform feedback is actually recorded, but the driver set the context info with each state submitted, thereby always using the info data that was attached to the vertex shader. Pass the streamout stride info to the context only from the shader that actually has outputs. (Thanks to Marek Olšák for pointing me in the right direction) Fixes regresion with: dEQP-GLES31.functional.tessellation.invariance.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108734 Fixes: 5d517a599b1eabd1d5696bf31e26f16568d35770 st/mesa: Don't record garbage streamout information in the non-SSO case. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* virgl: Use file descriptor instead of un-allocated objectGert Wollny2018-11-191-1/+1
| | | | | | | | | | The structure qdws is not allocated at this point, nor is the file descriptor set to it's member. Use the fd directly instead. Fixes: d1a1c21e7621b5177febf191fcd3d3b8ef69dc96 virgl: native fence fd support Signed-off-by: Gert Wollny <[email protected]>
* virgl: Clean up fences commitRobert Foss2018-11-184-4/+1
| | | | | | | | Remove a dead variable, a int->bool conversion and some whitespace changes. Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nv50/ir/ra: enforce max register requirement, and change spill orderIlia Mirkin2018-11-164-16/+26
| | | | | | | | | | | | | | | | | | On nv50, certain operations must happen on regs below 64, due to encoding requirements. First of all, we add infrastructure to enforce this. Secondly we change the spill order to first spill RIG nodes that are unconstrained, followed by ones that are. This makes the gamecube logo shadertoy compile properly. Curiously, if we adjust the spill order so that we first spill the constrained RIG nodes instead, the RA also succeeds. However it seems more logical to first spill the unconstrained ones. While we're at it, drop the nv50 max register to reserve r127 as the zero register of last resort (r63 is preferred). Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Karol Herbst <[email protected]>
* nv50/ir/ra: improve condition for short regs, unify with cond for 16-bitIlia Mirkin2018-11-161-7/+7
| | | | | | | | | | | | | | Instead of the size restriction existing in two places, and potentially being applied twice, we move this together. Ops with 16-bit register addresses can only take a short reg, and ops with immediates can only take a short reg. Of course we leave the immediate 0 in place since we know that it will be replaced by r63/r127 down the line, so don't treat zeroes as an immediate. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nv50/ir: delete MINMAX instruction that is no longer in the BBIlia Mirkin2018-11-161-1/+1
| | | | | | | | | We removed the op from the BB, but it was still listed in its sources' uses. This could trip up some logic down the line which analyzes all the uses of an l-value, e.g. spilling. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* virgl: native fence fd supportRobert Foss2018-11-167-16/+166
| | | | | | | | | Following the support for fences on the virtio driver add support for native fence on virgl. This was somewhat based on the freedeno one. Signed-off-by: Gustavo Padovan <[email protected]> Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vc4: Don't return a vc4 BO handle on a renderonly screen.Eric Anholt2018-11-151-2/+4
| | | | | | The handles exported need to be on the KMS device's fd, anything else is failure. Also, this code is assuming that the scanout resource has been created already, so assert it.
* vc4: Make sure we make ro scanout resources for create_with_modifiers.Eric Anholt2018-11-151-1/+9
| | | | | | | | | | The DRI3 create_with_modifiers paths don't set tmpl.bind to SCANOUT or SHARED, with the theory that given that you've got modifiers, that's all you need. However, we were looking at the tmpl.bind for setting up the KMS handle in the renderonly case, so we'd end up trying to use vc4's handle on the hx8357d fd. Fixes: 84ed8b67c56b ("vc4: Set shareable BOs as T tiled if possible")
* v3d: Fix double-swapping of R/B on V3D 4.1Eric Anholt2018-11-151-2/+3
| | | | Fixes: 4018eb04e8a5 ("v3d: Use the TLB R/B swapping instead of recompiles when available.")
* etnaviv: Make sure rs alignment checks matchGuido Günther2018-11-151-6/+13
| | | | | | | | | | | | | | | | | | etna_resource_alloc and etna_resource_from_handle currently use different checks. This leads to etna_resource_from_handle:492: target=2, format=PIPE_FORMAT_B8G8R8X8_UNORM, 1080x1920x1, array_size=1, last_level=0, nr_samples=0, usage=0, bind=8000a, flags=0 etna_resource_from_handle:541: BO stride 4320 is too small for RS engine width padding (4352, format PIPE_FORMAT_B8G8R8X8_UNORM) since etna_resource_from_handle wants to be aligned to a 16 byte boundary while the etna_resource_alloc does not. Adjust the two checks by using a common function. Broken by baff59ebf07a114f95ad66d1f54e4b1f409eebee Signed-off-by: Guido Günther <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* radeonsi: fix video APIs on Raven2Marek Olšák2018-11-142-4/+8
| | | | | | | | This was missed when I added the new enum. Cc: 18.3 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* st/xa: Bump minorThomas Hellstrom2018-11-141-1/+1
| | | | | | | | Bump minor to signal support for new formats and higher precision solid pictures. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/xa: Support Component Alpha with trivial blendingThomas Hellstrom2018-11-143-17/+35
| | | | | | | | | Support Component Alpha for those composite operations that do not require per-channel alpha blending. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]>
* st/xa: Minor renderer cleanupsThomas Hellstrom2018-11-141-12/+12
| | | | | | | | | constify function arguments to clean up the code a bit. Reported-by: Brian Paul <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]>
* st/xa: Fix transformations when we have both source and mask samplersThomas Hellstrom2018-11-141-68/+49
| | | | | | | | | In the case when we had both source and mask samplers, transformations were typically not applied correctly. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]>
* st/xa: Support a couple of new formatsThomas Hellstrom2018-11-142-9/+29
| | | | | Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/xa: Support higher color precision for solid picturesThomas Hellstrom2018-11-142-26/+100
| | | | | | | | The only solid fill picture type we supported only had 8 bit color channels. Add a new solid picture type that supports float channels. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/xa: Render update. Better support for solid picturesThomas Hellstrom2018-11-145-440/+225
| | | | | | | | | | | | Remove unused and obsolete code for gradients and component-alpha Support solid source- and mask pictures using a variable number of samplers in the composite pipeline rather than the fixed number we used before. Tested using rendercheck for XA. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* virgl: Add command and flags to initiate debugging on the host (v2)Gert Wollny2018-11-135-0/+37
| | | | | | | | | | | | On the host VREND_DEBUG=guestallow must be set to let the guest override the debug flags. v2: Send flag string instead of flags, this avoids the need to keep the flags in sync. v3: Only request host logging if the host actually understands the command Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]>
* freedreno/drm: fix unused 'entry' warningsRob Clark2018-11-121-2/+0
| | | | | | | Looks like importing libdrm_freedreno into mesa crossed paths with e27902a2613. Signed-off-by: Rob Clark <[email protected]>
* st/nine: clean up thead shutdown sequence a bitAndre Heider2018-11-091-4/+2
| | | | | | | Just break out of the loop instead, it does the same thing. Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* st/nine: plug thread related leaksAndre Heider2018-11-092-0/+9
| | | | | Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* st/nine: fix stack corruption due to ABI mismatchAndre Heider2018-11-091-1/+13
| | | | | | | | | | | | | | This fixes various crashes and hangs when using nine's 'thread_submit' feature. On 64bit, the thread function's data argument would just be NULL. On 32bit, the data argument would be garbage depending on the compiler flags (in my case -march>=core2). Fixes: f3fa7e3068512d ("st/nine: Use WINE thread for threadpool") Cc: [email protected] Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET onlyMarek Olšák2018-11-0915-20/+27
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CONTEXT_LOSE_CONTEXT_ON_RESETMarek Olšák2018-11-091-0/+3
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2Marek Olšák2018-11-094-3/+11
| | | | and add has_dcc_constant_encode.
* radeonsi: use better DCC clear codesMarek Olšák2018-11-091-5/+21
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallivm: fix improper clamping of vertex index when fetching gs inputsRoland Scheidegger2018-11-091-10/+31
| | | | | | | | | | | | | | | | | | | | Because we only have one file_max for the (2d) gs input file, the value actually represents the max of attrib and vertex index (although I'm not entirely sure if we really want the max, since the max valid value of the vertex dimension can be easily deduced from the input primitive). Thus in cases where the number of inputs is higher than the number of vertices per prim, we did not properly clamp the vertex index, which would result in out-of-bound fetches, potentially causing segfaults (the segfaults seemed actually difficult to trigger, but valgrind certainly wasn't happy). This might have happened even if the shader did not actually try to fetch bogus vertices, if the fetching happened in non-active conditional clauses. To fix simply use the correct max vertex index value (derived from the input prim type) instead when clamping for this case. Reviewed-by: Jose Fonseca <[email protected]>
* gm107/ir: fix compile time warning in getTEXSMaskKarol Herbst2018-11-071-0/+1
| | | | | | | | | | | In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)': warning: control reaches end of non-void function [-Wreturn-type] Reported-by: Moiman@freenode Fixes: f821e80213e38e93f96255b3deacb737a600ed40 "gm107/ir: use scalar tex instructions where possible" Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* winsys/amdgpu: Stop using amdgpu_bo_handle_type_kms_noimportMichel Dänzer2018-11-071-3/+3
| | | | | | | | | It only behaves any different from amdgpu_bo_handle_type_kms with libdrm 2.4.93, and it breaks if an older version is picked up. Bugzilla: https://bugs.freedesktop.org/108096 Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gm107/ir: use scalar tex instructions where possibleKarol Herbst2018-11-062-3/+317
| | | | | | | | | | | | | | | | | | | TEXS, TLD4 and TLD4S are variants of tex instructions which are more scalar, which gives RA more freedom and is less likely to insert silly MOVs to satisfy quad registers. shader-db changes: total instructions in shared programs : 7687265 -> 7614782 (-0.94%) total gprs used in shared programs : 803620 -> 798045 (-0.69%) total shared used in shared programs : 639636 -> 639636 (0.00%) total local used in shared programs : 24648 -> 24648 (0.00%) total bytes used in shared programs : 82103400 -> 81330696 (-0.94%) local shared gpr inst bytes helped 0 0 3648 10647 10647 hurt 0 0 464 205 205 Reviewed-by: Ilia Mirkin <[email protected]>