aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: shader_t -> gl_shader_stageRob Clark2018-11-2722-143/+121
| | | | | | | | | Just massive search/replace for the most part. Step towards removing ir3 dependency on disasm.h which is shared by a2xx. One step closer to being able to move ir3 out of gallium. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: standalone compiler updatesRob Clark2018-11-271-6/+27
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: move drm to common locationRob Clark2018-11-2720-3717/+12
| | | | | | | | So that we can re-use at least parts of it for vulkan driver, and so that we can move ir3 to a common location (which uses fd_bo to allocate storage for shaders) Signed-off-by: Rob Clark <[email protected]>
* freedreno/drm: remove dependency on gallium driverRob Clark2018-11-271-2/+11
| | | | | | | | Prep work to move drm to a common location. Slightly hacky, but the softpin debug flag is only temporary. Signed-off-by: Rob Clark <[email protected]>
* nv50/ir: remove dnz flag when converting MAD to ADD due to optimizationsIlia Mirkin2018-11-241-0/+3
| | | | | | | | | | dnz flag only applies for multiplications (e.g. to make 0 * Infinity becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz flag no longer makes sense, and upsets the GM107 emitter (since it looks at the ftz and dnz flags together). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* virgl: add assert and missing function parameterRobert Foss2018-11-211-1/+4
| | | | | | | | | | Verify the pipe_fd_type to be of PIPE_FD_TYPE_NATIVE_SYNC. Fixes: d1a1c21e7621b5177feb "virgl: native fence fd support" Suggested-by: Eric Engestrom <[email protected]> Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* r600: clean up the GS ring buffers when the context is destroyedGert Wollny2018-11-211-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two memory leaks reported by ASAN: Direct leak of 248 byte(s) in 1 object(s) allocated from: in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880) in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578 in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600 in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265 in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:725 in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291 in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1482 Direct leak of 248 byte(s) in 1 object(s) allocated from: in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880) in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578 in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600 in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265 in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:722 in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291 in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1489 Signed-off-by: Gert Wollny <[email protected]> Fixes: 1371d65a7fbd695d3516861fe733685569d890d0 r600g: initial support for geometry shaders on evergreen (v2) Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: go back to using bottom-of-pipe for beginning of TIME_ELAPSEDMarek Olšák2018-11-201-11/+4
| | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102597 Cc: 18.3 <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: don't send data after write-confirm with BOTTOM_OF_PIPE_TSMarek Olšák2018-11-203-9/+5
| | | | | | | There are no writes. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* meson: Add tests to suitesDylan Baker2018-11-202-2/+4
| | | | | | | | | | | | | | | | Meson test has a concepts of suites, which allow tests to be grouped together. This allows for a subtest of tests to be run only (say only the tests for nir). A test can be added to more than one suite, but for the most part I've only added a test to a single suite, though I've added a compiler group that includes nir, glsl, and glcpp tests. To use this you'll need to invoke meson test directly, instead of ninja test (which always runs all targets). it can be invoked as: `meson test -C builddir --suite $suitename` (meson test has addition options that are pretty useful). Tested-By: Gert Wollny <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* nir: Make nir_lower_clip_vs optionally work with variables.Kenneth Graunke2018-11-192-2/+3
| | | | | | | | | | | | The way nir_lower_clip_vs() works with store_output intrinsics makes a ton of assumptions about the driver_location field. In i965 and iris, I'd rather do this lowering early and work with variables. v3d may want to switch to that as well, and ir3 could too, but I'm not sure exactly what would need updating. For now, handle both methods. Reviewed-by: Eric Anholt <[email protected]>
* etnaviv: use dummy RT buffer when rendering without color bufferLucas Stach2018-11-193-2/+19
| | | | | | | | | | | | | | | At least GC2000 seems to push some dirt from the PE color cache into the last bound render target when drawing depth only. Newer cores seem to behave properly and don't do this, but I have found no way to fix it on GC2000. Flushes and stalls don't seem to make any difference. In order to stop the core from pushing the dirt into a precious real render target, plug in dummy buffer when rendering without a color buffer. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* radeonsi: fix an out-of-bounds read reported by ASANNicolai Hähnle2018-11-191-0/+4
| | | | | | | | We read 4 values out of sample_locs_8x, so make sure the array is big enough. Fixes: ac76aeef20 ("radeonsi: switch back to standard DX sample positions") Reviewed-by: Marek Olšák <[email protected]>
* r600: Only set context streamout strides info from the shader that has outputsGert Wollny2018-11-191-3/+9
| | | | | | | | | | | | | | | | | | | With 5d517a streamout info is only attached to the shader for which the transform feedback is actually recorded, but the driver set the context info with each state submitted, thereby always using the info data that was attached to the vertex shader. Pass the streamout stride info to the context only from the shader that actually has outputs. (Thanks to Marek Olšák for pointing me in the right direction) Fixes regresion with: dEQP-GLES31.functional.tessellation.invariance.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108734 Fixes: 5d517a599b1eabd1d5696bf31e26f16568d35770 st/mesa: Don't record garbage streamout information in the non-SSO case. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* virgl: Clean up fences commitRobert Foss2018-11-181-1/+1
| | | | | | | | Remove a dead variable, a int->bool conversion and some whitespace changes. Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nv50/ir/ra: enforce max register requirement, and change spill orderIlia Mirkin2018-11-164-16/+26
| | | | | | | | | | | | | | | | | | On nv50, certain operations must happen on regs below 64, due to encoding requirements. First of all, we add infrastructure to enforce this. Secondly we change the spill order to first spill RIG nodes that are unconstrained, followed by ones that are. This makes the gamecube logo shadertoy compile properly. Curiously, if we adjust the spill order so that we first spill the constrained RIG nodes instead, the RA also succeeds. However it seems more logical to first spill the unconstrained ones. While we're at it, drop the nv50 max register to reserve r127 as the zero register of last resort (r63 is preferred). Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Karol Herbst <[email protected]>
* nv50/ir/ra: improve condition for short regs, unify with cond for 16-bitIlia Mirkin2018-11-161-7/+7
| | | | | | | | | | | | | | Instead of the size restriction existing in two places, and potentially being applied twice, we move this together. Ops with 16-bit register addresses can only take a short reg, and ops with immediates can only take a short reg. Of course we leave the immediate 0 in place since we know that it will be replaced by r63/r127 down the line, so don't treat zeroes as an immediate. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nv50/ir: delete MINMAX instruction that is no longer in the BBIlia Mirkin2018-11-161-1/+1
| | | | | | | | | We removed the op from the BB, but it was still listed in its sources' uses. This could trip up some logic down the line which analyzes all the uses of an l-value, e.g. spilling. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* virgl: native fence fd supportRobert Foss2018-11-163-10/+62
| | | | | | | | | Following the support for fences on the virtio driver add support for native fence on virgl. This was somewhat based on the freedeno one. Signed-off-by: Gustavo Padovan <[email protected]> Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vc4: Don't return a vc4 BO handle on a renderonly screen.Eric Anholt2018-11-151-2/+4
| | | | | | The handles exported need to be on the KMS device's fd, anything else is failure. Also, this code is assuming that the scanout resource has been created already, so assert it.
* vc4: Make sure we make ro scanout resources for create_with_modifiers.Eric Anholt2018-11-151-1/+9
| | | | | | | | | | The DRI3 create_with_modifiers paths don't set tmpl.bind to SCANOUT or SHARED, with the theory that given that you've got modifiers, that's all you need. However, we were looking at the tmpl.bind for setting up the KMS handle in the renderonly case, so we'd end up trying to use vc4's handle on the hx8357d fd. Fixes: 84ed8b67c56b ("vc4: Set shareable BOs as T tiled if possible")
* v3d: Fix double-swapping of R/B on V3D 4.1Eric Anholt2018-11-151-2/+3
| | | | Fixes: 4018eb04e8a5 ("v3d: Use the TLB R/B swapping instead of recompiles when available.")
* etnaviv: Make sure rs alignment checks matchGuido Günther2018-11-151-6/+13
| | | | | | | | | | | | | | | | | | etna_resource_alloc and etna_resource_from_handle currently use different checks. This leads to etna_resource_from_handle:492: target=2, format=PIPE_FORMAT_B8G8R8X8_UNORM, 1080x1920x1, array_size=1, last_level=0, nr_samples=0, usage=0, bind=8000a, flags=0 etna_resource_from_handle:541: BO stride 4320 is too small for RS engine width padding (4352, format PIPE_FORMAT_B8G8R8X8_UNORM) since etna_resource_from_handle wants to be aligned to a 16 byte boundary while the etna_resource_alloc does not. Adjust the two checks by using a common function. Broken by baff59ebf07a114f95ad66d1f54e4b1f409eebee Signed-off-by: Guido Günther <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
* radeonsi: fix video APIs on Raven2Marek Olšák2018-11-142-4/+8
| | | | | | | | This was missed when I added the new enum. Cc: 18.3 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* virgl: Add command and flags to initiate debugging on the host (v2)Gert Wollny2018-11-135-0/+37
| | | | | | | | | | | | On the host VREND_DEBUG=guestallow must be set to let the guest override the debug flags. v2: Send flag string instead of flags, this avoids the need to keep the flags in sync. v3: Only request host logging if the host actually understands the command Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]>
* freedreno/drm: fix unused 'entry' warningsRob Clark2018-11-121-2/+0
| | | | | | | Looks like importing libdrm_freedreno into mesa crossed paths with e27902a2613. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET onlyMarek Olšák2018-11-0912-17/+20
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2Marek Olšák2018-11-094-3/+11
| | | | and add has_dcc_constant_encode.
* radeonsi: use better DCC clear codesMarek Olšák2018-11-091-5/+21
| | | | Tested-by: Dieter Nützel <[email protected]>
* gm107/ir: fix compile time warning in getTEXSMaskKarol Herbst2018-11-071-0/+1
| | | | | | | | | | | In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)': warning: control reaches end of non-void function [-Wreturn-type] Reported-by: Moiman@freenode Fixes: f821e80213e38e93f96255b3deacb737a600ed40 "gm107/ir: use scalar tex instructions where possible" Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: use scalar tex instructions where possibleKarol Herbst2018-11-062-3/+317
| | | | | | | | | | | | | | | | | | | TEXS, TLD4 and TLD4S are variants of tex instructions which are more scalar, which gives RA more freedom and is less likely to insert silly MOVs to satisfy quad registers. shader-db changes: total instructions in shared programs : 7687265 -> 7614782 (-0.94%) total gprs used in shared programs : 803620 -> 798045 (-0.69%) total shared used in shared programs : 639636 -> 639636 (0.00%) total local used in shared programs : 24648 -> 24648 (0.00%) total bytes used in shared programs : 82103400 -> 81330696 (-0.94%) local shared gpr inst bytes helped 0 0 3648 10647 10647 hurt 0 0 464 205 205 Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: add scalar field to TexInstructionsKarol Herbst2018-11-062-1/+6
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ra: add condenseDef overloads for partial condensesKarol Herbst2018-11-061-8/+21
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print color masks of tex instructionsKarol Herbst2018-11-061-4/+33
| | | | | | | v2: print the mask for TXG as well make the mask to be printed more mask like Reviewed-by: Ilia Mirkin <[email protected]>
* r600: Add support for EXT_texture_sRGB_R8Gert Wollny2018-11-061-0/+1
| | | | | | | | | | | Enables on R600 and makes pass: dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.* dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8* v2: remove chunk for dri/radeon (Emil) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* freedreno/a6xx: Clear z32 and separate stencil with blitterKristian H. Kristensen2018-11-062-27/+50
| | | | Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: fix VSC bug with larger # of tilesRob Clark2018-11-061-5/+2
| | | | | | | | | | At higher resolutions with the addition of MSAA, the number of tiles can increase to the point where we use more than one VSC pipe per tile. Which would cause us to calculate an out-of-bounds offset for VSC_SIZE_ADDRESS. So don't try to be clever, just always put it at a fixed offset assuming the max 32 VSC pipes in use. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-11-067-29/+51
| | | | Signed-off-by: Rob Clark <[email protected]>
* r600/sb: Fix constant logical operand in assert.Vinson Lee2018-11-041-1/+1
| | | | | | Fixes: da977ad90747 ("r600/sb: start adding GDS support") Signed-off-by: Vinson Lee <[email protected]> Reviewed-By: Gert Wollny <[email protected]>
* vc4: Use the normal simulator ioctl path for CL submit as well.Eric Anholt2018-11-023-13/+5
| | | | The simulator no longer needs to look back into the gallium structs.
* vc4: Maintain a separate GEM mapping of BOs in the simulator.Eric Anholt2018-11-022-42/+58
| | | | This will let us avoid looking back into the gallium driver's vc4_bo.
* vc4: Take advantage of _mesa_hash_table_remove_key() in the simulator.Eric Anholt2018-11-021-4/+2
|
* v3d: Remove the special path for simulaton of the submit ioctl.Eric Anholt2018-11-025-19/+13
| | | | | Now that it doesn't need to find the struct v3d_bos, it can just take the normal v3d_ioctl() path.
* v3d: Maintain a mapping of the GEM buffer in the simulator.Eric Anholt2018-11-021-23/+48
| | | | | This way we don't need to reach back into the gallium driver code to get the mapping.
* Gallium: Add format PIPE_FORMAT_R8_SRGBGert Wollny2018-11-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | This format is needed to support EXT_texture_sRGB_R8. THe patch adds a new format enum, the format entries in Gallium and and svga, the mapping between sRGB and linear formats, and tests. v2: - add mapping to linear format for PIPE_FORMATR_R8_SRGB v3: - Add texture format to svga format table since otherwise building mesa will fail when this driver is enabled. It was not tested whether the extension actually works. v4: - svga: remove the SVGA specific format definitions and table entries and only add correct the location of PIPE_FORMAT_R8_SRGB in the format_conversion_table (Ilia Mirkin) - Split patch (1/2) to separate Gallium part and mesa/st part. (Roland Scheidegger) - Trim the commit message to only contain the relevant parts from the split. v5: - svga: correct location of PIPE_FORMAT_SRGB_R8 (Ilia Mirkin) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Drop the winsys_stride relayout in the simluatorEric Anholt2018-11-015-95/+12
| | | | | | | Since 0c1dd9dee0da ("broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride."), we have the vc4-side BO properly laid out (assuming it's linear) in the winsys BO so that we can skip this extra copy.
* v3d: Use the TLB R/B swapping instead of recompiles when available.Eric Anholt2018-11-014-3/+17
| | | | | | The recompile reduction is nice, but this also makes it so that a straight texture copy could get optimized some day to not unpack/repack the f16 values.
* v3d: Take advantage of _mesa_hash_table_remove_key() in the simulator.Eric Anholt2018-11-011-4/+2
|
* v3d: Respect user-passed strides for BO imports.Eric Anholt2018-11-015-96/+19
| | | | | | | If the caller has passed in a stride for (linear) BO import, we should use that stride when rendering to the BO (or, if we some day support texturing from linear-imported BOs, when doing the linear-to-UIF shadow copy). This lets us remove the extra stride-changing relayout in the simulator.
* v3d: Drop #if 0-ed out v3d_dump_to_file().Eric Anholt2018-11-011-91/+0
| | | | | | This came from vc4, where we had a file format for GPU hangs. I don't have one of those for V3D, and I probably won't ever have the simulator side produce dumps even if I do.