aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: remove unnecessary null checksRob Clark2015-10-244-13/+13
| | | | | | | | According to piglit/xonotic/neverball/stc, blend/rasterize/zsa state will always be bound (never null). And the null checks were in- consistent anyways, so remove them. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: Implement DCC fast clear.Bas Nieuwenhuizen2015-10-243-14/+100
| | | | | | | | | | | Uses the DCC buffer instead of the CMASK buffer. The ELIMINATE_FAST_CLEAR still works. Furthermore, with DCC compression we can directly clear to a limited set of colors such that we do not need a postprocessing step. v2 Marek: check dcc_buffer && dirty_level_mask in set_sampler_view Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* softpipe: fix using non-zero layer in non-array view from array resourceRoland Scheidegger2015-10-242-9/+31
| | | | | | | | | | | | For vertex/geometry shader sampling, this is the same as for llvmpipe - just use the original resource target. For fragment shader sampling though (which does not use first-layer based mip offsets) adjust the sampling code to use first_layer in the non-array cases. While here also fix up some code which looked wrong wrt buffer texel fetch (no piglit change). Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: fix using non-zero layer in non-array view from array resourceRoland Scheidegger2015-10-242-8/+8
| | | | | | | | | | | Just need to use resource target not view target when calculating first-layer based mip offsets. (This is a gl specific problem since d3d10 does not distinguish between non-array and array resources neither at the resource nor view level, only at the shader level.) Fixes new piglit arb_texture_view sampling-2d-array-as-2d-layer test. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: add Stoney to si_init_gs_info()Alex Deucher2015-10-231-0/+1
| | | | | | | | This patch was originally written before stoney support was merged. Add stoney. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* radeonsi: Enable DCC.Bas Nieuwenhuizen2015-10-246-6/+50
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: Add FLUSH_AND_INV_CB_DATA_TS for DCC.Bas Nieuwenhuizen2015-10-241-0/+11
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: Disable operations that do not work with DCC.Bas Nieuwenhuizen2015-10-244-3/+11
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: Allocate buffers for DCC.Bas Nieuwenhuizen2015-10-244-0/+50
| | | | | | | | | | | As the alignment requirements can be 32 KiB or more, also adding an aligned buffer creation function. DCC is disabled for textures that can be shared as sharing the DCC buffers has not been implemented yet. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: only apply the SNORM blit workaround to *8_SNORMMarek Olšák2015-10-241-1/+1
| | | | | | | Like the comment says. This fixes DCC, which doesn't like blitting RG16 as RGBA8. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add another requirement for PARTIAL_ES_WAVEMarek Olšák2015-10-244-2/+35
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: merge two ifs setting WD_SWITCH_ON_EOPMarek Olšák2015-10-241-5/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: make PARTIAL_ES_WAVE globally dependent on SWITCH_ON_EOIMarek Olšák2015-10-241-5/+6
| | | | | | This catches the other cases that enable SWITCH_ON_EOI. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add one more SWITCH_ON_EOI requirement for Hawaii and VIMarek Olšák2015-10-241-1/+10
| | | | | | The VI condition depends on geometry shaders and MAX_PRIMGRP_IN_WAVE. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: only apply the instancing bug workaround to BonaireMarek Olšák2015-10-241-5/+5
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add SWITCH_ON_EOI requirement for 4 SE partsMarek Olšák2015-10-241-0/+4
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove unnecessary PARTIAL_VS_WAVE setting for streamoutMarek Olšák2015-10-241-4/+0
| | | | | | hardware does this automatically Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: allow unbinding vertex shadersMarek Olšák2015-10-241-2/+2
| | | | | | Draw calls without a vertex shader are skipped. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: allow unbinding pixel shaders and remove the dummy shaderMarek Olšák2015-10-243-22/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add draw_vbo check for a NULL pixel shaderMarek Olšák2015-10-243-1/+8
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add checks for a NULL pixel shaderMarek Olšák2015-10-242-32/+42
| | | | | | This will allow removing the dummy PS. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add support for Stoney asics (v3)Samuel Li2015-10-233-0/+8
| | | | | | | | | | | v2 (agd): rebase on mesa master, split pci ids to separate commit v3 (agd): use carrizo for llvm processor name for llvm 3.7 and older Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Samuel Li <[email protected]> Cc: [email protected]
* nvc0: respect edgeflag attribute widthIlia Mirkin2015-10-231-7/+33
| | | | | | | | | | | | The edgeflag comes in as ubyte with glEdgeFlagPointer but as float with plain immediate glEdgeFlag. Avoid reading bytes that weren't meant for the edgeflag in the pointer case. Fixes intermittent failures with gl-2.0-edgeflag piglit (and valgrind complaints about reading uninitialized memory). Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* vc4: Convert blending to being done in 4x8 unorm normally.Eric Anholt2015-10-235-51/+276
| | | | | | | | | | | | | We can't do this all the time, because you want blending to be done in linear space, and sRGB would lose too much precision being done in 4x8. The win on instructions is pretty huge when you can, though. total uniforms in shared programs: 32065 -> 32168 (0.32%) uniforms in affected programs: 327 -> 430 (31.50%) total instructions in shared programs: 92644 -> 89830 (-3.04%) instructions in affected programs: 15580 -> 12766 (-18.06%) Improves openarena performance at 1920x1080 from 10.7fps to 11.2fps.
* vc4: Add QIR/QPU support for the 8-bit vector instructions.Eric Anholt2015-10-234-0/+45
|
* vc4: Don't try to CSE non-SSA instructions.Eric Anholt2015-10-231-0/+1
| | | | | | | This can happen when we're doing destination packing -- we don't know what's in the rest of the register. Signed-off-by: Eric Anholt <[email protected]>
* vc4: Add dumping of VC4_PACKET_GL_INDEXED_PRIMITIVE.Eric Anholt2015-10-231-1/+22
|
* vc4: Add a workaround for HW-2116 (state counter wrap fails).Eric Anholt2015-10-233-6/+40
| | | | | | I haven't proven that this happens (I've got other GPU hangs in the way), but the closed driver also does this and it's documented as an errata.
* vc4: Fix missing \n in a perf_debug().Eric Anholt2015-10-231-1/+1
|
* vc4: Use Rob's NIR-based user clip lowering.Eric Anholt2015-10-234-69/+14
|
* vc4: Also dump the decimation mode for resolved stores.Eric Anholt2015-10-231-2/+4
|
* vc4: Use VC4_GET_FIELD and other defines in dumping VC4_RENDER_CONFIG.Eric Anholt2015-10-231-10/+10
|
* vc4: Add a sentinel after simulator buffers for buffer overflow detection.Eric Anholt2015-10-231-1/+11
| | | | | | | | | This is a little bit like the mprotect-based fencing I've experimented with, but it's simple and low overhead. The downside is that only catches writes, not reads. It didn't catch any bad writes on a current piglit run, but may be useful in the future.
* ilo: add support for scratch spacesChia-I Wu2015-10-2310-16/+133
| | | | | When a kernel reports a non-zero per-thread scratch space size, make sure the hardware state is correctly set up, and a scratch bo is allocated.
* ilo: fix scratch space setup in coreChia-I Wu2015-10-2311-133/+327
| | | | | | Move scratch_size out of ilo_state_shader_kernel_info and ilo_state_compute_interface_info. A scratch space is shared by all kernels/interfaces. Update builder to emit relocs for scratch bos.
* virgl/vtest: add vtest driverDave Airlie2015-10-231-1/+2
| | | | | | | | | | | | | | | | | | | | | virgl/vtest is a swrast driver that allows the virgl acceleration to be tested without having a virtual machine. The backend has a unix socket server that this connects to. This is run by setting LIBGL_ALWAYS_SOFTWARE=y GALLIUM_DRIVER=virpipe In this mode all renderering is sent over a socket to the remote renderer, and the results are readback and copies to the screen using drisw. This works well enough to develop new features and to help debug. Signed-off-by: Dave Airlie <[email protected]>
* virgl: add driver for virtio-gpu 3D (v2)Dave Airlie2015-10-2318-0/+4492
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | virgl is the 3D acceleration backend for the virtio-gpu shipping with qemu. The 3D acceleration is designed around gallium and TGSI as the virtualisation layer. The backend renderer translates the virgl interface into OpenGL currently. This is the initial import of the driver to mesa. The kernel driver portions are lined up for drm-next. Currently this driver supports up to GL3.3 and some misc extensions if the host driver exposes it. It is planned to iterate the virgl API to new GL levels as mesa host drivers gain features. v2: fix resource tracking across flushes to avoid ->bind hack in mapping. consolidate mapping and waiting code for transfers. use u_range for dirt tracking. handle larger shaders in protocol. include virtgpu_drm.h in mesa for now. add translation layer for gallium tgsi to virgl tgsi. Signed-off-by: Dave Airlie <[email protected]>
* svga: Condition preemptive flush on draw emissionSinclair Yeh2015-10-223-0/+15
| | | | | | | | | | | | | | On ultra high resolution modes, the preemptive flush flag can be set midway through command submission, a condition that cannot be recovered from a flush-retry, causing rendering artifacts. This patch prevents a preemtive_flush until a draw has been emitted. Signed-off-by: Sinclair Yeh <[email protected]> Reviewed-by: Thomas Hellstrom <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* svga: try to avoid index generation for some primitive typesBrian Paul2015-10-221-0/+14
| | | | | | | | | | The svga device doesn't directly support quads, quad strips or polygons so we have to convert those types to indexed triangle lists. But we can sometimes avoid that if we're drawing flat/constant-colored prims and we don't have to worry about provoking vertex. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* svga: avoid provoking vertex conversion when possibleBrian Paul2015-10-221-1/+14
| | | | | | | | | | | | Provoking vertex comes into play when doing flat shading. But if we know that all fragments in a primitive are the same color, the provoking vertex doesn't matter. Check for that case and use whichever provoking vertex convention is supported by the device. This avoids generating an index buffer to do the PV conversion. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* svga: detect constant color writes in fragment shadersBrian Paul2015-10-225-2/+77
| | | | | | | | | | | | | | | | | | Examine the fragment shader to try to detect TGSI shaders which use "MOV OUT[0], CONST[i]" to write a constant value for the fragment color. In this case, all fragments will have the same color (unless blending is enabled). This is a common case for OpenGL code such as: glColor(), glBegin(), glVertex(), ..., glEnd() when lighting/fog/etc are disabled. In this case, the Mesa/gallium state tracker actually generates a simple "MOV OUT[0], CONST[i]" fragment shader. This will be used by the next commit to avoid provoking vertex conversion (creating/rewriting an index buffer) when drawing flat-shaded primitives. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* radeon/uvd: don't expose HEVC on old UVD hw (v3)Alex Deucher2015-10-221-32/+18
| | | | | | | | | | | | | | | | The section for UVD 2 and older was not updated when HEVC support was added. Reported by Kano on irc. v2: integrate the UVD2 and older checks into the main switch statement. v3: handle encode checking as well. Encode is already checked in the top case statement, so drop encode checks in the lower case statement. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
* ilo: make sure there is HiZ before resolvingChia-I Wu2015-10-221-2/+4
| | | | We do not want to perform a depth resolve on an MCS enabled surface.
* ilo: fix max thread count for HS on Gen8Chia-I Wu2015-10-221-3/+5
| | | | It is in DW2 on Gen8.
* svga: fix clip plane regression after recent tgsi_scan changeBrian Paul2015-10-211-2/+2
| | | | | | | | | Before the change "tgsi/scan: use properties for clip/cull distance writemasks", the tgsi_shader_info::num_written_clipdistance field was a multiple of four, now it's an accurate count. In the svga driver, we need a minor change to the loop test. Reviewed-by: Charmaine Lee <[email protected]>
* svga: add switch case for PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTBrian Paul2015-10-201-0/+2
| | | | | | | | A third instance of this was needed but missed in the previous commit. Return 32 as for the two other cases. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTMarek Olšák2015-10-2011-0/+32
| | | | | | | | | | | | | | This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* vc4: Switch our vertex attr lowering to being NIR-based.Eric Anholt2015-10-202-143/+200
| | | | | | | | | | This exposes more information to NIR's optimization, and should be particularly useful when we do range-based optimization. total uniforms in shared programs: 32066 -> 32065 (-0.00%) uniforms in affected programs: 21 -> 20 (-4.76%) total instructions in shared programs: 93104 -> 92630 (-0.51%) instructions in affected programs: 31901 -> 31427 (-1.49%)
* vc4: Add limited support for ibfe/ubfe.Eric Anholt2015-10-201-0/+42
| | | | | This is just enough to cover our unpack modes, which will be used by some new NIR-based lowering in the next commit.
* radeonsi: enable BC_OPTIMIZE if centroid isn't usedMarek Olšák2015-10-201-1/+5
| | | | | | This solution was recommended by a Catalyst developer. Reviewed-by: Michel Dänzer <[email protected]>