summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: rename cache flushing flags once moreMarek Olšák2015-11-137-35/+30
| | | | | | | | | | | | | | | KCACHE, TC L1 and TC L2 are renamed to: - SMEM L1 - VMEM L1 - GLOBAL L2 You can easily tell what they are used for now. Shaders must deal with coherency issues between both L1s manually, e.g. by setting GLC=1 or by using s_dcache_*. BOTH_ICACHE_KCACHE was an unused definition. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set the DISABLE_WR_CONFIRM flag on CI-VI as wellMarek Olšák2015-11-131-2/+2
| | | | | | | | I missed this in commit c3e527f93d4281ad6e2ca165eaf6ff588e4faefa radeonsi: only enable write confirmation on the last CP DMA packet Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: initialize SX_PS_DOWNCONVERT to 0 on StoneyMarek Olšák2015-11-131-0/+3
| | | | | | | otherwise the SX or CB blocks can go bananas Reviewed-by: Nicolai Hähnle <[email protected]> Cc: [email protected]
* radeonsi: add glClearBufferSubData accelerationMarek Olšák2015-11-131-0/+60
| | | | | | 8-bit and 16-bit clears which are not aligned to dwords are done in software. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add SI_SAVE_FRAGMENT_STATE blitter flagMarek Olšák2015-11-131-19/+25
| | | | | | Buffer clears via transform feedback won't set this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix a future crash in emit_cb_target_maskMarek Olšák2015-11-131-1/+1
| | | | | | | This can't crash currently, but it would crash if clear_buffer from u_blitter were used with a clean context. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix unaligned clear_buffer fallbackMarek Olšák2015-11-131-6/+8
| | | | | | | This is unreachable currently, but it will be used by unaligned 8-bit and 16-bit fills. Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: fix clear_buffer fallback with offset != 0Marek Olšák2015-11-131-0/+1
| | | | | | | Discovered by luck. This code path hasn't been exercised since transform feedback was implemented. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: fix PIPE_QUERY_GPU_FINISHEDMarek Olšák2015-11-131-1/+1
| | | | | | | | | Broken by the addition of r600_multi_fence in 3b37155a68acc351cba86a1fa142bd0de2192d4c Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89014 Reviewed-by: Michel Dänzer <[email protected]>
* nvc0/ir: add support for TGSI_SEMANTIC_HELPER_INVOCATIONIlia Mirkin2015-11-126-0/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: add ARB_clear_texture supportIlia Mirkin2015-11-115-7/+101
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_CAP_CLEAR_TEXTURE and clear_texture prototypeIlia Mirkin2015-11-1114-0/+14
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600: initialised PGM_RESOURCES_2 for ES/GSDave Airlie2015-11-122-0/+6
| | | | | | | | | | | This fixes the corruption on rendering that we are seeing in certain geometry shaders. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91780 Reviewed-by: Alex Deucher <[email protected]> Tested / Reviewed-by: Glenn Kennard <[email protected]> Cc: "10.6" "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: Pass conservative depth parameters to hwGlenn Kennard2015-11-117-1/+53
| | | | | | | | Supported on R700 and up. Signed-off-by: Glenn Kennard <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* Revert "r600g: Pass conservative depth parameters to hw"Dave Airlie2015-11-116-46/+0
| | | | | | This reverts commit a1fc78911e9a6439db94d6ae91d5672c76e5fb1c. I pushed the wrong patch.
* r600g: Implement ARB_texture_viewGlenn Kennard2015-11-112-7/+18
| | | | | | Signed-off-by: Glenn Kennard <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: Pass conservative depth parameters to hwGlenn Kennard2015-11-116-0/+46
| | | | | | | Supported on R700 and up. Signed-off-by: Glenn Kennard <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* vc4: Avoid loading undefined (newly-allocated) FBO contents.Eric Anholt2015-11-091-0/+17
| | | | | | | Since X has undefined contents in new pixmaps, it will allocate new textures for an FBO and draw to them without an explicit clear. For VC4, it's much faster to emit a clear than the load of the actual undefined memory contents, so just do that instead.
* vc4: Return NULL when we can't make our shadow for a sampler view.Eric Anholt2015-11-091-0/+4
| | | | | | | I'm not sure what the caller does is appropriate (just have a NULL sampler at this slot), but it fixes the immediate crash. Cc: "11.0" <[email protected]>
* vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.Eric Anholt2015-11-092-19/+32
| | | | | | | I was afraid our callers weren't prepared for this, but it looks like at least for resource creation, mesa/st throws an error appropriately. Cc: "11.0" <[email protected]>
* vc4: Add CL dumping for GL_ARRAY_PRIMITIVE.Eric Anholt2015-11-091-1/+16
|
* vc4: Fix a compiler warning.Eric Anholt2015-11-091-1/+1
|
* nvc0: enable compute support on FermiSamuel Pitoiset2015-11-081-2/+2
| | | | | | | | | | | Altough the compute support is still not complete because textures and surfaces need to be implemented, it allows to launch very simple compute kernel like one which reads reading MP performance counters. This turns on PIPE_CAP_COMPUTE and PIPE_SHADER_COMPUTE. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: fix emission of s[] args in certain situationsIlia Mirkin2015-11-071-2/+2
| | | | | | | | | | There might only be a single arg (e.g. cvt), so use mode rather than looking at the source directly. Also we don't want to rely on the type of the value, which can be unreliable, but instead use the instruction's. This works out well since mkSplit doesn't adjust the type. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: only take abs value when computing high resultIlia Mirkin2015-11-071-1/+1
| | | | | | | | Not reachable from TGSI since it only has UMUL, no IMUL. However it's surprising that setting argument types to s32 will cause sign to get lost. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: avoid queueing too much work onto a single fenceIlia Mirkin2015-11-072-26/+43
| | | | | | | | | | Force the fence to get kicked off, which won't actually wait for its completion, but any additional work will be put onto a fresh list. This fixes crashes in teximage-colors --benchmark with too many active maps. Signed-off-by: Ilia Mirkin <[email protected]>
* llvmpipe: disable front updates for nowDave Airlie2015-11-081-1/+1
| | | | | | | | As pointed out by Emil, this sometimes hangs, appears to be due to threading need to rethink how this stuff works for llvmpipe. Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: add register definitions for StoneyMarek Olšák2015-11-071-0/+322
| | | | | | There are a few non-stoney changes too. Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: add workarounds for CP DMA to stay on the fast pathMarek Olšák2015-11-071-5/+88
| | | | | | v2: set emit_scratch_reloc, add a NULL check Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: unify CP DMA preparation logicMarek Olšák2015-11-071-37/+34
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: unify CP DMA code determining various flagsMarek Olšák2015-11-071-28/+23
| | | | | | v2: don't call get_flush_flags twice per function Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: only enable write confirmation on the last CP DMA packetMarek Olšák2015-11-071-2/+4
| | | | | | This should improve performance for big copies that need to be split. Reviewed-by: Michel Dänzer <[email protected]>
* nv50/ir: allow emission of immediates in imul/imad opsIlia Mirkin2015-11-071-2/+8
| | | | | | | Nothing actually uses this yet (due to complications), but the emission logic is right. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: properly set the type of the constant folding resultIlia Mirkin2015-11-061-4/+4
| | | | | | | This removes the hack used for merge, which only covers a fraction of the cases. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add support for const-folding OP_CVT with F64 source/destIlia Mirkin2015-11-063-0/+45
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add fp64 opcode emission support for G200 (NVA0)Ilia Mirkin2015-11-061-10/+84
| | | | | | Need to emulate rcp/rsq before providing full fp64 support Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: Add support for 64bit immediates to checkSwapSrc01Hans de Goede2015-11-061-5/+6
| | | | | | | | Now that we support 64 bit immediates in insnCanLoad, we need to swap 64 bit immediate sources too for optimal effect. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Teach insnCanLoad about double immediatesHans de Goede2015-11-061-6/+19
| | | | | | | | | | | | | | | | Teach insnCanLoad about double immediates, together with the "Add support for merge-s to the ConstantFolding pass" This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Add support for merge-s to the ConstantFolding passHans de Goede2015-11-061-0/+15
| | | | | | | | | | | This allows later passes like LoadPropagation to properly deal with 64 bit immediates. If the new 64 bit load this introduces does not get optimized away then split64BitOpPostRA() will split this into 2 instructions again. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: disallow 64-bit immediates on nv50 targetsIlia Mirkin2015-11-061-1/+1
| | | | | | No instructions are able to load short immediates like nvc0 can. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: allow movs with TYPE_F64 destinations to be splitIlia Mirkin2015-11-061-0/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: Add support for double immediatesHans de Goede2015-11-061-1/+4
| | | | | | | | Add support for encoding double immediates (up to 20 bits of precision) into the generated gm107 machine-code. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Add support for double immediatesHans de Goede2015-11-061-0/+8
| | | | | | | | Add support for encoding double immediates (up to 20 bits of precision) into the generated nvc0 machine-code. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* radeon/uvd: fix VC-1 simple/main profile decode v2Boyuan Zhang2015-11-062-2/+7
| | | | | | | | | | | We just needed to set the extra width/height fields to get this working. v2 (chk): rebased, CC stable added, commit message added, fixed coding style Signed-off-by: Boyuan Zhang <[email protected]> Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Cc: "10.6 11.0" <[email protected]>
* freedreno/a4xx: fix blend colorRob Clark2015-11-061-5/+9
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-11-066-43/+54
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: add a305 supportGuillaume Charifi2015-11-061-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: Use nir_foreach_variableBoyan Ding2015-11-061-3/+3
| | | | | Signed-off-by: Boyan Ding <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* nvc0: reintroduce BGRA4 format supportIlia Mirkin2015-11-062-3/+1
| | | | | | | | | | | | | | Commit 342e68dc60 (nvc0: remove BGRA4 format support) removed the support to fix a WoW trace. However after further experimentation, I was able to get the blit to work by using a different "fake" format in the 2d engine. The reason why this worked on nv50 is that nv50 falls back to the 3d blit path in case either the src or the dst aren't "faithfully" supported, while nvc0 only does it for the dst format. RG8 is better supported by the nvc0 2d engine than R16. Signed-off-by: Ilia Mirkin <[email protected]>
* llvmpipe: disable texture cacheRoland Scheidegger2015-11-051-1/+1
| | | | There are some weird problems with 8-wide vectors.