summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: rename r600_atom -> si_atomMarek Olšák2018-04-2710-76/+76
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove r600_pipe_common.hMarek Olšák2018-04-2714-347/+302
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: workaround for INTERP with indirect indexingMarek Olšák2018-04-271-6/+13
| | | | | | | and clean up the conditions. Reviewed-by: Nicolai Hähnle <[email protected]> Cc: 18.0 18.1 <[email protected]>
* radeonsi: rewrite DCC format compatibility checking codeMarek Olšák2018-04-271-56/+42
| | | | | | It might be better to use a slow compressed clear when clearing to 1. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement DCC fast clear swizzle constraints more accuratelyMarek Olšák2018-04-273-35/+65
| | | | | | | | | | Reduce swizzle constraints to the ALPHA_IS_ON_MSB constraint and the clear value of 1. This significantly changes the DCC fast clear code, and fixes fast clear for RGB formats without alpha. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rename variables and document stuff around DCC fast clearMarek Olšák2018-04-271-41/+42
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fully enable 2x DCC MSAA for array and non-array texturesMarek Olšák2018-04-274-14/+20
| | | | | | | The clear code is exactly the same as for 1 sample buffers - just clear the whole thing. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable fast color clear for level 0 of mipmapped textures on <= VIMarek Olšák2018-04-272-9/+24
| | | | | | | GFX9 is more complicated and needs a compute shader that we should just copy from amdvlk. Reviewed-by: Nicolai Hähnle <[email protected]>
* swr/rast: No need to export GetSimdValidIndicesGfxGeorge Kyriazis2018-04-271-4/+0
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Small editorial changesGeorge Kyriazis2018-04-273-19/+17
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Use new processor detection mechanismGeorge Kyriazis2018-04-272-1/+51
| | | | | | | | Use specific avx512 selection mechanism based on avx512er bit instead of getHostCPUName(). LLVM 6.0.0 has a bug that reports wrong string for KNL (fixed in 6.0.1). Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Output rasterizer dir to console since it's process specificGeorge Kyriazis2018-04-271-1/+4
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add TranslateGfxAddress for shaderGeorge Kyriazis2018-04-273-3/+19
| | | | | | Also add GFX_MEM_CLIENT_SHADER Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: jit PRINT improvements.George Kyriazis2018-04-271-2/+13
| | | | | | | | Sign-extend integer types to 32bit when specifying "%d" and add new %u which zero-extends to 32bit. Improves printing of sub 32bit integer types (i1 specifically). Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix regressions.George Kyriazis2018-04-271-1/+1
| | | | | | Bump jit cache revision number to force recompile. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Cleanup old cruft.George Kyriazis2018-04-271-17/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Package events.proto with core outputGeorge Kyriazis2018-04-272-2/+32
| | | | | | | | However only if the file exists in DEBUG_OUTPUT_DIR. The expectation is that AR rasterizerLauncher will start placing it there when launching a workload (which is in a subsequent checkin) Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix init in EventHandlerWorkerStatsGeorge Kyriazis2018-04-271-1/+4
| | | | | | Make sure we initialize variables. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix return type of VCVTPS2PH.George Kyriazis2018-04-271-1/+1
| | | | | | expecting <8xi16> return. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: WIP Translation handlingGeorge Kyriazis2018-04-272-18/+26
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Use different handing for stream masksGeorge Kyriazis2018-04-275-6/+11
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Silence warningsGeorge Kyriazis2018-04-273-4/+2
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add support for TexelMask evaluationGeorge Kyriazis2018-04-272-0/+44
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Internal core changeGeorge Kyriazis2018-04-271-0/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix x86 lowering 64-bit float handlingGeorge Kyriazis2018-04-272-6/+56
| | | | | | | | | - 64-bit cvt-to-float needs to be explicitly handled - gathers need the right parameter types to work with doubles Fixes draw-vertices piglit tests Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add some SIMD_T utility functorsGeorge Kyriazis2018-04-271-0/+66
| | | | | | VecEqual and VecHash Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix wrong type allocationGeorge Kyriazis2018-04-271-1/+1
| | | | | | ALLOCA pointer elements, not pointers. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: touch generated files to update timestampGeorge Kyriazis2018-04-271-0/+11
| | | | | | previous change in generators necessitates this change Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix byte offset for non-indexed drawsGeorge Kyriazis2018-04-271-3/+3
| | | | | | for the case when USE_SIMD16_SHADERS == FALSE Reviewed-by: Bruce Cherniak <[email protected]>
* etnaviv: remove not needed includesChristian Gmeiner2018-04-271-3/+0
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* etnaviv: remove redundant includeChristian Gmeiner2018-04-271-2/+0
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* broadcom/vc5: Add support for centroid varyings.Eric Anholt2018-04-264-4/+53
| | | | | | | | | It would be nice to share the flags packet emit logic with flat shade flags, but I couldn't come up with a good way while still using our pack macros. We need to refactor this to shader record setup at compile time, anyway. Fixes ext_framebuffer_multisample-interpolation * centroid-*
* broadcom/vc5: Add an assert about GFXH-1559.Eric Anholt2018-04-261-0/+9
| | | | | Our TF outputs always start at 6 or 7 currently, so we don't hit the broken 8 case. Let's make sure that doesn't change somehow.
* broadcom/vc5: Implement GFXH-1742 workaround (emit 2 dummy stores on 4.x).Eric Anholt2018-04-261-8/+27
| | | | | | This should fix help with intermittent GPU hangs in tests switching formats while rendering small frames. Unfortunately, it didn't help with the tests I'm having troubles with.
* st/va: Fix typosDrew Davenport2018-04-261-24/+24
| | | | | | | | | s/attibute/attribute/ s/suface/surface/ v2: rebased(Leo) Reviewed-by: Leo Liu <[email protected]>
* st/va: Fix potential buffer overreadDrew Davenport2018-04-261-1/+1
| | | | | | | | VASurfaceAttribExternalBuffers.pitches is indexed by plane. Current implementation only supports single plane layout. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeon/vcn: fix mpeg4 msg buffer settingsBoyuan Zhang2018-04-261-9/+9
| | | | | | | | Previous bit-fields assignments are incorrect and will result certain mpeg4 decode failed due to wrong flag values. This patch fixes these assignments. Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* gallium/util: Fix incorrect refcounting of separate stencil.Eric Anholt2018-04-251-2/+1
| | | | | | | | | | | The driver may have a reference on the separate stencil buffer for some reason (like an unflushed job using it), so we can't directly free the resource and should instead just decrement the refcount that we own. Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8 on vc5. Fixes: e94eb5e6000e ("gallium/util: add u_transfer_helper") Reviewed-by: Rob Clark <[email protected]>
* broadcom/vc5: Fix reloads of separate stencil buffers.Eric Anholt2018-04-251-4/+16
| | | | Like for stores, we need to emit a separate load_general packet.
* broadcom/vc5: Fix cpp of MSAA surfaces on 4.x.Eric Anholt2018-04-252-3/+5
| | | | | | The internal-type-bpp path is for surfaces that get stored in the raw TLB format. For 4.x, we're storing MSAA as just 2x width/height at the original format.
* broadcom/vc5: Implement stencil blits using RGBA.Eric Anholt2018-04-252-2/+83
| | | | Fixes piglit fbo-depthstencil blit default_fb
* broadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key.Eric Anholt2018-04-251-4/+1
|
* broadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x.Eric Anholt2018-04-251-1/+11
| | | | | For single-sample we have to always program SAMPLE_0, but for multisample we want to store all the samples.
* draw: fix different sign logic when clippingRoland Scheidegger2018-04-251-9/+8
| | | | | | | | | | | | | | | | | | | The logic was flawed, since mul(x,y) will be <= 0 (exactly 0) when the sign is the same but both numbers are sufficiently small (if the product is smaller than 2^-128). This could apparently lead to emitting a sufficient amount of additional bogus vertices to overflow the allocated array for them, hitting an assertion (still safe with release builds since we just aborted clipping after the assertion in this case - I'm however unsure if this is now really no longer possible, so that code stays). Not sure if the additional vertices could cause other grief, I didn't see anything wrong even when hitting the assertion. Essentially, both +-0 are treated as positive (the vertex is considered to be inside the clip volume for this plane), so integrate the logic determining different sign into the branch there. Reviewed-by: Jose Fonseca <[email protected]>
* draw: simplify clip null tri logicRoland Scheidegger2018-04-251-11/+9
| | | | | | | | | Simplifies the logic when to emit null tris (albeit the reasons why we have to do this remain unclear). This is strictly just logic simplification, the behavior doesn't change at all. Reviewed-by: Jose Fonseca <[email protected]>
* nvc0/ir: all short immediates are sign-extended, adjust LIMM testIlia Mirkin2018-04-243-19/+24
| | | | | | | | | | | | | | | | | | | | | | Some analysis suggests that all short immediates are sign-extended. The insnCanLoad logic already accounted for this, but we could still pick the wrong form when emitting actual instructions that support both short and long immediates (with the long form usually having additional restrictions that insnCanLoad should be aware of). This also reverses a bunch of commits that had previously "worked around" this issue in various emitters: 9c63224540ef: gm107/ir: make use of ADD32I for all immediates 83a4f28dc27b: gm107/ir: make use of LOP32I for all immediates b84c97587b4a: gm107/ir: make use of IMUL32I for all immediates d30768025a22: gk110/ir: make use of IMUL32I for all immediates as well as the original import for UMUL in the nvc0 emitter. Reported-by: Karol Herbst <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Karol Herbst <[email protected]>
* meson: raise required version to 0.44.1Dylan Baker2018-04-241-6/+0
| | | | | | | | | | | | We have already required 0.44 for building clover and swr, so it was already partially required. This just makes it required across the board instead of just for clover and swr. There is a bug in 0.44 which makes it impossible to build mesa in some configurations, so require 0.44.1 which fixes this. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson: fix graw-xlib after auxiliary consolidationDylan Baker2018-04-241-2/+1
| | | | | | | | | | This one's completely my fault, I didn't do good enough testing after rebasing and this got missed. Fixes: d28c24650110c130008be3d3fe584520ff00ceb1 ("meson: build graw tests") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gm107/ir/lib: fix sched in div u32 builtinKarol Herbst2018-04-242-4/+4
| | | | | | | | | | Imad needs to set a read barrier. With significant big work groups I was getting wrong results for div u32. Turns out the issue was with the sched opcodes. Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* broadcom/vc5: Set up internal_format for imported resources.Eric Anholt2018-04-241-0/+2
| | | | | Without this, we'd assertion fail in u_transfer_helper when mapping an imported resource.