summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* panfrost: Set the GEM handle for AFBC buffersTomeu Vizoso2019-03-271-0/+1
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix sscanf format optionsTomeu Vizoso2019-03-271-2/+2
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* virgl: Fake MSAA when max samples is 1Alexandros Frantzis2019-03-271-1/+4
| | | | | | | | | | | | | | When the host is running on softpipe/llvmpipe the maximum number of samples for multisampling is 1. GL 3.0 requires at least 4 samples, and softpipe/llvmpipe get around this by enabling PIPE_CAP_FAKE_SW_MSAA. This patch mimics softpipe/llvmpipe behavior in virgl by enabling the same PIPE_CAP_FAKE_SW_MSAA workaround when the max sample count reported by the host is 1. This change allows virgl on a softpipe/llvmpipe host to advertise support for GL 3.0 and beyond. Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-By: Gert Wollny <[email protected]>
* panfrost: Preliminary work for mipmapsAlyssa Rosenzweig2019-03-278-207/+163
| | | | | | | | | | | | | | This patch refactors a substantial amount of code in preparation for mipmaps. In particular, we know have a correct slice abstraction based on offsets; cpu/gpu are no longer arbitrary pointers. We additionally shuffle around other code to accompany these changes and cleanup how tiled textures are handled, while drawing some attention to the blit code. Mipmaps are still disabled at this point, as autogeneration is not yet implemented; enabling as-is would cause regressions. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: fpow is a two-part operationAlyssa Rosenzweig2019-03-264-4/+4
| | | | | | | | | In fact, the native "fpow" instruction only does half of it; more work is needed for the actual instruction. For now, just lower. Fixes: 1ea42894c ("panfrost/midgard: Implement fpow") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Handle i2b constantAlyssa Rosenzweig2019-03-261-1/+1
| | | | | | | Fixes dEQP-GLES2.functional.shaders.conversions.scalar_to_scalar.int_to_bool_fragment Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Expand fge lowering to more typesAlyssa Rosenzweig2019-03-263-6/+12
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add ult/ule opsAlyssa Rosenzweig2019-03-263-0/+7
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stub out ES3 caps/callbacksAlyssa Rosenzweig2019-03-262-1/+54
| | | | | | | | | Although this is not functional (and the command stream side is not aiming for ES3 right now), this is enough to run dEQP-GLES3 shader tests with the version override directive; this is useful, as some ES3 shader feature can occur in ES2 class shaders due to lowering. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Cleanup midgard_nir_algebraic.pyAlyssa Rosenzweig2019-03-261-1/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower source modifiers for intsAlyssa Rosenzweig2019-03-264-2/+22
| | | | | | | | | | | | | On Midgard, float ops support standard source modifiers (abs/neg) and destination modifiers (sat/pos/round). Integer ops do not support these, however. To cope, we use native NIR source modifiers for floats, but lower them away to iabs/ineg for integers, implementing those ops simultaneously to avoid regressions. Fixes the integer tests in dEQP-GLES2.functional.shaders.operator.unary_operator.minus.* Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Implement b2i; improve b2f/f2bAlyssa Rosenzweig2019-03-261-18/+30
| | | | | | | Fixes dEQP-GLES2.functional.shaders.conversions.scalar_to_scalar.bool_to_int_fragment Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower i2b32Alyssa Rosenzweig2019-03-261-0/+1
| | | | | | | Fixes dEQP-GLES2.functional.shader.conversions.scalar_to_scalar.int_to_bool_vertex Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower f2b32 to fneAlyssa Rosenzweig2019-03-261-0/+7
| | | | | | | Fixes dEQP-GLES2.functional.shaders.swizzles.vector_swizzles.mediump_bvec2_x_vertex Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower bool_to_int32Alyssa Rosenzweig2019-03-261-20/+23
| | | | | | | Fixes dEQP-GLES2.functional.shaders.linkage.varying_type_vec2 (among many others). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Map more bany/ball opcodesAlyssa Rosenzweig2019-03-261-0/+11
| | | | | | | | | | | Some of these are not yet fully functional due to related bugs, but this the correct op mapping. The native ball/bany opcodes act on vec4's unconditionally. That said, both ball and bany have the nice property that duplicating an argument does not affect their output, so the default "hanging swizzles" allow us to implement 2/3-component opcodes correctly, implicitly lowering. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add more ball/bany, iabs opsAlyssa Rosenzweig2019-03-261-0/+29
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Schedule ball/bany to vectorsAlyssa Rosenzweig2019-03-261-4/+4
| | | | | | Though they output scalars, they need a vector unit to make sense. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add fcsel_i opcodeAlyssa Rosenzweig2019-03-262-0/+3
| | | | | | | | | Whereas a normal fcsel acts on a boolean input in r31.w, the fcsel_i variant acts on an integer input in r31.w, which can be preloaded with an instruction like imov (with the appropriate negate flag on the source). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement scissor testAlyssa Rosenzweig2019-03-261-6/+16
| | | | | | | This preliminary implementation should handle some basic cases. Future work should scissor the FRAGMENT job as well for efficiency. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix viewportsAlyssa Rosenzweig2019-03-261-7/+16
| | | | | | | | | Our viewport code hardcoded a number of wrong assumptions, which sort of sometimes worked but was definitely wrong (and broke most of dEQP). This corrects the logic, accounting for flipped-Y framebuffers, which fixes... most of dEQP. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix b2f32 swizzle for vectorsAlyssa Rosenzweig2019-03-261-6/+8
| | | | | | Fixes issues in most of dEQP-GLES2.functional.shaders.* Signed-off-by: Alyssa Rosenzweig <[email protected]>
* softpipe: fix clears to only clear specified color buffers.Dave Airlie2019-03-271-1/+2
| | | | | | This fixes piglit clearbuffer-mixed-format Reviewed-by: Brian Paul <[email protected]>
* draw/vs: partly fix basevertex/vertex idDave Airlie2019-03-271-4/+3
| | | | | | | | | | | This gets the basevertex from the draw depending on whether it's an indexed or non-indexed draw. We still fail a transform feedback test for vertex id, as the vertex id actually an index id, and isn't getting translated properly to a vertex id, suggestions on how/where to fix that welcome. Reviewed-by: Brian Paul <[email protected]>
* freedreno/ir3: Track whether shader needs derivativesKristian H. Kristensen2019-03-252-3/+3
| | | | | | | | | | | | | In 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") we started counting number of samplers based on the uniform vars instead of number of cat5 instructions. We used the number of samplers to determine whether to enable derivatives, but when we only use derivatives and no samplers, that now breaks. Track whether we need derivatives explicitly and use that to enable the state. Fixes: 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* st/nine: enable csmt per default on irisAndre Heider2019-03-251-3/+5
| | | | | | | | iris is thread safe, enable csmt for a ~5% performace boost. Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* i965,iris,anv: Make alpha to coverage work with sample maskDanylo Piliaiev2019-03-251-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From "Alpha Coverage" section of SKL PRM Volume 7: "If Pixel Shader outputs oMask, AlphaToCoverage is disabled in hardware, regardless of the state setting for this feature." From OpenGL spec 4.6, "15.2 Shader Execution": "The built-in integer array gl_SampleMask can be used to change the sample coverage for a fragment from within the shader." From OpenGL spec 4.6, "17.3.1 Alpha To Coverage": "If SAMPLE_ALPHA_TO_COVERAGE is enabled, a temporary coverage value is generated where each bit is determined by the alpha value at the corresponding sample location. The temporary coverage value is then ANDed with the fragment coverage value to generate a new fragment coverage value." Similar wording could be found in Vulkan spec 1.1.100 "25.6. Multisample Coverage" Thus we need to compute alpha to coverage dithering manually in shader and replace sample mask store with the bitwise-AND of sample mask and alpha to coverage dithering. The following formula is used to compute final sample mask: m = int(16.0 * clamp(src0_alpha, 0.0, 1.0)) dither_mask = 0x1111 * ((0xfea80 >> (m & ~3)) & 0xf) | 0x0808 * (m & 2) | 0x0100 * (m & 1) sample_mask = sample_mask & dither_mask Credits to Francisco Jerez <[email protected]> for creating it. It gives a number of ones proportional to the alpha for 2, 4, 8 or 16 least significant bits of the result. GEN6 hardware does not have issue with simultaneous usage of sample mask and alpha to coverage however due to the wrong sending order of oMask and src0_alpha it is still affected by it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109743 Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* draw/gs: fix point size outputs from geometry shader.Dave Airlie2019-03-261-8/+1
| | | | | | | | | | If the geom shader emits a point size we failed to find it here, use the correct API to look it up. Fixes: tests/spec/glsl-1.50/execution/geometry/point-size-out.shader_test Reviewed-by: Brian Paul <[email protected]>
* draw: bail instead of assert on instance count (v2)Dave Airlie2019-03-261-1/+3
| | | | | | | | | | | | With indirect rendering it's fine to set the instance count parameter to 0, and expect the rendering to be ignored. Fixes assert in KHR-GLES31.core.compute_shader.pipeline-gen-draw-commands on softpipe v2: return earlier before changing fpstate Reviewed-by: Brian Paul <[email protected]>
* vl/dri3: remove the wait before getting back bufferLeo Liu2019-03-251-15/+3
| | | | | | | | | | The wait here is unnecessary since we got a pool of back buffers, and the wait for swap buffer will happen before the present pixmap, at the same time the previous back buffer will be put back to pool for reuse after the check for PresentIdleNotify event Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* android: static link with libexpat with Android O+Kishore Kadiyala2019-03-251-1/+9
| | | | | | | | | | | | | In Android O, MESA needs to statically link libexpat so that it's in same VNDK namespace. v2: apply change also to anv driver (Tapani) v3: use += in anv change (Eric Engestrom) Change-Id: I82b0be5c817c21e734dfdf5bfb6a9aa1d414ab33 Signed-off-by: Kishore Kadiyala <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno: add ESSL capRob Clark2019-03-221-0/+7
| | | | | | | | | Report 320 for a6xx, which isn't *quite* true (no geom/tess, in particular), but other caps keep the reported GL and GLSL versions correct (3.1 / 3.10 es). But reporting 320 will switch on EXT_gpu_shader5, which is the goal. Signed-off-by: Rob Clark <[email protected]>
* gallium: add PIPE_CAP_ESSL_FEATURE_LEVELRob Clark2019-03-223-0/+14
| | | | | | | | | | | | | Adds a new cap to allow drivers to expose higher shading language versions in GLES contexts, to avoid having to report an artificially low version for the benefit of GL contexts. The motivation is to expose EXT_gpu_shader5 even though a driver may not support all the features needed for the corresponding GL extension (ARB_gpu_shader5). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* swr: Fix build with llvm-9.0.Vinson Lee2019-03-222-0/+24
| | | | | | | | | | | | | | | | | Fix build error after llvm-9.0svn r352827 ("[opaque pointer types] Add a FunctionCallee wrapper type, and use it."). In file included from ./rasterizer/jitter/builder.h:158:0, from swr_shader.cpp:35: ./rasterizer/jitter/gen_builder_meta.hpp: In member function ‘llvm::Value* SwrJit::Builder::VGATHERPD(llvm::Value*, llvm::Value*, llvm::Value*, llvm::Value*, llvm::Value*, const llvm: :Twine&)’: ./rasterizer/jitter/gen_builder_meta.hpp:51:117: error: no matching function for call to ‘cast(llvm::FunctionCallee)’ Function* pFunc = cast<Function>(JM()->mpCurrentModule->getOrInsertFunction("meta.intrinsic.VGATHERPD", pFuncTy)); ^ Suggested-by: Philip Meulengracht <[email protected]> Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Alok Hota <[email protected]>
* spirv,nir: lower frexp_exp/frexp_sig inside a new NIR passSamuel Pitoiset2019-03-221-0/+1
| | | | | | | | | | This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* iris: Push heavy memchecker code to DEBUGChris Wilson2019-03-222-1/+13
| | | | | | | | | | | | | | | | Invoking VALGRIND_CHECK_MEM_IS_DEFINED pulls in enough code to convince gcc to not inline __gen_uint and results in a lot of packing code ending up out-of-line with lots of stack copying. To ameliorate this, only insert the check inside the packer if DEBUG is defined and instead perform the validation checking before submitting the batch to the kernel. This should give accurate results if --trace-origins=yes is used, and failing that we can recompile in full debug mode to check on insertion. Improve drawoverhead baseline by 25% with a default build with valgrind-dev installed (with effectively no loss of vg coverage). Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Fix batch chaining map_next increment.Kenneth Graunke2019-03-221-1/+1
| | | | Caught by Chris Wilson; split out from his valgrind patch.
* freedreno/ir3: rename has_kill to no_earlyzRob Clark2019-03-226-6/+6
| | | | | | | There are other cases where we need to disable early-z, like image writes. So rename to something more generic. Signed-off-by: Rob Clark <[email protected]>
* iris: Skip resolves and flushes altogether if unnecessaryKenneth Graunke2019-03-213-9/+27
| | | | Improves drawoverhead baseline scores by 1.17x.
* iris: Skip framebuffer resolve tracking if framebuffer isn't dirtyKenneth Graunke2019-03-212-70/+84
| | | | Improves drawoverhead baseline score by 1.86x.
* iris: Skip input resolve handling if bindings haven't changedKenneth Graunke2019-03-213-10/+19
| | | | | This brings the drawoverhead 16 Tex w/ no state change score from 22% of baseline to 97% of baseline.
* iris: Fix util_vma_heap_init size for IRIS_MEMZONE_SHADERKenneth Graunke2019-03-211-1/+1
| | | | Fixes assertions when disabling bucket allocators.
* softpipe: fix integer texture swizzling for 1 vs 1.0fDave Airlie2019-03-221-4/+5
| | | | | | | | The swizzling was putting float one in not integer 1. This fixes a lot of arb_texture_view-rendering-formats cases. Reviewed-by: Brian Paul <[email protected]>
* softpipe: remove shadow_ref assert.Dave Airlie2019-03-221-1/+0
| | | | | | | | | | | | I don't think this really buys us anything and TG4 with cubemap arrays falls over because sampler == 2, but otherwise works fine. Fixes: ./bin/textureGather fs shadow r CubeArray repeat on softpipe with ARB_gpu_shader5 enabled. Reviewed-by: Brian Paul <[email protected]>
* softpipe: handle 32-bit bitfield insertsDave Airlie2019-03-221-3/+7
| | | | | | Fixes piglits if ARB_gpu_shader5 is enabled Reviewed-by: Brian Paul <[email protected]>
* softpipe: fix 32-bit bitfield extractDave Airlie2019-03-221-2/+12
| | | | | | | | These didn't deal with the width == 32 case that TGSI is defined with. Fixes piglit tests if ARB_gpu_shader5 is enabled. Reviewed-by: Brian Paul <[email protected]>
* v3d: Upload all of UBO[0] if any indirect load occurs.Eric Anholt2019-03-211-38/+19
| | | | | | | | | | | | | | | The idea was that we could skip uploading the constant-indexed uniform data and just upload the uniforms that are variably-indexed. However, since the VS bin and render shaders may have a different set of uniforms used, this meant that we had to upload the UBO for each of them. The first case is generally a fairly small impact (usually the uniform array is the most space, other than a couple of FSes in shader-db), while the second is a larger impact: 3DMMES2 was uploading 38k/frame of uniforms instead of 18k. Given that the optimization is of dubious value, has a big downside, and is quite a bit of code, just drop it. No change in shader-db. No change on 3DMMES2 (n=15).
* v3d: Move constant offsets to UBO addresses into the main uniform stream.Eric Anholt2019-03-212-2/+6
| | | | | | | | | | We'd end up with the constant offset in the uniform stream anyway, since they're bigger than small immediates. Avoids the extra uniforms and adds in the shader in favor of just adding once on the CPU. shader-db: total instructions in shared programs: 6496865 -> 6494851 (-0.03%) total uniforms in shared programs: 2119511 -> 2117243 (-0.11%)
* v3d: Rename v3d_tmu_config_data to v3d_unit_data.Eric Anholt2019-03-211-4/+4
| | | | | | I want to reuse this for encoding small constant UBO/SSBO offsets into the uniform stream to reduce the extra uniform loads and adds for the small constant offsets.
* nv50/ir/nir: support gather offsetsKarol Herbst2019-03-212-3/+16
| | | | | | v2: only emit offsets if those are !0 Signed-off-by: Karol Herbst <[email protected]>