summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* iris: Print the memzone name when allocating BOs with INTEL_DEBUG=bufKenneth Graunke2019-03-281-2/+17
| | | | | This gives me an idea of what kinds of buffers are being allocated on the fly which could help inform our cache decisions.
* iris/icl: Add WA_2204188704 to disable pixel shader panic dispatchAnuj Phogat2019-03-281-0/+7
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris/icl: Set Enabled Texel Offset Precision Fix bitAnuj Phogat2019-03-281-0/+7
| | | | | | | | h/w specification requires this bit to be always set. See Mesa commit 5eb173304bd. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: align const size to vec4Rob Clark2019-03-281-4/+5
| | | | | | | This is no longer true since PIPE_CAP_PACKED_UNIFORMS was enabled. Fixes: 3c8779af325 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: small cleanupRob Clark2019-03-281-2/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* iris: Fix blits with S8_UINT destinationKenneth Graunke2019-03-281-4/+2
| | | | | | | | | | | | For depth and stencil blits, we always want the main mask to be Z, and the secondary pass mask to be S. If asked to blit Z+S to S, we should handle the blit in the second pass which properly gets the stencil resources. Before, we were trying to handle S as the main mask, and accidentally blitting a Z source to a S destination, which doesn't work out well. Fixes Piglit's "framebuffer-blit-levels {draw,read} stencil" tests.
* iris: Actually advertise some modifiersKenneth Graunke2019-03-271-0/+39
| | | | | | | | | | I neglected to fill out this driver function, causing us to advertise 0 modifiers. Now we advertise the various tilings and let the driver pick them. I've verified that X tiling works with Weston (by hacking the list to skip Y tiling). Y+CCS doesn't work yet because it's multiplane and the Gallium dri state tracker isn't really prepared for that. Leave it off for now.
* softpipe: add indirect store buffer/image unitDave Airlie2019-03-281-2/+34
| | | | | | | | The code to handle image unit indirect was missing Fixes piglit tests/spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-mixed-const-non-const-uniform-index.shader_test Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe/draw: fix vertex id in soft paths.Dave Airlie2019-03-285-11/+19
| | | | | | | | | | | | | | This fixes the vertex id fetch in the non-llvm drawing paths. This vertex id in elt mode comes from the elts not just a linear value. Note we don't bad basevertex in the elts case as it's already included in the elts by the looks of it (at least tests fail if I add it) Fixes piglit end-primitive tests and some others. Reviewed-by: Roland Scheidegger <[email protected]>
* freedreno/ir3: Push UBOs to constant fileKristian H. Kristensen2019-03-272-4/+27
| | | | | | | | We have a rather big constant file and it seems that the best way to use it is to upload all UBOs and lower UBO access the load_uniform. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMSKristian H. Kristensen2019-03-271-0/+1
| | | | | | | | | | | | | | | | | | | This commit turns on the gallium cap and adds a pass to lower the load_ubo intrinsics for block 0 back to load_uniform intrinsics and adjust the backend where the cap switches units from vec4s to dwords. As we stop using ir3_glsl_type_size() for uniform layout, this also corrects an issue where we would allocate a vec4 slot for samplers in uniforms, fixing: dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_fragment dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_vertex dEQP-GLES3.functional.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* radeon/vcn: add H.264 constrained baseline supportLeo Liu2019-03-271-0/+1
| | | | | | | | VCN supports this profile as well as UVD, so add it Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> CC: <[email protected]>
* panfrost: Wait for last job to finish in force_flush_fragmentTomeu Vizoso2019-03-271-0/+8
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Pass the context BOs to the kernel so they aren't unmapped while ↵Tomeu Vizoso2019-03-271-3/+9
| | | | | | | in use Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Also tell the kernel about the checksum_slabTomeu Vizoso2019-03-271-4/+9
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Set the GEM handle for AFBC buffersTomeu Vizoso2019-03-271-0/+1
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix sscanf format optionsTomeu Vizoso2019-03-271-2/+2
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* virgl: Fake MSAA when max samples is 1Alexandros Frantzis2019-03-271-1/+4
| | | | | | | | | | | | | | When the host is running on softpipe/llvmpipe the maximum number of samples for multisampling is 1. GL 3.0 requires at least 4 samples, and softpipe/llvmpipe get around this by enabling PIPE_CAP_FAKE_SW_MSAA. This patch mimics softpipe/llvmpipe behavior in virgl by enabling the same PIPE_CAP_FAKE_SW_MSAA workaround when the max sample count reported by the host is 1. This change allows virgl on a softpipe/llvmpipe host to advertise support for GL 3.0 and beyond. Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-By: Gert Wollny <[email protected]>
* panfrost: Preliminary work for mipmapsAlyssa Rosenzweig2019-03-278-207/+163
| | | | | | | | | | | | | | This patch refactors a substantial amount of code in preparation for mipmaps. In particular, we know have a correct slice abstraction based on offsets; cpu/gpu are no longer arbitrary pointers. We additionally shuffle around other code to accompany these changes and cleanup how tiled textures are handled, while drawing some attention to the blit code. Mipmaps are still disabled at this point, as autogeneration is not yet implemented; enabling as-is would cause regressions. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: fpow is a two-part operationAlyssa Rosenzweig2019-03-264-4/+4
| | | | | | | | | In fact, the native "fpow" instruction only does half of it; more work is needed for the actual instruction. For now, just lower. Fixes: 1ea42894c ("panfrost/midgard: Implement fpow") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Handle i2b constantAlyssa Rosenzweig2019-03-261-1/+1
| | | | | | | Fixes dEQP-GLES2.functional.shaders.conversions.scalar_to_scalar.int_to_bool_fragment Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Expand fge lowering to more typesAlyssa Rosenzweig2019-03-263-6/+12
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add ult/ule opsAlyssa Rosenzweig2019-03-263-0/+7
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stub out ES3 caps/callbacksAlyssa Rosenzweig2019-03-262-1/+54
| | | | | | | | | Although this is not functional (and the command stream side is not aiming for ES3 right now), this is enough to run dEQP-GLES3 shader tests with the version override directive; this is useful, as some ES3 shader feature can occur in ES2 class shaders due to lowering. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Cleanup midgard_nir_algebraic.pyAlyssa Rosenzweig2019-03-261-1/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower source modifiers for intsAlyssa Rosenzweig2019-03-264-2/+22
| | | | | | | | | | | | | On Midgard, float ops support standard source modifiers (abs/neg) and destination modifiers (sat/pos/round). Integer ops do not support these, however. To cope, we use native NIR source modifiers for floats, but lower them away to iabs/ineg for integers, implementing those ops simultaneously to avoid regressions. Fixes the integer tests in dEQP-GLES2.functional.shaders.operator.unary_operator.minus.* Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Implement b2i; improve b2f/f2bAlyssa Rosenzweig2019-03-261-18/+30
| | | | | | | Fixes dEQP-GLES2.functional.shaders.conversions.scalar_to_scalar.bool_to_int_fragment Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower i2b32Alyssa Rosenzweig2019-03-261-0/+1
| | | | | | | Fixes dEQP-GLES2.functional.shader.conversions.scalar_to_scalar.int_to_bool_vertex Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower f2b32 to fneAlyssa Rosenzweig2019-03-261-0/+7
| | | | | | | Fixes dEQP-GLES2.functional.shaders.swizzles.vector_swizzles.mediump_bvec2_x_vertex Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Lower bool_to_int32Alyssa Rosenzweig2019-03-261-20/+23
| | | | | | | Fixes dEQP-GLES2.functional.shaders.linkage.varying_type_vec2 (among many others). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Map more bany/ball opcodesAlyssa Rosenzweig2019-03-261-0/+11
| | | | | | | | | | | Some of these are not yet fully functional due to related bugs, but this the correct op mapping. The native ball/bany opcodes act on vec4's unconditionally. That said, both ball and bany have the nice property that duplicating an argument does not affect their output, so the default "hanging swizzles" allow us to implement 2/3-component opcodes correctly, implicitly lowering. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add more ball/bany, iabs opsAlyssa Rosenzweig2019-03-261-0/+29
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Schedule ball/bany to vectorsAlyssa Rosenzweig2019-03-261-4/+4
| | | | | | Though they output scalars, they need a vector unit to make sense. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add fcsel_i opcodeAlyssa Rosenzweig2019-03-262-0/+3
| | | | | | | | | Whereas a normal fcsel acts on a boolean input in r31.w, the fcsel_i variant acts on an integer input in r31.w, which can be preloaded with an instruction like imov (with the appropriate negate flag on the source). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement scissor testAlyssa Rosenzweig2019-03-261-6/+16
| | | | | | | This preliminary implementation should handle some basic cases. Future work should scissor the FRAGMENT job as well for efficiency. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix viewportsAlyssa Rosenzweig2019-03-261-7/+16
| | | | | | | | | Our viewport code hardcoded a number of wrong assumptions, which sort of sometimes worked but was definitely wrong (and broke most of dEQP). This corrects the logic, accounting for flipped-Y framebuffers, which fixes... most of dEQP. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix b2f32 swizzle for vectorsAlyssa Rosenzweig2019-03-261-6/+8
| | | | | | Fixes issues in most of dEQP-GLES2.functional.shaders.* Signed-off-by: Alyssa Rosenzweig <[email protected]>
* softpipe: fix clears to only clear specified color buffers.Dave Airlie2019-03-271-1/+2
| | | | | | This fixes piglit clearbuffer-mixed-format Reviewed-by: Brian Paul <[email protected]>
* draw/vs: partly fix basevertex/vertex idDave Airlie2019-03-271-4/+3
| | | | | | | | | | | This gets the basevertex from the draw depending on whether it's an indexed or non-indexed draw. We still fail a transform feedback test for vertex id, as the vertex id actually an index id, and isn't getting translated properly to a vertex id, suggestions on how/where to fix that welcome. Reviewed-by: Brian Paul <[email protected]>
* freedreno/ir3: Track whether shader needs derivativesKristian H. Kristensen2019-03-252-3/+3
| | | | | | | | | | | | | In 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") we started counting number of samplers based on the uniform vars instead of number of cat5 instructions. We used the number of samplers to determine whether to enable derivatives, but when we only use derivatives and no samplers, that now breaks. Track whether we need derivatives explicitly and use that to enable the state. Fixes: 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* st/nine: enable csmt per default on irisAndre Heider2019-03-251-3/+5
| | | | | | | | iris is thread safe, enable csmt for a ~5% performace boost. Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* i965,iris,anv: Make alpha to coverage work with sample maskDanylo Piliaiev2019-03-251-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From "Alpha Coverage" section of SKL PRM Volume 7: "If Pixel Shader outputs oMask, AlphaToCoverage is disabled in hardware, regardless of the state setting for this feature." From OpenGL spec 4.6, "15.2 Shader Execution": "The built-in integer array gl_SampleMask can be used to change the sample coverage for a fragment from within the shader." From OpenGL spec 4.6, "17.3.1 Alpha To Coverage": "If SAMPLE_ALPHA_TO_COVERAGE is enabled, a temporary coverage value is generated where each bit is determined by the alpha value at the corresponding sample location. The temporary coverage value is then ANDed with the fragment coverage value to generate a new fragment coverage value." Similar wording could be found in Vulkan spec 1.1.100 "25.6. Multisample Coverage" Thus we need to compute alpha to coverage dithering manually in shader and replace sample mask store with the bitwise-AND of sample mask and alpha to coverage dithering. The following formula is used to compute final sample mask: m = int(16.0 * clamp(src0_alpha, 0.0, 1.0)) dither_mask = 0x1111 * ((0xfea80 >> (m & ~3)) & 0xf) | 0x0808 * (m & 2) | 0x0100 * (m & 1) sample_mask = sample_mask & dither_mask Credits to Francisco Jerez <[email protected]> for creating it. It gives a number of ones proportional to the alpha for 2, 4, 8 or 16 least significant bits of the result. GEN6 hardware does not have issue with simultaneous usage of sample mask and alpha to coverage however due to the wrong sending order of oMask and src0_alpha it is still affected by it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109743 Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* draw/gs: fix point size outputs from geometry shader.Dave Airlie2019-03-261-8/+1
| | | | | | | | | | If the geom shader emits a point size we failed to find it here, use the correct API to look it up. Fixes: tests/spec/glsl-1.50/execution/geometry/point-size-out.shader_test Reviewed-by: Brian Paul <[email protected]>
* draw: bail instead of assert on instance count (v2)Dave Airlie2019-03-261-1/+3
| | | | | | | | | | | | With indirect rendering it's fine to set the instance count parameter to 0, and expect the rendering to be ignored. Fixes assert in KHR-GLES31.core.compute_shader.pipeline-gen-draw-commands on softpipe v2: return earlier before changing fpstate Reviewed-by: Brian Paul <[email protected]>
* vl/dri3: remove the wait before getting back bufferLeo Liu2019-03-251-15/+3
| | | | | | | | | | The wait here is unnecessary since we got a pool of back buffers, and the wait for swap buffer will happen before the present pixmap, at the same time the previous back buffer will be put back to pool for reuse after the check for PresentIdleNotify event Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* android: static link with libexpat with Android O+Kishore Kadiyala2019-03-251-1/+9
| | | | | | | | | | | | | In Android O, MESA needs to statically link libexpat so that it's in same VNDK namespace. v2: apply change also to anv driver (Tapani) v3: use += in anv change (Eric Engestrom) Change-Id: I82b0be5c817c21e734dfdf5bfb6a9aa1d414ab33 Signed-off-by: Kishore Kadiyala <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno: add ESSL capRob Clark2019-03-221-0/+7
| | | | | | | | | Report 320 for a6xx, which isn't *quite* true (no geom/tess, in particular), but other caps keep the reported GL and GLSL versions correct (3.1 / 3.10 es). But reporting 320 will switch on EXT_gpu_shader5, which is the goal. Signed-off-by: Rob Clark <[email protected]>
* gallium: add PIPE_CAP_ESSL_FEATURE_LEVELRob Clark2019-03-223-0/+14
| | | | | | | | | | | | | Adds a new cap to allow drivers to expose higher shading language versions in GLES contexts, to avoid having to report an artificially low version for the benefit of GL contexts. The motivation is to expose EXT_gpu_shader5 even though a driver may not support all the features needed for the corresponding GL extension (ARB_gpu_shader5). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* swr: Fix build with llvm-9.0.Vinson Lee2019-03-222-0/+24
| | | | | | | | | | | | | | | | | Fix build error after llvm-9.0svn r352827 ("[opaque pointer types] Add a FunctionCallee wrapper type, and use it."). In file included from ./rasterizer/jitter/builder.h:158:0, from swr_shader.cpp:35: ./rasterizer/jitter/gen_builder_meta.hpp: In member function ‘llvm::Value* SwrJit::Builder::VGATHERPD(llvm::Value*, llvm::Value*, llvm::Value*, llvm::Value*, llvm::Value*, const llvm: :Twine&)’: ./rasterizer/jitter/gen_builder_meta.hpp:51:117: error: no matching function for call to ‘cast(llvm::FunctionCallee)’ Function* pFunc = cast<Function>(JM()->mpCurrentModule->getOrInsertFunction("meta.intrinsic.VGATHERPD", pFuncTy)); ^ Suggested-by: Philip Meulengracht <[email protected]> Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Alok Hota <[email protected]>
* spirv,nir: lower frexp_exp/frexp_sig inside a new NIR passSamuel Pitoiset2019-03-221-0/+1
| | | | | | | | | | This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>