aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* etnaviv: separate PE and RS formats, use only RS only for tilingJonathan Marek2019-11-188-56/+54
| | | | | | | | | | | There are PE formats not supported by RS, so we can't have a single to translate both. Use RS only for same formats until we have a translate_rs_format and test the possible different format blits. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: blt: use only for tiling, and add missing formatsJonathan Marek2019-11-181-22/+30
| | | | | | | | | | | | | | | | | | * Removes the incorrect usage of translate_rs_format * Disables use of BLT engine for different src/dst format We only really need the BLT engine for tiling/detiling right now, but it would be nice to support as many blit cases as possible to avoid using PE for that. To deal with different formats we need to: * Have a translate_blt_format which has all supported formats * Fix the swizzle translation from gallium (current version was wrong) * Set the src/dst sRGB bits as needed * Find which type conversions the BLT engine can actually do Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* intel/compiler: Add a flag to avoid compacting push constantsJason Ekstrand2019-11-181-0/+1
| | | | | | | In vec4, we can just not run the pass. In fs, things are a bit more deeply intertwined. Reviewed-by: Lionel Landwerlin <[email protected]>
* etnaviv: rs: upsampling is not supportedChristian Gmeiner2019-11-171-1/+32
| | | | | | | | This change makes it possible to support different downsample cases like 4 -> 2 or 4 -> 1. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Gert Wollny <[email protected]>
* lima: Parse VS and PLBU command stream while making a dumpAndreas Baierl2019-11-177-0/+461
| | | | | | | | This makes the streams more readable and comparable with the blob's parser as it parses the VS and PLBU stream and shows the currently known values. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima: Beautify stream dumpsAndreas Baierl2019-11-171-7/+11
| | | | | | | | | | | Change the dump, that the output looks more like the output of mali-syscall-tracker [1]. This is a preparation for a more detailed stream analysis. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]> [1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker
* android: radeonsi: fix build error due to wrong u_format.csv file pathMauro Rossi2019-11-151-1/+1
| | | | | | | | | | | | GEN10_FORMAT_TABLE_INPUTS requires correction of u_format.csv file path in order to avoid following build error: ninja: error: 'external/mesa/util/format/u_format.csv', needed by 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/gfx10_format_table.h', missing and no known rule to make it Fixes: 882ca6d ("util: Move gallium's PIPE_FORMAT utils to /util/format/") Signed-off-by: Mauro Rossi <[email protected]>
* radeonsi/nir: don't lower fma, instead, fuse fmaMarek Olšák2019-11-151-1/+1
| | | | | | | | | | | | | | | | | | We want fma. This decreases compile times by 4% for Borderlands 2. 48505 shaders in 30515 tests Totals: SGPRS: 2206584 -> 2204784 (-0.08 %) VGPRS: 1647892 -> 1648964 (0.07 %) Spilled SGPRs: 6256 -> 6078 (-2.85 %) Spilled VGPRs: 72 -> 72 (0.00 %) Private memory VGPRs: 2176 -> 2176 (0.00 %) Scratch size: 2240 -> 2240 (0.00 %) dwords per thread Code Size: 49680804 -> 49837988 (0.32 %) bytes LDS: 74 -> 74 (0.00 %) blocks Max Waves: 371387 -> 371352 (-0.01 %) Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/nir: call nir_lower_flrp only once per shaderMarek Olšák2019-11-151-6/+7
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/nir: remove dead function tempsMarek Olšák2019-11-151-0/+1
| | | | | | glxgears has dead temps after lowering color inputs to load intrinsics. Reviewed-by: Timothy Arceri <[email protected]>
* panfrost: Make sure the shader descriptor is in sync with the GL stateTomeu Vizoso2019-11-151-19/+8
| | | | | | | | State was leaking from previous frames as we weren't updating the descriptor in all cases. Signed-off-by: Tomeu Vizoso <[email protected]> Tested-by: Andre Heider <[email protected]>
* panfrost: Multiply offset_units by 2Tomeu Vizoso2019-11-151-1/+1
| | | | | | | | | | Per the spec, the units passed to glPolygonOffset are to be multiplied by an implementation-defined constant. On Midgard, this constant seems to be 2. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* llvmpipe: Check thread creation errorsNathan Kidd2019-11-151-0/+4
| | | | | | | | | | | | | | | | | | In the case of glibc, pthread_t is internally a pointer. If lp_rast_destroy() passes a 0-value pthread_t to pthread_join(), the latter will SEGV dereferencing it. pthread_create() can fail if either the user's ulimit -u or Linux kernel's /proc/sys/kernel/threads-max is reached. Choosing to continue, rather than fail, on theory that it is better to run with the one main thread, than not run at all. Keeping as many threads as we got, since lack of threads severely degrades llvmpipe performance. Signed-off-by: Nathan Kidd <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* iris: Wrap iris_fix_edge_flags in NIR_PASSKenneth Graunke2019-11-141-1/+10
| | | | | | | | So nir_validate happens properly. Unfortunately this means we have to play the metadata song and dance, so walk over all impls and say that we didn't hurt anything. Reviewed-by: Jason Ekstrand <[email protected]>
* iris: Properly move edgeflag_out from output list to global listKenneth Graunke2019-11-141-8/+16
| | | | | | | | | | When demoting it from an output to a global, we need to actually move it to the correct list. While here, we also refactor so it's clear we aren't mutating the list while iterating. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2106 Fixes: f9fd04aca15 ("nir: Fix non-determinism in lower_global_vars_to_local") Reviewed-by: Jason Ekstrand <[email protected]>
* util: Move gallium's PIPE_FORMAT utils to /util/format/Eric Anholt2019-11-14191-212/+212
| | | | | | | | | | | | | | | To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to move their helpers out of gallium. Since u_format used util_copy_rect(), I moved that in there, too. I've put it in a separate directory in util/ because it's a big chunk of related code, and it's not clear to me whether we might want it as a separate library from libmesa_util at some point. Closes: #1905 Acked-by: Marek Olšák <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove util/ra supportAlyssa Rosenzweig2019-11-133-5/+2
| | | | | | It's now unused, in favour of LCRA. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add blend shader selection bits for MRTAlyssa Rosenzweig2019-11-131-24/+5
| | | | | | | | | | | | | This is less complicated than previously thought. Note we have no way of specifying the work register count for blend shaders; it must be strictly less than the work register count of the corresponding fragment shader (which is fine since we force the fragment shader to report a count of 16 with a blend shader as a major hack until we get register pressure down for blend shaders). TODO: pandecode the flags. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* zink: move drawing separate sourceErik Faye-Lund2019-11-134-296/+312
| | | | | This code is kinda stand-alone, and it makes it a bit easier to find the right source in the source-tree.
* zink: move blitting to separate sourceErik Faye-Lund2019-11-134-176/+188
| | | | | This code is kinda stand-alone, and it makes it a bit easier to find the right source in the source-tree
* zink: move filter-helper to separate helper-headerErik Faye-Lund2019-11-132-13/+41
| | | | This will help code-reuse a bit in the next commit.
* zink: move format-checking to separate sourceErik Faye-Lund2019-11-134-155/+161
| | | | | This code is more or less stand-alone, and this keeps the formats array a bit more encapsulated.
* freedreno/ir3: remove first-vertex sysvalRob Clark2019-11-122-6/+0
| | | | | | | | | This is a driver-param (loaded from uniform), not a sysval (populated by hw into a register). So it has no value to having a sysval slot. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* iris: Use mocs from isl_dev.Rafael Antognolli2019-11-127-71/+71
| | | | | Reviewed-by: Jordan Justen <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* freedreno: fix eglDupNativeFenceFD errorRob Clark2019-11-121-4/+10
| | | | | | | | | | | | | We can end up with scenarios where last_fence is associated with a batch that is flushed through some other path before needs_out_fence_fd gets set. Resulting in returning a fence that has no backing fd. The simplest thing is to just skip the optimization to try and avoid no-op batches when a fence-fd is requested. This should normally be just once a frame anyways. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* zink: remove no-longer-needed hackErik Faye-Lund2019-11-121-10/+0
| | | | | | | It seems whatever was causing this is no longer an issue. So let's get rid of the hack here. Signed-off-by: Erik Faye-Lund <[email protected]>
* zink: implement buffer-to-buffer copiesErik Faye-Lund2019-11-121-0/+12
|
* zink: always allow transfer to/from buffersErik Faye-Lund2019-11-121-4/+2
|
* freedreno: add Adreno 640 IDJonathan Marek2019-11-112-0/+10
| | | | | | | A640 seems to work without any other changes (glmark and vkcube). Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* st/mesa: remove unused TGSI-only debug printing functionsMarek Olšák2019-11-111-4/+0
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* panfrost: Select format-specific blending intrinsicsAlyssa Rosenzweig2019-11-113-9/+41
| | | | | | | | | | | If we have an accelerated path for a particular framebuffer format, let's use it to save a bunch of instructions in a blend shader. [Tomeu: Only use the faster intrinsic on >T760] Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Set depth and stencil for SFBD based on the formatTomeu Vizoso2019-11-114-21/+36
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* zink: correct depth-stencil formatErik Faye-Lund2019-11-111-1/+1
| | | | | | | | | | | | | | | | | | When using packed vulkan-formats on little-endian systems, we need to swap the components for the gallium formats. And since Zink isn't big-endian safe yet, little-endian is the only endianess we care about right now. This fixes a bunch of piglit tests, amongs others: - spec@arb_depth_texture@depth-level-clamp - spec@arb_depth_texture@depthstencil-render-miplevels * d=z24 - spec@arb_depth_texture@fbo-depth-gl_depth_component24-blit - spec@arb_depth_texture@fbo-depth-gl_depth_component24-copypixels - spec@arb_depth_texture@fbo-depth-gl_depth_component24-drawpixels - spec@arb_depth_texture@fbo-depth-gl_depth_component24-readpixels Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: 8d46e35d16e ("zink: introduce opengl over vulkan")
* zink/spirv: add support for nir_op_flrpErik Faye-Lund2019-11-111-0/+15
| | | | | | | | This fixes the following piglit: spec@ati_fragment_shader@ati_fragment_shader-render-fog Signed-off-by: Erik Faye-Lund <[email protected]>
* freedreno/ir3: also track # of nops for shader-dbRob Clark2019-11-091-1/+3
| | | | | | | | | | | | | The instruction count is (mostly) a measure of what optimization passes can do, while # of nops is more an indication of how effectively the scheduler is balancing register pressure vs instruction count. So track these independently. (There could be opportunities to rematerialize values to reduce register pressure, swapping some nop's with other alu instructions, so nothing is truely independent.. but it is still useful to break these stats out.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISIONRob Clark2019-11-091-1/+1
| | | | | | Set flag based on actual output reg type. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISIONRob Clark2019-11-091-1/+1
| | | | | | | We should really be setting this based on the actual output register type. Signed-off-by: Rob Clark <[email protected]>
* radeonsi/nir: fix compute shader crash due to nir_binary == NULLMarek Olšák2019-11-081-2/+12
| | | | | | This partially reverts 8b30114dda8. Fixes: 8b30114dda8 "radeonsi/nir: call nir_serialize only once per shader"
* radeonsi/nir: call nir_serialize only once per shaderMarek Olšák2019-11-081-21/+21
| | | | | | | | We were calling it twice. First serialize it, then use it to compute the cache key. Reviewed-by: Timothy Arceri <[email protected]>
* virgl: support emulating planar image samplingDavid Stevens2019-11-081-1/+6
| | | | | | | | | | Mesa emulates planar format sampling with per-plane samplers. Virgl now supports this by allowing the plane index to be passed when creating a sampler view from a planar image. With this change, mesa now passes that information to virgl. Signed-off-by: David Stevens <[email protected]> Reviewed-by: Lepton Wu <[email protected]>
* gallium/swr: Enable some ARB_gpu_shader5 extensionsKrzysztof Raszkowski2019-11-081-0/+1
| | | | | | | | | Enable / add to features.txt: - Enhanced textureGather. - Geometry shader instancing. - Geometry shader multiple streams. Reviewed-by: Jan Zielinski <[email protected]>
* gallium/swr: Fix GS invocation issuesKrzysztof Raszkowski2019-11-081-2/+7
| | | | | | | - Fixed proper setting gl_InvocationID. - Fixed GS vertices output memory overflow. Reviewed-by: Jan Zielinski <[email protected]>
* panfrost: Try to evict unused BOs from the cacheBoris Brezillon2019-11-084-6/+61
| | | | | | | | | | | | | | | | | | | | | | The panfrost BO cache can only grow since all newly allocated BOs are returned to the cache (unless they've been exported). With the MADVISE ioctl that's not a big issue because the kernel can come and reclaim this memory, but MADVISE will only be available on 5.4 kernels. This means an app can currently allocate a lot memory without ever releasing it, leading to some situations where the OOM-killer kicks in and kills the app (or even worse, kills another process consuming more memory than the GL app) to get some of this memory back. Let's try to limit the amount of BOs we keep in the cache by evicting entries that have not been used for more than one second (if the app stopped allocating BOs of this size, it's likely to not allocate similar BOs in a near future). This solution is based on the VC4/V3D implementation. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move BO cache related fields to a sub-structBoris Brezillon2019-11-083-18/+21
| | | | | | | | | We will soon introduce an LRU list to evict BOs that have been unused for more than 1 second. Let's first move all BO cache fields to a sub-struct to clarify which fields are used by the BO caching logic. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* freedreno/a6xx: Turn on tessellation shadersKristian H. Kristensen2019-11-071-1/+13
| | | | | | | | Wow. Very triangle. So shader. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Only use merged regs and four quads for VS+FSKristian H. Kristensen2019-11-071-5/+15
| | | | | | | | | When other geometry stages are present, we chose two quads and no merged regs. Acked-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/blitter: Save tessellation stateKristian H. Kristensen2019-11-071-0/+2
| | | | | | | | We have tessellation state now. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Only set emit.hs/ds when we're drawing patchesKristian H. Kristensen2019-11-071-2/+3
| | | | | | | | | At least the gallium blitter helper will call us to draw with tessellation shaders set but a non-patch primitive. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Use bypass rendering for tessellationKristian H. Kristensen2019-11-071-0/+8
| | | | | | | | | | It seems like tiling could work in the Adreno architecture, but we've only ever seen bypass rendering with tessellation. For now, let's do that too. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Program state for tessellation stagesKristian H. Kristensen2019-11-073-34/+157
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>