summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* Call shmget() with permission 0600 instead of 0777Brian Paul2019-11-182-2/+4
| | | | | | | | | | | | | | A security advisory (TALOS-2019-0857/CVE-2019-5068) found that creating shared memory regions with permission mode 0777 could allow any user to access that memory. Several Mesa drivers use shared- memory XImages to implement back buffers for improved performance. This path changes the shmget() calls to use 0600 (user r/w). Tested with legacy Xlib driver and llvmpipe. Cc: [email protected] Reviewed-by: Kristian H. Kristensen <[email protected]>
* intel/compiler: Add a flag to avoid compacting push constantsJason Ekstrand2019-11-181-0/+1
| | | | | | | In vec4, we can just not run the pass. In fs, things are a bit more deeply intertwined. Reviewed-by: Lionel Landwerlin <[email protected]>
* etnaviv: rs: upsampling is not supportedChristian Gmeiner2019-11-171-1/+32
| | | | | | | | This change makes it possible to support different downsample cases like 4 -> 2 or 4 -> 1. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Gert Wollny <[email protected]>
* lima: Parse VS and PLBU command stream while making a dumpAndreas Baierl2019-11-177-0/+461
| | | | | | | | This makes the streams more readable and comparable with the blob's parser as it parses the VS and PLBU stream and shows the currently known values. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima: Beautify stream dumpsAndreas Baierl2019-11-171-7/+11
| | | | | | | | | | | Change the dump, that the output looks more like the output of mali-syscall-tracker [1]. This is a preparation for a more detailed stream analysis. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]> [1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker
* clover/llvm: fix build after llvm 10 commit 1dfede3122eeAaron Watry2019-11-152-4/+20
| | | | | | | | | CodeGenFileType moved from ::llvm::TargetMachine in llvm/Target/TargetMachine.h to ::llvm:: in llvm/Support/CodeGen.h Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* android: radeonsi: fix build error due to wrong u_format.csv file pathMauro Rossi2019-11-151-1/+1
| | | | | | | | | | | | GEN10_FORMAT_TABLE_INPUTS requires correction of u_format.csv file path in order to avoid following build error: ninja: error: 'external/mesa/util/format/u_format.csv', needed by 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/gfx10_format_table.h', missing and no known rule to make it Fixes: 882ca6d ("util: Move gallium's PIPE_FORMAT utils to /util/format/") Signed-off-by: Mauro Rossi <[email protected]>
* radeonsi/nir: don't lower fma, instead, fuse fmaMarek Olšák2019-11-151-1/+1
| | | | | | | | | | | | | | | | | | We want fma. This decreases compile times by 4% for Borderlands 2. 48505 shaders in 30515 tests Totals: SGPRS: 2206584 -> 2204784 (-0.08 %) VGPRS: 1647892 -> 1648964 (0.07 %) Spilled SGPRs: 6256 -> 6078 (-2.85 %) Spilled VGPRs: 72 -> 72 (0.00 %) Private memory VGPRs: 2176 -> 2176 (0.00 %) Scratch size: 2240 -> 2240 (0.00 %) dwords per thread Code Size: 49680804 -> 49837988 (0.32 %) bytes LDS: 74 -> 74 (0.00 %) blocks Max Waves: 371387 -> 371352 (-0.01 %) Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/nir: call nir_lower_flrp only once per shaderMarek Olšák2019-11-151-6/+7
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/nir: remove dead function tempsMarek Olšák2019-11-151-0/+1
| | | | | | glxgears has dead temps after lowering color inputs to load intrinsics. Reviewed-by: Timothy Arceri <[email protected]>
* gallium/noop: call finalize_nirMarek Olšák2019-11-151-0/+3
| | | | | | For measuring st/mesa compile time. Reviewed-by: Timothy Arceri <[email protected]>
* panfrost: Make sure the shader descriptor is in sync with the GL stateTomeu Vizoso2019-11-151-19/+8
| | | | | | | | State was leaking from previous frames as we weren't updating the descriptor in all cases. Signed-off-by: Tomeu Vizoso <[email protected]> Tested-by: Andre Heider <[email protected]>
* panfrost: Multiply offset_units by 2Tomeu Vizoso2019-11-151-1/+1
| | | | | | | | | | Per the spec, the units passed to glPolygonOffset are to be multiplied by an implementation-defined constant. On Midgard, this constant seems to be 2. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* llvmpipe: Check thread creation errorsNathan Kidd2019-11-151-0/+4
| | | | | | | | | | | | | | | | | | In the case of glibc, pthread_t is internally a pointer. If lp_rast_destroy() passes a 0-value pthread_t to pthread_join(), the latter will SEGV dereferencing it. pthread_create() can fail if either the user's ulimit -u or Linux kernel's /proc/sys/kernel/threads-max is reached. Choosing to continue, rather than fail, on theory that it is better to run with the one main thread, than not run at all. Keeping as many threads as we got, since lack of threads severely degrades llvmpipe performance. Signed-off-by: Nathan Kidd <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shadersBen Crocker2019-11-141-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Large programs, e.g. gnome-shell and firefox, may tax the addressability of the Medium code model once a (potentially unbounded) number of dynamically generated JIT-compiled shader programs are linked in and relocated. Yet the default code model as of LLVM 8 is Medium or even Small. The cost of changing from Medium to Large is negligible: - an additional 8-byte pointer stored immediately before the shader entrypoint; - change an add-immediate (addis) instruction to a load (ld). Testing with WebGL Conformance (https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html) yields clean runs with this change (and crashes without it). Testing with glxgears shows no detectable performance difference. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327, 1753789, 1543572, 1747110, and 1582226 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/223 Co-authored by: Nemanja Ivanovic <[email protected]>, Tom Stellard <[email protected]> CC: [email protected] Signed-off-by: Ben Crocker <[email protected]>
* iris: Wrap iris_fix_edge_flags in NIR_PASSKenneth Graunke2019-11-141-1/+10
| | | | | | | | So nir_validate happens properly. Unfortunately this means we have to play the metadata song and dance, so walk over all impls and say that we didn't hurt anything. Reviewed-by: Jason Ekstrand <[email protected]>
* iris: Properly move edgeflag_out from output list to global listKenneth Graunke2019-11-141-8/+16
| | | | | | | | | | When demoting it from an output to a global, we need to actually move it to the correct list. While here, we also refactor so it's clear we aren't mutating the list while iterating. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2106 Fixes: f9fd04aca15 ("nir: Fix non-determinism in lower_global_vars_to_local") Reviewed-by: Jason Ekstrand <[email protected]>
* util: Move gallium's PIPE_FORMAT utils to /util/format/Eric Anholt2019-11-14306-11570/+305
| | | | | | | | | | | | | | | To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to move their helpers out of gallium. Since u_format used util_copy_rect(), I moved that in there, too. I've put it in a separate directory in util/ because it's a big chunk of related code, and it's not clear to me whether we might want it as a separate library from libmesa_util at some point. Closes: #1905 Acked-by: Marek Olšák <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* Revert "st/dri: assume external consumers of back buffers can write to the ↵Tapani Pälli2019-11-141-6/+6
| | | | | | | | | | | | | buffers" This reverts commit 1d1b4578211dcc69cfab8879d0cdafaba1eec948. This series caused unexpected flickering artifacts with Iris driver on Chrome OS and EGL_EXT_image_flush_external spec has not been published yet. Acked-by: Eric Engestrom <[email protected]> Acked-by: Kristian H. Kristensen <[email protected]>
* Revert "st/dri: add support for EGL_EXT_image_flush_external"Tapani Pälli2019-11-141-91/+40
| | | | | | | | | | | This reverts commit 1d122c104a7a3d9348ab347e1e843b7e2bf3b498. This series caused unexpected flickering artifacts with Iris driver on Chrome OS and EGL_EXT_image_flush_external spec has not been published yet. Acked-by: Eric Engestrom <[email protected]> Acked-by: Kristian H. Kristensen <[email protected]>
* pan/midgard: Remove util/ra supportAlyssa Rosenzweig2019-11-133-5/+2
| | | | | | It's now unused, in favour of LCRA. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add blend shader selection bits for MRTAlyssa Rosenzweig2019-11-131-24/+5
| | | | | | | | | | | | | This is less complicated than previously thought. Note we have no way of specifying the work register count for blend shaders; it must be strictly less than the work register count of the corresponding fragment shader (which is fine since we force the fragment shader to report a count of 16 with a blend shader as a major hack until we get register pressure down for blend shaders). TODO: pandecode the flags. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* zink: move drawing separate sourceErik Faye-Lund2019-11-134-296/+312
| | | | | This code is kinda stand-alone, and it makes it a bit easier to find the right source in the source-tree.
* zink: move blitting to separate sourceErik Faye-Lund2019-11-134-176/+188
| | | | | This code is kinda stand-alone, and it makes it a bit easier to find the right source in the source-tree
* zink: move filter-helper to separate helper-headerErik Faye-Lund2019-11-132-13/+41
| | | | This will help code-reuse a bit in the next commit.
* zink: move format-checking to separate sourceErik Faye-Lund2019-11-134-155/+161
| | | | | This code is more or less stand-alone, and this keeps the formats array a bit more encapsulated.
* freedreno/ir3: remove first-vertex sysvalRob Clark2019-11-122-6/+0
| | | | | | | | | This is a driver-param (loaded from uniform), not a sysval (populated by hw into a register). So it has no value to having a sysval slot. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* iris: Use mocs from isl_dev.Rafael Antognolli2019-11-127-71/+71
| | | | | Reviewed-by: Jordan Justen <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* freedreno: fix eglDupNativeFenceFD errorRob Clark2019-11-121-4/+10
| | | | | | | | | | | | | We can end up with scenarios where last_fence is associated with a batch that is flushed through some other path before needs_out_fence_fd gets set. Resulting in returning a fence that has no backing fd. The simplest thing is to just skip the optimization to try and avoid no-op batches when a fence-fd is requested. This should normally be just once a frame anyways. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* zink: remove no-longer-needed hackErik Faye-Lund2019-11-121-10/+0
| | | | | | | It seems whatever was causing this is no longer an issue. So let's get rid of the hack here. Signed-off-by: Erik Faye-Lund <[email protected]>
* zink: implement buffer-to-buffer copiesErik Faye-Lund2019-11-121-0/+12
|
* zink: always allow transfer to/from buffersErik Faye-Lund2019-11-121-4/+2
|
* freedreno: add Adreno 640 IDJonathan Marek2019-11-112-0/+10
| | | | | | | A640 seems to work without any other changes (glmark and vkcube). Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* st/mesa: remove unused TGSI-only debug printing functionsMarek Olšák2019-11-111-4/+0
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for themMarek Olšák2019-11-113-0/+4
| | | | | | | They use the "sample" keyword as a variable name. Cc: 19.2 19.3 <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* panfrost: Select format-specific blending intrinsicsAlyssa Rosenzweig2019-11-113-9/+41
| | | | | | | | | | | If we have an accelerated path for a particular framebuffer format, let's use it to save a bunch of instructions in a blend shader. [Tomeu: Only use the faster intrinsic on >T760] Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Set depth and stencil for SFBD based on the formatTomeu Vizoso2019-11-114-21/+36
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* zink: correct depth-stencil formatErik Faye-Lund2019-11-111-1/+1
| | | | | | | | | | | | | | | | | | When using packed vulkan-formats on little-endian systems, we need to swap the components for the gallium formats. And since Zink isn't big-endian safe yet, little-endian is the only endianess we care about right now. This fixes a bunch of piglit tests, amongs others: - spec@arb_depth_texture@depth-level-clamp - spec@arb_depth_texture@depthstencil-render-miplevels * d=z24 - spec@arb_depth_texture@fbo-depth-gl_depth_component24-blit - spec@arb_depth_texture@fbo-depth-gl_depth_component24-copypixels - spec@arb_depth_texture@fbo-depth-gl_depth_component24-drawpixels - spec@arb_depth_texture@fbo-depth-gl_depth_component24-readpixels Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: 8d46e35d16e ("zink: introduce opengl over vulkan")
* zink/spirv: add support for nir_op_flrpErik Faye-Lund2019-11-111-0/+15
| | | | | | | | This fixes the following piglit: spec@ati_fragment_shader@ati_fragment_shader-render-fog Signed-off-by: Erik Faye-Lund <[email protected]>
* freedreno/ir3: also track # of nops for shader-dbRob Clark2019-11-091-1/+3
| | | | | | | | | | | | | The instruction count is (mostly) a measure of what optimization passes can do, while # of nops is more an indication of how effectively the scheduler is balancing register pressure vs instruction count. So track these independently. (There could be opportunities to rematerialize values to reduce register pressure, swapping some nop's with other alu instructions, so nothing is truely independent.. but it is still useful to break these stats out.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISIONRob Clark2019-11-091-1/+1
| | | | | | Set flag based on actual output reg type. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISIONRob Clark2019-11-091-1/+1
| | | | | | | We should really be setting this based on the actual output register type. Signed-off-by: Rob Clark <[email protected]>
* radeonsi/nir: fix compute shader crash due to nir_binary == NULLMarek Olšák2019-11-081-2/+12
| | | | | | This partially reverts 8b30114dda8. Fixes: 8b30114dda8 "radeonsi/nir: call nir_serialize only once per shader"
* radeonsi/nir: call nir_serialize only once per shaderMarek Olšák2019-11-081-21/+21
| | | | | | | | We were calling it twice. First serialize it, then use it to compute the cache key. Reviewed-by: Timothy Arceri <[email protected]>
* u_format: Fix swizzle of A1R5G5B5.Eric Anholt2019-11-081-1/+1
| | | | | | | Found once I started using the generated unpack code from the Mesa side. Fixes: 4bbaac3782ad ("gallium: Add some more channel orderings of packed formats.") Reviewed-by: Erik Faye-Lund <[email protected]>
* virgl: support emulating planar image samplingDavid Stevens2019-11-081-1/+6
| | | | | | | | | | Mesa emulates planar format sampling with per-plane samplers. Virgl now supports this by allowing the plane index to be passed when creating a sampler view from a planar image. With this change, mesa now passes that information to virgl. Signed-off-by: David Stevens <[email protected]> Reviewed-by: Lepton Wu <[email protected]>
* gallium/swr: Enable some ARB_gpu_shader5 extensionsKrzysztof Raszkowski2019-11-081-0/+1
| | | | | | | | | Enable / add to features.txt: - Enhanced textureGather. - Geometry shader instancing. - Geometry shader multiple streams. Reviewed-by: Jan Zielinski <[email protected]>
* gallium/swr: Fix GS invocation issuesKrzysztof Raszkowski2019-11-081-2/+7
| | | | | | | - Fixed proper setting gl_InvocationID. - Fixed GS vertices output memory overflow. Reviewed-by: Jan Zielinski <[email protected]>
* panfrost: Try to evict unused BOs from the cacheBoris Brezillon2019-11-084-6/+61
| | | | | | | | | | | | | | | | | | | | | | The panfrost BO cache can only grow since all newly allocated BOs are returned to the cache (unless they've been exported). With the MADVISE ioctl that's not a big issue because the kernel can come and reclaim this memory, but MADVISE will only be available on 5.4 kernels. This means an app can currently allocate a lot memory without ever releasing it, leading to some situations where the OOM-killer kicks in and kills the app (or even worse, kills another process consuming more memory than the GL app) to get some of this memory back. Let's try to limit the amount of BOs we keep in the cache by evicting entries that have not been used for more than one second (if the app stopped allocating BOs of this size, it's likely to not allocate similar BOs in a near future). This solution is based on the VC4/V3D implementation. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move BO cache related fields to a sub-structBoris Brezillon2019-11-083-18/+21
| | | | | | | | | We will soon introduce an LRU list to evict BOs that have been unused for more than 1 second. Let's first move all BO cache fields to a sub-struct to clarify which fields are used by the BO caching logic. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>