| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A security advisory (TALOS-2019-0857/CVE-2019-5068) found that
creating shared memory regions with permission mode 0777 could allow
any user to access that memory. Several Mesa drivers use shared-
memory XImages to implement back buffers for improved performance.
This path changes the shmget() calls to use 0600 (user r/w).
Tested with legacy Xlib driver and llvmpipe.
Cc: [email protected]
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
In vec4, we can just not run the pass. In fs, things are a bit more
deeply intertwined.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This change makes it possible to support different downsample cases
like 4 -> 2 or 4 -> 1.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Gert Wollny <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes the streams more readable and comparable with the blob's parser
as it parses the VS and PLBU stream and shows the currently known values.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Change the dump, that the output looks more like the output of
mali-syscall-tracker [1].
This is a preparation for a more detailed stream analysis.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
[1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker
|
|
|
|
|
|
|
|
|
| |
CodeGenFileType moved from ::llvm::TargetMachine in
llvm/Target/TargetMachine.h to ::llvm:: in llvm/Support/CodeGen.h
Signed-off-by: Aaron Watry <[email protected]>
Reviewed-by: Jan Vesely <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
GEN10_FORMAT_TABLE_INPUTS requires correction of u_format.csv file path
in order to avoid following build error:
ninja: error: 'external/mesa/util/format/u_format.csv',
needed by 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/gfx10_format_table.h',
missing and no known rule to make it
Fixes: 882ca6d ("util: Move gallium's PIPE_FORMAT utils to /util/format/")
Signed-off-by: Mauro Rossi <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want fma. This decreases compile times by 4% for Borderlands 2.
48505 shaders in 30515 tests
Totals:
SGPRS: 2206584 -> 2204784 (-0.08 %)
VGPRS: 1647892 -> 1648964 (0.07 %)
Spilled SGPRs: 6256 -> 6078 (-2.85 %)
Spilled VGPRs: 72 -> 72 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2240 -> 2240 (0.00 %) dwords per thread
Code Size: 49680804 -> 49837988 (0.32 %) bytes
LDS: 74 -> 74 (0.00 %) blocks
Max Waves: 371387 -> 371352 (-0.01 %)
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
glxgears has dead temps after lowering color inputs to load intrinsics.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
For measuring st/mesa compile time.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
State was leaking from previous frames as we weren't updating the
descriptor in all cases.
Signed-off-by: Tomeu Vizoso <[email protected]>
Tested-by: Andre Heider <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Per the spec, the units passed to glPolygonOffset are to be multiplied
by an implementation-defined constant.
On Midgard, this constant seems to be 2.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the case of glibc, pthread_t is internally a pointer. If
lp_rast_destroy() passes a 0-value pthread_t to pthread_join(), the
latter will SEGV dereferencing it.
pthread_create() can fail if either the user's ulimit -u or Linux
kernel's /proc/sys/kernel/threads-max is reached.
Choosing to continue, rather than fail, on theory that it is better to
run with the one main thread, than not run at all.
Keeping as many threads as we got, since lack of threads severely
degrades llvmpipe performance.
Signed-off-by: Nathan Kidd <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Large programs, e.g. gnome-shell and firefox, may tax the
addressability of the Medium code model once a (potentially unbounded)
number of dynamically generated JIT-compiled shader programs are
linked in and relocated. Yet the default code model as of LLVM 8 is
Medium or even Small.
The cost of changing from Medium to Large is negligible:
- an additional 8-byte pointer stored immediately before the shader entrypoint;
- change an add-immediate (addis) instruction to a load (ld).
Testing with WebGL Conformance
(https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html)
yields clean runs with this change (and crashes without it).
Testing with glxgears shows no detectable performance difference.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327, 1753789, 1543572, 1747110, and 1582226
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/223
Co-authored by: Nemanja Ivanovic <[email protected]>, Tom Stellard <[email protected]>
CC: [email protected]
Signed-off-by: Ben Crocker <[email protected]>
|
|
|
|
|
|
|
|
| |
So nir_validate happens properly. Unfortunately this means we have
to play the metadata song and dance, so walk over all impls and say
that we didn't hurt anything.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When demoting it from an output to a global, we need to actually move
it to the correct list. While here, we also refactor so it's clear
we aren't mutating the list while iterating.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2106
Fixes: f9fd04aca15 ("nir: Fix non-determinism in lower_global_vars_to_local")
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to
move their helpers out of gallium. Since u_format used
util_copy_rect(), I moved that in there, too.
I've put it in a separate directory in util/ because it's a big chunk
of related code, and it's not clear to me whether we might want it as
a separate library from libmesa_util at some point.
Closes: #1905
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
buffers"
This reverts commit 1d1b4578211dcc69cfab8879d0cdafaba1eec948.
This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 1d122c104a7a3d9348ab347e1e843b7e2bf3b498.
This series caused unexpected flickering artifacts with Iris driver on
Chrome OS and EGL_EXT_image_flush_external spec has not been published
yet.
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
It's now unused, in favour of LCRA.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is less complicated than previously thought. Note we have no way of
specifying the work register count for blend shaders; it must be
strictly less than the work register count of the corresponding fragment
shader (which is fine since we force the fragment shader to report a
count of 16 with a blend shader as a major hack until we get register
pressure down for blend shaders).
TODO: pandecode the flags.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
This code is kinda stand-alone, and it makes it a bit easier to find the
right source in the source-tree.
|
|
|
|
|
| |
This code is kinda stand-alone, and it makes it a bit easier to find the
right source in the source-tree
|
|
|
|
| |
This will help code-reuse a bit in the next commit.
|
|
|
|
|
| |
This code is more or less stand-alone, and this keeps the formats array
a bit more encapsulated.
|
|
|
|
|
|
|
|
|
| |
This is a driver-param (loaded from uniform), not a sysval (populated by
hw into a register). So it has no value to having a sysval slot.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can end up with scenarios where last_fence is associated with a batch
that is flushed through some other path before needs_out_fence_fd gets
set. Resulting in returning a fence that has no backing fd.
The simplest thing is to just skip the optimization to try and avoid
no-op batches when a fence-fd is requested. This should normally be
just once a frame anyways.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
It seems whatever was causing this is no longer an issue. So let's get
rid of the hack here.
Signed-off-by: Erik Faye-Lund <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
| |
A640 seems to work without any other changes (glmark and vkcube).
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
They use the "sample" keyword as a variable name.
Cc: 19.2 19.3 <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If we have an accelerated path for a particular framebuffer format,
let's use it to save a bunch of instructions in a blend shader.
[Tomeu: Only use the faster intrinsic on >T760]
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using packed vulkan-formats on little-endian systems, we need to
swap the components for the gallium formats. And since Zink isn't
big-endian safe yet, little-endian is the only endianess we care about
right now.
This fixes a bunch of piglit tests, amongs others:
- spec@arb_depth_texture@depth-level-clamp
- spec@arb_depth_texture@depthstencil-render-miplevels * d=z24
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-blit
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-copypixels
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-drawpixels
- spec@arb_depth_texture@fbo-depth-gl_depth_component24-readpixels
Signed-off-by: Erik Faye-Lund <[email protected]>
Fixes: 8d46e35d16e ("zink: introduce opengl over vulkan")
|
|
|
|
|
|
|
|
| |
This fixes the following piglit:
spec@ati_fragment_shader@ati_fragment_shader-render-fog
Signed-off-by: Erik Faye-Lund <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The instruction count is (mostly) a measure of what optimization passes
can do, while # of nops is more an indication of how effectively the
scheduler is balancing register pressure vs instruction count. So track
these independently.
(There could be opportunities to rematerialize values to reduce register
pressure, swapping some nop's with other alu instructions, so nothing is
truely independent.. but it is still useful to break these stats out.)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Set flag based on actual output reg type.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
We should really be setting this based on the actual output register
type.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
This partially reverts 8b30114dda8.
Fixes: 8b30114dda8 "radeonsi/nir: call nir_serialize only once per shader"
|
|
|
|
|
|
|
|
| |
We were calling it twice.
First serialize it, then use it to compute the cache key.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Found once I started using the generated unpack code from the Mesa side.
Fixes: 4bbaac3782ad ("gallium: Add some more channel orderings of packed formats.")
Reviewed-by: Erik Faye-Lund <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Mesa emulates planar format sampling with per-plane samplers. Virgl now
supports this by allowing the plane index to be passed when creating a
sampler view from a planar image. With this change, mesa now passes that
information to virgl.
Signed-off-by: David Stevens <[email protected]>
Reviewed-by: Lepton Wu <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Enable / add to features.txt:
- Enhanced textureGather.
- Geometry shader instancing.
- Geometry shader multiple streams.
Reviewed-by: Jan Zielinski <[email protected]>
|
|
|
|
|
|
|
| |
- Fixed proper setting gl_InvocationID.
- Fixed GS vertices output memory overflow.
Reviewed-by: Jan Zielinski <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The panfrost BO cache can only grow since all newly allocated BOs are
returned to the cache (unless they've been exported).
With the MADVISE ioctl that's not a big issue because the kernel can
come and reclaim this memory, but MADVISE will only be available on 5.4
kernels. This means an app can currently allocate a lot memory without
ever releasing it, leading to some situations where the OOM-killer kicks
in and kills the app (or even worse, kills another process consuming
more memory than the GL app) to get some of this memory back.
Let's try to limit the amount of BOs we keep in the cache by evicting
entries that have not been used for more than one second (if the app
stopped allocating BOs of this size, it's likely to not allocate
similar BOs in a near future).
This solution is based on the VC4/V3D implementation.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We will soon introduce an LRU list to evict BOs that have been unused
for more than 1 second. Let's first move all BO cache fields to a
sub-struct to clarify which fields are used by the BO caching logic.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|