| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that nir_const_value is a scalar, there's no reason why we need
multiple paths here and it's just extra paths to keep working. While
we're here, we also add a vtn_fail_if check that component indices are
in-bounds.
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
| |
It only accepts 32-bit integers so it should have a more descriptive
name. This patch should not be a functional change.
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
All of the callers for this function are looking at interpolation
qualifiers and want to make sure they're declared flat. Any 64-bit
integer inputs need to be flat. It's also makes the function make more
sense since "integer" is fairly generic.
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
| |
All of the callers of this function really just want to know if the type
is an integer and don't care about bit size.
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even if only variables access flags are changed, the existing NIR
infrastructure expects metadata to be explicitly preserved, so do
that. Don't care about avoiding preserve to be called twice since the
cost is negligible.
This scenario can be triggered by dead variables, and also by other
intrinsics that read the variables -- but not cause progress to be
made when processing the intrinsics.
Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags"
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Unwrap any array in the variable type so we can get the sampler dim.
This fixes piglit test
spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test.
Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags"
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for
execinfo.h presence, just check directly. This allows the build to work
on musl.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The disk cache code tries to allocate a 256 Kbyte buffer on the stack.
Since musl only gives 80 Kbyte of stack space per thread, this causes a
trap.
See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size
(In musl-1.1.21 the default stack size has increased to 128K)
[mattst88]: Original author unknown, but I think this is small enough
that it is not copyrightable.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
%lu is for unsigned long, %zu is for size_t. Just cast the data.
|
|
|
|
|
|
| |
We don't execute any of the commands to record snapshots, so we can't
actually produce a real result. We do however need to avoid waiting
on a syncpt which will never be signalled. So, just return 0.
|
|
|
|
|
|
|
| |
Support a new virgl bind type for shared buffers.
Signed-off-by: David Riley <[email protected]>
Reviewed-By: Gert Wollny <[email protected]>
|
|
|
|
|
|
|
| |
Using the existing VK_EXT_debug_report extension.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This brings the nir path in line with the TGSI path.
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 2792 -> 2652 (-5.01 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 247380 -> 248072 (0.28 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 121 -> 132 (9.09 %)
Wait states: 0 -> 0 (0.00 %)
Most of the change came from DiRT: Showdown, and came from sinking SSBO
loads.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
No changes with radeonsi shader-db.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
This is simple now, but we're going to be adding a few more conditions
to this later.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Nothing uses its results yet, that will come with the following commits.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now, this just deduces when we can arbitrarily reorder SSBO and
image loads, matching the existing logic in radeonsi's TGSI->LLVM pass.
This approach can't handle some things that nir_opt_copy_prop_vars can,
but it can handle images, and with GCM it lets us hoist reads outside of
loops. We can also pass this information to LLVM which lets it do its
own optimizations on it.
This is GLSL only as I haven't tested it on Vulkan yet, and it would
probably need a few changes to work there.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The spec explicitly says that volatile writes can't be removed and
volatile reads do not guarantee that the same value will still be around
after the read, as if there were a barrier after each read/write. Just
ignore them.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We were completely ignoring these before, except for putting them on
variables. While we're here, don't set access qualifiers when converting
to bindless since glsl_to_nir will already have set a more accurate
qualifier that includes any qualifiers on struct members that are
dereferenced.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
In the next commit, we'll properly handle access qualifiers on struct
members by propagating them to load/store instructions, but these
instructions had no way to specify the qualifier.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inaccessiblememonly means that it doesn't modify memory accesible via
normal LLVM pointers. This lets LLVM's dead store elimination, memcpy
forwarding, etc. ignore functions with this attribute. We don't
represent descriptors as pointers, so this property is always true of
buffer and image stores. There are plans to represent descriptors via
pointers, but this just means that now nothing is inaccessiblememonly,
as LLVM will then understand loads/stores via its usual alias analysis.
Radeonsi was mistakenly only setting it if the driver could prove that
there were no reads, and then it was cargo-culted into ac_llvm_build
and ac_llvm_to_nir. Rip it out of everything.
statistics with nir enabled:
Totals from affected shaders:
SGPRS: 152 -> 152 (0.00 %)
VGPRS: 128 -> 132 (3.12 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 9324 -> 9244 (-0.86 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 17 -> 17 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
The only difference was a manhattan31 shader.
Acked-by: Timothy Arceri <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
close() is in <unistd.h>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to disable the FMASK decompress pass when
transitioning from CB writes to shader reads.
This will likely be improved and enabled by default in the future.
No CTS regressions on GFX8 but a few number of multisample CTS
failures on GFX9 (they look related to the small hint).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Found while working on DCC for MSAA.
Fixes: 6b976024a87 ("radv: add support for FMASK expand")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have some serious leaks, so plug some and also move to ralloc to
limit the lifetime of some objects to that of their parent.
Lots more such work to do.
For some reason, this fixes:
dEQP-GLES2.functional.lifetime.attach.deleted_output.texture_framebuffer
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not offer a hardware drm backed egl device if no render node
is available. The current implementation will fail on this
egl device. On top it issues a warning that is actually missleading.
There are finally more error paths that can fail on the way to a
hardware backed egl device. Fixing all of them would kind of require
opening the drm device and see if there is a usable driver associated
with the device. The taken approach avoids a full probe and fixes at
least this kind of problem on kvm virtualization hosts I observe here.
Fixes: dbb4457d985 ("egl: add EGL_EXT_device_drm support")
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
| |
Passes spec@amd_seamless_cubemap_per_texture@amd_seamless_cubemap_per_texture
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-By: Guido Günther <[email protected]>
|
|
|
|
|
|
| |
Update to etna_viv commit a3bf0da.
Signed-off-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
| |
Pointed out by coverity
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This is pointless in that we won't ever hit those paths in real life,
but coverity complains.
Fixes: f014ae3c7cce ("nouveau: add support for nir")
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as
GL_MAP_INVALIDATE_RANGE_BIT naively. When we run into
ptr = glMapBufferRange(buf, 0, size,
GL_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT);
memcpy(ptr, data1, size);
glUnmapBuffer(buf);
ptr = glMapBufferRange(buf, size, size,
GL_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT);
memcpy(ptr, data2, size);
glUnmapBuffer(buf);
we never want data1 to be copy_transfer'ed. Because that would mean
that data2 might overwrite valid data.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Alexandros Frantzis [email protected]
Fixes: a22c5df0794 ("virgl: Use buffer copy transfers to avoid waiting when mapping")
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Now that sRGB formats are supported for both rendering and sampling,
advertise support.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
The performance impact is slightly mitigated by tiling the render
target, but it's undeniably still slow compared to AFBC. Unfortunately,
it doesn't look like AFBC and sRGB play nice...
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
For fixed-function, we have hardware to handle sRGB so we just set a
flag. For blend shaders, it's rather more involved; this is currently
unimplemented. Assert it out for now; we don't need it quite yet.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We already can sample from Mali's linear/tiled encoding (the one from
Utgard -- AFBC is mostly unrelated); let's be able to render to it as
well.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
A mode for rendering tiled/uncompressed was noticed, so we reshuffle the
MFBD render target definitions to explicitly include block type.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This combines the two cmdstream bits "is_3d" and "is_not_cubemap" into a
single 2-bit texture target selection, noticing it's the same as the
2-bit selection in Midgard and Bifrost texturing ops. Accordingly, we
share this definition and add the missing entry for 1D/buffer textures.
This requires a nontrivial (but functionally similar) refactor of all
parts of the driver to use the new definitions appropriately.
Theoretically, this should add support for buffer textures, but that's
obviously not tested and probably wouldn't work.
While doing so, we notice the sRGB enable bit, which we document and
decode as well here so we don't forget about it.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Requirements for a job should be figured out in pan_job.c
v2: [Alyssa] Fix early return
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Move the reset out of frame invalidation into job submission
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Start fleshing out panfrost_job
v2: [Alyssa: Remove unused variable, warning introduced]
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|