| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inaccessiblememonly means that it doesn't modify memory accesible via
normal LLVM pointers. This lets LLVM's dead store elimination, memcpy
forwarding, etc. ignore functions with this attribute. We don't
represent descriptors as pointers, so this property is always true of
buffer and image stores. There are plans to represent descriptors via
pointers, but this just means that now nothing is inaccessiblememonly,
as LLVM will then understand loads/stores via its usual alias analysis.
Radeonsi was mistakenly only setting it if the driver could prove that
there were no reads, and then it was cargo-culted into ac_llvm_build
and ac_llvm_to_nir. Rip it out of everything.
statistics with nir enabled:
Totals from affected shaders:
SGPRS: 152 -> 152 (0.00 %)
VGPRS: 128 -> 132 (3.12 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 9324 -> 9244 (-0.86 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 17 -> 17 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
The only difference was a manhattan31 shader.
Acked-by: Timothy Arceri <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
close() is in <unistd.h>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to disable the FMASK decompress pass when
transitioning from CB writes to shader reads.
This will likely be improved and enabled by default in the future.
No CTS regressions on GFX8 but a few number of multisample CTS
failures on GFX9 (they look related to the small hint).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Found while working on DCC for MSAA.
Fixes: 6b976024a87 ("radv: add support for FMASK expand")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have some serious leaks, so plug some and also move to ralloc to
limit the lifetime of some objects to that of their parent.
Lots more such work to do.
For some reason, this fixes:
dEQP-GLES2.functional.lifetime.attach.deleted_output.texture_framebuffer
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not offer a hardware drm backed egl device if no render node
is available. The current implementation will fail on this
egl device. On top it issues a warning that is actually missleading.
There are finally more error paths that can fail on the way to a
hardware backed egl device. Fixing all of them would kind of require
opening the drm device and see if there is a usable driver associated
with the device. The taken approach avoids a full probe and fixes at
least this kind of problem on kvm virtualization hosts I observe here.
Fixes: dbb4457d985 ("egl: add EGL_EXT_device_drm support")
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
| |
Passes spec@amd_seamless_cubemap_per_texture@amd_seamless_cubemap_per_texture
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-By: Guido Günther <[email protected]>
|
|
|
|
|
|
| |
Update to etna_viv commit a3bf0da.
Signed-off-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
| |
Pointed out by coverity
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This is pointless in that we won't ever hit those paths in real life,
but coverity complains.
Fixes: f014ae3c7cce ("nouveau: add support for nir")
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as
GL_MAP_INVALIDATE_RANGE_BIT naively. When we run into
ptr = glMapBufferRange(buf, 0, size,
GL_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT);
memcpy(ptr, data1, size);
glUnmapBuffer(buf);
ptr = glMapBufferRange(buf, size, size,
GL_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT);
memcpy(ptr, data2, size);
glUnmapBuffer(buf);
we never want data1 to be copy_transfer'ed. Because that would mean
that data2 might overwrite valid data.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Alexandros Frantzis [email protected]
Fixes: a22c5df0794 ("virgl: Use buffer copy transfers to avoid waiting when mapping")
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Now that sRGB formats are supported for both rendering and sampling,
advertise support.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
The performance impact is slightly mitigated by tiling the render
target, but it's undeniably still slow compared to AFBC. Unfortunately,
it doesn't look like AFBC and sRGB play nice...
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
For fixed-function, we have hardware to handle sRGB so we just set a
flag. For blend shaders, it's rather more involved; this is currently
unimplemented. Assert it out for now; we don't need it quite yet.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We already can sample from Mali's linear/tiled encoding (the one from
Utgard -- AFBC is mostly unrelated); let's be able to render to it as
well.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
A mode for rendering tiled/uncompressed was noticed, so we reshuffle the
MFBD render target definitions to explicitly include block type.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This combines the two cmdstream bits "is_3d" and "is_not_cubemap" into a
single 2-bit texture target selection, noticing it's the same as the
2-bit selection in Midgard and Bifrost texturing ops. Accordingly, we
share this definition and add the missing entry for 1D/buffer textures.
This requires a nontrivial (but functionally similar) refactor of all
parts of the driver to use the new definitions appropriately.
Theoretically, this should add support for buffer textures, but that's
obviously not tested and probably wouldn't work.
While doing so, we notice the sRGB enable bit, which we document and
decode as well here so we don't forget about it.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Requirements for a job should be figured out in pan_job.c
v2: [Alyssa] Fix early return
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Move the reset out of frame invalidation into job submission
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Start fleshing out panfrost_job
v2: [Alyssa: Remove unused variable, warning introduced]
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Add corresponding entries from p_format.h
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
It's nice to keep these two files in sync, as they define
guest userspace <---> host userspace communcation.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Getting rid of a compiler warning :
In file included from ../src/intel/tools/aubinator_viewer.cpp:225:
../src/imgui/imgui_memory_editor.h: In member function ‘void MemoryEditor::DisplayPreviewData(size_t, const u8*, size_t, MemoryEditor::DataType, MemoryEditor::DataFormat, char*, size_t) const’:
../src/imgui/imgui_memory_editor.h:637:16: warning: enumeration value ‘DataType_COUNT’ not handled in switch [-Wswitch]
switch (data_type)
^
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
Enable nir_opt_vectorize.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This effectively does the opposite of nir_lower_alus_to_scalar, trying
to combine per-component ALU operations with the same sources but
different swizzles into one larger ALU operation. It uses a similar
model as CSE, where we do a depth-first approach and keep around a hash
set of instructions to be combined, but there are a few major
differences:
1. For now, we only support entirely per-component ALU operations.
2. Since it's not always guaranteed that we'll be able to combine
equivalent instructions, we keep a stack of equivalent instructions
around, trying to combine new instructions with instructions on the
stack.
The pass isn't comprehensive by far; it can't handle operations where
some of the sources are per-component and others aren't, and it can't
handle phi nodes. But it should handle the more common cases, and it
should be reasonably efficient.
[Alyssa: Rebase on latest master, updating with respect to typeless
moves]
Acked-by: Alyssa Rosenzweig <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for nir_texop_txs instructions which are needed
to support the OpenGL textureSize() function. This is also needed to
support RECT texture sampling which is currently lowered to 2D sampling +
a TXS() instruction by the nir_lower_tex() helper.
Changes in v2:
* Split options for the 1st and 2nd tex lowering passes
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We are about to add support for the TXS (texture size) op which is not
implemented using a midgard texture instruction. Let's rename emit_tex()
into emit_texop_native() and repurpose emit_tex() as a dispatcher.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're about to add more sysval types, and panfrost_emit_for_draw()
is big enough, so let's move the sysval upload logic in a separate
function.
We also add one sub-function per sysval type to keep the
panfrost_upload_sysvals() small/readable.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We are about to add support for nir_texop_txs which requires adding a
sysval/uniform containing the texture size. Let's change the
emit_sysval_read() prototype to take a nir_instr object instead of
a nir_intrinsic_instr one so we can re-use this function when emitting
a sysval for a txs instruction.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The V3D driver has an open-coded solution for this, and we need the
same thing for Panfrost, so let's add a generic way to lower TXS(LOD)
into max(TXS(0) >> LOD, 1).
Changes in v2:
* Use == 0 instead of !
* Rework the minification logic as suggested by Jason
* Assign cursor pos at the beginning of the function
* Patch the LOD just after retrieving the old value
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_texture_size() will create a txs instruction with ->sampler_dim set
to the original tex->sampler_dim. The condition to call lower_rect()
only checks the value of ->sampler_dim and whether lower_rect is
requested or not. This leads to an infinite loop when calling
nir_lower_tex() with the same options until it returns false.
In order to avoid that, let's move the tex->sampler_dim patching before
get_texture_size() is called. This way the txs instruction will have
->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try
to lower it on the subsequent passes.
Changes in v2:
* Add Jason R-b
* Add a comment explaining why we patch ->sampler_dim at the beginning
of the lower_rect() func
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code considers that projector lowering was done even if it's not
really the case. Change the project_src() prototype to return a bool
encoding whether projector lowering happened or not and update the
progress var accordingly in nir_lower_tex_block().
---
Changes in v2:
* Add Jason R-b
* Drop the part suggesting that nir_lower_rect() could be called in
a do-while(progress) loop.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We hadn't updated the kernel header after the driver got into mainline.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Alyssa fixed some failing tests last night.
Signed-off-by: Tomeu Vizoso <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Only skip levels without DCC when it's a DCC decompression.
Whoops.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
In other words, make use of radv_dcc_enabled() instead of
radv_image_has_dcc() all over the places.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader-db results:
total instructions in shared programs: 9117550 -> 9102719 (-0.16%)
instructions in affected programs: 1752873 -> 1738042 (-0.85%)
helped: 7076
HURT: 478
helped stats (abs) min: 1 max: 22 x̄: 2.19 x̃: 2
helped stats (rel) min: 0.07% max: 13.89% x̄: 1.70% x̃: 1.07%
HURT stats (abs) min: 1 max: 7 x̄: 1.41 x̃: 1
HURT stats (rel) min: 0.09% max: 10.17% x̄: 0.86% x̃: 0.54%
95% mean confidence interval for instructions value: -2.00 -1.92
95% mean confidence interval for instructions %-change: -1.58% -1.50%
Instructions are helped.
total max-temps in shared programs: 1327774 -> 1327728 (<.01%)
max-temps in affected programs: 1025 -> 979 (-4.49%)
helped: 47
HURT: 2
helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1
helped stats (rel) min: 2.63% max: 20.00% x̄: 7.67% x̃: 5.26%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 4.17% max: 4.17% x̄: 4.17% x̃: 4.17%
95% mean confidence interval for max-temps value: -1.06 -0.82
95% mean confidence interval for max-temps %-change: -8.89% -5.49%
Max-temps are helped.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
v2: use _mesa_set_search() (Eric)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Without them, the state tracker falls back to an RGBA format, but it
doesn't always manage to override the swizzle for us. So we lose the
information that the API expects an X channel, where alpha is garbage
and reads back as 1. We have no equivalent ISL RGBX format for these,
so we just use RGBA directly and override the swizzle in all cases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VaryingNames array has NumVaryings entries. But BufferStride is
a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with
more than 4 varyings would read out of bounds.
Also, BufferStride is set based on the shader itself, which means that
it's inherently already included in the hash, and doesn't need to be
included again. At the point when shader_cache_read_program_metadata
is called, the linker hasn't even set those fields yet. So, just drop
it entirely.
Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test.
Fixes: 6d830940f78 glsl/shader_cache: Allow shader cache usage with transform feedback
Reviewed-by: Timothy Arceri <[email protected]>
|