| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch sizeof(linear_header) was 20 bytes in a
non-debug build on 32-bit platforms. We do some pointer arithmetic to
calculate the next available location with
ptr = (linear_size_chunk *)((char *)&latest[1] + latest->offset);
in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation
would only be 4-byte aligned.
On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of
4-byte registers to memory) requires an 8-byte aligned address. Such an
instruction is used to store to an 8-byte integer type, like intmax_t
which is used in glcpp's expression_value_t struct.
As a result of the 4-byte alignment returned by linear_alloc_child() we
would generate a SIGBUS (unaligned exception) on SPARC.
According to the GNU libc manual malloc() always returns memory that has
at least an alignment of 8-bytes [1]. I think our allocator should do
the same.
So, simple fix with two parts:
(1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally.
(2) Mark linear_header with an aligned attribute, which will cause
its sizeof to be rounded up to that alignment. (We already do
this for ralloc_header)
With this done, all Mesa's unit tests now pass on SPARC.
[1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
Fixes: 47e17586924f ("glcpp: use the linear allocator for most objects")
Bug: https://bugs.gentoo.org/636326
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
The debug code is all asserts, so protect it with the same thing that
controls assert.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example the following type of thing is seen in TCS from
a number of Vulkan and DXVK games:
vec1 32 ssa_557 = deref_var &oPatch (shader_out float)
vec1 32 ssa_558 = intrinsic load_deref (ssa_557) ()
vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float)
vec1 32 ssa_560 = intrinsic load_deref (ssa_559) ()
vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float)
vec1 32 ssa_562 = intrinsic load_deref (ssa_561) ()
intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x */
intrinsic store_deref (ssa_559, ssa_560) (1) /* wrmask=x */
intrinsic store_deref (ssa_561, ssa_562) (1) /* wrmask=x */
No shader-db changes on i965 (SKL).
vkpipeline-db results RADV (VEGA):
Totals from affected shaders:
SGPRS: 7832 -> 7728 (-1.33 %)
VGPRS: 6476 -> 6740 (4.08 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 469572 -> 456596 (-2.76 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 989 -> 960 (-2.93 %)
Wait states: 0 -> 0 (0.00 %)
The Max Waves and VGPRS changes here are misleading. What is
happening is a bunch of TCS outputs are being optimised away as
they are now recognised as unused. This results in more varyings
being compacted via nir_compact_varyings() which can result in
more register pressure when they are not packed in an optimal way.
This is an existing problem independent of this patch. I've run
some benchmarks and haven't noticed any performance regressions
in affected games.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shader-db results for SLK:
total instructions in shared programs: 13106498 -> 13091573 (-0.11%)
instructions in affected programs: 1186244 -> 1171319 (-1.26%)
helped: 6186
HURT: 0
total cycles in shared programs: 332062633 -> 331961653 (-0.03%)
cycles in affected programs: 8537165 -> 8436185 (-1.18%)
helped: 5371
HURT: 862
LOST: 6
GAINED: 14
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the debug would be:
libEGL debug: No DRI config supports native format 0x20203852
libEGL debug: No DRI config supports native format 0x38385247
but
libEGL debug: No DRI config supports native format R8
libEGL debug: No DRI config supports native format GR88
is a lot easier to understand.
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This requires that the caller make a little (stack) allocation to store
the string.
v2: Use gbm_format_canonicalize (suggested by Daniel)
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
|
|
|
| |
I want it for the format name debugging code.
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
There are two problems:
1) the extra underscore in MISSING_64BIT_ATOMICS
2) we should link with libatomic if the previous test decided we needed
it
Fixes: d1992255bb29054fa51763376d125183a9f602f3
("meson: Add build Intel "anv" vulkan driver")
Reviewed-and-Tested-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Fixes: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.sr8_ext
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This implementation can have massive drawbacks.
Cc: 18.3 <[email protected]>
Reviewed-by: Edmondo Tommasina <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
regs is only set and used on x86; on other platforms (like ARM), this
code causes a trivial warning, solved by moving the regs declaration to
the architecture-dependent usage.
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
meson does this for you with its warn levels, so we don't need to set
it ourselves.
Fixes: d1992255bb29054fa51763376d125183a9f602f3
("meson: Add build Intel "anv" vulkan driver")
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Looks like importing libdrm_freedreno into mesa crossed paths with
e27902a2613.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Add a AYUV entry android in the android backend (Tapani)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Byte ordering is :
0: V
1: U
2: Y
3: A
v2: Split refactoring of alpha channel (Lionel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]> (v1)
Acked-by: Eric Engestrom <[email protected]> (v2)
|
|
|
|
|
|
|
|
|
| |
We're about to introduce AYUV support which provides its own alpha
channel. So give alpha as a parameter and set it to 1 on exising
formats.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Only needed when the pipeline actually uses tessellation. I don't
think that changes anything, except improving readability.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The variable state is free'd and afterwards state->error is used
as the return value, resulting in a use after free bug detected
by memory safety tools like address sanitizer.
Signed-off-by: Hanno Böck <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108636
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers")
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers")
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
execution size.
This can occur during payload setup of SIMD-split send message
instructions, which can lead to the emission of header setup
instructions with a non-zero channel group and fixed SIMD width. Such
instructions could end up using undefined channel enable signals
except they don't care since they're always marked force_writemask_all.
Not known to affect correctness of any workload at this point, but it
would be trivial to back-port to stable if something comes up.
Reported-by: Sagar Ghuge <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Tested-by: Sagar Ghuge <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader-db results radeonsi (VEGA):
Totals from affected shaders:
SGPRS: 161464 -> 161368 (-0.06 %)
VGPRS: 86904 -> 86292 (-0.70 %)
Spilled SGPRs: 296 -> 314 (6.08 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3618596 -> 3573852 (-1.24 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 26189 -> 26276 (0.33 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This pass moves constant outputs to the consuming shader stage
where possible.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Just break out of the loop instead, it does the same thing.
Signed-off-by: Andre Heider <[email protected]>
Reviewed-by: Axel Davy <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Andre Heider <[email protected]>
Reviewed-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes various crashes and hangs when using nine's 'thread_submit'
feature.
On 64bit, the thread function's data argument would just be NULL.
On 32bit, the data argument would be garbage depending on the compiler
flags (in my case -march>=core2).
Fixes: f3fa7e3068512d ("st/nine: Use WINE thread for threadpool")
Cc: [email protected]
Signed-off-by: Andre Heider <[email protected]>
Reviewed-by: Axel Davy <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
and add has_dcc_constant_encode.
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
| |
not needed anymore (probably since the tile_swizzle fix)
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
This reverts commit 5213be9fab72548c799b30e320dd1b257534f096.
|
|
|
|
| |
This reverts commit cccd7a253f9ed14ea748a222f58b0e5c895eb939.
|
|
|
|
|
|
|
|
|
| |
It seems I missed some details when exposing NV_conditional_render
on GLES; this fixes up "make check".
Fixes: 5213be9fab7 ("mesa: expose NV_conditional_render on GLES")
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-and-Tested-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Helpful for debugging compiler backend problems: this allows us to
easily retrieve the LLVM IR from RenderDoc.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
The extension spec has been updated to include GLES 2 support, so let's
enable it there.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
nir_alu_type_get_type_size takes a type as parameter and we were
passing a bit-size instead, which did what we wanted by accident,
since a bit-size of zero matches nir_type_invalid, which has a
size of 0 too.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SIMD16 instructions need to have additional interferences to prevent
source / destination hazards when the source and destination registers
are off by one register.
While we already have code to handle this, it was only running for SIMD16
dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch.
An example of this are pull constant loads since commit b56fa830c6095,
but there are more cases.
This fixes a number of CTS test failures found in work-in-progress
tests that were hitting this situation for 16-wide pull constants
in a SIMD8 program.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because we only have one file_max for the (2d) gs input file, the value
actually represents the max of attrib and vertex index (although I'm
not entirely sure if we really want the max, since the max valid value
of the vertex dimension can be easily deduced from the input primitive).
Thus in cases where the number of inputs is higher than the number of
vertices per prim, we did not properly clamp the vertex index, which
would result in out-of-bound fetches, potentially causing segfaults
(the segfaults seemed actually difficult to trigger, but valgrind
certainly wasn't happy). This might have happened even if the shader
did not actually try to fetch bogus vertices, if the fetching happened
in non-active conditional clauses.
To fix simply use the correct max vertex index value (derived from
the input prim type) instead when clamping for this case.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes Skqp's unitTest_EGLImageTest test.
For Intel platforms, we support external textures only for EGLImages
created with EGL_EXT_image_dma_buf_import. This restriction seems to
be Intel specific and not present for other platforms.
While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent
to the test because of this restriction.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301
Signed-off-by: Aditya Swarup <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use #pragma warning(off) and #pragma warning(on) to disable or enable
all warnings. This is a big hammer. If we ever need a smaller hammer,
we can enhance this functionality.
There is one lame thing about this. Because we parse everything, create
an AST, then convert the AST to GLSL IR, we have to treat the #pragma
like a statment. This means that you can't do something like
' void
' #pragma warning(off)
' __foo
' #pragma warning(on)
' (float param0);
Fixing that would, as far as I can tell, require a huge amount of work.
I did try just handling the #pragma during parsing (like we do for
state for the whole shader.
v2: Fix the #pragma lines in the commit message that git-commit ate.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|