| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.
Reviewed-by: Qiang Yu <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For radeonsi, we will prefer the NIR pass as it'll generate better code
(some index calculation and a single load vs. a load, then index
calculation, then another load) and oftentimes NIR optimization can kick
in and make all the access indices constant.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
When the code is executing an hits a barrier, it will suspend
the coroutine and return control to the coroutine dispatcher.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
coroutines require a proper pass manager, so add the passes
to the correct places
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
These wrap the coroutine intrinsics and also add some higher
level wrappers around coroutine begin, end and suspend procedures
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This allows the counter value to be forced to a certain value
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Each BSD has slightly different sysctl for retrieving per-CPU times.
FreeBSD returns long while NetBSD returns uint64_t. On OpenBSD return
type differs between summation and per-CPU times. DragonFly is
compatible with FreeBSD.
Signed-off-by: Jan Beich <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes: 4d0b2c7aaac3cf3de5af ("ttn: Update shader->info as we generate code.")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
We'll use these in radeonsi.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The majority of these only apply the start argument to the input, but a
few of them also does for the output-array. util_primconvert, the only
user of this argument expects this pass a non-zero start-argument does
not expect this to be applied to the output; if it is, it will write
outside of allocated memory, leading to VRAM corruption.
The reason this doesn't seem to have been noticed before, is that no
driver currently use util_primconvert to convert a primitive-type to
itself, which is the cases where this was broken. But for Zink, this
will no longer be true, because we need to eliminate the use of 8-bit
index-buffers.
Signed-off-by: Erik Faye-Lund <[email protected]>
Fixes: 28f3f8d413f ("gallium/auxiliary/indices: add start param")
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111511
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM 7.0 ditched the pmulu intrinsics.
This is only a trivial patch to use the fallback code instead.
It'll likely produce atrocious code since the pattern doesn't match what
llvm itself uses in its autoupgrade paths, hence the pattern won't be
recognized.
Should fix https://bugs.freedesktop.org/show_bug.cgi?id=111496
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Reduces the size of the u_format_table.c file by 140k (out of 1.64M)
and makes me less confused about endianness in gallium.
Reviewed-by: Roland Scheidegger <[email protected]>
Acked-by: Adam Jackson <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The formats affected are:
- LA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT)
- R8G8B8 x (UNORM, SNORM, SRGB, USCALED, SSCALED, UINT, SINT)
- RG/RGB/RGBA x (64_FLOAT, 32_FLOAT, 16_FLOAT, 32_UNORM, 32_SNORM,
32_USCALED, 32_SSCALED, 32_FIXED, 32_UINT, 32_SINT)
- RGB/RGBA x (16_UNORM, 16_SNORM, 16_USCALED, 16_SSCALED,
16_UINT, 16_SINT)
- RGBx16 x (UNORM, SNORM, FLOAT, UINT, SINT)
- RGBx32 x (FLOAT, UINT, SINT)
- RA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT)
The updated st_formats.c unit test checks that the formats affected by
this change are all array formats in the equivalent Mesa format (if
any). Mesa's array format definition is clear: the value stored is an
array (increasing memory address) of values of the channel's type.
It's also the only thing that makes sense for the RGB types, or very
large types like RGBA64_FLOAT (A should not move to the low address
because the cpu is BE).
Acked-by: Roland Scheidegger <[email protected]>
Acked-by: Adam Jackson <[email protected]>
Tested-by: Matt Turner <[email protected]> (unit tests on BE)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Nothing used this var.
Reviewed-by: Roland Scheidegger <[email protected]>
Acked-by: Adam Jackson <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Nothing accessed the .value field, just the .chan. Unwrap all the
code from the union, for clarity (and 13k less generated code).
Reviewed-by: Roland Scheidegger <[email protected]>
Acked-by: Adam Jackson <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Shaves 30k off of the 1.6M .c file, and makes for less noise for me
trying to understand how gallium formats actually work.
Reviewed-by: Roland Scheidegger <[email protected]>
Acked-by: Adam Jackson <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
This adds the callbacks for the driver/gallium binding for
image operations.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
This introduces the jit image type into the jit interface
for vertex/geom shaders
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
This lets us create an image structure with the same basic
types as the texture one.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Just invert the exec_mask to get if this is a helper or not.
v2: get the bld mask (Roland)
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This allows using it with a const src later.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Not sure how I missed this before, but compswap was hitting an
assert here as it is it's own special case.
Fixes: b5ac381d8f ("gallivm: add buffer operations to the tgsi->llvm conversion.")
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
No driver implements them yet, but this is a long way toward gallium
having matching format enums for Mesa formats.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
I decided not to update nblocks() with a depth arg as the callers
wouldn't be doing ASTC 3D.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
To add ASTC 3D compression formats, we need to be able to express the
block depth. While I'm touching every line, line up the columns of
the CSV again as they've drifted over time.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Code that used it was removed in 4ebe6b2e72e ("tgsi: Drop the SSE2
constants setup that's been dead code since 2011.")
Acked-by: Eric Engestrom <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
v2: Pass through to oscreen rather than faking it (review from Marek).
Fixes: 0346b700833 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Fixes: 0346b700833 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Fixes: 0346b700833 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Fixes: 0346b700833 ("gallium/screen: Add pipe_screen::resource_get_param")
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Expose configs when allow_fp16_configs has been enabled and
DRI_LOADER_CAP_FP16 is set in the loader.
Also, make kms_swrast_dri respect format bpp, to allow for allocating
buffers wider than 32 bpp.
Make fp16 opt-in for gallium.
Signed-off-by: Kevin Strasser <[email protected]>
Reviewed-by: Adam Jackson <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader render)
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Acked-by: Eric Engestrom <[email protected]>
Reviewed-By: Gert Wollny <[email protected]>
|
|
|
|
|
|
|
| |
The SSE2 executor was removed in 4eb3225b38ce ("Remove tgsi_sse2.")
Acked-by: Eric Engestrom <[email protected]>
Reviewed-By: Gert Wollny <[email protected]>
|
|
|
|
|
|
|
|
| |
This was fixed in 912ed84f8338 ("tgsi: move to using vector for system
values.")
Acked-by: Eric Engestrom <[email protected]>
Reviewed-By: Gert Wollny <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implementation introduced in "tgsi_to_nir: be careful about not
losing any TGSI properties silently (v2)" updates all the TGSI properties,
but it didn't take into account that the shader_info structure uses a union
to store the different attributes for each shader stage.
Now we only update the attributes if they affect current shader stage,
avoiding to overwrite members of the union that should be overwritten.
This has created hundreds of regressions in v3d.
For example the TGSI_PROPERTY_VS_BLIT_SGPRS_AMD was overwritting the
same position used by TGSI_PROPERY_CS_FIXED_BLOCK_DEPTH.
Fixes: e3003651978 ("tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)")
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When building Mesa against a recent LLVM 10 with C++11, the build fails
if the AMD common code is built as well due to "std::index_sequence"
being undeclared.
LLVM requires a minimum of C++14.
Signed-off-by: Kai Wasserbäch <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This method returns size_t, but the multiplication multiplies two
integers, leading to overflow rather than type widening.
Noticed by compiling with MSVC, which emits a warning.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
On Windows, p_atomic_inc_return returns an unsigned long long rather
than the type the pointer refers to, so let's make sure we cast the
result to the right type. Otherwise, we'll trigger a warning about
the wrong format-string for the type.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There was two incompatible definitions of strcasecmp, which lead to a
compiler warning. Let's clean this up by only leaving one of them, and
using that one all the time.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|