| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
Rather than using a dest_override, we upscale integers by using a half
field with a sign-extend bit. A variant of this trick should also work
for floats, but one step at a time!
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We have dedicated intrinsics to access the raw contents of the tile
buffer so we can use a dedicated NIR pass to lower appropriately for
blend shaders, rather than introducing a bizarre hardcoded blend
epilogue that only works for RGBA8_UNORM.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Oh, dear. No turning back now.
We begin implementing non-32-bit types, using downsizing integer type
conversions as the initial instructions. We implement them naively as
type-converting moves; substantially more efficient operation is
possible by copypropping the type conversion modifier, but this
optimization is not implemented here.
Size converting modifiers on Midgard allow an instruction to write to a
destination 1/2 the size, or to read from a source 1/2 the size. If we
need an extreme conversion (32-bit to 8-bit, for instance), multiple
type converting ops are chained together, which here is handled via an
algebraic pass.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This begins the process of removing blend shader specific MIR into a
more general NIR lowering pass for formats.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Eventually, this will allow packing clear colours for all formats,
including floating-point framebuffers, pure integer buffers, and special
formats. Currently, a few of these formats are supported, and many more
are handled through a generic Gallium colour packing path (which is not
a perfect fit for the hardware, but works for many formats and is a sane
default for the moment.)
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We see the hardware doesn't actually support float framebuffers in the
native sense -- rather, it just allows higher bpp framebuffers and lets
a blend shader / additional clear_color fields sort out the formats.
This will be.. interesting.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Full MRT support is a while away, but in the mean time, we can remove
code that explicitly assumes nr_cbufs <= 0, to minimize the obstacles
we'll face later when we add the whole thing.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
We had vendored duplicates from pre-Mesa days; clean that up.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the spec [1], `__egl_Main` is the only symbol that needs to
be exported. We don't want applications directly linking against
libEGL_mesa.so (apps should always go through libEGL.so, regardless of
who is providing it), so we shouldn't export any other symbols either.
[1] https://github.com/NVIDIA/libglvnd/blob/master/include/glvnd/libeglabi.h
(this header is the closest there is to a spec)
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Part of the effort to replace shell scripts with portable python scripts.
I could've used a trivial `assert lines == sorted(lines)`, but this way
the caller is shown which entrypoint is out of order.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the Vulkan ICD spec [1], these two symbols must be exposed:
- vk_icdGetInstanceProcAddr
- vk_icdNegotiateLoaderICDInterfaceVersion
and this one is optional:
- vk_icdGetPhysicalDeviceProcAddr
[1] https://github.com/KhronosGroup/Vulkan-Loader/blob/master/loader/LoaderAndLayerInterface.md
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
No longer used as of last commit :)
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Note: the list in gbm-symbols.txt is the same as the one that was in
gbm-symbols-check, I just took the opportunity to sort it.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've re-written this in bash a couple times over the years, and then
I realised python is much more portable and already required by Mesa, so
we might as well make use of it.
I decided to still use the build system's NM instead of re-implementing
symbols extraction, to offload the complexity of keeping it compatible
with many systems (Linux, Unix, BSD, MacOS, etc.), especially when
cross-building.
This new script checks not only that nothing is exported when it
shouldn't be, but also that everything that should be exported is.
Sometimes, some symbols _can_ be exported but don't have to be, in which
case they can be prefixed with `(optional)`.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by Dylan Baker <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
required by OpenCL
v2: fix setting globalAccess
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
| |
required by OpenCL
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
| |
required for OpenCL
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
| |
required by OpenCL
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Drivers only use lower_io for modes where pointers don't have a
meaningful value, and dereferences can always be traced back to a
variable. But there can be other modes, like global mode with
VK_EXT_buffer_device_address, where pointers cannot be traced back to a
variable, and lower_io would segfault on loads/stores of these since
nir_deref_instr_get_variable() would return NULL.
Just use the mode on the deref itself to filter out these modes before
we try to get the variable.
Fixes: 118a66df990 ("radv: Use NIR barycentric coordinates")
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently this is done rather late in radv, after lowering booleans, so
it isn't safe to run additional optimizations that may add e.g. 1-bit
booleans. We could move the lowering parts earlier, but since right now
we only lower FS inputs and by this point all indirects have been
lowered away, there's no reason we should need to optimize anything.
One shader from Devil May Cry 5 was getting optimized, but only because
the optimization loop was working on 32-bit booleans which revealed an
opportunity that was hidden with 1-bit booleans, and we generated a
1-bit boolean which is invalid.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111092
Fixes: 118a66df9907772bb9e5503b736c95d7bb62d52c
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the following building error:
external/mesa/src/amd/addrlib/src/gfx10/gfx10addrlib.cpp:35:10:
fatal error: 'gfx10_gb_reg.h' file not found
^~~~~~~~~~~~~~~~
1 error generated.
Fixes: 78cdf9a ("amd/addrlib: add gfx10 support")
Signed-off-by: Mauro Rossi <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The necessary Android makefile building rules are added
and the generation rules are simplified for readability
Fixes the following building errors:
external/mesa/src/amd/common/ac_llvm_build.c:1496:45:
error: use of undeclared identifier 'V_008F0C_IMG_FORMAT_8_UINT'
case V_008F0C_BUF_DATA_FORMAT_8: format = V_008F0C_IMG_FORMAT_8_UINT; break;
^
Fixes: 74a26af ("amd/common/gfx10: add register JSON")
Signed-off-by: Mauro Rossi <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix Android building rules for gfx10_format_table.h generated header
(v2) Add LOCAL_C_INCLUDES += $(intermediates)/radeonsi to fix error:
external/mesa/src/gallium/drivers/radeonsi/si_state.c:46:10:
fatal error: 'gfx10_format_table.h' file not found
^~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Fixes: 0ffa229 ("radeonsi/gfx10: generate gfx10_format_table.h")
Signed-off-by: Mauro Rossi <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
The path could be imported automatically.
Signed-off-by: Chih-Wei Huang <[email protected]>
Reviewed-by: Mauro Rossi <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The gen_enum_to_str.py generates vk_enum_to_str.c and its header at once.
However, the makefiles incorrectly list both files parallel with the same
recipes. That means both two files may be generated simultaneously by two
processes. The generating files may be truncated by another process, as
shown below:
$ cd $OUT/obj/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util
$ ls -l
-rw-rw-r-- 1 lh lh 193713 Jul 5 13:31 vk_enum_to_str.c
-rw-rw-r-- 1 lh lh 4609 Jul 5 13:31 vk_enum_to_str.d
-rw-rw-r-- 1 lh lh 0 Jul 5 16:21 vk_enum_to_str.h
Let one file depends on the other with empty recipe to avoid the issue.
Signed-off-by: Chih-Wei Huang <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's unnecessary to manually add these include paths since they could
be imported automatically.
Signed-off-by: Chih-Wei Huang <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Add libmesa_nir to a common LOCAL_STATIC_LIBRARIES defined by
ANV_STATIC_LIBRARIES so that its include path can be imported
automatically. Then ANV_INCLUDES is unnecessary and could be
eliminated.
Signed-off-by: Chih-Wei Huang <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The dummy library libmesa_anv_entrypoints is totally unnecessary.
The four VULKAN_GENERATED_FILES could be generated and built in
libmesa_vulkan_common directly. The libraries using the generated
headers should get it via the exported include path.
Signed-off-by: Chih-Wei Huang <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Export the correct include path so that the libraries use it can
get it automatically.
Signed-off-by: Chih-Wei Huang <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The libmesa_git_sha1 is a dummy library. There is no reason to put
it into LOCAL_WHOLE_STATIC_LIBRARIES.
Move libmesa_vulkan_util to the vulkan.radv which really needs it.
Signed-off-by: Chih-Wei Huang <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The libmesa_anv_entrypoints and libmesa_genxml are dummy libraries.
There is no reason to put them into LOCAL_WHOLE_STATIC_LIBRARIES.
Move libmesa_vulkan_util to the vulkan HAL which really needs it.
Signed-off-by: Chih-Wei Huang <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The vulkan module is the final HAL. No need to export its headers
since none will import it.
Signed-off-by: Chih-Wei Huang <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The vulkan module is the final HAL. No need to export its headers
since none will import it.
Signed-off-by: Chih-Wei Huang <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit re-plumbs all of nir_loop_analyze to use nir_ssa_scalar for
all intermediate values so that we can properly handle swizzles. Even
though if conditions are required to be scalars, they may still consume
swizzles so you could have ((a.yzw < b.zzx).xz && c.xx).y == 0 as your
loop termination condition. The old code would just bail the moment it
saw its first non-zero swizzle but we can now properly chase the scalar
from the if condition to all the way to a, b, and c.
Shader-db results on Kaby Lake:
total loops in shared programs: 4388 -> 4364 (-0.55%)
loops in affected programs: 29 -> 5 (-82.76%)
helped: 29
HURT: 5
Shader-db results on Haswell:
total loops in shared programs: 4370 -> 4373 (0.07%)
loops in affected programs: 2 -> 5 (150.00%)
helped: 2
HURT: 5
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This commit reworks both get_induction_and_limit_vars() and
try_find_trip_count_vars_in_iand to return true on success and not
modify their output parameters on failure. This makes their callers
significantly simpler.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
There are various cases in which we want to chase SSA values through ALU
ops ranging from hand-written optimizations to back-end translation
code. In all these cases, it can be very tricky to do properly because
of swizzles. This set of helpers lets you easily work with a single
component of an SSA def and chase through ALU ops safely.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
None of the current code knows what to do with swizzles. Take the safe
option for now and bail if we see one. This does have a small shader-db
impact but it is at least safe.
Shader-db results on Kaby Lake:
total loops in shared programs: 4364 -> 4388 (0.55%)
loops in affected programs: 5 -> 29 (480.00%)
helped: 5
HURT: 29
Shader-db results on Haswell:
total loops in shared programs: 4373 -> 4370 (-0.07%)
loops in affected programs: 5 -> 2 (-60.00%)
helped: 5
HURT: 2
Fixes: 6772a17acc8ee "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The current code assumes everything is 32-bit which is very likely true
but not guaranteed by any means. Instead, use nir_eval_const_opcode to
do the calculations in a bit-size-agnostic way. We also use the new
constant constructors to build the correct size constants.
Fixes: 6772a17acc8ee "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One issue was that the original version didn't check that swizzles
matched when comparing ALU instructions so it could end up matching
very different instructions. Using the nir_instrs_equal function from
nir_instr_set.c which we use for CSE should be much more reliable.
Another was that the loop assumes it will only run two iterations which
may not be true. If there's something which guarantees that this case
only happens for phis after ifs, it wasn't documented.
Fixes: 9e6b39e1d521 "nir: detect more induction variables"
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|