| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Results from x11perf -copywinwin10 on Eric's SKL:
4.33338% ± 0.905054% (n=40)
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Tested-by: Yogesh Marathe <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While modern pthread mutexes are very fast, they still incur a call to an
external DSO and overhead of the generality and features of pthread mutexes.
Most mutexes in mesa only needs lock/unlock, and the idea here is that we can
inline the atomic operation and make the fast case just two intructions.
Mutexes are subtle and finicky to implement, so we carefully copy the
implementation from Ulrich Dreppers well-written and well-reviewed paper:
"Futexes Are Tricky"
http://www.akkadia.org/drepper/futex.pdf
We implement "mutex3", which gives us a mutex that has no syscalls on
uncontended lock or unlock. Further, the uncontended case boils down to a
cmpxchg and an untaken branch and the uncontended unlock is just a locked decr
and an untaken branch. We use __builtin_expect() to indicate that contention
is unlikely so that gcc will put the contention code out of the main code
flow.
A fast mutex only supports lock/unlock, can't be recursive or used with
condition variables. We keep the pthread mutex implementation around as
for the few places where we use condition variables or recursive locking.
For platforms or compilers where futex and atomics aren't available,
simple_mtx_t falls back to the pthread mutex.
The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
applications. Most CPU bound cases are helped and some of our internal
bind_buffer_object heavy benchmarks gain up to 10%.
Signed-off-by: Kristian Høgsberg <[email protected]>
Signed-off-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When I introduced gl_shader_program_data one of the intentions was to
fix a bug where a failed linking attempt freed data required by a
currently active program. However I seem to have failed to finish
hooking up the final steps required to have the data hang around.
Here we create a fresh instance of gl_shader_program_data every
time we link. gl_program has a reference to gl_shader_program_data
so it will be freed once the program is no longer active.
Cc: "17.2 17.3" <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102177
|
|
|
|
|
|
| |
Cc: "17.2 17.3" <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This turned out to be a dead end, it is much easier and less error
prone to just cache the IR used by the drivers backend e.g. TGSI or
NIR.
Cc: "17.2 17.3" <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has a bit of a surprising effect:
For the render pipeline, the upload_sampler_state_table atom emits
3DSTATE_BINDING_TABLE_POINTERS_XS. It tries to avoid this for compute:
if (GEN_GEN >= 7 && stage_state->stage != MESA_SHADER_COMPUTE) {
/* Emit a 3DSTATE_SAMPLER_STATE_POINTERS_XS packet. */
genX(emit_sampler_state_pointers_xs)(brw, stage_state);
} ...
However, we were failing to initialize brw->cs.base.stage, so it was
left as 0 (MESA_SHADER_VERTEX), causing this condition to break. We
then emitted 3DSTATE_SAMPLER_STATE_POINTERS_VS in GPGPU mode, when
trying to upload CS samplers. Nothing good can come of this.
Found by inspection while debugging a GPU hang. Jordan believes this
helps the Deus Ex: Mankind Divided benchmark mode's stability when
running with shader cache.
Cc: [email protected]
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com>
Cc: [email protected]
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
... as can happen with various types like mat4, or else we'll smash the
stack writing past the end of components_local[].
Fixes: 5a0d3e1129b7 ("nir: Print the components referenced for split or
packed shader in/outs.")
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes compiling the optional standalone glsl compiler.
Reported-by: DrNick (on irc)
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-and-Tested-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes: 94d669b0d2f ("glsl: enforce fragment shader input restrictions in
GLSL ES 3.10")
Signed-off-by: Andreas Boll <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Unused since original import of VC5.
Fixes: ade416d0236 ("broadcom: Add VC5 NIR compiler.")
Signed-off-by: Andreas Boll <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: 45bb8f29571 ("broadcom: Add V3D 3.3 gallium driver called "vc5",
for BCM7268.")
Cc: 17.3 <[email protected]>
Signed-off-by: Andreas Boll <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
So far clover based its test for compiler support on the version of gcc,
while in reality support for c++11 is required. This patch replaces the
version check by the check unified for all modules that require c++11.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a check that tests whether the c++ compiler supports c++11, either
by default, by adding the compiler flag -std=c++11, or by adding a
compiler flag that the user has specified via the environment variable
CXX11_CXXFLAGS.
The test only does a very shallow check of c++11 support, i.e. it tests
whether the define __cplusplus >= 201103L to confirm language support
by the compiler, and it checks whether the header <tuple> is available
to test the availability of the c++11 standard library.
A make file conditional HAVE_STD_CXX11 is provided that is used in this
patch to enable the test in st/mesa if C++11 support is available.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Currently we were overwriting the existing warning flags, instead of
adding new [as applicable].
Fixes c5d2e2d43f6 ("configure: Test for -Wno-initializer-overrides")
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we were overwriting the existing warning flags, instead of
adding new [as applicable].
v2: Add missing space before -Werror (Eric)
Fixes e4b2b69e828 ("configure: Add and use AX_CHECK_COMPILE_FLAG")
Cc: Matt Turner <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]> (v1)
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Because meson mirrors the auototools logic, it needs the same changes to
allow building glvnd based egl.
v2: - change if to elif (Eric)
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Targets such as omx and va can work w/o anything X related. Mandate the
xcb* dependencies only when the X11 platform is selected.
Reported-by: Lukas Rusak <[email protected]>
Fixes: 63e11ac2b5c ("configure: error out if building VA w/o supported
platform")
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Tested-by: Lukas Rusak <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we error out when building GLVND w/o GLX.
That was the original premice before we had EGL. As the commit says,
that error should be reworked to honour both - do so.
v2: Drop noop *);; (Eric)
Reported-by: Lukas Rusak <[email protected]>
Fixes: ce562f9e3fa ("EGL: Implement the libglvnd interface for EGL (v3)")
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Tested-by: Lukas Rusak <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Android implements the API and does the native damage handling itself.
At the same time it
a) does call the vendor's eglSwapBuffersWithDamageKHR
b) does not implement eglSetDamageRegionKHR
There's something strange happening here. For now simply note about the
'lack' of eglSwapBuffersWithDamageKHR support.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function is effectively a direct function call into
libwayland-server.so.
Thus GBM no longer depends on the wayland-drm static library, making the
build more straight forward. And the resulting binary is a bit smaller.
Note: we need to move struct wayland_drm_callbacks further up,
otherwise we'll get an error since the type is incomplete.
v2: Rebase, beef-up commit message, update meson, move struct
wayland_drm_callbacks.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Daniel Stone <[email protected]> (v1)
Reviewed-by: Eric Engestrom <[email protected]> # meson bit only
Acked-by: Eric Engestrom <[email protected]> # for the rest
Reviewed-by: Dylan Baker <[email protected]> # meson
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 05fc62d89f5 sets the variable, yet it forgot the update the
existing reference to append (instead of assign).
Thus as-is the expat library was discarded from the link chain when
building with Android.
Fixes: 05fc62d89f5 ("automake: intel: move expat handling where it's
used")
Cc: Hongxu Jia <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nearly all the distributions* that build Mesa OpenCL, enable the ICD.
Since building a non-ICD driver has the chance of conflicting with
existing OpenCL binary (libOpenCL.so).
Furthermore, some applications expect the library to provide
annotated/versioned symbols.
https://lists.freedesktop.org/archives/mesa-dev/2017-September/171093.html
*Fedora, Suse, Arch, Debian, Ubuntu, FreeBSD use the ICD
Gentoo manages the conflicting files via eselect.
Cc: Matt Turner <[email protected]>
Cc: Jan Vesely <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-By: Aaron Watry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use $(sysconfdir) instead of hardcoding /etc.
While the OpenCL spec expects the file in /etc, people building their
stack can override that, esp. !Linux users.
Furthermore this removes a fundamental violation, which results in the
system file being overwritten even as one explicitly sets --prefix
and/or DESTDIR.
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-By: Aaron Watry <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise it will be missing from the release tarball
Fixes: 7f33e94e43a ("amd/addrlib: update to latest version")
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM 6 changed the API on the fast-math-flags:
https://reviews.llvm.org/rL317488
NOTE: This also enables the new flag 'ApproxFunc' to allow for
approximations for library functions (sin, cos, ...). I'm not completly
convinced, that this is something mesa should do.
Signed-off-by: Tobias Droste <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is mostly a patch done by Ilia Mirkin.
It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.
v2: fix locations for TCS/TES/GS inputs and outputs (Ilia)
CC: Ilia Mirkin <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
Reviewed-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
| |
Use the NIR helper rather than the GLSL IR helper to get in/out
masks. This allows us to ignore varyings removed by NIR
optimisations.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
We want to use nir_shader_gather_info() the GLSL IR version might
be including varyings that NIR later eliminates. To do this we
need to generate NIR before we we start using the in/out bitmasks.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Delaying adding built-in uniforms until after we convert to NIR
gives us a better chance to optimise them away. Also NIR allows
us to iterate over the uniforms directly so should be faster.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses C++11 initializer lists.
I just overwrote all Mesa files with internal addrlib and discarded
hunks that we should probably keep, but I might have missed something.
The code depending on ADDR_AM_BUILD is removed. We can add it back next
time if needed.
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Fixes GL_OUT_OF_MEMORY from streaming-texture-leak (and will hopefully
keep piglit from ooming on my no-swap platform, as well).
|
|
|
|
|
| |
We were doing f16 unpacks, which trashed "1" values. Fixes many piglit
texwrap GL_EXT_texture_integer cases.
|
|
|
|
|
|
| |
Gallium disables it by removing the streamout buffers, not by binding a
program that doesn't have TF outputs. Fixes piglit
"ext_transform_feedback2/counting with pause"
|
|
|
|
| |
Fixes piglit discard-drawarrays.
|
|
|
|
|
|
|
| |
The v3d_qpu_writes_r*() were only checking for fixed-function accumulator
writes, not normal ALU writes to those regs.
Fixes fs-discard-exit-2 on simulation (but not HW).
|
|
|
|
|
|
| |
We have to compute the queries in software, so we're counting the
primitives by hand. We still need to make sure to not increment the
PRIMITIVES_EMITTED if we overflowed, but leave that for later.
|
|
|
|
| |
Fixes all of piglit's OQ tests.
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Most of NIR doesn't allow doing array indexing on a vector (though it
does on a matrix). However, nir_lower_io handles it just fine and this
behavior is needed for shared variables in Vulkan. This commit makes
glsl_get_array_element do something sensible for vector types and makes
nir_validate happy with them.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We were already validating that the parent type goes along with the
child type but we weren't actually validating that the parent type is
reasonable. This fixes that.
Acked-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared
as a uniform. This means that it cannot change across an invocation
such as a draw call or a compute dispatch. For compute shaders, we're
ok because we only ever use one dispatch size. For fragment, however,
the hardware dynamically chooses between SIMD8 and SIMD16 which violates
the spec. Instead, let's just pick a subgroup size based on the shader
stage. The fixed size we choose for compute shaders is a bit higher
than strictly needed but there's no real harm in that. The advantage is
that, if they do anything interesting with the value, NIR will see it as
an immediate and can optimize better.
Acked-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ballot intrinsics return a bitfield of subgroups. In GLSL and some
SPIR-V extensions, they return a uint64_t. In SPV_KHR_shader_ballot,
they return a uvec4. Also, some back-ends would rather pass around
32-bit values because it's easier than messing with 64-bit all the time.
To solve this mess, we make nir_lower_subgroups take a new parameter
called ballot_bit_size and it lowers whichever thing it gets in from the
source language (uint64_t or uvec4) to a scalar with the specified
number of bits. This replaces a chunk of the old lowering code.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
| |
This lets you easily build integer immediates of arbitrary bit size.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The SUBGROUP_*_MASK system values are uint64_t when coming in from GLSL
but uvec4 when coming in from SPIR-V. Lowering based on type allows us
to nicely handle both.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|