| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we dump the bitcode for off-line debug purposes, we really want the
pre-optimized bitcode, otherwise it's useless in identifying problems
with IR optimization (if you have a shader which takes an hour to do
IR optimization, it's also nice you don't have to wait that hour...).
Also, print out the function passes for opt which correspond to what
was used for jit compilation (and also the opt level for codegen).
Using opt/llc this way should then pretty much mimic what was done
for jit. (When specifying something like -time-passes
-debug-pass=[Structure|Arguments] (for either opt or llc) that also
gives very useful information in which passes all the time was spent,
and which passes are really run along with the order - llvm will add
passes due to dependencies on its own, and of course -O2 for llc
comes with a ~100 pass list.)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Conversion to int can otherwise overflow if compile times are over
~71min. (Yes this can happen...)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LICM is simply too expensive, even though it presumably can help quite
a bit in some cases.
It was definitely cheaper in llvm 3.3, though as far as I can tell with
llvm 3.3 it failed to do anything in most cases. early-cse also actually
seems to cause licm to be able to move things when it previously couldn't,
which causes noticeable compile time increases.
There's more loop passes in llvm, but I'm not sure which ones are helpful,
and I couldn't find anything which would roughly do what the old licm in
llvm 3.3 did, so ditch it.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass is quite cheap, and can simplify the IR quite a bit for our
generated IR.
In particular on a variety of shaders I've found the time saved by
other passes due to the simplified IR more than makes up for the cost
of this pass, and on top of that the end result is actually better.
The only downside I've found is this enables the LICM pass to move some
things out of the main shader loop (in the case I've seen, instanced
vertex fetch (which is constant within the jit shader) plus the derived
instructions in the shader) which it couldn't do before for some reason.
This would actually be desirable but can increase compile time
considerably (licm seems to have considerable cost when it actually can
move things out of loops, due to alias analysis). But blaming early cse
for this seems inappropriate. (Note that the first two sroa / earlycse
passes are similar to what a standard llvm opt -O1/-O2 pipeline would
do, albeit this has some more passes even before but I don't think
they'd do much for us.)
It also in particular helps some crazy shader used for driver
verification (don't ask...) a lot (about factor of 6 faster in compile
time) (due to simplfiying the ir before LICM is run).
While here, also move licm behind simplifycfg. For some shaders there
seems to be very significant compile time gains (we've seen a factor
of 10000 albeit that was a really crazy shader you'd certainly never
see in a real app), beause LICM is quite expensive and there's cases
where running simplifycfg (along with sroa and early-cse) before licm
reduces IR complexity significantly. (I'm not entirely sure if it would
make sense to also run it afterwards.)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't use any of the existing implementations in u_debug_stack.
Android technically has libunwind, but it's been modified to the point
where it no longer compiles with the Mesa usage. The library is also
not meant to be referenced by vendor libraries. The officially sanctioned
way of obtaining backtraces is through the Android own libbacktrace, a
C++ library. Access it through a separate C++ source file on Android only.
Signed-off-by: Stefan Schake <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The fallback path for no libunwind ends up being stubs for Android.
Don't compile them in so we can provide our own implementation.
Signed-off-by: Stefan Schake <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
In preparation of dimension-aware LLVM image intrinsics.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This is in preparation for the new image intrinsics.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
To match the header file.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Don't include Unix headers or use Unix functions when building with MSVC.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
which also simplifies the build scripts.
|
|
|
|
|
|
|
|
|
|
|
| |
Add this prefix to the env var: "simple," For example:
GALLIUM_HUD=simple,fps
The X coordinates are the same, but the Y coordinates are different, because
there is only text.
'+' happens to behave the same as "\n".
',' happens to behave the same as "\n\n".
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds a helper to check if a pipe format is in YUV color space.
Drivers want to know about this, as YUV mostly needs special handling.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the old days (0.42.x), when mesa's meson system was written the
recommendation for handling conditional dependencies was to define them
as empty lists. When meson would evaluate the dependencies of a target
it would recursively flatten all of the arguments, and empty lists would
be removed. There are some problems with this, among them that lists and
dependencies have different methods (namely .found()), so the
recommendation changed to use `dependency('', required : false)` for
such cases. This has the advantage of providing a .found() method, so
there is no need to do things like `dep_foo != [] and dep_foo.found()`,
such a dependency should never exist.
I've tested this with 0.42 (the minimum we claim to support) and 0.45.
On 0.45 this removes warnings about comparing unlike types, such as:
meson.build:1337: WARNING: Trying to compare values of different types
(DependencyHolder, list) using !=.
v2: - Use dependency('', required : false) instead of
declare_dependency(), the later will always report that it is
found, which is not what we want.
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549
|
|
|
|
|
|
| |
This fixes a radeonsi crash.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026
|
|
|
|
|
|
|
|
|
|
| |
Without this the return value will never get set to -1. This
was first added in 49866c8f3457 and copied in 2b396eeed983.
Fixes: 2b396eeed983 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager"
Reviewed-by: Marek Olšák <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342
|
|
|
|
|
|
|
|
|
| |
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
util_is_power_of_two_or_zero
The new name make the zero-input behavior more obvious. The next
patch adds a new function with different zero-input behavior.
Signed-off-by: Ian Romanick <[email protected]>
Suggested-by: Matt Turner <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1
as a divisor, so we would overflow to count=0 and upload no data,
triggering the assert below. We want to upload 1 element in this case,
fixing the test on VC5.
v2: Use some more obvious logic, and explain why we don't use the normal
round_up().
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes these build errors with GCC 4.4.
Compiling src/gallium/auxiliary/util/u_debug_stack.c ...
src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’:
src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions
Fixes: 370e356ebab4 ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Need to update the tgsi code and st_glsl_to_tgsi code at the same time
to prevent compile break since C++ is much pickier about implicit
enum/unsigned casting.
Bump size of glsl_to_tgsi_instruction::op to 10 bits to be sure to
avoid MSVC signed enum overflow issue. No change in class size.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This way we can utilise it with later patches.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Calling __builtin_frame_address with a nonzero argument is unsafe
but is sometimes done for debugging purposes. Since this code is
part of some debug util code I'm assuming that is the case here
and using GCC pragma to silence the warning.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes a memory trashing crash (not the test) seen with
dEQP-GLES3.stress.draw.unaligned_data.random.203
on virgl.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
This is an old patch that I had.
|
|
|
|
|
|
|
| |
(For the pipe_tex_filter enum)
Reviewed-by: Mathias Fröhlich <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
The logic would not work correctly for line lengths smaller than 1.0,
even a degenerated line with length 0 would still produce a fragment
with anyhwere between alpha 0.0 and 0.5.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Like the r600 paths to use other custom states, we pass in a couple of
parameters to customize the innards of the blitter. It's up to the caller
to wrap other state necessary for its shaders (for example, constant
buffers for the uniforms the shader uses).
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tegra K1 and later use a GPU that can be driven by the Nouveau driver.
But the GPU is a pure render node and has no display engine, hence the
scanout needs to happen on the Tegra display hardware. The GPU and the
display engine each have a separate DRM device node exposed by the
kernel.
To make the setup appear as a single device, this driver instantiates
a Nouveau screen with each instance of a Tegra screen and forwards GPU
requests to the Nouveau screen. For purposes of scanout it will import
buffers created on the GPU into the display driver. Handles that
userspace requests are those of the display driver so that they can be
used to create framebuffers.
This has been tested with some GBM test programs, as well as kmscube and
weston. All of those run without modifications, but I'm sure there is a
lot that can be improved.
Some fixes contributed by Hector Martin <[email protected]>.
Changes in v2:
- duplicate file descriptor in winsys to avoid potential issues
- require nouveau when building the tegra driver
- check for nouveau driver name on render node
- remove unneeded dependency on libdrm_tegra
- remove zombie references to libudev
- add missing headers to C_SOURCES variable
- drop unneeded tegra/ prefix for includes
- open device files with O_CLOEXEC
- update copyrights
Changes in v3:
- properly unwrap resources in ->resource_copy_region()
- support vertex buffers passed by user pointer
- allocate custom stream and const uploader
- silence error message on pre-Tegra124
- support X without explicit PRIME
Changes in v4:
- ship Meson build files in distribution tarball
- drop duplicate driver_tegra dependency
Reviewed-by: Emil Velikov <[email protected]>
Acked-by: Emil Velikov <[email protected]>
Tested-by: Andre Heider <[email protected]>
Reviewed-by: Dmitry Osipenko <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This option is disabled by default. Primarily intended for drivers on
virtual hardware.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Sinclair Yeh <[email protected]>
Reviewed-by: Deepak Rawat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In contrast to non-aa, where stippling is based on either dx or dy
(depending on if it's a x or y major line), stippling is based on
actual distance with smooth lines, so adjust for this.
(It looks like there's some minor artifacts with mesa demos
line-sample and stippling, it looks like the line endpoints
aren't quite right with aa + stippling - maybe due to the
integer math in the stipple stage, but I can't quite pinpoint it.)
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivation actually was to get rid of the additional tex
instruction, since that requires the draw fallback code to intercept
all sampler / view calls (even if the fallback is never hit).
Basically, the idea is to use coverage of the pixel to calculate
the alpha value, and coverage is simply based on the distance
to the center of the line (in both line direction, which is useful
for wide lines, as well as perpendicular to the line).
This is much closer to what hw supporting this natively actually does.
It also fixes an issue with line width not quite being correct, as
well as endpoints getting stretched too far (in line direction) with
wide lines, which is apparent with mesa demo line-sample.
(For llvmpipe, it would probably make sense to do something like this
directly when drawing lines, since rendering two tris is twice as
expensive as a line, but it would need some changes with state
management.)
Since we're no longer relying on mipmapping to get the alpha value,
we also don't need to draw 3 rects (6 tris), one is sufficient.
There's still issues (as before):
- quite sure it's not correct without half_pixel_center, but can't test
this with GL.
- aaline + line stipple is incorrect (evident with line-sample demo).
Looking at the spec the stipple pattern should actually be based on
distance (not just dx or dy for x/y major lines as without aa).
- outputs (other than pos + the one used for line aa) should be
reinterpolated since we actually increase line length by half a pixel
(but there's no tests which would care).
v2: simplify the math (should be equivalent), don't need immediate
v3: use float versions of atan2,cos,sin, minor cleanups
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Limit the length of acceptable cpu names for use in hud_get_num_cpufreq
in order to avoid a buffer overflow later in add_object when this name
is copied into cpufreq_info::name.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274
Signed-off-by: Gert Wollny <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of listing all the UNIX PIPE_OS platforms just use
PIPE_OS_UNIX. Makes BSD sockets available on PIPE_OS_BSD.
Signed-off-by: Jonathan Gray <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
There's no point, we know the highest non-null one.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
We already stored the highest (potentially) used number.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Seems to have not been used since 16be87c90429
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This makes the dependencies easier to manage, since each media target
doesn't need to worry about linking to half a dozen libraries.
Fixes: b1b65397d0c4978e3 ("meson: Build gallium auxiliary")
Signed-off-by: Dylan Baker <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|