| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Not sure if this is the right way to do it, but it seems to work.
v2: make it a no-op on LLVM <= 3.5
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
RadeonSI stats: Mostly 0% difference, but Valley shows a small improvement:
Application Files SGPRs VGPRs SpillSGPR SpillVGPR Code Size LDS Max Waves Waits
unigine_valley 278 0.00 % -0.29 % 0.00 % 0.00 % 0.01 % 0.00 % 0.17 % 0.00 %
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently, these are deprecated. There's some AutoUpgrade feature which
is supposed to promote these to cmp/select, which apparently doesn't work
with jit code. It is possible it's not actually even meant to work (see
the bug filed against llvm which couldn't provide an answer neither)
but in any case this is meant to be only temporary unless the intrinsics
are really illegal. So, just use the fallback code (which should be cmp/select,
we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages
to optimize this back to pmin/pmax in the end).
This addresses https://llvm.org/bugs/show_bug.cgi?id=28176
CC: <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Tested-by: Vinson Lee <[email protected]>
Tested-by: Aaron Watry <[email protected]>
|
|
|
|
|
|
|
| |
v2: include whitespace fixes
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
This converts one other place to using the new helper.
Reviewed-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This just uses the same form across the fetches.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This just makes some generic code that currently emits double
suitable for emitting 64-bit values.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't
handle llvm.fmuladd and instead it fall backs to a C function. Which in
turn causes a segfault on Windows.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
As suggested by Roland Scheidegger.
Use the same logic as f16c, since fma requires VEX encoding.
But disable FMA on LLVM 3.3 without MCJIT.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instead of doing a add and then mask out the upper bits, we can
simply do a add with a half wide type (this, of course, assumes
the hw can actually do it...), so we'll get the required zero
in the upper bits automatically.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use GALLIVM_DEBUG=dumpbc for dumping of modules as bitcode.
Instead of a fixed llvmpipe.bc name, use ir_<modulename>.bc so multiple
modules can be dumped (albeit it might still overwrite previous modules,
particularly the modules from draw tend to always have the same name).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Those aren't really interesting, however outputting them is helpful when
trying to feed the IR to llvm llc (or opt) for debugging.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
At least with MCJIT the disassembler will crash otherwise when trying to
disassemble such functions.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't target this yet, and some llvm versions incorrectly enable it based
on cpu string, causing crashes.
(Albeit this is a losing battle, it is pretty much guaranteed when the next
new feature comes along llvm will mistakenly enable it on some future cpu,
thus we would have to proactively disable all new features as llvm adds them.)
This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested)
Tested-by: Timo Aaltonen <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]
CC: <[email protected]>
|
|
|
|
|
|
|
|
| |
LLVM when configured with "intel jitevents" enabled can inform
VTune about dynamic code, so individual shaders are attributed
profiling data and the resulting assembly can be examined.
Acked-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some cases (especially these using fract for coord wrapping) did not handle
NaNs (or Infs) correctly - the following code assumed the fract result
could not be outside [0,1], but if the input is a NaN (or +-Inf) the fract
result was NaN - which then could produce out-of-bound offsets.
(Note that the explicit NaN behavior changes for min/max on x86 sse don't
result in actual changes in the generated jit code, but may on other
architectures. Found by looking through all the wrap functions.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94955
No piglit changes.
(v2: fix min/max typo in coord_mirror, add comment)
Cc: "11.1 11.2" <[email protected]>
Tested-by: Bruce Cherniak <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Avoids warnings in release builds.
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Screwed up since 0753b135f6e83b171d8a1b08aea967374f3542bc.
(Only an issue with different min/mag filters, and then only in some cases,
which is probably why it went unnoticed for quite a while.
The effect should have simply been nearest mip filter instead of linear, iff
min was nearest, mag was linear, and all pixels hit the mignifying path.)
Fixes a bunch of dEQP failures.
Reviewed-by: Jose Fonseca <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
|
|
|
| |
Just use LLVM_HOST_TRIPLE, which is available at least from LLVM 3.3
onwards, and is pretty much what llvm::sys::getProcessTriple() does anyway,
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Just keep a copy of the module_name in gallivm.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
One needs to call setJITMemoryManager for LLVM 3.3, instead of
setMCJITMemoryManager.
This regressed in commits 065256df/75ad4fe7 when trying to make the
code to build with LLVM 3.6.
Tested MCJIT with LLVM 3.3 to 3.6.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On the LLVM versions that support it, so we can easily switch between
MCJIT/old-jit for testing.
The new option is GALLIVM_MCJIT.
Unfortunately setting GALLIVM_MCJIT=1 for LLVM 3.3 or 3.4 causes
segfault, both on Linux and Windows. I'm almost certain this used to
work, so there probably is a regression somewhere.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Instead of LLVM C++ interfaces.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And llvm::raw_string_ostream where not (LLVM 3.3).
Thereby eliminating yet another dependency on unstable LLVM interfaces.
As a bonus this also gets LLVM IR on OutputDebugMessageA on MSVC (which
was disabled, probably due to C++ issues.)
Tested `lp_test_arit -v -v` on LLVM 3.3, 3.4 and 3.8.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This isn't currently that easy to expand, so fix it up
before expanding it later to include dynamic samplers.
[airlied: use some local variables (Roland)]
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
llvm 3.7 sometimes simply miscompiles vector selects.
See https://bugs.freedesktop.org/show_bug.cgi?id=94972
This was fixed in llvm r249669
(https://llvm.org/bugs/show_bug.cgi?id=24532).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The credit for finding and isolating this bug goes to Vinson and Roland.
The buggy LLVM versions were found by doing
opt -instcombine llvm-pr27332.ll > /dev/null
where llvm-pr27332.ll is the IR from
https://llvm.org/bugs/show_bug.cgi?id=27332#c3
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to use sse roundps intrinsic directly, but switched to use the llvm
intrinsics for rounding with e4f01da15d8c6ce3e8c77ff3ff3d2ce2574a3f7b.
However, llvm semantics follows standard math lib round function which is
specced to do roundNearestAwayFromZero but we really want roundNearestEven
(moreoever, using round generates atrocious code since the cpu can't do it
directly and it results in scalar calls to libm __roundf).
So, use llvm.nearbyint instead, which does exactly the right thing, and even
has the advantage of being available with llvm 3.3 too. (I've verified it
actually generates a roundps instruction with llvm 3.3.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For adding .v4f32 like suffixes to intrinsics, taking special care for
scalar case, which was being often neglected.
This fixes invalid IR when doing mipmap filtering on SSE2 (the only
case where we'd use intrinsics with scalars.)
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Exactly the same code.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We could unconditionally use these instrinsics, but performance with SSE2
would suck, as LLVM falls back to calling libm.
lp_test_arit.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
For simulating less capable machines.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
LLVM often can't determine the mask elements are all ones/zeros, and
there doesn't seem to be a good way to hint that.
Thanks to Roland Scheidegger for spotting and analyzing the issue.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
No longer needed.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Only provide a fallback for LLVM 3.3.
One less dependency on LLVM C++ interface.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
By using os_log_message directly, as _debug_vprintf truncates messages
to 4K.
Also cleanup the disassemble interface.
Spotted by Roland.
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an old patch I had around.
Vector selects seem to work well from LLVM 3.3. Using them should
improve code quality, as it might make constant propagation pass more
effective.
Tested lp_test_*
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
This instruction has the resource (buffer or image) as a destination to
represent the writemask for SSBO writes. However, this is obviously not
a "real" destination for the purpose of emitting LLVM IR.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
swr driver which is written in C++ needs access to some more
gallium utility functions than are currently exposed.
Reviewed-by: Roland Scheidegger <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Because the if statement that checks whether we have a return
statement is valid only on x86, surround it with X86 or X86-64
arch defines
Signed-off-by: Oded Gabbay <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, disassemble() directly prints to stdout. This has broke the
profiling support for llvmpipe JIT code.
This patch redirects the output to an sstream object, which is then
either gets printed to stdout (for assembly debugging) or gets written
to a file in /tmp/ (for profiling support).
Signed-off-by: Oded Gabbay <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Just like the rest of the msaa "implementation" it's just fake for now...
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
This functionality is not exposed via the LLVM C API.
Tested-by: Michel Dänzer <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM removed LLVMAddTargetData for the 3.9 release in r260919. For the two
places in mesa where this is called, only enable the lines when compiling
for less then 3.9.
For the radeon driver, I'm not sure how to check if any other LLVM calls need
to be adjusted. I think since the target data used is extracted from the
LLVMModule, it isn't necessary to pass it back to LLVM again.
The code does compile, and at least for radeonsi does run OpenGL games.
[ Michel Dänzer: Move #if closer to LLVMAddTargetData in lp_bld_init.c,
and add HAVE_LLVM < 0x0309 guards around now unused occurrences of TD
and data_layout ]
Signed-off-by: Matthew Dawson <[email protected]>
Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|