| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
The new location field can be either center, centroid, or sample, which
indicates the location that the shader should interpolate at.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
The _msaa shaders weren't getting freed.
Cc: "10.2" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This doesn't fix any known issue. In fact, radeon drivers ignore all
the discard flags for textures and implicitly do "discard range"
for any write transfer.
Cc: [email protected]
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that this cap is used to determine the availability of both, adjust
its name to reflect the new reality.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This is for reporting whether or not double precision floating-point
operations are supported.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
it looks since ce1a1372280d737a1b85279995529206586ae480 they are now included
in more places, in particular even for things buildable with msvc, and hence
those break the build.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
| |
This is required for fallbacks to work with ARB_draw_indirect.
|
|
|
|
|
|
| |
v2:
Added comments to util_draw_indirect, clarified and fixed map size.
Removed unlikely().
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Required for the conversion stage of all VL targets to
a single library per API (static/shared pipe-drivers).
No longer required as per last commit.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old logic would let all negative values go through unclamped, with
potentially disastrous results (probably trying to fetch viewport values
from random memory locations). GL has undefined rendering for vp indices
outside valid range but that's a bit too undefined...
(The logic is now the same as in llvmpipe.)
CC: "10.1 10.2" <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create a single library (for the vdpau api) thus reducing
the overall size of mesa. Current commit converts
vdpau-nouveau, with upcomming commits handling the rest.
The library can be built with the relevant pipe-drivers
statically linked in, or loaded as shared modules.
Currently we default to static.
Add SPLIT_TARGETS to guard the other VL targets.
Note: symlink handling is rather ugly and will need an
update to work with BSD and other non-linux platforms.
v2: Split the conversion into per-target basis.
Cc: Maarten Lankhorst <[email protected]>
Cc: Ilia Mirkin <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously llvm detected cpu features automatically when the execution engine
was created (based on host cpu). This is no longer the case, which meant llvm
was then not able to emit some of the intrinsics we used as we didn't specify
any sse attributes (only on avx supporting systems this was not a problem since
despite at least some llvm versions enabling it anyway we always set this
manually). So, instead of trying to figure out which MAttrs to set just set
MCPU.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=77493.
Reviewed-by: Jose Fonseca <[email protected]>
Tested-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Add a couple of helpers to be used by the dri targets when
built with static pipe-drivers. Both functions provide
functionality required by the dri state-tracker.
With this patch ilo, nouveau and r300 gain support for
throttle dri configuration.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Will be used by gallium targets that statically link the
pipe-drivers in the final library. Provides identical
functionality to device_descriptor.create_screan.
v2:
- Don't sw_screen_wrap the i915/svga screen.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
If memory serves me right, at least one debug wrapper does
not return the base screen on failure.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Required for the dri state-tracker. Will be used to retrieve
driver specific configuration parameters:
- share_fd (dmabuf) capability
- throttle
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
The string is malloc'd (strdup) in loader_get_driver_for_fd().
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The extension is always supported if GLSL 1.30 is supported.
Softpipe and llvmpipe support is also added (trivial).
Radeon and nouveau support is already done.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Such conversions (which are most likely rather pointless in practice) were
resulting in shifts with negative shift counts and shifts with counts the same
as the bit width. This was always undefined in llvm, the code generated was
rather horrendous but happened to work.
So make sure such shifts are filtered out and replaced with something that
works (the generated code is still just as horrendous as before).
This fixes lp_test_format, https://bugs.freedesktop.org/show_bug.cgi?id=73846.
v2: prettify by using build context shift helpers.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
We need this for radeonsi, and it might be useful for other drivers,
too.
|
|
|
|
|
|
|
|
|
| |
Use the has_streamout flag as we do elsewhere to check if we need
to call pipe->set_stream_output_targets(). The driver might implement
the set_stream_output_targets() function, but not for all hardware
configurations.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes the limits for GL 3.2, and subsequently fixes
some segfaults in some varying packing tests and max varying tests
after the limits bumped.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This limits the number of emitted vertices to the shaders max output
vertices, and avoids us writing things into memory that isn't big
enough for it.
Reviewed-by: Zack Rusin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
disabled.
At least on MSVC we statically link against the CRT, so we must disable
the CRT message boxes if we want unattended testing.
The messages are convenient when running manually, so let them be if the
system error message boxes are not disabled.
|
|
|
|
|
|
| |
Marek v2: add a cap
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes this build error with icc 14.0.2.
In file included from state_tracker/st_glsl_to_tgsi.cpp(63):
../../src/gallium/auxiliary/util/u_math.h(583): error: identifier "__builtin_clrsb" is undefined
return 31 - __builtin_clrsb(i);
^
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
| |
Fixed upstream.
|
|
|
|
|
|
|
| |
It works fine, though it requires using ELF objects.
With this change there is nothing preventing us to switch exclusively
to MCJIT, everywhere. It's still off though.
|
|
|
|
|
|
|
|
|
|
| |
In commit 4be146b1, I neglected to add the new property to the strings
array. This leads to the string '(null)' to be printed instead when
converting a GS shader to text.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
2ea923cf571235dfe573c35c3f0d90f632bd86d8 had the side effect of IR counting
now being done after IR optimization instead of before. Some quick analysis
shows that there's roughly 1.5 times more IR instructions before optimization
than after, hence the effective shader cache size got quite a bit smaller.
Could counter this with an increase of the instruction limit but it probably
makes more sense to count them after optimizations, so move that code.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
I actually checked the getModuleIdentifier() function exists with 3.1 but
missed that the file moved...
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=78803
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enabled with GALLIVM_DEBUG=perf (which up to now was only used to print
warnings for unoptimized code).
While some unexpectedly long shader compile times for some shaders were fixed
with 8a9f5ecdb116d0449d63f7b94efbfa8b205d826f this should help recognize such
problems in the future. For now though only available in debug builds (which
are not always suitable for such analysis). And since this uses system time,
it might not be all that accurate (even llvmpipe's own rasterization threads
might be running at the same time, or just other tasks).
(llvmpipe also has LP_DEBUG=counters but this only gives an average per shader
and the the total time for all shaders.)
This prints information like this:
optimizing module fs17_variant0 took 1 msec
optimizing module setup_variant_0 took 0 msec
optimizing module draw_llvm_vs_variant0 took 9 msec
optimizing module draw_llvm_vs_variant0 took 12 msec
optimizing module fs17_variant1 took 2 msec
v2: rebase for recent gallivm compilation changes, and print time for whole
modules instead of functions (otherwise it would be very spammy since it would
include all trivial inline sse2 functions), using the shiny new module names,
prying them off LLVM using new helper (not available through C bindings).
Per function timings, while possibly giving more information (if there'd be
a problem only in for instance the partial not the whole function), don't seem
all that useful for now.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When we had just one module "gallivm" was an appropriate name. But now we have
modules containing all functions for a particular variant, so give it a
corresponding name (this is really just for helping debugging).
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This workaround doesn't list any llvm version, but it was introduced
2010-06-10 (e277d5c1f6b2c5a6d202561e67d2b6821a69ecc4). It is unlikely
this bug is still present in llvm versions we support (3.1+).
There's no specific test listed, but I ran lp_test_arit (which uses
the mentioned functions) on llvm 3.1 and 3.3 with sse41 disabled and
this pass enabled without issues.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
32bit code generation and llvm >= 2.7 used a different optimization pass
order - this code was initially introduced (2010-07-23) by
815e79e72c1f4aa849c0ee6103621685b678bc9d, apparently due to buggy code being
generated with then brand new llvm versions (which was llvm 2.7 plus pre 2.8
devel).
It seems very highly likely that whatever this bug was it has been fixed in
newer llvm versions, though there's no easy way to test this - the mentioned
piglit test has been removed years ago, and even if you'd build it I'm
sceptical the glsl compiler would still produce the required code to trigger
it.
I have no idea what a good order of passes is, but just remove the workaround
and use the same order everywhere.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
All shaders had the same name.
We could probably use some identifier per shader too, but for now only use
the variant number.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In 1d35f77228ad540a551a8e09e062b764a6e31f5e support for multiple constant
buffers was introduced. This meant we had another indirection, and we did
resolve the indirection for each constant buffer access. This looks very
reasonable since llvm can figure out if it's the same pointer, however it
turns out that this can cause llvm compilation time to go through the roof
and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more
than 10 seconds (!)), with all the additional time spent in IR optimization
passes (and in the end all of it in DominatorTree::dominate()).
I've been unable to narrow it down a bit more (only some shaders seem affected,
seemingly without much correlation to overall shader complexity or constant
usage) but it is easily avoidable by doing the buffer lookups themeselves just
once (at constant buffer declaration time).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
When there's an error, also need to flush the stream, otherwise an assertion
is hit (meaning you don't actually see the error neither).
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Not necessary, now that we will free the whole module (hence all
function bodies) immediately after compiling.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Unused. Deprecated by gallivm_free_ir().
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Free up unneeded LLVM stuff immediately after generating vertex shader
code. Saves about 500K per shader.
v2: Don't bother calling gallivm_free_function (Jose)
Signed-off-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split free_gallivm_state() into two steps. First step is
gallivm_free_ir() which cleans up the LLVM scaffolding used to generate
code while preserving the code itself. Second step is
gallivm_free_code() to free the memory occupied by the code.
v2: s/gallivm_teardown/gallivm_free_ir/ (Jose)
Signed-off-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide a JITMemoryManager derivative which puts all generated code into
one memory pool instead of creating a new one each time code is generated.
This saves significant memory per shader as the pool size is 512K and
a small shader occupies just several K.
This memory manager also defers freeing generated code until you tell
it to do so, making it possible to destroy the LLVM engine while keeping
the code, thus enabling future memory savings.
v2: Fix compilation errors with LLVM 3.4 (Jose)
Signed-off-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|