| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
All shaders had the same name.
We could probably use some identifier per shader too, but for now only use
the variant number.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In 1d35f77228ad540a551a8e09e062b764a6e31f5e support for multiple constant
buffers was introduced. This meant we had another indirection, and we did
resolve the indirection for each constant buffer access. This looks very
reasonable since llvm can figure out if it's the same pointer, however it
turns out that this can cause llvm compilation time to go through the roof
and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more
than 10 seconds (!)), with all the additional time spent in IR optimization
passes (and in the end all of it in DominatorTree::dominate()).
I've been unable to narrow it down a bit more (only some shaders seem affected,
seemingly without much correlation to overall shader complexity or constant
usage) but it is easily avoidable by doing the buffer lookups themeselves just
once (at constant buffer declaration time).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
When there's an error, also need to flush the stream, otherwise an assertion
is hit (meaning you don't actually see the error neither).
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Not necessary, now that we will free the whole module (hence all
function bodies) immediately after compiling.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Unused. Deprecated by gallivm_free_ir().
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Free up unneeded LLVM stuff immediately after generating vertex shader
code. Saves about 500K per shader.
v2: Don't bother calling gallivm_free_function (Jose)
Signed-off-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split free_gallivm_state() into two steps. First step is
gallivm_free_ir() which cleans up the LLVM scaffolding used to generate
code while preserving the code itself. Second step is
gallivm_free_code() to free the memory occupied by the code.
v2: s/gallivm_teardown/gallivm_free_ir/ (Jose)
Signed-off-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide a JITMemoryManager derivative which puts all generated code into
one memory pool instead of creating a new one each time code is generated.
This saves significant memory per shader as the pool size is 512K and
a small shader occupies just several K.
This memory manager also defers freeing generated code until you tell
it to do so, making it possible to destroy the LLVM engine while keeping
the code, thus enabling future memory savings.
v2: Fix compilation errors with LLVM 3.4 (Jose)
Signed-off-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This is how it is meant to be done nowadays.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I saw that LLVM internally uses its global context for some things, even
when we use our own. Given ours is also global, might as well use
LLVM's.
However, sepearate contexts can still be enabled with a simple source
code modification, for when the need/benefit arises.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Nowadays LLVMModuleProviderRef is just an alias for LLVMModuleRef, so
its use just causes unnecessary confusion.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Older versions haven't been tested probably don't work anyway. But more
importantly, code supporting it is hindering further work.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The screen takes ownership of the winsys, and is responsible for
destroying it. Users of pipe-loader should make sure they destory
and screens they've created to avoid memory leaks.
This fixes a crash in clover introduced by
ce6c17c0833032e91a2d1b34f9eb80c738a854a2 where the pipe-loader was
destroying the winsys while a screen was still using it.
Cc: "10.2" <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1c73e919a4b4dd79166d0633075990056f27fd28 made it possible to not allocate
the tgsi machine if llvm was used. However, draw_get_option_use_llvm() is
not reliable after draw context creation, since drivers can explicitly
request a non-llvm draw context even if draw_get_option_use_llvm() would
return true (and softpipe does just that) which leads to crashes.
Thus use draw->llvm to determine if we're using llvm or not instead (and
make draw->llvm available even if HAVE_LLVM is false so we don't have to put
even more ifdefs).
Cc: "10.2" <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Add cases for PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS and
PIPE_SHADER_CAP_PREFERRED_IR. Remove default switch case so we
learn of missing cases at compile time.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Return PIPE_SHADER_IR_TGSI for the PIPE_SHADER_CAP_PREFERRED_IR query.
Remove default switch case so we learn of missing switch cases at
compile time.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
There are now provided by VS.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
don't leak the MCSubtargetInfo (not really big, was already fixed with
llvm master) and TargetMachine (big). While this is only used for debugging
the leak is large enough to get you into trouble in some cases.
Tested with llvm 3.1 and master.
Before (llvm 3.1), GALLIVM_DEBUG=asm glxgears:
==14152== LEAK SUMMARY:
==14152== definitely lost: 105,228 bytes in 20 blocks
==14152== indirectly lost: 347,252 bytes in 261 blocks
==14152== possibly lost: 866,625 bytes in 1,453 blocks
==14152== still reachable: 7,344,677 bytes in 6,494 blocks
==14152== suppressed: 0 bytes in 0 blocks
After:
==13799== LEAK SUMMARY:
==13799== definitely lost: 3,108 bytes in 6 blocks
==13799== indirectly lost: 0 bytes in 0 blocks
==13799== possibly lost: 804,143 bytes in 1,429 blocks
==13799== still reachable: 7,314,267 bytes in 6,473 blocks
==13799== suppressed: 0 bytes in 0 blocks
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that there are multiple input buffers but only one is
relevant for translation. Then there will be only a single translation
group, which might need to source data from a buffer index != 0.
Fixes wrong vertex shader inputs as observed while debugging with an
application and driver combination that requires translation of a
vertex attribute in a non-trivial set of attributes and input buffers.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Add bitwise reversing and signed MSB helpers for software implementation
of the new TGSI opcodes.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The old python code used sys.is_big_endian to select between little-endian
and big-endian formats, which meant that the build and host endiannesses
needed to be the same. This patch instead generates both big- and little-
endian layouts, using PIPE_ARCH_BIG_ENDIAN to select between them.
Signed-off-by: Richard Sandiford <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Splits out the code that parses the channel list, so that we
can have different lists for little and big endian.
There is no change to the generated u_format_table.c.
Signed-off-by: Richard Sandiford <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than iterate over format.channels and format.swizzles directly,
use Python subfunctions that take the channel and swizzle lists as
arguments. This allow the channel and swizzle lists to depend on
endianness.
There is no change to the generated u_format_table.c.
Signed-off-by: Richard Sandiford <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
With the big-endian changes, there can be two swizzle orders for each format.
This patch turns Format.inv_swizzle() into a global function that takes the
swizzle list as a parameter.
There is no change to the generated u_format_table.c.
Signed-off-by: Richard Sandiford <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The main aim is to reduce the number of places that access channels[0],
swizzles[0] and swizzles[1] directly.
There is no change to the generated u_format_table.c.
Signed-off-by: Richard Sandiford <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function relies on the sw/dri winsys which is build only when --enable-dri
is set. Fixes build issues with the following config
./configure --disable-dri --with-gallium-drivers=svga --enable-xa
Issue can be reproduced with any hw gallium driver + st that uses the pipe-loader.
Cc: Brian Paul <[email protected]>
Reported-by: Brian Paul <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
util_color often merely represents a collection of bytes, however it is
inconvenient if those bytes can only be accessed as floats/doubles for int
formats exceeding 32bits.
(Note that since rgba8 formats use one uint, not 4 bytes, hence the byte and
short member were left as is.)
|
|
|
|
|
|
|
|
|
|
| |
Lets make draw_get_option_use_llvm function available unconditionally
and use it to avoid useless allocations when LLVM paths are active.
TGSI machine is never used when we're using LLVM.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
Courtesy of MSVC static code analyser.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Courtesy of Clang static analyzer.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Mostly for consistency; as MSVC's static source code analysis doesn't
seem to rely on assertions, but instead on different kind of source
annotations( http://msdn.microsoft.com/en-us/library/hh916383.aspx ).
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
warning.
Now that _debug_assert_fail() has the noreturn attribute, it is better
that execution truly never returns. Not just for sake of silencing the
warning, but because the code at the return IP address may be invalid or
lead to inconsistent results.
This removes support for the GALLIUM_ABORT_ON_ASSERT debugging
environment variable, but between the usefulness of
GALLIUM_ABORT_ON_ASSERT and better static code analysis I think better
static code analysis wins.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Just adjust to the ever-changing API, pass in MCContext when creating the
MCDisassembler.
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
| |
As recommended by
http://clang-analyzer.llvm.org/annotations.html#attr_noreturn
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
It was computed, but never actually used.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This replaces u_gen_mipmap with an extremely simple implementation based
on pipe->blit. st/mesa is also cleaned up.
Pros:
- less code
- correct mipmap generation for NPOT 3D textures (u_blitter uses a better
formula)
- queries are not affected by mipmap generation if drivers disable them
v2: add "first_layer", "last_layer" parameters, drop "face"
v2.1: add format
v2.2: document the format parameter
|
|
|
|
|
|
|
|
|
| |
This opcode provide support for GL_ARB_texture_query_lod,
Signed-off-by: Dave Airlie <[email protected]>
[imirkin: rebase, docs update]
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
So that pipe->blit can be used for 3D mipmap generation.
|
|
|
|
|
| |
It may cause issues with mipmap generation.
I think it was used to make some piglit tests pass on r300g.
|
|
|
|
|
|
|
|
|
| |
As we do for sampler states in single_sampler_done() and many other
CSO functions.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to call pipe->set_sampler_views() with count being the
maximum of the old number of sampler views and the new number.
This makes sure we null-out any old sampler views.
We already do the same thing for sampler states in single_sampler_done().
Fixes some assertions seen in the VMware driver with XA tracker.
Cc: "10.0" "10.1" <[email protected]>
Reviewed-by: Thomas Hellstrom <[email protected]>
Tested-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|