| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
tgsi_ureg was recently enhanced to support local temporaries, and as result
temps are declared individually.
This change avoids many TEMP register declarations on common shaders.
(And fixes performance regression due to mismatches against performance
sensitive shaders.)
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
No behaviour change intended.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
As suggested by Andy Furniss: it looks like some old gcc versions
require it.
|
|
|
|
|
| |
If devs is NULL, then the kernel should be compiled for all devices
associated with the program.
|
|
|
|
|
|
|
|
|
| |
The templated copy constructor doesn't prevent the compiler from
emitting a default copy constructor, which leads to inconsistent
memory handling and was reported to cause segfaults when doing event
manipulation.
Reported-by: Tom Stellard <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
The function internalizer pass marks non-kernel functions as internal,
which enables optimizations like function inlining and global dead-code
elimination.
v2:
- Pass vector arguments by const reference
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Thanks to Brian for spotting this.
|
|
|
|
| |
Somehow they are not recognized as constants.
|
|
|
|
|
|
|
| |
These were leaking.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Removed u_half.py used to generate the table for previous method.
Previous implementation of float to half conversion was faulty for
denormalised and NaNs and would require extra logic to fix,
thus making the speedup of using tables irrelevant.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
This function checks whether a format description is in a simple array format.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Doesn't really change the generated assembly, but produces more compact IR,
and of course, makes code more consistent.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes 21 piglit tests:
spec/glsl-1.10/execution/variable-indexing/
fs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-row-wr
spec/glsl-1.20/execution/variable-indexing/
fs-temp-array-mat3-index-col-row-rd
fs-temp-array-mat3-index-row-rd
fs-temp-array-mat4-col-row-wr
fs-temp-array-mat4-index-col-row-rd
fs-temp-array-mat4-index-col-row-wr
fs-temp-array-mat4-index-row-rd
fs-temp-array-mat4-index-row-wr
vs-temp-array-mat3-index-col-row-rd
vs-temp-array-mat3-index-col-row-wr
vs-temp-array-mat3-index-row-rd
vs-temp-array-mat3-index-row-wr
vs-temp-array-mat4-col-row-wr
vs-temp-array-mat4-index-col-row-rd
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-wr
vs-temp-array-mat4-index-row-rd
vs-temp-array-mat4-index-row-wr
vs-temp-array-mat4-index-wr
... and prevents a lot of GPU lockups
|
|
|
|
|
| |
Remove macro which changes control flow (it's evil).
Make all fail paths print (correct) error message.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
For example
GALLIVM_DEBUG=no_brilinear
which was being parsed as two options, "no" and "brilinear".
|
|
|
|
|
|
|
|
| |
Updated lp_build_printf to share common code.
Removed specific lp_build_print_vecX.
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Since we don't have them in hw we emulate them in the shader. Although not
recommended by the spec it is legit.
As a side effect we also get GL 2.1. I think this is as far as we can take
the i915.
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
DUAL_EXPORT can be enabled on r6xx/r7xx when all CBs use 16-bit export
and there is no depth/stencil export.
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
| |
It seems DUAL_EXPORT on evergreen may be enabled when all CBs use 16-bit export
mode (EXPORT_4C_16BPC), also there should be at least one CB, and the PS
shouldn't export depth/stencil.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases TGSI shader has more color outputs than the number of CBs,
so it seems we need to limit the number of color exports. This requires
different shader variants depending on the nr_cbufs, but on the other hand
we are doing less exports, which are very costly.
v2: fix various piglit regressions
Signed-off-by: Vadim Girlin <[email protected]>
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader variants are stored in the list, the key for lookup is based on the
states that require different hw shaders - currently it's rctx->two_side (all
gpus) and rctx->nr_cbufs (evergreen/cayman, when writes_all property is set).
v2:
- use simple list instead of keymap as suggested by Marek on irc
- call r600_adjust_gprs from r600_bind_vs_shader for r6xx/r7xx
(r600_shader_select isn't used for vertex shaders currently)
v3:
- fix call to r600_adjust_gprs - do it after updating current shader
Improves performance for some apps, e.g. FlightGear -
see https://bugs.freedesktop.org/show_bug.cgi?id=50360
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
And fix incorrect error message for a bad shader type/number.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
As with the previous commit for softpipe.
v2: remove 'default' case to get compile-time warning
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These all return zero. Add a debug_printf() to catch the default case so
we don't accidently mishandle something important in the future.
v2: remove 'default' case to get compile-time warning
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is actually required for GL_ARB_framebuffer_object, but the state
tracker doesn't currently check it.
Direct3D 9 allows mixed format color buffers with some restrictions.
Setting this allows Unigine Heaven 2.5 and 3.0 to run. Tested both on
GL and D3D hosts.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
| |
The type is the destination type (i.e. float vector) and not the
source type. Fixes piglit fs-{in,de}crement-uint.
Signed-off-by: Olivier Galibert <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
| |
This fixes piglit EXT_transform_feedback tests:
- intervening-read output
- intervening-read prims_written
|
| |
|
|
|
|
|
| |
We are going to have a separate resource for depth texturing and transfers
and this is just a transfer thing.
|
| |
|
|
|
|
|
|
|
|
| |
It was only no-oping the clear() function, not actual triangle
rasterization. Move the no_rast field from lp_context down into
lp_rasterizer so it's accessible where it's needed.
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$CLANG_RESOURCE_DIR is the directory that contains all resources
needed by clang to compile programs. When clover uses clang to
compile kernels it needs to specify a resource dir, so that clang
can find its internal headers (e.g. stddef.h).
clang defines $CLANG_RESOURCE_DIR as $CLANG_LIBDIR/clang/$CLANG_VERSION
This patch adds the --with-clang-libdir option in order to accommodate
clang intalls to non-standard locations, and it also adds a check
to the configure script to verify that $CLANG_RESOURCE_DIR/include
contains the necessary header files.
|