| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Somehow they are not recognized as constants.
|
|
|
|
|
|
|
| |
These were leaking.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Removed u_half.py used to generate the table for previous method.
Previous implementation of float to half conversion was faulty for
denormalised and NaNs and would require extra logic to fix,
thus making the speedup of using tables irrelevant.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
This function checks whether a format description is in a simple array format.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Doesn't really change the generated assembly, but produces more compact IR,
and of course, makes code more consistent.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes 21 piglit tests:
spec/glsl-1.10/execution/variable-indexing/
fs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-row-wr
spec/glsl-1.20/execution/variable-indexing/
fs-temp-array-mat3-index-col-row-rd
fs-temp-array-mat3-index-row-rd
fs-temp-array-mat4-col-row-wr
fs-temp-array-mat4-index-col-row-rd
fs-temp-array-mat4-index-col-row-wr
fs-temp-array-mat4-index-row-rd
fs-temp-array-mat4-index-row-wr
vs-temp-array-mat3-index-col-row-rd
vs-temp-array-mat3-index-col-row-wr
vs-temp-array-mat3-index-row-rd
vs-temp-array-mat3-index-row-wr
vs-temp-array-mat4-col-row-wr
vs-temp-array-mat4-index-col-row-rd
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-wr
vs-temp-array-mat4-index-row-rd
vs-temp-array-mat4-index-row-wr
vs-temp-array-mat4-index-wr
... and prevents a lot of GPU lockups
|
|
|
|
|
| |
Remove macro which changes control flow (it's evil).
Make all fail paths print (correct) error message.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
For example
GALLIVM_DEBUG=no_brilinear
which was being parsed as two options, "no" and "brilinear".
|
|
|
|
|
|
|
|
| |
Updated lp_build_printf to share common code.
Removed specific lp_build_print_vecX.
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Since we don't have them in hw we emulate them in the shader. Although not
recommended by the spec it is legit.
As a side effect we also get GL 2.1. I think this is as far as we can take
the i915.
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
DUAL_EXPORT can be enabled on r6xx/r7xx when all CBs use 16-bit export
and there is no depth/stencil export.
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
| |
It seems DUAL_EXPORT on evergreen may be enabled when all CBs use 16-bit export
mode (EXPORT_4C_16BPC), also there should be at least one CB, and the PS
shouldn't export depth/stencil.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases TGSI shader has more color outputs than the number of CBs,
so it seems we need to limit the number of color exports. This requires
different shader variants depending on the nr_cbufs, but on the other hand
we are doing less exports, which are very costly.
v2: fix various piglit regressions
Signed-off-by: Vadim Girlin <[email protected]>
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader variants are stored in the list, the key for lookup is based on the
states that require different hw shaders - currently it's rctx->two_side (all
gpus) and rctx->nr_cbufs (evergreen/cayman, when writes_all property is set).
v2:
- use simple list instead of keymap as suggested by Marek on irc
- call r600_adjust_gprs from r600_bind_vs_shader for r6xx/r7xx
(r600_shader_select isn't used for vertex shaders currently)
v3:
- fix call to r600_adjust_gprs - do it after updating current shader
Improves performance for some apps, e.g. FlightGear -
see https://bugs.freedesktop.org/show_bug.cgi?id=50360
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
And fix incorrect error message for a bad shader type/number.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
As with the previous commit for softpipe.
v2: remove 'default' case to get compile-time warning
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These all return zero. Add a debug_printf() to catch the default case so
we don't accidently mishandle something important in the future.
v2: remove 'default' case to get compile-time warning
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is actually required for GL_ARB_framebuffer_object, but the state
tracker doesn't currently check it.
Direct3D 9 allows mixed format color buffers with some restrictions.
Setting this allows Unigine Heaven 2.5 and 3.0 to run. Tested both on
GL and D3D hosts.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
| |
The type is the destination type (i.e. float vector) and not the
source type. Fixes piglit fs-{in,de}crement-uint.
Signed-off-by: Olivier Galibert <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
| |
This fixes piglit EXT_transform_feedback tests:
- intervening-read output
- intervening-read prims_written
|
| |
|
|
|
|
|
| |
We are going to have a separate resource for depth texturing and transfers
and this is just a transfer thing.
|
| |
|
|
|
|
|
|
|
|
| |
It was only no-oping the clear() function, not actual triangle
rasterization. Move the no_rast field from lp_context down into
lp_rasterizer so it's accessible where it's needed.
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$CLANG_RESOURCE_DIR is the directory that contains all resources
needed by clang to compile programs. When clover uses clang to
compile kernels it needs to specify a resource dir, so that clang
can find its internal headers (e.g. stddef.h).
clang defines $CLANG_RESOURCE_DIR as $CLANG_LIBDIR/clang/$CLANG_VERSION
This patch adds the --with-clang-libdir option in order to accommodate
clang intalls to non-standard locations, and it also adds a check
to the configure script to verify that $CLANG_RESOURCE_DIR/include
contains the necessary header files.
|
|
|
|
|
| |
Signed-off-by: Olivier Galibert <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
Drop the compute specific evergreen_set_buffer_sync() function and
instead use the r600_surface_sync_command atom for emitting SURFACE_SYNC
packets.
|
| |
|
| |
|
|
|
|
|
| |
Thie BitExtract optimization folds a mask and shift operation together
into a single instruction (BFE_UINT).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's not optimal, but it's better than the register pressure scheduler
that was previously being used. The VLIW scheduler currently ignores
all the complicated instruction groups restrictions and just tries to
fill the instruction groups with as many instructions as possible.
Though, it does know enough not to put two trans only instructions in
the same group.
We are able to ignore the instruction group restrictions in the LLVM
backend, because the finalizer in r600_asm.c will fix any illegal
instruction groups the backend generates.
Enabling the VLIW scheduler improved the run time for a sha1 compute
shader by about 50%. I'm not sure what the impact will be for graphics
shaders. I tested Lightsmark with the VLIW scheduler enabled and the
framerate was about the same, but it might help apps that use really
big shaders.
|
| |
|
|
|
|
|
|
| |
Every place that uses ASM_FLAGS already uses DEFINES. Not including
it in DEFINES is just a way to screw up potential users, as I've done
several times while working on the build system.
|
|
|
|
|
|
|
|
|
|
| |
1) We need to insert a barrier between consecutive transform feedback calls.
2) VBO cache needs to be flushed when TFB output is used as VBO draw input.
Fixes Piglit test EXT_transform_feedback/immediate-reuse.
Thanks to Christoph Bumiller for pointing out bugs in previous versions
of this patch.
|
|
|
|
|
|
| |
Fixes alignment problems with flash player.
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
That makes the output black in case of decoding errors.
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Olivier Galibert <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|