| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes 21 piglit tests:
spec/glsl-1.10/execution/variable-indexing/
fs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-row-wr
spec/glsl-1.20/execution/variable-indexing/
fs-temp-array-mat3-index-col-row-rd
fs-temp-array-mat3-index-row-rd
fs-temp-array-mat4-col-row-wr
fs-temp-array-mat4-index-col-row-rd
fs-temp-array-mat4-index-col-row-wr
fs-temp-array-mat4-index-row-rd
fs-temp-array-mat4-index-row-wr
vs-temp-array-mat3-index-col-row-rd
vs-temp-array-mat3-index-col-row-wr
vs-temp-array-mat3-index-row-rd
vs-temp-array-mat3-index-row-wr
vs-temp-array-mat4-col-row-wr
vs-temp-array-mat4-index-col-row-rd
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-wr
vs-temp-array-mat4-index-row-rd
vs-temp-array-mat4-index-row-wr
vs-temp-array-mat4-index-wr
... and prevents a lot of GPU lockups
|
|
|
|
|
| |
Remove macro which changes control flow (it's evil).
Make all fail paths print (correct) error message.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Since we don't have them in hw we emulate them in the shader. Although not
recommended by the spec it is legit.
As a side effect we also get GL 2.1. I think this is as far as we can take
the i915.
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
DUAL_EXPORT can be enabled on r6xx/r7xx when all CBs use 16-bit export
and there is no depth/stencil export.
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
| |
It seems DUAL_EXPORT on evergreen may be enabled when all CBs use 16-bit export
mode (EXPORT_4C_16BPC), also there should be at least one CB, and the PS
shouldn't export depth/stencil.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases TGSI shader has more color outputs than the number of CBs,
so it seems we need to limit the number of color exports. This requires
different shader variants depending on the nr_cbufs, but on the other hand
we are doing less exports, which are very costly.
v2: fix various piglit regressions
Signed-off-by: Vadim Girlin <[email protected]>
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader variants are stored in the list, the key for lookup is based on the
states that require different hw shaders - currently it's rctx->two_side (all
gpus) and rctx->nr_cbufs (evergreen/cayman, when writes_all property is set).
v2:
- use simple list instead of keymap as suggested by Marek on irc
- call r600_adjust_gprs from r600_bind_vs_shader for r6xx/r7xx
(r600_shader_select isn't used for vertex shaders currently)
v3:
- fix call to r600_adjust_gprs - do it after updating current shader
Improves performance for some apps, e.g. FlightGear -
see https://bugs.freedesktop.org/show_bug.cgi?id=50360
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
And fix incorrect error message for a bad shader type/number.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
As with the previous commit for softpipe.
v2: remove 'default' case to get compile-time warning
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These all return zero. Add a debug_printf() to catch the default case so
we don't accidently mishandle something important in the future.
v2: remove 'default' case to get compile-time warning
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is actually required for GL_ARB_framebuffer_object, but the state
tracker doesn't currently check it.
Direct3D 9 allows mixed format color buffers with some restrictions.
Setting this allows Unigine Heaven 2.5 and 3.0 to run. Tested both on
GL and D3D hosts.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
| |
|
|
|
|
|
| |
We are going to have a separate resource for depth texturing and transfers
and this is just a transfer thing.
|
| |
|
|
|
|
|
|
|
|
| |
It was only no-oping the clear() function, not actual triangle
rasterization. Move the no_rast field from lp_context down into
lp_rasterizer so it's accessible where it's needed.
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
| |
Drop the compute specific evergreen_set_buffer_sync() function and
instead use the r600_surface_sync_command atom for emitting SURFACE_SYNC
packets.
|
| |
|
| |
|
|
|
|
|
| |
Thie BitExtract optimization folds a mask and shift operation together
into a single instruction (BFE_UINT).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's not optimal, but it's better than the register pressure scheduler
that was previously being used. The VLIW scheduler currently ignores
all the complicated instruction groups restrictions and just tries to
fill the instruction groups with as many instructions as possible.
Though, it does know enough not to put two trans only instructions in
the same group.
We are able to ignore the instruction group restrictions in the LLVM
backend, because the finalizer in r600_asm.c will fix any illegal
instruction groups the backend generates.
Enabling the VLIW scheduler improved the run time for a sha1 compute
shader by about 50%. I'm not sure what the impact will be for graphics
shaders. I tested Lightsmark with the VLIW scheduler enabled and the
framerate was about the same, but it might help apps that use really
big shaders.
|
|
|
|
|
|
| |
Every place that uses ASM_FLAGS already uses DEFINES. Not including
it in DEFINES is just a way to screw up potential users, as I've done
several times while working on the build system.
|
|
|
|
|
|
|
|
|
|
| |
1) We need to insert a barrier between consecutive transform feedback calls.
2) VBO cache needs to be flushed when TFB output is used as VBO draw input.
Fixes Piglit test EXT_transform_feedback/immediate-reuse.
Thanks to Christoph Bumiller for pointing out bugs in previous versions
of this patch.
|
|
|
|
|
| |
Signed-off-by: Olivier Galibert <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The system array values concept doesn't really because it expects the
system values to be fixed per call, which is wrong for gl_VertexID and
iffy for gl_SampleID. So this patch does two things:
- kill the array, have emit_fetch_system_value directly pick the
values it needs (only gl_InstanceID for now, as the previous code)
- correctly handle the expected type in emit_fetch_system_value
Signed-off-by: Olivier Galibert <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
texture fetches.
Signed-off-by: Olivier Galibert <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
z or stencil texture should not be created with the z/stencil
flags for surface creation as they are intended to be bound
as texture.
v2: remove broken code
Signed-off-by: Jerome Glisse <[email protected]>
|
|
|
|
| |
Signed-off-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Based on https://bugs.freedesktop.org/show_bug.cgi?id=50317#c4
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=50316
https://bugs.freedesktop.org/show_bug.cgi?id=50317
Signed-off-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Solaris Studio C compiler does not support anonymous structs and
anonymous unions.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
| |
|
| |
|
|
|
|
| |
We can use TargetLowering::getRegClassFor() instead.
|
| |
|
|
|
|
|
|
| |
Tested against mesa demos cylwrap and dx9 DCT address.exe which now passes 100%.
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug where a sampler view was using stale texture/resource
data when the texture was modified through a surface (render to texture).
Bumping the texture and layer ages triggers sampler view revalidation.
Fixes piglit fbo-blit failure.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
Now that it's in Linus's tree.
Has anyone had a chance to test streamout on Cayman recently?
|
|
|
|
| |
This allows using the optimizations more broadly.
|
|
|
|
|
|
|
|
|
| |
This requires the latest streamout kernel patches.
Streamout is disabled by default on r7xx, so this patch is safe for regular
users.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
SET_CONTEXT_REG was not counted in.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
It helps on R7xx.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This allows to submit things to the compute only
rings on cayman+
v2: rebased on current master and actually make use
of the new flag in evergreen_compute.c
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
It manifests at exit as:
"WARNING: destroying GPU memory cache with some buffers still in use"
|
|
|
|
| |
If we don't, the GPU will just throw an ILLEGAL_OPERATION error.
|