| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
This enables gallium support for EGL_ANDROID_native_fence_sync, for
drivers which support PIPE_CAP_NATIVE_FENCE_FD.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
ported from Vulkan
Cc: 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
not needed
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Cc: 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Cc: 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
All codepaths are handled except for clover.
Cc: 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
The next commit will need this.
Cc: 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Drivers that support this benefit by saving one lowering pass in the
GLSL-to-TGSI conversion.
radeonsi already supports this because all outputs are stored in temporary
variables before the export (except for TCS outputs, which have always
been readable in TGSI anyway due to their special semantics).
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compiler doesn't shrink s_load_dwordx8, so we always wasted 4 SGPRs.
Also, the extraction of the descriptor created some really ugly asm code
with lots of VALU bitwise ops and v_readfirstlane.
Totals from *affected* shaders:
SGPRS: 13880 -> 13253 (-4.52 %)
VGPRS: 15200 -> 15088 (-0.74 %)
Code Size: 499864 -> 459816 (-8.01 %) bytes
Max Waves: 1554 -> 1564 (0.64 %)
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
My LLVM commit disables it for dGPUs, but not APUs.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
v2: only do this if debug output of shader dumping is enabled
Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
|
|
|
|
|
|
| |
PS and CS don't have any param exports, so it's a no-op.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes dual source blending on Stoney. The fix was copied from Vulkan.
The problem was discovered during internal testing.
Cc: 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
copied from Vulkan
Cc: 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
better safe than sorry
Cc: 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
better safe than sorry; set_framebuffer_state always makes this dirty
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A past commit added the ability to compile "optimized" shader variants
asynchronously (not stalling the app).
This commit builds upon that and adds what is basically a runtime shader
linker. If a VS output isn't used by the currently-bound PS, a new VS
compilation is started without that output. The new shader variant
is used when it's ready.
All apps using separate shader objects I've seen had unused VS outputs.
Eliminating unused/useless VS outputs also eliminates the corresponding
vertex attribute loads.
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
It's just tgsi_shader_info with DEFAULT_VAL varyings removed.
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is the first user of optimized monolithic shader variants.
Cull distances can't be disabled by states.
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
there is no VS epilog in this case
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
key->part.*: prolog and epilog flags only
key->as_{ls,es}: special flags
key->mono.*: flags for monolithic compilation only
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixed piglits:
- arb_cull_distance/clip-cull-3
- arb_cull_distance/clip-cull-4
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Company Of Heroes 2 needs only 24.
This saves 512 bytes of CE RAM per shader stage.
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
We don't want opt4Space here.
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the uploading of shader fails on si_shader_binary_upload(),
it returns -ENOMEM. We should handle si_shader_binary_upload() failure path
on si_create_compute_state().
CID 1394027
v2: Fixes from Edward O'Callaghan's review
a) Update explicitly return value check with "si_shader_binary_upload() < 0"
b) Update commit message.
Signed-off-by: Mun Gwan-gyeong <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
CID 1394028
Signed-off-by: Mun Gwan-gyeong <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For compute shaders, we free the selector after the shader has been
compiled, so we need to save this bit somewhere else. Also, make sure that
this type of bug cannot re-appear, by NULL-ing the selector pointer after
we're done with it.
This bug has been there since the feature was added, but was only exposed
in piglit arb_compute_variable_group_size-local-size by commit
9bfee7047b70cb0aa026ca9536465762f96cb2b1 (which is totally unrelated).
Cc: 13.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the state tracker now enables MSAA in the hardware for the case
nr_samples == 1 as well, we need to set sample locations correctly for
this case.
The Polaris override is still needed for the non-MSAA case (when
nr_samples == 0).
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
| |
This fixes GL45-CTS.sample_variables.mask.*.samples_1.*.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
| |
I'm also sending out a piglit test, gl-2.0/vertexattribpointer-size-3,
which exposes this corner case.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
| |
Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Piglit regressions (radeonsi or LLVM bugs, they pass on softpipe):
- glsl-1.10/execution/variable-indexing/vs-output-array-vec3-index-wr
- glsl-1.10/execution/variable-indexing/vs-output-array-vec4-index-wr
- glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-col-row-wr
- glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-row-wr
Totals:
SGPRS: 1132185 -> 1168801 (3.23 %)
VGPRS: 907856 -> 906204 (-0.18 %)
Spilled SGPRs: 2011 -> 2425 (20.59 %)
Spilled VGPRs: 368 -> 96 (-73.91 %)
Scratch VGPRs: 1344 -> 1060 (-21.13 %) dwords per thread
Code Size: 35916164 -> 35705372 (-0.59 %) bytes
LDS: 767 -> 767 (0.00 %) blocks
Max Waves: 194010 -> 194921 (0.47 %)
Wait states: 0 -> 0 (0.00 %)
Before:
VGPR SPILLING APPS Shaders SpillVGPR ScratchVGPR
alien_isolation 2938 38 40
bioshock-infinite 1769 245 732
dirt-showdown 548 85 72
f1-2015 776 0 320
ue4_lightroom_inter.. 74 0 180
After:
VGPR SPILLING APPS Shaders SpillVGPR ScratchVGPR
alien_isolation 2938 38 40
bioshock-infinite 1769 0 480
dirt-showdown 548 58 40
f1-2015 776 0 320
ue4_lightroom_inter.. 74 0 180
Bioshock and DiRT benefit.
If I set IF_THRESHOLD=4, tesseract starts spilling VGPRs
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
the instruction knows the target
Reviewed-by: Nicolai Hähnle <[email protected]>
|