| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Don't reject YUV formats that the driver doesn't handle natively, since
mesa/st already knows how to lower this in shader.
Reported-by: Nicolas Dechesne <[email protected]>
Fixes: 83e9de2 ("st/mesa: EGLImageTarget* error handling")
Cc: 17.1 <[email protected]
Signed-off-by: Rob Clark <[email protected]>
Tested-by: Nicolas Dechesne <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
State trackers can set this to tell the driver when u_threaded_context is
desirable.
Reviewed-by: Nicolai Hähnle <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The function was pretty slow. This brings a substantial decrease in draw
call overhead when min/max index bounds are not needed:
Before: DrawElements (1 VBO) w/ no state change: 5.75 million
After: DrawElements (1 VBO) w/ no state change: 7.03 million
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is the best place to do it. Now drivers without u_vbuf don't have to
do it.
v2: use correct upload size and optimal alignment
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pipe_draw_info::indexed is replaced with index_size. index_size == 0 means
non-indexed.
Instead of pipe_index_buffer::offset, pipe_draw_info::start is used.
For indexed indirect draws, pipe_draw_info::start is added to the indirect
start. This is the only case when "start" affects indirect draws.
pipe_draw_info::index is a union. Use either index::resource or
index::user depending on the value of pipe_draw_info::has_user_indices.
v2: fixes for nine, svga
|
|
|
|
| |
For faster initialization of non-indirect draws.
|
| |
|
|
|
|
|
|
|
|
| |
Similar to how image resources are handled. That way we are sure
that inst->resource.file is PROGRAM_SAMPLER for "bound" samplers.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: use the st_common_program() helper
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It contains only one member: the update function. Let's use the update
function directly.
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
When the arrays are initialized later on with -1, that's useless
to use rzalloc_array().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
After calling this we were then overriding all the functions with
st versions.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
For the explicit pack/unpack conversions.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The TGSI DCE pass doesn't eliminate dead assignments like
MOV TEMP[0], TEMP[1] in presence of loops because it assumes
that the visitor doesn't emit dead code. This assumption is
actually wrong and this situation happens.
However, it appears that the merge_registers() pass accidentally
takes care of this for some weird reasons. But since this pass has
been disabled for RadeonSI and Nouveau, the renumber_registers()
pass which is called *after*, can't do its job correctly.
This is because it assumes that no dead code is present. But if
there is still a dead assignment, it might re-use the TEMP
register id incorrectly and emits wrong code.
This patches fixes the issue by recording writes instead of reads,
and this has the advantage to be faster.
This should fix Unigine Heaven on RadeonSI and Nouveau.
shader-db results with RadeonSI:
47109 shaders in 29632 tests
Totals:
SGPRS: 1923308 -> 1923316 (0.00 %)
VGPRS: 1133843 -> 1133847 (0.00 %)
Spilled SGPRs: 2516 -> 2518 (0.08 %)
Spilled VGPRs: 65 -> 65 (0.00 %)
Private memory VGPRs: 1184 -> 1184 (0.00 %)
Scratch size: 1308 -> 1308 (0.00 %) dwords per thread
Code Size: 60095968 -> 60096256 (0.00 %) bytes
LDS: 1077 -> 1077 (0.00 %) blocks
Max Waves: 431889 -> 431889 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
It's still interesting to disable the merge_registers() pass.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It doesn't make sense to prefix them with 'image' because
they are called "Memory Qualifiers" and they can be applied
to members of storage buffer blocks.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Andres Gomez <[email protected]>
|
|
|
|
|
|
|
| |
This is already set for the instruction at initialisation.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
also remove the incorrect comment about primitive restart.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VMware driver has a limited set of integer texture formats. We
often have to fall back to 4-component formats when 1- or 2-component
formats are missing.
This fixes about 8 integer texture Piglit tests with the VMware driver
on Linux. We've had this code in-house for a long time but I guess it
was never up-streamed to Mesa master.
This shouldn't regress any other drivers since we're either choosing
an earlier format in the list, or failing anyway.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
|
|
| |
stfb->iface is always non-NULL for an st_framebuffer. These checks
were incorrect, relying on out-of-bounds memory access in the
surface-less case of EGL_KHR_surfaceless_context.
v2: remove redundant stread check (Marek)
Reviewed-by: Marek Olšák <marek@[email protected]> (v2)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The incomplete framebuffer is set for a surfaceless context. This leads to
the following error in piglit spec@egl_khr_surfaceless_context@viewport:
==26703==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7f6886e43240 at pc 0x7f68854db0fd bp 0x7ffca404b3b0 sp 0x7ffca404b3a0
READ of size 8 at 0x7f6886e43240 thread T0
#0 0x7f68854db0fc in st_viewport ../../../mesa-src/src/mesa/state_tracker/st_cb_viewport.c:57
#1 0x556840176cdb in main tests/egl/spec/egl_khr_surfaceless_context/viewport.c:101
#2 0x7f688edcf3f0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x203f0)
#3 0x556840176e19 in _start (/home/nha/amd/piglit/bin/egl-surfaceless-context-viewport+0xe19)
0x7f6886e43240 is located 32 bytes to the left of global variable 'DummyRenderbuffer' defined in '../../../mesa-src/src/mesa/main/fbobject.c:69:31' (0x7f6886e43260) of size 112
0x7f6886e43240 is located 8 bytes to the right of global variable 'IncompleteFramebuffer' defined in '../../../mesa-src/src/mesa/main/fbobject.c:73:30' (0x7f6886e42de0) of size 1112
SUMMARY: AddressSanitizer: global-buffer-overflow ../../../mesa-src/src/mesa/state_tracker/st_cb_viewport.c:57 in st_viewport
Cc: [email protected]
Reviewed-by: Marek Olšák <marek@[email protected]>
|
| |
|
|
|
|
|
| |
It turns out that explicitly setting the writemask isn't actually
needed; emit_asm does the right thing based on looking at the types.
|
|
|
|
| |
They are now no longer used.
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These operations are currently implemented as IR expressions. However,
they cannot be transformed and moved in the way that other IR
expressions can because they have non-trivial interactions with
control-flow.
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Remove unneeded parens. Add const qualifiers. Move var decls closer
to where they're used.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Neha Bhende<[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main goal of this pass to merge temporary registers in order
to reduce the total number of registers and also to produce
optimal TGSI code.
In fact, compilers seem to be confused when temporary variables
are already merged, maybe because it's done too early in the
process.
Skipping the pass, reduce both the register pressure and the code
size, at least for Nouveau and RadeonSI because they have a real
backend compiler.
Found by luck while fixing an issue in the TGSI dead code elimination
pass which affects tex instructions with bindless samplers.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This avoids repeated translations of the enum.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|