| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
We were ignoring the incoming box parameters, and were providing totally
bogus stride/layer stride, and other bits, for when a non-full-surface
map was requested.
Signed-off-by: Ilia Mirkin <[email protected]>
Tested-by: Samuel Pitoiset <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
| |
It will be removed from the firmware for the Polaris.
Cc: 12.0 <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
The non-MULTI variants will be removed in Polaris firmware.
Cc: 12.0 <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This should help flush out GPU VM faults.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This should help flush out GPU VM faults.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
- make sure FP32 denormals will stay disabled in LLVM in the future
(the current default is disabled)
- tell LLVM that FP64 denormals are enabled
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't need the clamped version and we don't have to use any intrinsic.
Stats on Tonga:
15382 shaders in 9128 tests
Totals:
SGPRS: 1230560 -> 1230560 (0.00 %)
VGPRS: 469577 -> 462504 (-1.51 %)
Code Size: 22089908 -> 21730052 (-1.63 %) bytes
LDS: 598 -> 598 (0.00 %) blocks
Scratch: 283648 -> 281600 (-0.72 %) bytes per wave
Max Waves: 125664 -> 126969 (1.04 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 547280 -> 547280 (0.00 %)
VGPRS: 269132 -> 262059 (-2.63 %)
Code Size: 15709604 -> 15349748 (-2.29 %) bytes
LDS: 198 -> 198 (0.00 %) blocks
Scratch: 74752 -> 72704 (-2.74 %) bytes per wave
Max Waves: 47840 -> 49145 (2.73 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2: Merge with PIPE_SHADER_CAP_DOUBLES
Add CHIP_HEMLOCK
v3: only set the instruction on EG and CM
Signed-off-by: Jan Vesely <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
so that independent types of jobs can use the same queue.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
by converting semaphores to condvars and using the main mutex
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
for debugging
v2: correct the snprintf use
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
independent jobs don't have to be stuck on only one thread
v2: use CALLOC & FREE
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Checking "signalled" is first done without a mutex, then with a mutex.
Also, checking without waiting doesn't lock the mutex. This is racy, but
should be safe.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
and allow specifying its size in util_queue_init.
v2: use CALLOC & FREE
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up misrepetitions ('if if', 'the the' etc) found throughout the
comments. This has been done manually, after grepping
case-insensitively for duplicate if, is, the, then, do, for, an,
plus a few other typos corrected in fly-by
v2:
* proper commit message and non-joke title;
* replace two 'as is' followed by 'is' to 'as-is'.
v3:
* 'a integer' => 'an integer' and similar (originally spotted by
Jason Ekstrand, I fixed a few other similar ones while at it)
Signed-off-by: Giuseppe Bilotta <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
| |
Put the HBS code into a separate function.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Never be dependent on "draw 0", instead have a bool that makes the draw
dependent on the previous draw or not dependent at all.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Move drawIDs from 64-bit to 32-bit to increase perf.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Add early-out if no components are enabled. Add asserts.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
So we can skip the index gather in PA.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Only adds the attribute mapping to the jitter; no implementation yet.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Function static destructors were getting called by exit
handlers before context teardown.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Handle SGV stores separate from the stream fetch code.
Because of this change, there is a potential to jit an extra unused store.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Was trying to store an extra uninitialized component.
Only affects component packing, which isn't enabled (yet).
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Currently, most code paths between AVX2 and AVX512 are identical
(see changes to knobs.h).
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
llvm redefines DEBUG; adding push/pop prevents a undefined reference
to debug_refcnt_state in llvm-3.7+.
v2: add undef DEBUG
Cc: "12.0" <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
To be consistent with the pipe_context function name.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
| |
There's no reason for doing so.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With commit fb9fe35, we start using transfer_inline_write
for memcpy TexSubImage path, but that triggers a regression with
texture array in the svga driver.
With this patch, the direct map code will update the texture array
correctly.
Fixes VMware bug 1679293.
Tested with MTT piglit, glretrace, conform.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently with the SetVertexBuffers optimization, we avoid emitting
redundant DXSetVertexBuffers commands. However, these buffers surfaces
will still need to be referenced, otherwise, in the case of linux,
the subsequent surface discard map will map to the existing mob instead
of a new one, causing rendering artifacts.
With this patch, we'll call resource_rebind() to reference the resources
even if we are avoiding the actual set command. This fixes the
rendering artifacts in the window title area running with unity in
Ubuntu 14.04
Tested with piglit, glretrace.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Sinclair Yeh <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes three issues with vertex buffer references:
(1) Instead of copy the vertex buffer resource handles to the hw state
in the context structure, use pipe_resource_reference to properly
reference the vertex buffer resources in the context.
(2) Make sure to unbind those unused vertex buffer resources.
(3) Force to rebind the vertex buffer resources at the first draw of each
command buffer to make sure the vertex buffer resources are paged in.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of copy the index buffer resource handle to the hw state in
the context structure, use pipe_resource_reference to properly reference
the index buffer resource in the context.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The start instance is applied as an offset into the buffer directly,
ignoring the divisor, not as an instance id offset that respects the
divisor.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.2 12.0" <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The generic version gets this right already, but this was using an
incorrect formula in SSE.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.2 12.0" <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This reduces time spend in glGenerateMipmap by a half.
v2: don't decompress the levels to be overwritten
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
for pipe_context::generate_mipmap
first move some of the blit code from util_blitter_blit_generic
to a separate function, then use it from util_blitter_generate_mipmap
Reviewed-by: Nicolai Hähnle <[email protected]>
|