| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
We already call nir_rematerialize_derefs_in_use_blocks_impl prior to
calling nir_lower_ssa_defs_to_regs_block so the assertion that all deref
uses in the block should hold. This fixes the following CTS test when
SPIR-V optimization recipe 1:
dEQP-VK.glsl.struct.local.loop_nested_struct_array_vertex
Fixes: 606eb56ab9449b "intel/nir: Only lower load/store derefs"
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the block in which the jump is inserted is the predecessor of a phi
then we need to remove phi sources otherwise the phi may end up with
things improperly connected. This fixes the following CTS test when
dEQP is run with SPIR-V optimization recipe 1:
dEQP-VK.glsl.functions.control_flow.return_in_nested_loop_vertex
Cc: [email protected]
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Let's just be explicit that VK_NV_shading_rate_image is not supported.
Suggested-by: Jason Ekstrand <[email protected]>
Fixes: 6ee17091708a41c4aa81a "vulkan: Update the XML and headers to 1.1.86"
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes a bunch of Vulkan subgroup tests on little core platforms.
Fixes: 4150920b95 "intel/fs: Add a helper for emitting scan operations"
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Tested-by: Mark Janes <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Timestamp can be zero for example when Flatpak is used. In this
case just disable the cache rather then segfaulting when
incompatible cache items are loaded.
V2: actually return false when mtime is 0.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Acked-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ports the fix from 3d41757788ac. Both LLVM 7 & 8 continue
to have this problem.
It fixes rendering issues in some menu and loading screens of
Civ VI which can be seen in the trace from bug 104602.
Note: This does not fix the black triangles on Vega for bug
104602.
Reviewed-by: Marek Olšák <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104602
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Unnecessary. While we are at it, remove the check for pre-VI
because it's already checked earlier.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If apps use the MUTABLE bit and the same formats as the image one
in the list, we can still enable TC-compat HTILE. I don't think
this happens often but given the fact that TC-compat HTILE allows
a nice boost in some situations, it's worth checking.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Like we disable DCC/CMASK for small color surfaces as well.
Serious Sam 2017 creates a 1x1 depth surface and I think
it should be faster to do slow clears on the graphics queue
instead of fast clears on compute, and eventually a depth
expand if the surface isn't TC-compatible HTILE.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Other drivers set these two as well, just apply the same rule.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I don't think this is required by Vulkan too.
Ported from RadeonSI (AMDVLK doesn't set it either).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
As discussed in the review of the patch which added the comment:
Nothing happens when a thread is created, because pthread_atfork doesn't
affect creating threads. However, spawning a child process will likely
crash.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already track if the DMA engine is busy/idle with a flag,
and we emit a packet that waits for all CP DMA operations
to be complete. This is done at end of command buffer because
the kernel doesn't wait for them, and also when emitting
barriers, so it should be safe.
This improves small copies for both aligned and unaligned sizes.
Aligned sizes:
BEFORE:
1 KB: 59.840000 ms
2 KB: 71.200000 ms
AFTER:
1 KB: 31.200000 ms
2 KB: 31.040000 ms
Unaligned sizes:
BEFORE:
2 KB: 68.3200 ms
3 KB: 79.3600 ms
5 KB: 76.6400 ms
9 KB: 90.8800 ms
17 KB: 116.0000 ms
AFTER:
2 KB: 31.0400 ms
3 KB: 32.0000 ms
5 KB: 30.8800 ms
9 KB: 30.5600 ms
17 KB: 29.6000 ms
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to my benchmark results, it appears that we should
reduce the threshold to 1024.
BEFORE:
1 KB: 68.656000 ms
2 KB: 118.368000 ms
AFTER:
1 KB: 31.760000 ms
2 KB: 29.840000 ms
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It's unnecessary because we can just check if the timestamp
is to different to the default value when a pool is created
or resetted. Instead of waiting for the availability bit to
be 1, we have to emit a not equal WAIT_REG_MEM for checking
if the timestamp is ready.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Pulling this logic out means we can share the logic and avoid a couple
of temporary variables that helped make things clearer before. Note
that in either vismode case, we always program vismode 0.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that we've copied the emit logic into each branch of the
if (info->index_size) statement, we can simplify the logic a bit
according to which case we're in.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
This splits the two code paths into separate functions and moves the
"if (info->indirect)" test into draw_impl().
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Simplify the code a bit by inlining this helper.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
This way the markers clearly bracket the draw call and isn't
duplicated for both direct and indirect draw code.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Only used in fd6_draw.c so put them there.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
According to the following definition,
int AtomicCompSwap(inout int mem, uint compare, uint data);
the preceding one in atomic_comp_swap of NIR is compare and data is
followed, while src0 for cmpxchg needs vec2(data, compare)
So for ssbo/image deref comp_swap, that should be reversed.
Fixes: dEQP-GLES31.functional.image_load_store.*.atomic.comp_swap*
|
|
|
|
|
|
|
|
| |
Possibly these bits mean something else now. Blob always seems to use
FOUR_QUADS, and changing to TWO_QUADS seems to cause different threads
to overlap registers.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Do a better job of skipping mem2gmem/gmem2mem..
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fix a few bits of confusion, as with previous gen's constlen is aligned
to 4, and value in bitfield is left-shifted by 2 (ie. divided by 4).
But this is done by the CONSTLEN() accessor/builder fxn, so don't do it
twice. Also HLSQ_FS_CNTL.CONSTLEN is not special.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Fixes a crash in (of all things) dEQP-GLES2.info.vendor with
--deqp-surface-type=fbo..
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
batch_flush_reset_dependencies() expects to be called unlocked, and can
call fd_batch_reference() which can try to aquire the screen lock again.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
In c3d9f29b we allowed ctx->batch to be null, and started tracking the
current framebuffer state in fd_context. But the existing logic in
fd_blitter_pipe_begin() would, if !ctx->batch, set null fb state to be
restored after blit. Which broke the world of deqp (and probably other
things)
Fixes: c3d9f29b781 freedreno: allocate ctx's batch on demand
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is defined to always clear the entire surface(s) specified,
regardless of scissor state.. mesa/st will turn scissored clears
into a draw. So rip about a bunch of unnecessary machinery.
Also remove a comment that was obsolete since using u_blitter to
turn clear into draw (for the cases where there isn't a hw blitter
fast-path).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The logic to force a flush every draw was short-circuited with newer
kernels. Also it should apply to clears as well.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The effective scissor changes based on rasterizer->scissor flag, so we
need to re-emit scissor state when rasterizer state changes.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In st_renderbuffer_alloc_storage, we avoid allocating storage for
zero-sized buffers, leading to this pointer being NULL. We already
take care to avoid dereferencing these pointers for color-buffers,
but not for depth/stencil-buffers.
So let's thread a bit more carefully here.
This avoids a crash while running Piglit's glx/glx-visuals-stencil
test, both on virgl and r600g.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Guillaume Charifi <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since the Randr lease code was added, compiling against libxcb 1.12 no
longer works.
CC: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108024
Fixes: 7ab1fffcd2a504024b16e408de329f7a94553ecc
Tested-By: Maxime <[email protected]>
Fixes: 7ab1fffcd2a504024b16 "vulkan: Add EXT_acquire_xlib_display [v5]"
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
| |
Use sample rate shading instead, should give better locality.
Makes Nier with 8x msaa on a Raven go 5 fps -> 7 fps in the menu.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
| |
|