| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
VK_EXT_conditional_rendering allows to discard draw commands
(not only normal draws).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
VK_EXT_conditional_rendering allows to discard dispatch commands.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Addresses in the command streams should be in canonical form (i.e
bit[63:48] == bit[47]). If the [bo->gtt_offset, bo->gtt_offset +
target_offset] range contains the address 0x800000000000, the current
code will fail that criteria.
v2: Fix missing include (Lionel)
Fixes: 1c9053d0765dc6 ("i965: Prepare batchbuffer module for softpin support.")
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We don't need to emit PS_PARTIAL_FLUSH for the pre fragment shader
stages (ie. geometry/tessellation). Emitting VS_PARTIAL_FLUSH
is enough for these stages. Note that PS_PARTIAL_FLUSH also
synchronizes all vertex stages.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The assert is checking that we are not binding more descriptor sets
than the supported by the driver. When binding the descriptor set
number MAX_SETS-1, it was breaking the assert because
descriptorSetCount = 1.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
CC: <[email protected]>
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Abort all dependent events.
v2: Abort the current event as well.
CC: <[email protected]>
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
One of these was seen in a Deus Ex shader.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
|
|
|
|
|
| |
Signed-off-by: Konstantin Kharlamov <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Just a nice hint for both peoples and compilers.
Signed-off-by: Konstantin Kharlamov <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Just a nice hint for both peoples and compilers.
Signed-off-by: Konstantin Kharlamov <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Just a nice hint for both peoples and compilers.
Signed-off-by: Konstantin Kharlamov <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ported from radeonsi. Improves windowed glxgears ran as
vblank_mode=0 glxgears -info -geometry 0+0+512+512
from ≈2270 FPS to ≈2360 FPS. Tested with AMD TURKS.
v2: turned out glxgears ignores the option above, the correct way would
be "512x512+0+0". Now it can be seen 512x512 actually loses 30 FPS.
300×300 however wins around a hundred FPS, and to leave some room in
case results may differ for other cards I want not to nitpick in search
of an optimum but to simply leave 300×300 in the code.
v3: remove redundant braces, and try harder for the mail to stick to
the rest of the series.
Signed-off-by: Konstantin Kharlamov <[email protected]>
Reviewed-by: Gert Wollny <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Annoyingly we still have to briefly drop the lock to unref resources..
but push the lock down into __fd_batch_destroy() so we can invalidate
the batch and reset resources before dropping the lock.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-allocate rather than re-use. Originally we had an unnecessarily
complex design to avoid re-allocating cmdstream buffers. But now that
support for "growable" cmdstream buffers has been in place for a couple
years, I guess we can care a bit less about the extra overhead on older
kernels.
But making the batches one-shot removes a class of potential race
conditions vs the flush_queue.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of the reading batch setting a dependency on the writing batch,
simply flush the writing batch immediately. This avoids situations
where we have to flush the context's current batch later.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
This was basically to avoid a zero-dword IB (indirect-branch), but
instead just don't emit the IB packet in that case.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
pipe_framebuffer_state can have samples=0 in various cases, which is
actually the same thing as samples=1. So use the _get_num_samples()
helper to populate the key, to avoid this looking like two distinct
fb states to the cache.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
It is possible for a batch to be freed under our feet when flushing, so
it is best to hold a reference to all of them up-front.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
OpenCL knows vector of size 8 and 16.
v2: rebased on master (nir_swizzle rework)
rework more declarations with nir_component_mask_t
adjust print_var_decl
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
If we know that the given image doesn't have any metadata,
we don't need to flush.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The Vulkan 1.1.80 spec says:
"viewMask has the same effect for the described subpass as
VkRenderPassMultiviewCreateInfo::pViewMasks has on each
corresponding subpass."
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
This is required for OpenGL 4.4 and OpenGL ES 3.1 support.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The current code is buggy: if there are only 12 dwords left in cbuf,
we emit a zero data length command which will be rejected by virglrenderer.
Fix it by calling flush in this case.
Cc: [email protected]
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This allows us to implement glMinSampleShading correctly, which up
until now just got ignored.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When handling loops in constant propagation, implement the "FINISHME"
comment like copy propagation: perform a first pass to find values
that can't be propagated, then perform a second pass with the ACP
containing still valid values.
Certain values are killed because the loop may run more than one
iteration, so we can't copy propagate them as they would be invalid in
the later iterations.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When handling 'if' in constant propagation, if a certain variable was
killed when processing the first branch of the 'if', then the second
would get any propagation from previous nodes. This is similar to the
change done for copy propagation code.
x = 1;
if (...) {
z = x; // This would turn into z = 1.
x = 22; // x gets killed.
} else {
w = x; // This would NOT turn into w = 1.
}
With the change, we let constant propagation happen independently in
the two branches and only then apply the killed values for the
subsequent code.
The new code use a single hash table for keeping the kills of both
branches (the branches only write to it), and it gets deleted after we
use -- instead of waiting for mem_ctx to collect it.
NIR deals well with constant propagation, so it already covered for
the missing ones that this patch fixes.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
I keep having to ignore these shader-db changes since I don't trust them,
so just disable the reports entirely.
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 98578 -> 98119 (-0.47%)
instructions in affected programs: 27571 -> 27112 (-1.66%)
and it also eliminates most spills/fills on the CTS's randomized uniform
usage testcases.
|
|
|
|
| |
The docs had an update noting this restriction, so reflect it in the code.
|
|
|
|
|
| |
This doesn't affect us yet since we're not doing TMUWTs, but I think we
will for GLES 3.1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SNB doesn't have a definition of 3DSTATE_CONSTANT_BODY, thats
why we got segmentation fault when used INTEL_DEBUG=bat.
Fixed by adding of 3DSTATE_CONSTANT_BODY into 3DSTATE_CONSTANT
of VS, GS and PS structures.
v2: added definition of 3DSTATE_CONSTANT_BODY to the gen6.xml
Fixes: 169d8e011ae (intel: Fix 3DSTATE_CONSTANT buffer decoding.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107190
Signed-off-by: Sergii Romantsov <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
Empty initializer braces aren't valid c (it's a gnu extension, and
it's valid in c++).
Hopefully fixes appveyor / msvc build...
Fixes a3150c1d06ae7766c3d3fe3b33432e55c3c7527e
|
|
|
|
|
|
|
|
|
| |
This makes the arguments match the (thing, container) pattern used in
other nir_foreach macros and also renames it to make that a bit more
clear.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For one thing, the NIR opcodes for image load/store always take and
return a vec4 value regardless of the image type. We need to fix up
both the source and destination to handle it. For another thing, we
weren't actually setting up a destination in the OpAtomicLoad case.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: [email protected]
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
This decreases sizeof(struct amdgpu_cs_buffer) from 24 to 16 bytes.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
For a later simplification.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
For a later simplification.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|