| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Right now, we name these fields as "field name minus one" so that your C
code obviously states what the value should be. However, it's easy enough
to handle at the codegen level with another little XML attribute, meaning
less C code and easier-to-read values in CLIF dumping and gdb as well.
(The actual CLIF format for simulator and FPGA replay takes in
pre-minus-one values, so we need it there too).
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
There used to be one and it looks like it was removed by eb63640c1d.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106986
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Vulkan spec says:
"pipelineBindPoint is a VkPipelineBindPoint indicating whether
the descriptors will be used by graphics pipelines or compute
pipelines. There is a separate set of bind points for each of
graphics and compute, so binding one does not disturb the other."
CC: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It's always false.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-By: Gert Wollny <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Now that SSA values can be derefs and they have special rules, we have
to be a bit more careful about our LCSSA phis. In particular, we need
to clean up in case LCSSA ended up creating a phi node for a deref.
This fixes validation issues with some Vulkan CTS tests with the new
deref instructions.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Ported from RadeonSI.
This appears to fix some random fails with:
dEQP-VK.query_pool.statistics_query.*
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
While XFB has been enabled for cache, we did not serialize enough
data for the whole API to work (such as glGetProgramiv).
Fixes: 6d830940f7 "Allow shader cache usage with transform feedback"
Signed-off-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The driver already supports exporting the stencil value.
The following CTS test now pass:
dEQP-VK.pipeline.shader_stencil_export.op_replace
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
From the Vulkan spec:
"If this is a primary command buffer, then this value is ignored."
CC: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can not use the VUE Dereference flags combination for EOT
message under ILK and SNB because the threads are not initialized
there with initial VUE handle unlike Pre-IL.
So to avoid GPU hangs on SNB and ILK we need
to avoid usage of the VUE Dereference flags combination.
(Was tested only on SNB but according to the specification
SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6
the ILK must behave itself in the similar way)
v2: Approach to fix this issue was changed.
Instead of different EOT flags in the program end
we will create VUE every time even if GS produces no output.
v3: Clean up the patch.
Signed-off-by: Andrii Simiklit <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399
CC: <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Tested-by: Mark Janes <[email protected]>
|
|
|
|
|
|
|
|
| |
The radeon winsys isn't linked against the ac code, I have vague
memories of this causing some problems before, for now fix the build
but just duplicating the code.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
cmask_buffer and surface.cmask_size can replace its role.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
cmask_size is changed to uint32_t because it can't be greater than 4GB.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
The real offset is passed through resource_from_memobj.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Some fields shouldn't be initialized, like framebuffers_bound and other stats.
It's hopefully complete now.
Cc: 18.1 <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch both adds support for probing & filtering DRM nodes
and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD
gralloc call.
Currently the filtering is based just on the driver name,
and the desired name is supplied using the "drm.gpu.vendor_name"
Android property.
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Maintaining both flink names and prime fd support which are provided by
2 different gralloc implementations is problematic because we have a
dependency on a specific gralloc implementation header.
This mostly disables the dependency on the gralloc implementation and
headers. The dependency on GRALLOC_MODULE_PERFORM_GET_DRM_FD remains for
now, but the definition is added locally to remove the header
dependency.
drm_gralloc support can be enabled by setting
BOARD_USES_DRM_GRALLOC=true in BoardConfig.mk.
Signed-off-by: Rob Herring <[email protected]>
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the driver ends up by performing a slow depthstencil clear,
the HTILE metadata won't be initialized correctly.
This fixes random VM faults on Polaris while running CTS
with Bas's runner. This doesn't seem to regress performance.
CC: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For instruction sequences that change the address register with every load
the current limit to bail out of the scheduler and reject the optimisation
was too tight, i.e. it was expected that at least one pending instruction
would be scheduled each time.
Give the scheduler more margin to sort out these load sequences by allowing
a number of rounds where no instruction is scheduled.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106163
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is based on
https://lists.freedesktop.org/archives/mesa-dev/2018-February/185805.html
Dave Airlie:
"A bunch of CTS tests led me to write
tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
which r600/sb always fell over on.
GCM seems to move some of the copies into other basic blocks,
if we don't allow this to happen then it doesn't seem to schedule
them badly.
Everything I've read on SSA/phi copies say they have to happen
in parallel, so keeping them in the same basic block seems like
a good way to keep some of that property."
This patch differs from the one proposed by Dave in that it only adds
the NF_DONT_MOVE flag to copy_move instructions that are created by split_phi*
and that are located in loops.
Fixes piglit: tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
(no regressions in the shader set). It also fixes all failing tests from
dEQP-GLES3.functional.shaders.loops.*
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Fixes: cf0c7258ee0 freedreno/a5xx: MSAA
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This was mistakenly exposed, even though we want atomic counters to be
lowered to atomic ops on an SSBO like nearly every other GPU. Which
somehow recently started getting segfaults due to calling a null
pipe->set_hw_atomic_buffers().
Fixes a crash in stk, and probably other things.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.
v2:
Rework fence integration into the driver so that waiting for
any of a mixture of fence types (wsi, driver or syncobjs)
causes the driver to poll, while a list of just syncobjs or
just driver fences will block. When we get syncobjs for wsi
fences, we'll adapt to use them.
v3: Adopt Jason Ekstrand's coding conventions
Declare variables at first use, eliminate extra whitespace between
types and names. Wrap lines to 80 columns.
Suggested-by: Jason Ekstrand <[email protected]>
v4: Adapt to WSI fence API change. It now returns VkResult and
no longer has an option for relative timeouts.
v5: wsi_register_display_event and wsi_register_device_event now
use the default allocator when NULL is provided, so remove the
computation of 'alloc' here.
Signed-off-by: Keith Packard <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.
v2: Adopt Jason Ekstrand's coding conventions
Declare variables at first use, eliminate extra whitespace between
types and names. Wrap lines to 80 columns.
Add extension to list in alphabetical order
Suggested-by: Jason Ekstrand <[email protected]>
v3: Adapt to WSI fence API change. It now returns VkResult and
no longer has an option for relative timeouts.
v4: wsi_register_display_event and wsi_register_device_event now
use the default allocator when NULL is provided, so remove the
computation of 'alloc' here.
v5: use zalloc2 instead of alloc2 for the WSI fence.
Suggested-by: Jason Ekstrand <[email protected]>
Signed-off-by: Keith Packard <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.
v2: Remove DRM_CRTC_SEQUENCE_FIRST_PIXEL_OUT flag. This has
been removed from the proposed kernel API.
Add NULL parameter to drmCrtcQueueSequence ioctl as we
don't care what sequence the event was actually queued to.
v3: Adapt to pthread clock switch to MONOTONIC
v4: Fix scope for wsi_display_mode andwsi_display_connector allocs
Suggested-by: Jason Ekstrand <[email protected]>
v5: Adopt Jason Ekstrand's coding conventions
Declare variables at first use, eliminate extra whitespace between
types and names. Wrap lines to 80 columns.
Use wsi_rel_to_abs_time helper function to convert relative
timeouts to absolute timeouts without causing overflow.
Suggested-by: Jason Ekstrand <[email protected]>
v6:
Change WSI fence wait function to return VkResult instead of
bool. This makes the meaning of the return value easier to
understand, and allows for the indication of failure.
Also change the WSI fence wait function to take only absolute
timeouts and not provide an option for a relative timeout. No
users wanted relative timeouts, and it's simpler if that option
isn't available.
Terminate the DPMS property loop once we've found the property.
Assert that the fence hasn't already been destroyed in
wsi_display_fence_destroy.
Rearrange the event handler function order in the file to place
routines in an easier to find order.
Suggested-by: Jason Ekstrand <[email protected]>
v7:
Adapt to API changes for surface_get_capabilities
v8:
Use wsi->alloc in register_display_event so that callers
don't have to dig out an allocator for us.
v9:
Fix a few minor formatting issues
Suggested-by: Jason Ekstrand <[email protected]>
v10:
Use wsi->alloc if none provided in wsi_display_fence_alloc.
Now that drivers are expected to pass the allocator argument
straight through from the application, we need to check those
for NULL everywhere.
Signed-off-by: Keith Packard <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handle the case where the set of fences to wait for is not all of the
same type by either waiting for them sequentially (waitAll), or
polling them until the timer has expired (!waitAll). We hope the
latter case is not common.
While the current code makes sure that it always has fences of only
one type, that will not be true when we add WSI fences. Split out this
refactoring to make merging that clearer.
v2: Adopt Jason Ekstrand's coding conventions
Declare variables at first use, eliminate extra whitespace between
types and names. Wrap lines to 80 columns.
Suggested-by: Jason Ekstrand <[email protected]>
v2:
Cast INT64_MAX to uint64_t to make of its use as the maximum
possible timeout clearly unsigned to the reader.
Suggested-by: Jason Ekstrand <[email protected]>
Make anv_wait_for_fences with !waitAll check all fences at least
once, even if the requested timeout has already passed.
Signed-off-by: Keith Packard <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
System values are never arrays or structs so we can assume a direct var
deref. This simplifies things a bit and prevents us from accidentally
throwing away an array index.
Suggested-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|