| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
We'll need this in the future.
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This mainly removes and simplifies code that is no longer needed.
There were some issues with the DB->CB stencil copy on gfx10, so let's
just use a fragment shader blit for all ZS mappings. It's more reliable.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is for drivers that can't map depth and stencil and need to blit
them to a color texture for CPU access.
This also useful for drivers using separate depth and stencil.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
| |
- merge all 3 functions (Z, S, ZS)
- don't write the color output
- read the value from texel.x, then write it to position.z or stencil.y
(don't use the value from texel.y or texel.z)
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel
|
|
|
|
|
|
|
|
|
|
| |
Added meson test for standalone compiler with --dump-builder option
on builtin texture* functions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107767
Signed-off-by: Yevhenii Kolesnikov <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
glsl/standalone with --dump-builder will exit when unsupported texture
functions are encountered.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107767
Signed-off-by: Sergii Romantsov <[email protected]>
Signed-off-by: Yevhenii Kolesnikov <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gl_SampleMaskIn is 1 when R_028BE0_PA_SC_AA_CONFIG is 0, so this commit rework the conditions
controlling this register.
Before it was set if the sctx->framebuffer had a sample count > 1.
Now we still require this condition, but we also need either:
- GL_MULTISAMPLE to be enabled
- to be executing an operation that doesn't depends on GL state using u_blitter.
This fixes the arb_sample_shading/sample_mask piglit tests on radeonsi.
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we're doing a Z -> Z MSAA blit (for example) we need to enable
msaa rasterization when drawing the quads so that we can properly
write the per-sample values.
This fixes a number of Piglit ext_framebuffer_multisample blit tests
such as ext_framebuffer_multisample/no-color 2 depth combined with
the VMware driver.
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we are discarding the whole resource, we don't care about previous contents,
and the resource storage is now unused, either because we have created new
resource storage, or because we have waited for the existing resource storage
to become unused, or because the transfer is unsynchronized.
In the last two cases this commit marks the storage as uninitialized, but only
if the resource is not host writable (in which case we can't clear the valid
range, since that would result in missed readbacks in future transfers).
In the first case, when the whole resource discard involves a reallocation, the
reallocation and subsequent rebinding already update the valid buffer range
appropriately.
Signed-off-by: Alexandros Frantzis <[email protected]>
Reviewed-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The rasterizer core supported ARB_viewport_array,
but the swr layer connecting core to Gallium state
tracker only allowed one viewport.
We add support for multiple viewports to swr layer.
Reviewed-by: Alok Hota <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Basically same as external for now.
Reviewed-by: Samuel Pitoiset <[email protected]>
Only case we might need to handle differently in the near future
is Raven's case of displayable DCC which is not renderable. But
we don't support that yet.
|
|
|
|
|
|
|
| |
blocks.
Fixes: d7e6541cc72 "radv: Only allocate supplied number of descriptors when variable."
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Getting a DMA-buf fd and converting that to a handle using our duplicate
of that file descriptor (getting at which requires passing a
radeon_winsys pointer to the buffer_get_handle hook) makes sure of this,
since duplicated file descriptors reference the same file description
and therefore the same GEM handle namespace.
This is necessary because libdrm_amdgpu may use a different DRM file
descriptor with a separate handle namespace internally, e.g. because it
always reuses any existing amdgpu_device_handle for the same device.
amdgpu_bo_export returns a handle which is valid for that internal
file descriptor.
Bugzilla: https://bugs.freedesktop.org/110903
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It extends pipe_screen / radeon_winsys and references amdgpu_winsys.
Multiple amdgpu_screen_winsys instances may reference the same
amdgpu_winsys instance, which corresponds to an amdgpu_device_handle.
The purpose of amdgpu_screen_winsys is to keep a duplicate of the DRM
file descriptor passed to amdgpu_winsys_create, which will be needed
in the next change.
v2:
* Add comment in amdgpu_winsys_unref explaining why it always returns
true (Marek Olšák)
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
| |
Cleanup to prevent breakage with the next change, no functional change
intended in this one.
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Do not use the view format when filling the surface state.
Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.*
Fixes: fb1350c76f1 ("intel: Add and use helpers for level0 extent")
Reviewed-by: Nanley Chery <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
this can fail unexpectedly due to bugs, so it's good to provide feedback
when this occurs
Reviewed-by: Sagar Ghuge <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1):
For objects in the Uniform, StorageBuffer, or PushConstant storage
classes, the element’s address or location is calculated using a
stride, which will be the Base-type’s Array Stride when the Base
type is decorated with ArrayStride. For all other objects, the
implementation will calculate the element’s address or location.
For non-CL shaders the driver should layout the Workgroup storage
class, so override any explicitly set ArrayStride in the shader. This
currently fixes only the lower_workgroup_access_to_offsets case, which
is used by anv.
Reviewed-by: Juan A. Suarez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gen11, the removed the PLN instruction so we have to emit a pile of
MAD to emulate it. We may as well do that in NIR so we can optimize and
later schedule it.
Shader-db results on Ice Lake:
total instructions in shared programs: 17145644 -> 16556440 (-3.44%)
instructions in affected programs: 11507454 -> 10918250 (-5.12%)
helped: 35763
HURT: 42085
helped stats (abs) min: 1 max: 140 x̄: 19.09 x̃: 18
helped stats (rel) min: 0.04% max: 37.93% x̄: 15.40% x̃: 14.49%
HURT stats (abs) min: 1 max: 248 x̄: 2.22 x̃: 2
HURT stats (rel) min: 0.05% max: 50.00% x̄: 5.00% x̃: 2.47%
95% mean confidence interval for instructions value: -7.67 -7.47
95% mean confidence interval for instructions %-change: -4.46% -4.29%
Instructions are helped.
total loops in shared programs: 4370 -> 4370 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 360624645 -> 368220857 (2.11%)
cycles in affected programs: 269631244 -> 277227456 (2.82%)
helped: 15583
HURT: 65874
helped stats (abs) min: 1 max: 28561 x̄: 78.45 x̃: 32
helped stats (rel) min: <.01% max: 67.81% x̄: 5.38% x̃: 2.44%
HURT stats (abs) min: 1 max: 238638 x̄: 133.87 x̃: 20
HURT stats (rel) min: <.01% max: 306.25% x̄: 5.81% x̃: 3.97%
95% mean confidence interval for cycles value: 67.42 119.09
95% mean confidence interval for cycles %-change: 3.61% 3.73%
Cycles are HURT.
total spills in shared programs: 8943 -> 8981 (0.42%)
spills in affected programs: 1925 -> 1963 (1.97%)
helped: 44
HURT: 14
total fills in shared programs: 21815 -> 21925 (0.50%)
fills in affected programs: 3511 -> 3621 (3.13%)
helped: 41
HURT: 18
LOST: 70
GAINED: 14
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If they never get used, dead code should clean them up. Also, we rework
the at_offset and at_sample intrinsics so they return a proper vec2
instead of returning things in PLN layout. Fortunately, copy-prop is
pretty good at cleaning this up and it doesn't result in any actual
extra MOVs.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of manually adding the BOs from the various SLAB pools plus
the one backing the color FB, we insert them in the BO set attached
to the job and let panfrost_drm_submit_job() pass all BOs from this set
to the SUBMIT ioctl.
This means we are now passing all referenced BOs and let the scheduler
wait on referenced BO fences if needed.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
There's no point duplicating the code, and it will help us simplify
the bo_handles[] filling logic in panfrost_drm_submit_job().
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
To avoid the panfrost_memory <-> panfrost_bo dance done in
panfrost_resource_create_bo() and panfrost_bo_unreference().
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
So we can re-use it for the panfrost_drm_create_bo() function we are
about to introduce.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Let's keep a clear split between ioctl wrappers and the rest of the
driver. All the import BO function need is a dmabuf FD and the screen
object, and the export one should only take care of generating a dmabuf
FD out of a BO object. Winsys handle manipulation should stay in the
resource.c file.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
That's what most (all?) implementation seem to do, and my understanding
is that a BO is just a bunch of memory that can be used for anything GPU
related, not only texture/FB resources.
Let's move those meta data in panfrost_resource so we can use
panfrost_bo for all kind of memory allocation and make BO allocation
more consistent.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
panfrost_drm_submit_job() and panfrost_fence_create() are not used
outside of pan_drm.c.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
bo->imported was never set to true which means this path was never taken.
Moreover, panfrost_drm_free_imported_bo() is doing missing the munmap()
call which seems wrong because the import BO function calls mmap().
Let's just kill this function along with the ->imported field.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
Commit 5f81669d880b ("panfrost: Remove the panfrost_driver abstraction")
left a few things behind, remove them now.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
| |
Which improves readability and help us avoid a memory leak.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise we get random use-after-{free,unmap} errors.
Signed-off-by: Boris Brezillon <[email protected]>
---
Changes in v2:
- Move the panfrost_job_add_bo() call out of the loop
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's currently only enabled if dcc_slice_size is equal to
dcc_slice_fast_clear_size because the driver assumes that
portions of multiple layers are contiguous but it's not
always true.
Still not supported on GFX9.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|