| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
These mirror the fragment shader structs, this is just a framework.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
The compute shader will need it's own context like the frag shader
has, this just introduces the framework struct and allocates/frees
for it in the right places.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
In order to efficiently run a number of compute blocks, use
a threadpool that just allows for jobs with unique sequential
ids to be dispatched.
|
|
|
|
|
|
|
|
| |
In order to share the texture/image/sampler code with compute
shaders we need to reorg them to be at the front of context
same as draw does for vs/gs sharing.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We were only handling the modifiers case and not counting the number of
planes in actual planar images.
Fixes Piglit's ext_image_dma_buf_import-export.
Fixes: fc12fd05f56 ("iris: Implement pipe_screen::resource_get_param")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Based on the vc4 implementation.
Fixes Android RenderEngine::flush() routine:
android.googlesource.com/platform/frameworks/native/+/refs/tags/android-o-mr1-iot-release-smart-clock-fcs/services/surfaceflinger/RenderEngine/RenderEngine.cpp#225
Signed-off-by: Roman Stratiienko <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Try more aggressive approach with cloning uniform and coord loads.
Uniform load can be inserted into any instruction, so let's do that. ARM site
claim that penalty for cache miss is one clock, so we don't lose anything if
we merge it into instruction that uses the result. As side effect we can also
pipeline it and thus decrease reg pressure.
Do the same for varyings that hold texture coords, but for different reason:
looks like there's a special path for coords that increases precision if
varying that holds it is pipelined. If we don't pipeline it and load coords
from a register its precision is fp16 and thus only 10 bits which is not enough
to accurately sample textures of size 1024 or larger.
Since instruction can hold only one uniform load and one varying load,
node_to_instr now creates a move using helper introduced in previous commit if
slot is already taken. As side effect of this change we can also try to
pipeline texture loads and create a move if attempt fails.
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It can load value from varying directly as well. Also load_regs is the
only op that has a source, so add src_num field to load node and set it
accordingly.
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
| |
Introduce common helper for creating movs to avoid code duplication
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This looks like clear copy-and-pasteos, and fixes:
dEQP-GLES2.functional.draw.random.40
(on A307 and A630, both tested in the new CI farm)
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We can get all the information we need from NIR. It's slightly less
accurate, but radeonsi doesn't use the extra information. The old code
also overcounted atomic counters, which led to problems when everything
was used at once.
Fixes KHR-GL45.compute_shader.resources-max.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise it's impossible to know the maximum SSBO index for both
internal TGSI shaders from TTN (which don't have any notion of atomic
counters and no offset) as well as shaders from GLSL.
I fixed everything I could find while grepping for num_ssbos and
num_abos, which hopefully is everything (iris was the only user I could
find that uses it in a meaningful way).
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This routine was made obsolete over a series of reworks of memory
allocation; Tomeu's changes to shader memory allocation finally made
this unused as cppcheck noted.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
| |
I was not aware this incurred undefined behaviour; thank you cppcheck.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Four bytes of src_surf will be compressed into a 32-bits data and
stored into dst_surf, and dst_surf is read as z-order, so its width
must be aligned to multiples of 8(4x2) before divided by 2.
Signed-off-by: Zhaowei Yuan <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111266
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
swr_shader.cpp: In function ‘void (* swr_compile_gs(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’:
swr_shader.cpp:732:44: error: ‘make_unique’ was not declared in this scope
ctx->gs->map.insert(std::make_pair(key, make_unique<VariantGS>(builder.gallivm, func)));
^~~~~~~~~~~
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Jan Zielinski <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Lionel found actual documentation for this at long last. Apparently
it actually is a sampler cache limitation that was mostly fixed on
Icelake. Unfortunately, it seems there are still issues with ASTC
and non-ASTC sampler views. Still, we can lessen the flush condition
from "format mismatch" to "ASTC mismatch", which eliminates most of
the flushing here.
We also update the documentation to refer to the workaround name.
|
|
|
|
|
|
|
|
|
|
| |
Commit 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer
core and swr") unintentionally removed changes for llvm-9.0.
Fixes: 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr")
Fixes: 5dd9ad157005 ("swr/rasterizer: Better implementation of scatter")
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Jan Zielinski <[email protected]>
|
|
|
|
|
|
|
| |
We already had a perfectly cromulent pass for this, but one landed in
common NIR code so let's switch and lighten our tree.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Memory allocated through panfrost_allocate_transient() is likely to
come from the transient pool. Let's add the BO backing the allocated
memory region to the job batch so the kernel can retain this BO while
jobs are executed.
In practice that has never been a problem because the transient pool
is never shrinked, and even if it was, we still control the lifetime of
the job, so there's no reason for this BO to be freed before the GPU is
done executing the batch. But it still make sense to add the BO for
debugging purpose.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Both the BO cache and the transient pool are shared across
context's. Protect access to these with mutexes.
Signed-off-by: Rohan Garg <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Jobs _must_ only be shared across the same context, having
the last_job tracked in a screen causes use-after-free issues
and memory corruptions.
Signed-off-by: Rohan Garg <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
Tiling mode was missing from fd3_emit_gmem_restore_tex().
emit_gmem2mem_surf() used LINEAR exclusiveley.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
* Fix 2D/2DArray/3D tiling parameters:
There is a bottom threshold for width and height.
* Renable tiling for Cubemap, after setting the right parameters.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride." for v3d.
Allows YUV buffers with a single buffer and plane offsets to be
passed in.
Signed-off-by: Dave Stevenson <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Input to GS is just a set of attributes, so remove explicit setup of
'position' which is meaningless for GS input processing.
Reviewed-by: Alok Hota <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Thong Thai <[email protected]>
Reviewed-by: Boyuan Zhang <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 5a2e65be89d97ed5d7263f0296ea69ae8517187b.
Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL
still needs to be emitted by the UMD, or else the driver will hang
Cc: 19.2 <[email protected]>
Signed-off-by: Thong Thai <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We enabled fast clears at level > 0, but didn't minify the dimensions
when comparing the box size, so we always thought it was a partial
clear and as a result never actually enabled any.
This eliminates some slow clears in Civilization VI, but they are mostly
during initialization and not the main rendering.
Thanks to Dan Walsh for noticing we had too many slow clears.
Fixes: 393f659ed83 ("iris: Enable fast clears on other miplevels and layers than 0.")
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
is_scanout is not used anywhere and can be inferred within
panfrost_drm_submit_vs_fs_job() if required.
Signed-off-by: Rohan Garg <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise it doesn't exist and can't be parsed, so everything dies at
screen init time.
Fixes: 6dc4ddc5f81 ("iris: use driconf for 'bo_reuse' parameter")
|
|
|
|
|
|
|
|
| |
Some functionality has been added to deqp-volt to only print
regressions, so update our version of it and use the new options.
Signed-off-by: Tomeu Vizoso <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Added loading gl_Layer and gl_ViewportIndex variables
to Pixel Shader context.
Reviewed-by: Alok Hota <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When u_upload_mgr fills up a buffer, it unmaps and destroys it. Our
unmap function was automatically performing the equivalent of a
FlushMappedBufferRange call in this case. Because the buffer mapping
is persistent and coherent, we don't actually do any flushing when we
do the rest of the writes to the buffer - we were just doing one final
one at the end. But we would be using the uploaded contents on the
GPU the whole time.
This certainly shouldn't be necessary for streaming buffers, and if
such flushing and dirtying is necessary for coherent buffers, this is
wildly insufficient.
Drops a small number of constant packets and PIPE_CONTROL flushes from
most benchmarks that I've looked at. Doesn't seem to make much of an
impact on performance, however.
Thanks to Felix Degrood for noticing that we were emitting more
3DSTATE_CONSTANT_* packets than we needed to.
|
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We were clamping the LOD to force non-mipmap filtering, but that means
that the HW doesn't get to select between the min and mag filters.
Setting MIPFILTER_LINEAR_FAR appears to force non-mipmap filtering.
Fixes all failures in dEQP-GLES2.functional.texture.filtering.2d.*
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Reset the damage area in the resource_from_handle() path (as done in
panfrost_resource_create()).
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Looks like initial RE was wrong and some fields have different purpose.
I.e. there's no "disable_mipmap" field, it's actually part of another field
that selects mipmap filtering.
Also fix layout position.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
| |
This is always false on Gen8+, no need for dead code and parameters.
|
|
|
|
|
| |
Cc: 19.2 19.1 <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111414
Fixes: b758eed9c37 ("radeonsi: make sure that blend state != NULL and remove all NULL checking")
Cc: 19.2 <[email protected]>
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We need two different values of the register, one for NGG and one for
legacy, in order to fix edge flags for the legacy pipeline.
Passing the ngg flag to emit_clip_regs would be too complicated,
so CONTEXT_REG_RMW is used for partial register updates.
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|