| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
| |
Don't lower to offsets, instead use nir_lower_explicit_io here and
use actual pointers for UBO's and SSBO's. This makes
KHR_variable_pointers trivial. This also fixes asserts with shared
variables, which are now supposed to be lowered with
nir_lower_explicit_io.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5684>
|
|
|
|
|
|
|
|
|
|
| |
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
|
|
|
|
|
|
|
|
| |
Both vertex and fragment shaders need to have the lowering.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5751>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5753>
|
|
|
|
|
|
|
| |
Document this new a6xx_state_src value seen in A640/A650 tess traces.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5760>
|
|
|
|
|
|
|
|
| |
Avoid inadvertantly becoming master if fdperf happens to be the first
thing to open the device.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5762>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we would match the start of the compatible string, in
a couple of cases, in order to match compatible strings like
"qcom,adreno-630.2". But these cases would always list a more
generic compatible (ie. "qcom,adreno") as a later choice. So if
we parse all the compatible strings, we can do a more precise
exact match.
This avoids us accidentially matching on "qcom,adreno-smmu" and
the hilarity that ensues.
Fixes: 5a13507164a ("freedreno/perfcntrs: add fdperf")
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5762>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5762>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous version assumes tess level outputs will only be written once
in the shader, however its not possible to guarantee that.
It also assumes all invocations will write all the levels, which is also
not guaranteed.
This is required to fix the "tesselation" and "terraintessellation" demos
with turnip.
The comment about nir_lower_io_to_temporaries in lower_tess_ctrl_block is
removed because nir_lower_io_to_temporaries specifically skips TESS_CTRL
shaders so the comment doesn't make sense.
The split load for tess levels workaround is removed, the new version only
has scalar access unless if ever gets vectorized.
This sets NIR_COMPACT_ARRAYS cap to avoid the glsl tess vec lowering with
gallium. It seems this will also disable "LowerCombinedClipCullDistance",
which I'm not sure was needed or not.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5744>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Accidentally broke this when rebasing the offending commit.
My use case with non-zero explicit offset is UV plane of UBWC NV12, and
only the UBWC slice offset is used for the UBWC sampler, so I didn't catch
it immediately.
Fixes: d53dc6c37680eba8e8 ("freedreno/fdl6: rework layout code a bit (reduce linear align to 64 bytes)")
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5761>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4600>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4600>
|
|
|
|
|
|
|
| |
Avoids having to duplicate logic to figure out the write mask on D24S8
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4600>
|
|
|
|
|
|
|
|
|
|
|
| |
ir3 already calculates the stride in the tess param bo, so use that instead
of a incorrect calculation. The calculation of per_vertex_output_size /
per_patch_output_size is wrong because it counts dwords instead of bytes,
and what it counts for per_vertex_output_size is a per-patch size because
the glsl type is already an array of # vertex/patch elements.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5743>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove scratch_bo from cmdbuffer, use a device-global bo instead, which
also includes border color (and eventually shaders for 3D blit path)
* Use CP_SET_BIN_DATA5_OFFSET to allow setting VSC buffer addresses only
once at the start of the cmdstream
* Use scratch bo mechanism for a resizable VSC buffer
* Use feedback from "vsc_draw_overflow" and "vsc_prim_overflow" values to
increase the size of VSC buffer when beginning to record a new cmdbuffer
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5570>
|
|
|
|
|
|
|
|
|
|
|
| |
Loop through pipes and then loop over the tiles in that pipe instead of
looping over all tiles then having to calculate the pipe # and slot #.
Mainly this avoids the hard to follow "config_get_tile" logic, but should
also be a gain due to better use of cache with the VSC data.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5570>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Compute the tiling config at framebuffer creation time. A framebuffer will b
be re-used multiple times, so this will avoid having to re-calculate the
tiling config every time a command buffer is recorded.
The tiling config already couldn't use the render area's x1/y1 because of
hw binning, this move makes it so the render area isn't used at all for the
tiling config.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5570>
|
|
|
|
|
|
|
| |
v2. Remove redundant memset and make the expression simpler.
Signed-off-by: Hyunjun Ko <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5703>
|
|
|
|
|
|
|
|
| |
Check the interp mode and use SYSTEM_VALUE_BARYCENTRIC_LINEAR_* instead
when it is INTERP_MODE_NOPERSPECTIVE.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5582>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5582>
|
|
|
|
|
|
|
| |
This will be useful to support the missing barycentric sysvals.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5582>
|
|
|
|
|
|
|
|
|
|
|
| |
Note:
* a3xx change based on available register documentation
* a4xx guesses (RB_RENDER_CONTROL2 bits especially)
* a5xx change based on a6xx, these registers seem identical
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5582>
|
|
|
|
|
|
|
| |
Passes at least dEQP-VK.dynamic_state.rs_state.depth_bias_clamp
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5678>
|
|
|
|
|
|
|
| |
Passes dEQP-VK.rasterization.primitive_size.points.point_size_*
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5678>
|
|
|
|
|
|
|
|
|
|
|
| |
This is not completely tested, but matches the max array pitch allowed by
A6XX_TEX_CONST_9_FLAG_BUFFER_ARRAY_PITCH.
Note this still doesn't allow all image sizes, but it allows 16384x16384
cpp=4 images to work.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5678>
|
|
|
|
|
|
|
|
|
|
|
| |
resinfo always writes 3 components, which was not being taken into account
Fixes these tests:
dEQP-VK.renderpass.suballocation.attachment_sparse_filling.input_attachment_3
dEQP-VK.renderpass.suballocation.attachment_sparse_filling.input_attachment_7
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5674>
|
|
|
|
|
|
|
|
| |
tu_ImportFenceFdKHR is used by tu_AcquireImageANDROID, which may or
may not work, but let's at least keep things compiling until somebody
has time to tie up the loose ends on the Android side.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5670>
|
|
|
|
|
|
| |
The device lost support closely matches the anv code for the same.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2769>
|
|
|
|
|
|
|
|
|
|
|
| |
drmCommandWriteRead gives us a -errno, and we only checked for -1 (-EPERM,
incidentally). All the callers wanted 0 for errors, which they were
getting by the fact that req.value was 0-initialized in our stack
allocation (though this only works as long as the kernel doesn't return an
error after setting req.value to something), and -EPERM not really being
an answer we would expect from an ioctl at this stage in the driver.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2769>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2769>
|
|
|
|
|
|
|
|
|
| |
There is only one queue for now, so for non-shared semaphores, the
implementation is basically a no-op. For shared semaphores, this
always uses syncobjs. This depends on syncobj support in the msm
kernel driver.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2769>
|
|
|
|
|
|
|
|
|
| |
In cases where every variant is a shader-cache-hit, we never need the
post-finalize round of nir opt/lowering passes. So defer this until
the first shader-cache-miss to avoid doing pointless work.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a shader disk-cache for ir3 shader variants. Note that builds with
`-Dshader-cache=false` have no-op stubs with `disk_cache_create()` that
returns NULL.
Binning pass variants are serialized together with their draw-pass
counterparts, due to shared const-state.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
| |
For shader-cache, we are going to want to serialize them together.
Which is awkward if the two related variants are not compiled together.
This also decouples allocation and compile, which will simplify adding
shader-cache (which still needs to allocate, but can skip compile).
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we always do sysmem if there is tess. And for GS, the binning
pass VS ends up identical to the draw pass VS, so no point in compiling
it twice. (For GS what we should do someday is generate a binning pass
GS, and possibly if we can do cross-stage linking opts, an optimized
binning pass VS, but the required outputs would somehow have to end up
in the shader variant key.)
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
| |
Just to group together the parts that will get serialized when we have
shader disk-cache.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
| |
Use ir3_compiler_destroy() rather than open-coding ralloc_free(). This
will give us a place to add more compiler related cleanup code in the
following patches.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
| |
The next step is to hook this into pscreen->finalize_nir() so it can
come before the state tracker's shader-caching.
Unfortunately we still need to do lower_io after mesa/st, so that is
split out into a post-finalize pass.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that tu_cs_emit_regs is used for the scissor, it hits an assert when
the scissor is too large. Fixes this dEQP test:
dEQP-VK.draw.scissor.static_scissor_max_int32
Fixes: 9c0ae5704d654108fd36b ("turnip: fix empty scissor case")
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5655>
|
|
|
|
|
|
|
|
|
|
| |
My attempt to be clever here backfired, it overwrites the pNext and stops
the loop (causing deqp to fail to query extension features after that).
Fixes: 62de79ac4492ac9e ("turnip: implement VK_KHR_shader_draw_parameters")
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5654>
|
|
|
|
|
|
|
|
| |
Saves some minor overhead, cleans things up a bit, and removes one more
unknown. We now program the internal registers in the same way between
direct/indirect draws.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5644>
|
|
|
|
|
|
|
| |
Based on comparing the implementations of CP_DRAW_INDX_OFFSET and
CP_DRAW_INDIRECT, this is what this field is for.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5644>
|
|
|
|
|
|
|
|
|
| |
This was already done correctly for the indirect variants, and turnip
was setting the correct value, but it seems freedreno missed the change
in the non-indirect variant. Also, fix a misspelling of "indices" and
add a type to INDX_SIZE.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5644>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
|
| |
This provides the policy for how to handle reducing constlen for some
stages.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
|
|
| |
This provides the mechanism for compiling variants with a reduced
constlen. The next patch provides the policy for choosing which to
reduce.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
|
| |
I wanted to access the ir3_compiler from a small helper inside
ir3_shader.h, which currently isn't possible.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
|
| |
Prevents problems when calculating whether we overflow the shared limit.
Note that on a6xx, the macros handle the assert for us.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
| |
Passes the new tests in dEQP-VK.rasterization.culling.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5650>
|