| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes all the ARB_texture_query_lod piglit tests, and needed to get
the Vulkan CTS textureQueryLOD passing with the ongoing Vulkan driver.
Note that LOD Query bit flag became only available on V42 of the hw,
but the v3d40_tex is using V41 as reference. In order to avoid setting
up the infrastructure to support both v41 and v42, we manually set the
bit if the device version is the correct one.
We also fix how the ARB_texture_query_lod (so EXT_texture_query_lod)
is exposed. Before this commit it was always exposed (wrongly as it
was not really supported). Now it is exposed for devinfo.ver >= 42.
v2: move _need_sampler helper to nir.h (Eric Anholt)
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Configuration Parameter packets 1 and 2 are pointed as optional, but
it is not clearly stated if you can skip only P1 when P2 is needed.
In the practice, it seems that the situation P0 - non-P1 - P2 can
causes problems, and at least on the simulator, it seems that sampler
info are attempted to be accessed. So let's just be conservative, and
only skip P1 configuration if we can skip P2 configuration too.
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TMU configuration parameter 1 configures the sampler for the texture
operation. But there are some texture operations that doesn't need a
sampler. Skipping the configuration could provide a small perf
improvement on OpenGL. On the incoming Vulkan driver, would allow us
to avoid to set up an unneeded sampler.
Note that we still need to add the sampler configuration parameter if
the output is a 32bit, as it is on the sampler where we configure that
info.
Also, note that for images this is done comparing against a unpacked
p1 default. But in order to do that it is needed to go through the
code that fills up the unpacked p1. We can skip that too.
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
|
|
|
|
|
|
|
|
| |
This saves a bunch of hassle in handling derefs in the backend, and would
be needed for reasonable handling of dynamic indexing of image arrays.
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes dEQP test:
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read
Fixes piglit test:
spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test
Fixes: 6281f26f064ada ("v3d: Add support for shader_image_load_store.")
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That we set for any TMU write on spills and general tmu. It is then
used as part of v3d_emit_gl_shader_state later.
v2: add a new flag instead at v3d_compiler instead of dirty the flag
at v3dx if there is any spill (change suggested by Eric, added by
Alejandro)
v3: set this for anything that is not a load and do it also in
v3d40_vir_emit_image_load_store (Eric)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This better matches all the other atomic intrinsics such as those for
SSBOs and shared variables where the sign is part of the intrinsic
opcode. Both generators (GLSL and SPIR-V) know the sign from the type
of the image variable or handle. In SPIR-V, signed min/max are separate
opcodes from unsigned.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows to remove a mov of 1/-1, as it is implicit with the
operation.
As with atomic inc/dec/add, usual shader-db set doesn't include any
GLES shader using it. So using as workaround vk-gl-cts shaders, we get
this:
total instructions in shared programs: 1217013 -> 1217006 (<.01%)
instructions in affected programs: 53 -> 46 (-13.21%)
helped: 2
HURT: 0
One of the helped shader went from 40 to 34 instructions.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
And moved to new auxiliar method v3d40_image_load_store_tmu_op,
equivalent to the nir_to_nir v3d_general_tmu_op, to clean-up a little.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
An unused tex should be DCEed, but if it wasn't we'd run into trouble with
not doing a TMUWT.
|
|
|
|
|
|
|
| |
Fixes assertion failures in the CTS since Karol's cleanup when NIR started
noticing that we were reading an invalid component.
Fixes: 5450f1c9fb09 ("v3d: prefer using nir_src_comp_as_int over nir_src_as_const_value")
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
I'm not sure why I didn't do this before -- it's clearly much simpler to
add dumping of the extra thing than to have it as another implicit source.
|
|
|
|
| |
Noticed while looking at the gitlab-CI MR.
|
|
|
|
|
|
| |
This is only exposed on V3D 4.1+, because we didn't have the TMU write
operations for images on 3.3 (To do GLES 3.1 there, you have to lower it
to SSBO load/stores, which is a problem to solve later).
|
|
|
|
|
|
|
|
| |
Even without any clever optimization on the unpack operations, this gives
us a useful value for the channels read field, which we can use to avoid
ldtmu instructions to the no-op register.
instructions in affected programs: 890712 -> 881974 (-0.98%)
|
|
|
|
|
|
| |
Fixes
dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.clamp_to_edge_repeat
and others.
|
|
|
|
|
|
| |
This is what the GLSL ES 310 spec tells us to do, but apparently the
"gather mode" flag doesn't imply it in the HW. Fixes
dEQP-GLES31.functional.texture.gather.basic.2d.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear
|
|
|
|
|
|
| |
These have all been floating in my head, and while I've thought about
encoding them in issues on gitlab once they're enabled, they also make
sense to just have in the area of the code you'll need to work in.
|
|
|
|
|
|
| |
shader-db:
total instructions in shared programs: 91275 -> 90768 (-0.56%)
instructions in affected programs: 20702 -> 20195 (-2.45%)
|
|
|
|
| |
The docs had an update noting this restriction, so reflect it in the code.
|
|
|
|
|
|
| |
Fixes simulator assertion failures in
dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment
and similar complicated cases.
|
|
|