aboutsummaryrefslogtreecommitdiffstats
path: root/src/broadcom/compiler/v3d40_tex.c
Commit message (Collapse)AuthorAgeFilesLines
* v3d: support for textureQueryLODAlejandro Piñeiro2020-04-221-26/+17
| | | | | | | | | | | | | | | | | | | Fixes all the ARB_texture_query_lod piglit tests, and needed to get the Vulkan CTS textureQueryLOD passing with the ongoing Vulkan driver. Note that LOD Query bit flag became only available on V42 of the hw, but the v3d40_tex is using V41 as reference. In order to avoid setting up the infrastructure to support both v41 and v42, we manually set the bit if the device version is the correct one. We also fix how the ARB_texture_query_lod (so EXT_texture_query_lod) is exposed. Before this commit it was always exposed (wrongly as it was not really supported). Now it is exposed for devinfo.ver >= 42. v2: move _need_sampler helper to nir.h (Eric Anholt) Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
* v3d/tex: Configuration Parameter 1 can be only skipped if P2 can be skipped tooAlejandro Piñeiro2020-04-221-2/+9
| | | | | | | | | | | | | Configuration Parameter packets 1 and 2 are pointed as optional, but it is not clearly stated if you can skip only P1 when P2 is needed. In the practice, it seems that the situation P0 - non-P1 - P2 can causes problems, and at least on the simulator, it seems that sampler info are attempted to be accessed. So let's just be conservative, and only skip P1 configuration if we can skip P2 configuration too. Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
* v3d/tex: don't configure tmu config 1 if not neededAlejandro Piñeiro2020-04-221-27/+66
| | | | | | | | | | | | | | | | | | | TMU configuration parameter 1 configures the sampler for the texture operation. But there are some texture operations that doesn't need a sampler. Skipping the configuration could provide a small perf improvement on OpenGL. On the incoming Vulkan driver, would allow us to avoid to set up an unneeded sampler. Note that we still need to add the sampler configuration parameter if the output is a 32bit, as it is on the sampler where we configure that info. Also, note that for images this is done comparing against a unpacked p1 default. But in order to do that it is needed to go through the code that fills up the unpacked p1. We can skip that too. Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
* v3d: Ask the state tracker to lower image accesses off of derefs.Eric Anholt2020-02-241-34/+23
| | | | | | | | This saves a bunch of hassle in handling derefs in the backend, and would be needed for reasonable handling of dynamic indexing of image arrays. Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
* v3d: Fix predication with atomic image operationsJose Maria Casanova Crespo2019-11-201-0/+12
| | | | | | | | | | | | | Fixes dEQP test: dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read Fixes piglit test: spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test Fixes: 6281f26f064ada ("v3d: Add support for shader_image_load_store.") Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: add new flag dirty TMU cache at v3d_compilerIago Toral Quiroga2019-10-181-0/+3
| | | | | | | | | | | | | | That we set for any TMU write on spills and general tmu. It is then used as part of v3d_emit_gl_shader_state later. v2: add a new flag instead at v3d_compiler instead of dirty the flag at v3dx if there is any spill (change suggested by Eric, added by Alejandro) v3: set this for anything that is not a load and do it also in v3d40_vir_emit_image_load_store (Eric) Reviewed-by: Eric Anholt <[email protected]>
* v3d: Use the correct opcodes for signed image min/maxJason Ekstrand2019-08-211-0/+2
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nir: Add explicit signs to image min/max intrinsicsJason Ekstrand2019-08-211-2/+4
| | | | | | | | | | | This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: use inc/dec tmu operation with image atomic sub/add of 1Alejandro Piñeiro2019-07-121-5/+11
| | | | | | | | | | | | | | | | | | This allows to remove a mov of 1/-1, as it is implicit with the operation. As with atomic inc/dec/add, usual shader-db set doesn't include any GLES shader using it. So using as workaround vk-gl-cts shaders, we get this: total instructions in shared programs: 1217013 -> 1217006 (<.01%) instructions in affected programs: 53 -> 46 (-13.21%) helped: 2 HURT: 0 One of the helped shader went from 40 to 34 instructions. Reviewed-by: Eric Anholt <[email protected]>
* v3d: refactor some code from v3d40_vir_emit_image_load_storeAlejandro Piñeiro2019-07-121-33/+29
| | | | | | | And moved to new auxiliar method v3d40_image_load_store_tmu_op, equivalent to the nir_to_nir v3d_general_tmu_op, to clean-up a little. Reviewed-by: Eric Anholt <[email protected]>
* v3d: Assert that we do request the normal texturing return data.Eric Anholt2019-04-261-0/+2
| | | | | An unused tex should be DCEed, but if it wasn't we'd run into trouble with not doing a TMUWT.
* v3d: Only look up the 3rd texture gather offset for non-arrays.Eric Anholt2019-04-161-1/+1
| | | | | | | Fixes assertion failures in the CTS since Karol's cleanup when NIR started noticing that we were reading an invalid component. Fixes: 5450f1c9fb09 ("v3d: prefer using nir_src_comp_as_int over nir_src_as_const_value")
* v3d: prefer using nir_src_comp_as_int over nir_src_as_const_valueKarol Herbst2019-04-071-6/+5
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: Switch implicit uniforms over to being any qinst->uniform != ~0.Eric Anholt2019-03-051-2/+1
| | | | | I'm not sure why I didn't do this before -- it's clearly much simpler to add dumping of the extra thing than to have it as another implicit source.
* v3d: Fix the autotools build.Eric Anholt2019-01-291-1/+1
| | | | Noticed while looking at the gitlab-CI MR.
* v3d: Add support for shader_image_load_store.Eric Anholt2019-01-141-3/+179
| | | | | | This is only exposed on V3D 4.1+, because we didn't have the TMU write operations for images on 3.3 (To do GLES 3.1 there, you have to lower it to SSBO load/stores, which is a problem to solve later).
* v3d: Use the core tex lowering.Eric Anholt2019-01-041-65/+3
| | | | | | | | Even without any clever optimization on the unpack operations, this gives us a useful value for the channels read field, which we can use to avoid ldtmu instructions to the no-op register. instructions in affected programs: 890712 -> 881974 (-0.98%)
* v3d: Add support for non-constant texture offsets.Eric Anholt2018-12-301-8/+24
| | | | | | Fixes dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.clamp_to_edge_repeat and others.
* v3d: Force sampling from base level for tg4.Eric Anholt2018-12-301-3/+3
| | | | | | This is what the GLSL ES 310 spec tells us to do, but apparently the "gather mode" flag doesn't imply it in the HW. Fixes dEQP-GLES31.functional.texture.gather.basic.2d.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear
* v3d: Drop in a bunch of notes about performance improvement opportunities.Eric Anholt2018-12-141-0/+14
| | | | | | These have all been floating in my head, and while I've thought about encoding them in issues on gitlab once they're enabled, they also make sense to just have in the area of the code you'll need to work in.
* v3d: Skip emitting texture config parameter 2 if it's just the defaults.Eric Anholt2018-07-231-1/+5
| | | | | | shader-db: total instructions in shared programs: 91275 -> 90768 (-0.56%) instructions in affected programs: 20702 -> 20195 (-2.45%)
* v3d: Add an assert that we don't provide an invalid texture return words.Eric Anholt2018-07-161-0/+8
| | | | The docs had an update noting this restriction, so reflect it in the code.
* v3d: Limit shader threading according to our maximum TMU fifo usage.Eric Anholt2018-06-151-10/+24
| | | | | | Fixes simulator assertion failures in dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment and similar complicated cases.
* broadcom/vc5: Add compiler support for V3D 4.x texturing.Eric Anholt2018-01-121-0/+237