summaryrefslogtreecommitdiffstats
path: root/src/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* ir3/lower_io_offsets: Try propagate SSBO's SHR into a previous shift instructionEduardo Lima Mitev2019-03-131-4/+94
| | | | | | | | | | | | | | | While we lack value range tracking, this patch tries to 'manually' propogate the division by 4 to calculate SSBO element-offset, into a possible previous shift operation (shift left or right); checking that it is safe to do so. This should help in cases like ie. when accessing a field in an array of structs, where the offset is likely defined as base plus a multiplication by a struct or array element size. See dEQP test 'dEQP-GLES31.functional.ssbo.atomic.xor.highp_uint' for an example of a shader that benefits from this. Reviewed-by: Rob Clark <[email protected]>
* ir3/compiler: Enable lower_io_offsets pass and handle new SSBO intrinsicsEduardo Lima Mitev2019-03-134-60/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These intrinsics have the offset in dwords already computed in the last source, so the change here is basically using that instead of emitting the ir3_SHR to divide the byte-offset by 4. The improvement in shader stats is significant, of up to ~15% in instruction count in some cases. Tested only on a5xx. shader-db is unfortunately not very useful here because shaders that use SSBO require GLSL versions that are not supported by freedreno yet. For examples, most Khronos CTS tests under 'dEQP-GLES31.functional.ssbo.*' are helped. A random case: dEQP-GLES31.functional.ssbo.layout.2_level_array.packed.row_major_mat3x2 with current master: ; CL prog 14/1: 1252 instructions, 0 half, 48 full ; 8 const, 8 constlen ; 61 (ss), 43 (sy) with the SSBO dword-offset moved to NIR: ; CL prog 14/1: 1053 instructions, 0 half, 45 full ; 7 const, 7 constlen ; 34 (ss), 73 (sy) The SHR previously emitted for every single SSBO instruction disappears in most cases, and the dword-offset ends up embedded in the STGB instruction as immediate in many cases as well. There are also a few of those tests that are currently failing on register allocation, that start to pass as a result of reducing the pressure. At least these, probably more: dEQP-GLES31.functional.ssbo.layout.random.unsized_arrays.24 dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6 dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.17 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays.14 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays_instance_arrays.5 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays_instance_arrays.7 No regressions observed with relevant CTS and piglit tests. Reviewed-by: Rob Clark <[email protected]>
* ir3/nir: Add a new pass 'ir3_nir_lower_io_offsets'Eduardo Lima Mitev2019-03-134-0/+217
| | | | | | | | | | | | | | | | | | | | This NIR->NIR pass implements offset computations that are currently done on the IR3 backend compiler, to give NIR a better chance of optimizing them. For now, it supports lowering the dword-offset computation for SSBO instructions. It will take an SSBO intrinsic and replace it with the new ir3-specific version that adds an extra source. That source will hold the SSA value resulting from inserting a division by 4 (an SHR op) of the original byte-offset source already provided by NIR in one of the intrinsic sources. Note that on a6xx the original byte-offset is not needed, so we could potentially replace that source instead of adding a new one. But to keep things simple and consistent we always add the new source and a6xx will just ignore the original one. Reviewed-by: Rob Clark <[email protected]>
* turnip: preliminary support for Wayland WSIChia-I Wu2019-03-116-1/+358
|
* turnip: preliminary support for tu_GetImageSubresourceLayoutChia-I Wu2019-03-111-5/+11
|
* turnip: Use Vulkan 1.1 names instead of KHRChad Versace2019-03-118-100/+100
| | | | | | | That is, drop KHR from all tokens that were promoted to Vulkan 1.1. The consistency makes ctags more useful (it now jumps directly to the real definitions in vulkan_core.h instead of the typedefs); and it makes the code slightly less verbose.
* turnip: preliminary support for tu_CmdDrawChia-I Wu2019-03-111-0/+83
|
* turnip: preliminary support for draw state bindingChia-I Wu2019-03-112-4/+357
| | | | | This adds support for tu_CmdBindPipeline, tu_CmdBindVertexBuffers, etc.
* turnip: add draw_cs to tu_cmd_bufferChia-I Wu2019-03-112-3/+21
| | | | It will hold draw commands.
* turnip: parse VkPipelineVertexInputStateCreateInfoChia-I Wu2019-03-112-0/+134
|
* turnip: parse VkPipelineShaderStageCreateInfoChia-I Wu2019-03-112-0/+602
|
* turnip: compile VkPipelineShaderStageCreateInfoChia-I Wu2019-03-112-0/+143
| | | | Compile all shaders and upload the binaries to a BO.
* turnip: preliminary support for shader modulesChia-I Wu2019-03-115-5/+411
| | | | | | Save SPIR-V in tu_shader_module. Tranlation to NIR happens in tu_shader_create, and compilation to binary code happens in tu_shader_compile. Both will be called during pipeline creation.
* turnip: parse VkPipeline{Multisample,ColorBlend}StateCreateInfoChia-I Wu2019-03-113-0/+355
|
* turnip: parse VkPipelineDepthStencilStateCreateInfoChia-I Wu2019-03-112-2/+204
|
* turnip: parse VkPipelineRasterizationStateCreateInfoChia-I Wu2019-03-112-0/+128
|
* turnip: parse VkPipelineViewportStateCreateInfoChia-I Wu2019-03-112-1/+130
|
* turnip: parse VkPipelineInputAssemblyStateCreateInfoChia-I Wu2019-03-112-0/+49
|
* turnip: parse VkPipelineDynamicStateCreateInfoChia-I Wu2019-03-111-0/+46
|
* turnip: create a less dummy pipelineChia-I Wu2019-03-112-30/+109
| | | | Still dummy, but at least it is created from tu_pipeline_builder.
* turnip: simplify tu_cs sub-streams usageChia-I Wu2019-03-112-7/+12
| | | | | | Let tu_cs_begin_sub_stream imply tu_cs_reserve_space, and tu_cs_end_sub_stream imply tu_cs_sanity_check. Callers are no longer required to call them (but can still do if they choose to).
* turnip: fix tu_cs sub-streamsChia-I Wu2019-03-111-1/+5
| | | | | Update cs->start in tu_cs_end_sub_stream. Otherwise, the entry would include commands from all prior sub-streams.
* turnip: tu_cs_emit_arrayChia-I Wu2019-03-111-0/+11
| | | | | | Array version of tu_cs_emit. Useful for updating multiple consecutive array-like registers, or loading a shader binary with SS6_DIRECT.
* turnip: add tu_cs_discard_entriesChia-I Wu2019-03-111-0/+11
| | | | | | We will start a draw IB at the beginning of a subpass and consume it at the end of the subpass. With tu_cs_discard_entries, we can reuse the same tu_cs for all subpasses.
* turnip: more/better asserts for tu_csChia-I Wu2019-03-111-2/+4
| | | | | | | | Asserting (cur < end) in tu_cs_emit catches much less programming errors comparing to asserting (cur < reserved_end). We should never write more commands than what we have reserved. Assert IB is non-empty and sane in tu_cs_emit_ib.
* turnip: use 32-bit offset in tu_cs_entryChia-I Wu2019-03-112-2/+6
| | | | We don't support nor expect BOs to be that big in tu_cs.
* turnip: mark IBs for dumpingChia-I Wu2019-03-112-3/+4
| | | | Includes IBs in kernel cmdbuf dumps.
* turnip: use the platform defines in vk.xml instead of hard-coding themEric Engestrom2019-03-111-4/+7
| | | | Signed-off-by: Eric Engestrom <[email protected]>
* turnip: Add todo for copies.Bas Nieuwenhuizen2019-03-111-0/+7
|
* turnip: Add buffer->image DMA copies.Bas Nieuwenhuizen2019-03-111-12/+195
| | | | Passes dEQP-VK.api.copy_and_blit.core.buffer_to_image.*
* turnip: Add image->buffer DMA copies.Bas Nieuwenhuizen2019-03-111-12/+218
| | | | Passes dEQP-VK.api.copy_and_blit.core.image_to_buffer.*
* turnip: Implement buffer->buffer DMA copies.Bas Nieuwenhuizen2019-03-112-9/+190
| | | | Passes dEQP-VK.api.copy_and_blit.core.buffer_to_buffer.*
* turnip: Add tu6_rb_fmt_to_ifmt.Bas Nieuwenhuizen2019-03-112-0/+75
|
* turnip: Make tu6_emit_event_write shared.Bas Nieuwenhuizen2019-03-112-1/+7
|
* turnip: Add buffer memory binding.Bas Nieuwenhuizen2019-03-112-0/+14
|
* turnip: respect color attachment formatsChia-I Wu2019-03-113-22/+33
| | | | | Make tu6_get_native_format available to tu_cmd_buffer and start using of it.
* turnip: preliminary support for fencesChia-I Wu2019-03-114-79/+412
| | | | | | This should be quite complete feature-wise. External fences are still missing. We probably also want to add a simpler path to tu_WaitForFences for when fenceCount == 1.
* turnip: fix VkClearValue packingChia-I Wu2019-03-113-46/+224
| | | | | | Add tu_pack_clear_value to correctly pack VkClearValue according to VkFormat. It ignores the component order defined by VkFormat, and always packs to WZYX order.
* turnip: add support for VK_KHR_external_memory_{fd,dma_buf}Chia-I Wu2019-03-112-1/+64
|
* turnip: advertise VK_KHR_external_memoryChia-I Wu2019-03-111-0/+1
| | | | | AFAICT, it is supported. We don't need to handle any of the new structs because our BOs can always be exported.
* turnip: advertise VK_KHR_external_memory_capabilitiesChia-I Wu2019-03-111-0/+1
| | | | AFAICT, it is supported.
* turnip: add functions to import/export prime fdChia-I Wu2019-03-113-14/+91
| | | | | Add tu_bo_init_dmabuf, tu_bo_export_dmabuf, tu_gem_import_dmabuf, and tu_gem_export_dmabuf.
* turnip: Fix error behavior for VkPhysicalDeviceExternalImageFormatInfoChad Versace2019-03-111-26/+24
| | | | | | | | If the handle type is unsupported, then the spec requires us to return VK_ERROR_FORMAT_NOT_SUPPORTED. Reviewed-by: Chia-I Wu <[email protected]> Closes: https://gitlab.freedesktop.org/bnieuwenhuizen/mesa/merge_requests/17
* turnip: add a more complete format tableChia-I Wu2019-03-111-47/+301
| | | | | | | | | | | | A format table is an array of tu_native_format. Table lookup is done through array indexing. This commit defines a single format table for core VkFormat. It is derived from the table in the gallium driver. There might be errors introduced in the process of the conversion. When an extension that defines new VkFormat is supported, we need to add a new table for the extension.
* turnip: preliminary support for loadOp and storeOpChia-I Wu2019-03-112-19/+645
| | | | | | | | | | | | | | | - create tile_load_ib and tile_store_ib at the beginning of each subpass - execute the IBs at the end of each subpass - no DONT_CARE support - no subpass dependency analysis and subpass merging - no zs support - no true VkImageView support - assume VK_FORMAT_B8G8R8A8_UNORM - no tiling - no MSAA This also removes cur_cs from tu_cmd_buffer.
* turnip: add TU_CS_MODE_SUB_STREAMChia-I Wu2019-03-114-6/+81
| | | | | | | When in TU_CS_MODE_SUB_STREAM, tu_cs_begin_sub_stream (or tu_cs_end_sub_stream) should be called instead of tu_cs_begin (or tu_cs_end). It gives the caller a TU_CS_MODE_EXTERNAL cs to emit commands to.
* turnip: add tu_cs_modeChia-I Wu2019-03-113-30/+98
| | | | | | | | | Add tu_cs_mode and TU_CS_MODE_EXTERNAL. When in TU_CS_MODE_EXTERNAL, tu_cs wraps an external buffer and can not grow. This also moves tu_cs* up in tu_private.h, such that other structs can embed tu_cs_entry.
* turnip: provide both emit_ib and emit_callChia-I Wu2019-03-111-7/+30
| | | | | tu_cs_emit_ib emits a CP_INDIRECT_BUFFER for a BO. tu_cs_emit_call emits a CP_INDIRECT_BUFFER for each entry of a target cs.
* turnip: add tu_cs_sanity_checkChia-I Wu2019-03-114-7/+7
| | | | | It replaces tu_cs_reserve_space_assert and can be called at any time to sanity check tu_cs.
* turnip: never fail tu_cs_begin/tu_cs_endChia-I Wu2019-03-113-62/+49
| | | | | | Error checking tu_cs_begin/tu_cs_end is too tedious for the callers. Move tu_cs_add_bo and tu_cs_reserve_entry to tu_cs_reserve_space such that tu_cs_begin/tu_cs_end never fails.