mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	ir3/lower_io_offsets: Try propagate SSBO's SHR into a previous shift instruction	Eduardo Lima Mitev	2019-03-13	1	-4/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While we lack value range tracking, this patch tries to 'manually' propogate the division by 4 to calculate SSBO element-offset, into a possible previous shift operation (shift left or right); checking that it is safe to do so. This should help in cases like ie. when accessing a field in an array of structs, where the offset is likely defined as base plus a multiplication by a struct or array element size. See dEQP test 'dEQP-GLES31.functional.ssbo.atomic.xor.highp_uint' for an example of a shader that benefits from this. Reviewed-by: Rob Clark <[email protected]>
*	ir3/compiler: Enable lower_io_offsets pass and handle new SSBO intrinsics	Eduardo Lima Mitev	2019-03-13	4	-60/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These intrinsics have the offset in dwords already computed in the last source, so the change here is basically using that instead of emitting the ir3_SHR to divide the byte-offset by 4. The improvement in shader stats is significant, of up to ~15% in instruction count in some cases. Tested only on a5xx. shader-db is unfortunately not very useful here because shaders that use SSBO require GLSL versions that are not supported by freedreno yet. For examples, most Khronos CTS tests under 'dEQP-GLES31.functional.ssbo.*' are helped. A random case: dEQP-GLES31.functional.ssbo.layout.2_level_array.packed.row_major_mat3x2 with current master: ; CL prog 14/1: 1252 instructions, 0 half, 48 full ; 8 const, 8 constlen ; 61 (ss), 43 (sy) with the SSBO dword-offset moved to NIR: ; CL prog 14/1: 1053 instructions, 0 half, 45 full ; 7 const, 7 constlen ; 34 (ss), 73 (sy) The SHR previously emitted for every single SSBO instruction disappears in most cases, and the dword-offset ends up embedded in the STGB instruction as immediate in many cases as well. There are also a few of those tests that are currently failing on register allocation, that start to pass as a result of reducing the pressure. At least these, probably more: dEQP-GLES31.functional.ssbo.layout.random.unsized_arrays.24 dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6 dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.17 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays.14 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays_instance_arrays.5 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays_instance_arrays.7 No regressions observed with relevant CTS and piglit tests. Reviewed-by: Rob Clark <[email protected]>
*	ir3/nir: Add a new pass 'ir3_nir_lower_io_offsets'	Eduardo Lima Mitev	2019-03-13	4	-0/+217
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This NIR->NIR pass implements offset computations that are currently done on the IR3 backend compiler, to give NIR a better chance of optimizing them. For now, it supports lowering the dword-offset computation for SSBO instructions. It will take an SSBO intrinsic and replace it with the new ir3-specific version that adds an extra source. That source will hold the SSA value resulting from inserting a division by 4 (an SHR op) of the original byte-offset source already provided by NIR in one of the intrinsic sources. Note that on a6xx the original byte-offset is not needed, so we could potentially replace that source instead of adding a new one. But to keep things simple and consistent we always add the new source and a6xx will just ignore the original one. Reviewed-by: Rob Clark <[email protected]>
*	turnip: preliminary support for Wayland WSI	Chia-I Wu	2019-03-11	6	-1/+358
\|
*	turnip: preliminary support for tu_GetImageSubresourceLayout	Chia-I Wu	2019-03-11	1	-5/+11
\|
*	turnip: Use Vulkan 1.1 names instead of KHR	Chad Versace	2019-03-11	8	-100/+100
\| \| \| \| \| \| \|	That is, drop KHR from all tokens that were promoted to Vulkan 1.1. The consistency makes ctags more useful (it now jumps directly to the real definitions in vulkan_core.h instead of the typedefs); and it makes the code slightly less verbose.
*	turnip: preliminary support for tu_CmdDraw	Chia-I Wu	2019-03-11	1	-0/+83
\|
*	turnip: preliminary support for draw state binding	Chia-I Wu	2019-03-11	2	-4/+357
\| \| \| \| \|	This adds support for tu_CmdBindPipeline, tu_CmdBindVertexBuffers, etc.
*	turnip: add draw_cs to tu_cmd_buffer	Chia-I Wu	2019-03-11	2	-3/+21
\| \| \| \|	It will hold draw commands.
*	turnip: parse VkPipelineVertexInputStateCreateInfo	Chia-I Wu	2019-03-11	2	-0/+134
\|
*	turnip: parse VkPipelineShaderStageCreateInfo	Chia-I Wu	2019-03-11	2	-0/+602
\|
*	turnip: compile VkPipelineShaderStageCreateInfo	Chia-I Wu	2019-03-11	2	-0/+143
\| \| \| \|	Compile all shaders and upload the binaries to a BO.
*	turnip: preliminary support for shader modules	Chia-I Wu	2019-03-11	5	-5/+411
\| \| \| \| \| \|	Save SPIR-V in tu_shader_module. Tranlation to NIR happens in tu_shader_create, and compilation to binary code happens in tu_shader_compile. Both will be called during pipeline creation.
*	turnip: parse VkPipeline{Multisample,ColorBlend}StateCreateInfo	Chia-I Wu	2019-03-11	3	-0/+355
\|
*	turnip: parse VkPipelineDepthStencilStateCreateInfo	Chia-I Wu	2019-03-11	2	-2/+204
\|
*	turnip: parse VkPipelineRasterizationStateCreateInfo	Chia-I Wu	2019-03-11	2	-0/+128
\|
*	turnip: parse VkPipelineViewportStateCreateInfo	Chia-I Wu	2019-03-11	2	-1/+130
\|
*	turnip: parse VkPipelineInputAssemblyStateCreateInfo	Chia-I Wu	2019-03-11	2	-0/+49
\|
*	turnip: parse VkPipelineDynamicStateCreateInfo	Chia-I Wu	2019-03-11	1	-0/+46
\|
*	turnip: create a less dummy pipeline	Chia-I Wu	2019-03-11	2	-30/+109
\| \| \| \|	Still dummy, but at least it is created from tu_pipeline_builder.
*	turnip: simplify tu_cs sub-streams usage	Chia-I Wu	2019-03-11	2	-7/+12
\| \| \| \| \| \|	Let tu_cs_begin_sub_stream imply tu_cs_reserve_space, and tu_cs_end_sub_stream imply tu_cs_sanity_check. Callers are no longer required to call them (but can still do if they choose to).
*	turnip: fix tu_cs sub-streams	Chia-I Wu	2019-03-11	1	-1/+5
\| \| \| \| \|	Update cs->start in tu_cs_end_sub_stream. Otherwise, the entry would include commands from all prior sub-streams.
*	turnip: tu_cs_emit_array	Chia-I Wu	2019-03-11	1	-0/+11
\| \| \| \| \| \|	Array version of tu_cs_emit. Useful for updating multiple consecutive array-like registers, or loading a shader binary with SS6_DIRECT.
*	turnip: add tu_cs_discard_entries	Chia-I Wu	2019-03-11	1	-0/+11
\| \| \| \| \| \|	We will start a draw IB at the beginning of a subpass and consume it at the end of the subpass. With tu_cs_discard_entries, we can reuse the same tu_cs for all subpasses.
*	turnip: more/better asserts for tu_cs	Chia-I Wu	2019-03-11	1	-2/+4
\| \| \| \| \| \| \| \|	Asserting (cur < end) in tu_cs_emit catches much less programming errors comparing to asserting (cur < reserved_end). We should never write more commands than what we have reserved. Assert IB is non-empty and sane in tu_cs_emit_ib.
*	turnip: use 32-bit offset in tu_cs_entry	Chia-I Wu	2019-03-11	2	-2/+6
\| \| \| \|	We don't support nor expect BOs to be that big in tu_cs.
*	turnip: mark IBs for dumping	Chia-I Wu	2019-03-11	2	-3/+4
\| \| \| \|	Includes IBs in kernel cmdbuf dumps.
*	turnip: use the platform defines in vk.xml instead of hard-coding them	Eric Engestrom	2019-03-11	1	-4/+7
\| \| \| \|	Signed-off-by: Eric Engestrom <[email protected]>
*	turnip: Add todo for copies.	Bas Nieuwenhuizen	2019-03-11	1	-0/+7
\|
*	turnip: Add buffer->image DMA copies.	Bas Nieuwenhuizen	2019-03-11	1	-12/+195
\| \| \| \|	Passes dEQP-VK.api.copy_and_blit.core.buffer_to_image.*
*	turnip: Add image->buffer DMA copies.	Bas Nieuwenhuizen	2019-03-11	1	-12/+218
\| \| \| \|	Passes dEQP-VK.api.copy_and_blit.core.image_to_buffer.*
*	turnip: Implement buffer->buffer DMA copies.	Bas Nieuwenhuizen	2019-03-11	2	-9/+190
\| \| \| \|	Passes dEQP-VK.api.copy_and_blit.core.buffer_to_buffer.*
*	turnip: Add tu6_rb_fmt_to_ifmt.	Bas Nieuwenhuizen	2019-03-11	2	-0/+75
\|
*	turnip: Make tu6_emit_event_write shared.	Bas Nieuwenhuizen	2019-03-11	2	-1/+7
\|
*	turnip: Add buffer memory binding.	Bas Nieuwenhuizen	2019-03-11	2	-0/+14
\|
*	turnip: respect color attachment formats	Chia-I Wu	2019-03-11	3	-22/+33
\| \| \| \| \|	Make tu6_get_native_format available to tu_cmd_buffer and start using of it.
*	turnip: preliminary support for fences	Chia-I Wu	2019-03-11	4	-79/+412
\| \| \| \| \| \|	This should be quite complete feature-wise. External fences are still missing. We probably also want to add a simpler path to tu_WaitForFences for when fenceCount == 1.
*	turnip: fix VkClearValue packing	Chia-I Wu	2019-03-11	3	-46/+224
\| \| \| \| \| \|	Add tu_pack_clear_value to correctly pack VkClearValue according to VkFormat. It ignores the component order defined by VkFormat, and always packs to WZYX order.
*	turnip: add support for VK_KHR_external_memory_{fd,dma_buf}	Chia-I Wu	2019-03-11	2	-1/+64
\|
*	turnip: advertise VK_KHR_external_memory	Chia-I Wu	2019-03-11	1	-0/+1
\| \| \| \| \|	AFAICT, it is supported. We don't need to handle any of the new structs because our BOs can always be exported.
*	turnip: advertise VK_KHR_external_memory_capabilities	Chia-I Wu	2019-03-11	1	-0/+1
\| \| \| \|	AFAICT, it is supported.
*	turnip: add functions to import/export prime fd	Chia-I Wu	2019-03-11	3	-14/+91
\| \| \| \| \|	Add tu_bo_init_dmabuf, tu_bo_export_dmabuf, tu_gem_import_dmabuf, and tu_gem_export_dmabuf.
*	turnip: Fix error behavior for VkPhysicalDeviceExternalImageFormatInfo	Chad Versace	2019-03-11	1	-26/+24
\| \| \| \| \| \| \| \|	If the handle type is unsupported, then the spec requires us to return VK_ERROR_FORMAT_NOT_SUPPORTED. Reviewed-by: Chia-I Wu <[email protected]> Closes: https://gitlab.freedesktop.org/bnieuwenhuizen/mesa/merge_requests/17
*	turnip: add a more complete format table	Chia-I Wu	2019-03-11	1	-47/+301
\| \| \| \| \| \| \| \| \| \| \| \|	A format table is an array of tu_native_format. Table lookup is done through array indexing. This commit defines a single format table for core VkFormat. It is derived from the table in the gallium driver. There might be errors introduced in the process of the conversion. When an extension that defines new VkFormat is supported, we need to add a new table for the extension.
*	turnip: preliminary support for loadOp and storeOp	Chia-I Wu	2019-03-11	2	-19/+645
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- create tile_load_ib and tile_store_ib at the beginning of each subpass - execute the IBs at the end of each subpass - no DONT_CARE support - no subpass dependency analysis and subpass merging - no zs support - no true VkImageView support - assume VK_FORMAT_B8G8R8A8_UNORM - no tiling - no MSAA This also removes cur_cs from tu_cmd_buffer.
*	turnip: add TU_CS_MODE_SUB_STREAM	Chia-I Wu	2019-03-11	4	-6/+81
\| \| \| \| \| \| \|	When in TU_CS_MODE_SUB_STREAM, tu_cs_begin_sub_stream (or tu_cs_end_sub_stream) should be called instead of tu_cs_begin (or tu_cs_end). It gives the caller a TU_CS_MODE_EXTERNAL cs to emit commands to.
*	turnip: add tu_cs_mode	Chia-I Wu	2019-03-11	3	-30/+98
\| \| \| \| \| \| \| \| \|	Add tu_cs_mode and TU_CS_MODE_EXTERNAL. When in TU_CS_MODE_EXTERNAL, tu_cs wraps an external buffer and can not grow. This also moves tu_cs* up in tu_private.h, such that other structs can embed tu_cs_entry.
*	turnip: provide both emit_ib and emit_call	Chia-I Wu	2019-03-11	1	-7/+30
\| \| \| \| \|	tu_cs_emit_ib emits a CP_INDIRECT_BUFFER for a BO. tu_cs_emit_call emits a CP_INDIRECT_BUFFER for each entry of a target cs.
*	turnip: add tu_cs_sanity_check	Chia-I Wu	2019-03-11	4	-7/+7
\| \| \| \| \|	It replaces tu_cs_reserve_space_assert and can be called at any time to sanity check tu_cs.
*	turnip: never fail tu_cs_begin/tu_cs_end	Chia-I Wu	2019-03-11	3	-62/+49
\| \| \| \| \| \|	Error checking tu_cs_begin/tu_cs_end is too tedious for the callers. Move tu_cs_add_bo and tu_cs_reserve_entry to tu_cs_reserve_space such that tu_cs_begin/tu_cs_end never fails.