mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	freedreno/ir3: reads/writes to unrelated arrays are not dependent	Rob Clark	2019-03-28	1	-1/+30
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: sched fix	Rob Clark	2019-03-28	1	-1/+1
\| \| \| \| \| \| \|	Not sure why new-style frag inputs start triggering this. But we probably shouldn't consider src's from other blocks. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: Add workaround for VS samgq	Kristian H. Kristensen	2019-03-28	6	-4/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This instruction needs a workaround when used from vertex shaders. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler2dshadow_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler2dshadow_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler2dshadow_vertex Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno/ir3: Don't access beyond available regs	Kristian H. Kristensen	2019-03-28	1	-4/+7
\| \| \| \| \| \| \| \|	emit_cat5() needs to check if the last optional reg is there before it accesses it. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno/ir3: Push UBOs to constant file	Kristian H. Kristensen	2019-03-27	3	-12/+118
\| \| \| \| \| \| \| \|	We have a rather big constant file and it seems that the best way to use it is to upload all UBOs and lower UBO access the load_uniform. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS	Kristian H. Kristensen	2019-03-27	7	-13/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit turns on the gallium cap and adds a pass to lower the load_ubo intrinsics for block 0 back to load_uniform intrinsics and adjust the backend where the cap switches units from vec4s to dwords. As we stop using ir3_glsl_type_size() for uniform layout, this also corrects an issue where we would allocate a vec4 slot for samplers in uniforms, fixing: dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_fragment dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_vertex dEQP-GLES3.functional.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno/ir3: Fix operand order for DSX/DSY	Kristian H. Kristensen	2019-03-25	1	-0/+15
\| \| \| \| \| \| \| \| \| \|	Most cat5 instructions are constructed using ir3_SAM, which uses regs[1] for the (sampler, tex) src. Not DSX/DSY though, so we look up src1 and src2 differently for those two. Fixes: 1dffb089 ("freedreno/ir3: fix sam.s2en encoding") Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	freedreno/ir3: Track whether shader needs derivatives	Kristian H. Kristensen	2019-03-25	4	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	In 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") we started counting number of samplers based on the uniform vars instead of number of cat5 instructions. We used the number of samplers to determine whether to enable derivatives, but when we only use derivatives and no samplers, that now breaks. Track whether we need derivatives explicitly and use that to enable the state. Fixes: 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass	Samuel Pitoiset	2019-03-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	freedreno/ir3: disable early-z for SSBO/image writes	Rob Clark	2019-03-22	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \|	Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_stencil dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_stencil_fbo Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: rename has_kill to no_earlyz	Rob Clark	2019-03-22	3	-4/+4
\| \| \| \| \| \| \|	There are other cases where we need to disable early-z, like image writes. So rename to something more generic. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: dynamic UBO indexing vs 64b pointers	Rob Clark	2019-03-21	1	-2/+2
\| \| \| \| \| \| \|	Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.uniform_fragment and similar things with multiple UBOs Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix bit_count	Rob Clark	2019-03-21	1	-2/+23
\| \| \| \| \| \| \| \| \|	Seems like it can only work 16b at a time. Fixes dEQP-GLES31.functional.shaders.builtin_functions.integer.bitcount.* TODO need to check if this limitation applies to a3xx as well. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: additional lowering	Rob Clark	2019-03-21	1	-0/+6
\| \| \| \| \| \| \| \| \|	For some things that show up when we expose higher glsl TODO check blob traces to see if we have instructions for some of this? I guess we don't but worth a check.. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: optimize sam.s2en to sam	Rob Clark	2019-03-21	3	-6/+36
\| \| \| \| \| \| \|	Detect when sampler/texture idx are immediate and switch to non s2en encoding. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: enable indirect tex/samp (sam.s2en)	Rob Clark	2019-03-21	2	-22/+73
\| \| \| \| \| \| \| \| \| \|	For now it uses indirect for everything. The next step is for the ir3_cp pass to detect the case that tex and samp idx are immediate and convert the sam instruction back to the non .s2en variant. But doing that in a following patch so we can shake out the bugs with .s2en more easily. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: find # of samplers from uniform vars	Rob Clark	2019-03-21	3	-13/+13
\| \| \| \| \| \| \|	When we have indirect samplers, we cannot tell the max sampler referenced. Instead just refer to the number of sampler uniforms. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix regmask for merged regs	Rob Clark	2019-03-21	2	-3/+13
\| \| \| \| \| \| \| \|	On a6xx+ with half-regs conflicting with full-regs, the legalize pass needs to set appropriate sync bits, such as (sy), on writes to full regs that conflict with half regs, and visa-versa. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix sam.s2en encoding	Rob Clark	2019-03-21	2	-9/+12
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix sam.s2en decoding	Rob Clark	2019-03-21	1	-3/+5
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3/ra: fix half-class conflicts	Rob Clark	2019-03-21	1	-7/+14
\| \| \| \| \| \| \| \| \| \| \| \|	On a6xx, half-regs conflict with full-regs. But we were only setting up conflicts for the first class (ie. scalar, but not hvec2/hvec3/hvec4), resulting in higher half-reg classes getting assigned to regs that overwrite full-regs. Noticed while trying to enable indirect-sampler (sam.s2en) which uses an hvec2 argument to pass the sampler/tex index. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3 better cat6 encoding detection	Rob Clark	2019-03-21	2	-8/+24
\| \| \| \| \| \| \|	These two bits seem to be a better way to detect which encoding we are looking at. Signed-off-by: Rob Clark <[email protected]>
*	anv,radv,turnip: Lower TG4 offsets with nir_lower_tex	Jason Ekstrand	2019-03-21	1	-0/+1
\| \| \| \| \| \|	v2: turn on for turnip as well (Karol Herbst) Reviewed-by: Karol Herbst <[email protected]>
*	freedreno/ir3/a6xx: fix ssbo comp_swap	Rob Clark	2019-03-20	1	-1/+1
\| \| \| \| \| \| \|	One line left out of the conversion to ir3 ssbo intrinsics on a6xx. Fixes: 2e4525883f0 ir3/compiler: Enable lower_io_offsets pass and handle new SSBO intrinsics Signed-off-by: Rob Clark <[email protected]>
*	turnip: Deconflict vk_format_table regeneration	Bas Nieuwenhuizen	2019-03-16	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Avoids src/freedreno/vulkan/meson.build:42:0: ERROR: Tried to create target "vk_format_table.c", but a target of that name already exists. when building both radv and turnip. Fixes: 26380b3a9f8 "turnip: Add driver skeleton (v2)" Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	turnip: Fix GCC compiles.	Bas Nieuwenhuizen	2019-03-16	1	-6/+3
\| \| \| \| \| \| \| \| \| \|	Apparently GCC does not consider static const variables to be integer constants, and hence the array size and the static assert result in compile failures. Fixes: 4b9f967cd1a "turnip: add a more complete format table" Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	freedreno/ir3/cp: fix ldib bug	Rob Clark	2019-03-15	1	-0/+6
\| \| \| \| \| \| \|	Something that we didn't hit earlier because of the extra shr.b Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	ir3/lower_io_offsets: Try propagate SSBO's SHR into a previous shift instruction	Eduardo Lima Mitev	2019-03-13	1	-4/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While we lack value range tracking, this patch tries to 'manually' propogate the division by 4 to calculate SSBO element-offset, into a possible previous shift operation (shift left or right); checking that it is safe to do so. This should help in cases like ie. when accessing a field in an array of structs, where the offset is likely defined as base plus a multiplication by a struct or array element size. See dEQP test 'dEQP-GLES31.functional.ssbo.atomic.xor.highp_uint' for an example of a shader that benefits from this. Reviewed-by: Rob Clark <[email protected]>
*	ir3/compiler: Enable lower_io_offsets pass and handle new SSBO intrinsics	Eduardo Lima Mitev	2019-03-13	4	-60/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These intrinsics have the offset in dwords already computed in the last source, so the change here is basically using that instead of emitting the ir3_SHR to divide the byte-offset by 4. The improvement in shader stats is significant, of up to ~15% in instruction count in some cases. Tested only on a5xx. shader-db is unfortunately not very useful here because shaders that use SSBO require GLSL versions that are not supported by freedreno yet. For examples, most Khronos CTS tests under 'dEQP-GLES31.functional.ssbo.*' are helped. A random case: dEQP-GLES31.functional.ssbo.layout.2_level_array.packed.row_major_mat3x2 with current master: ; CL prog 14/1: 1252 instructions, 0 half, 48 full ; 8 const, 8 constlen ; 61 (ss), 43 (sy) with the SSBO dword-offset moved to NIR: ; CL prog 14/1: 1053 instructions, 0 half, 45 full ; 7 const, 7 constlen ; 34 (ss), 73 (sy) The SHR previously emitted for every single SSBO instruction disappears in most cases, and the dword-offset ends up embedded in the STGB instruction as immediate in many cases as well. There are also a few of those tests that are currently failing on register allocation, that start to pass as a result of reducing the pressure. At least these, probably more: dEQP-GLES31.functional.ssbo.layout.random.unsized_arrays.24 dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6 dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.17 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays.14 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays_instance_arrays.5 dEQP-GLES31.functional.ssbo.layout.random.nested_structs_arrays_instance_arrays.7 No regressions observed with relevant CTS and piglit tests. Reviewed-by: Rob Clark <[email protected]>
*	ir3/nir: Add a new pass 'ir3_nir_lower_io_offsets'	Eduardo Lima Mitev	2019-03-13	4	-0/+217
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This NIR->NIR pass implements offset computations that are currently done on the IR3 backend compiler, to give NIR a better chance of optimizing them. For now, it supports lowering the dword-offset computation for SSBO instructions. It will take an SSBO intrinsic and replace it with the new ir3-specific version that adds an extra source. That source will hold the SSA value resulting from inserting a division by 4 (an SHR op) of the original byte-offset source already provided by NIR in one of the intrinsic sources. Note that on a6xx the original byte-offset is not needed, so we could potentially replace that source instead of adding a new one. But to keep things simple and consistent we always add the new source and a6xx will just ignore the original one. Reviewed-by: Rob Clark <[email protected]>
*	turnip: preliminary support for Wayland WSI	Chia-I Wu	2019-03-11	6	-1/+358
\|
*	turnip: preliminary support for tu_GetImageSubresourceLayout	Chia-I Wu	2019-03-11	1	-5/+11
\|
*	turnip: Use Vulkan 1.1 names instead of KHR	Chad Versace	2019-03-11	8	-100/+100
\| \| \| \| \| \| \|	That is, drop KHR from all tokens that were promoted to Vulkan 1.1. The consistency makes ctags more useful (it now jumps directly to the real definitions in vulkan_core.h instead of the typedefs); and it makes the code slightly less verbose.
*	turnip: preliminary support for tu_CmdDraw	Chia-I Wu	2019-03-11	1	-0/+83
\|
*	turnip: preliminary support for draw state binding	Chia-I Wu	2019-03-11	2	-4/+357
\| \| \| \| \|	This adds support for tu_CmdBindPipeline, tu_CmdBindVertexBuffers, etc.
*	turnip: add draw_cs to tu_cmd_buffer	Chia-I Wu	2019-03-11	2	-3/+21
\| \| \| \|	It will hold draw commands.
*	turnip: parse VkPipelineVertexInputStateCreateInfo	Chia-I Wu	2019-03-11	2	-0/+134
\|
*	turnip: parse VkPipelineShaderStageCreateInfo	Chia-I Wu	2019-03-11	2	-0/+602
\|
*	turnip: compile VkPipelineShaderStageCreateInfo	Chia-I Wu	2019-03-11	2	-0/+143
\| \| \| \|	Compile all shaders and upload the binaries to a BO.
*	turnip: preliminary support for shader modules	Chia-I Wu	2019-03-11	5	-5/+411
\| \| \| \| \| \|	Save SPIR-V in tu_shader_module. Tranlation to NIR happens in tu_shader_create, and compilation to binary code happens in tu_shader_compile. Both will be called during pipeline creation.
*	turnip: parse VkPipeline{Multisample,ColorBlend}StateCreateInfo	Chia-I Wu	2019-03-11	3	-0/+355
\|
*	turnip: parse VkPipelineDepthStencilStateCreateInfo	Chia-I Wu	2019-03-11	2	-2/+204
\|
*	turnip: parse VkPipelineRasterizationStateCreateInfo	Chia-I Wu	2019-03-11	2	-0/+128
\|
*	turnip: parse VkPipelineViewportStateCreateInfo	Chia-I Wu	2019-03-11	2	-1/+130
\|
*	turnip: parse VkPipelineInputAssemblyStateCreateInfo	Chia-I Wu	2019-03-11	2	-0/+49
\|
*	turnip: parse VkPipelineDynamicStateCreateInfo	Chia-I Wu	2019-03-11	1	-0/+46
\|
*	turnip: create a less dummy pipeline	Chia-I Wu	2019-03-11	2	-30/+109
\| \| \| \|	Still dummy, but at least it is created from tu_pipeline_builder.
*	turnip: simplify tu_cs sub-streams usage	Chia-I Wu	2019-03-11	2	-7/+12
\| \| \| \| \| \|	Let tu_cs_begin_sub_stream imply tu_cs_reserve_space, and tu_cs_end_sub_stream imply tu_cs_sanity_check. Callers are no longer required to call them (but can still do if they choose to).
*	turnip: fix tu_cs sub-streams	Chia-I Wu	2019-03-11	1	-1/+5
\| \| \| \| \|	Update cs->start in tu_cs_end_sub_stream. Otherwise, the entry would include commands from all prior sub-streams.
*	turnip: tu_cs_emit_array	Chia-I Wu	2019-03-11	1	-0/+11
\| \| \| \| \| \|	Array version of tu_cs_emit. Useful for updating multiple consecutive array-like registers, or loading a shader binary with SS6_DIRECT.