summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ac/llvm: convert src operands to pointers if necessarySamuel Pitoiset2019-11-281-0/+11
| | | | | | | | | | | | | To avoid generating invalid LLVM IR when both operands don't have the same type. This might happen when performing pointer comparisons with SPIRV 1.4. Fixes invalid LLVM IR for: dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrequal.variable_pointers_ssbo_equal dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrnotequal.variable_pointers_ssbo_not_equal Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* llvmpipe: add initial nir supportDave Airlie2019-11-2810-20/+125
| | | | | | | | | | This adds the hooks between llvmpipe and the gallivm NIR code, for compute and fragment shaders. NIR support is hidden behind LP_DEBUG=nir for now until all the intergration issues are solved Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: add swizzle support where one channel isn't defined.Dave Airlie2019-11-282-12/+35
| | | | | | | NIR doesn't always define all output channels relies on outputs being memset to 0 Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add nir lowering passes for the draw pipe stages. (v2)Dave Airlie2019-11-287-27/+547
| | | | | | | | This transforms the NIR shaders like the TGSI transforms worked. v2: fix some nir info requirements, use 32-bit bools Acked-by: Roland Scheidegger <[email protected]>
* draw: add nir info gathering and building supportDave Airlie2019-11-285-26/+63
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: add nir->llvm translation (v2)Dave Airlie2019-11-285-0/+3142
| | | | | | | | | This add the initial implementation of the NIR->LLVM conversion for llvmpipe NIR support. v2: lower bool to int32 in nir not llvm Acked-by: Roland Scheidegger <[email protected]>
* gallivm: add selection for non-32 bit typesDave Airlie2019-11-281-1/+8
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: add cttz wrapperDave Airlie2019-11-282-0/+17
| | | | | | this will be used to write find_lsb support Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: add popcount intrinsic wrapperDave Airlie2019-11-282-1/+15
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: nir->tgsi info convertor (v2)Dave Airlie2019-11-285-1/+816
| | | | | | | | | | | This is a port of the old radeonsi code to be used for llvmpipe NIR support. Once we remove TGSI support from llvmpipe (I can dream? :-), then we should be able to refine most of this down and remove it. v2: port to later radeonsi code for vertex inputs and sampler/io parsing. Acked-by: Roland Scheidegger <[email protected]>
* gallivm: split out the flow control ir to a common file.Dave Airlie2019-11-286-479/+599
| | | | | | We can share a bunch of flow control handling between NIR and TGSI. Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: enable SPIR-V and GL 4.6 for NIRMarek Olšák2019-11-271-6/+5
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/nir: support interface output types to fix SPIR-V xfb piglitsMarek Olšák2019-11-271-1/+1
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/nir: fix location_frac handling for TCS outputsMarek Olšák2019-11-271-1/+1
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/nir: don't rely on data.patch for tess factorsMarek Olšák2019-11-271-1/+4
| | | | | | GLCTS SPIR-V tests have this issue. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/nir: validate is_patch because SPIR-V doesn't set it for tess factorsMarek Olšák2019-11-271-10/+21
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: simplify get_tcs_tes_buffer_address_from_generic_indicesMarek Olšák2019-11-271-27/+19
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: simplify the interface of get_dw_address_from_generic_indicesMarek Olšák2019-11-271-29/+19
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/nir: implement subgroup system values for SPIR-VMarek Olšák2019-11-274-0/+11
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/nir: don't rely on data.patch for tess factorsMarek Olšák2019-11-271-2/+6
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* drirc: Set vs_position_always_invariant for Shadow of Mordor on IntelKenneth Graunke2019-11-271-0/+10
| | | | | | | | | | | | | | | | | | | | | | When drawing the main character in Shadow of Mordor, the game appears to draw Talion with one vertex shader, and the Wraith with another. If the compiler optimizes those in different ways which lead to slight imprecisions, then the resulting positions may not line up, leading to Z-fighting occurring as the game decides which of the two are in front. brw_nir_opt_peephole_ffma looks at usages of multiply adds across the entire shader, and may make different decisions between the two, leading to such imprecisions and Z-fighting. This started happening recently after a NIR change to eliminate unnecessary MOVs (7025dbe7), but that change simply exposed the existing problem. Improves performance on Skylake GT4e by 1.22945% +/- 0.398672% (n=3), likely due to the fixed rendering. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1985 Fixes: 7025dbe794b ("nir: Skip emitting no-op movs from the builder.") Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* driconf, glsl: Add a vs_position_always_invariant optionKenneth Graunke2019-11-278-0/+23
| | | | | | | | | | | | | | | | | | | | Many applications use multi-pass rendering and require their vertex shader position to be computed the same way each time. Optimizations may consider, say, fusing a multiply-add based on global usage of an expression in a shader. But a second shader with the same expression may have different code, causing that optimization to make the other choice the second time around. The correct solution is for applications to mark their VS outputs 'invariant', indicating they need multiple shaders to compute that output in the same manner. However, most applications fail to do so. So, we add a new driconf option - vs_position_always_invariant - which forces the gl_Position output in vertex shaders to be marked invariant. Fixes: 7025dbe794b ("nir: Skip emitting no-op movs from the builder.") Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* turnip: Disable timestamp queries for now.Eric Anholt2019-11-271-2/+2
| | | | | | | They're not implemented, and not critical to bring up immediately. Avoids failures in the CTS when nothing gets written to the query. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* freedreno/perfcntrs/fdperf: add missing a2xx case in select_counterJonathan Marek2019-11-271-0/+1
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/perfcntrs/fdperf: add missing a20x compatibleJonathan Marek2019-11-271-0/+1
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/perfcntrs/fdperf: fix u64 print on 32-bit buildsJonathan Marek2019-11-271-1/+2
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/perfcntrs: add a2xx MH countersJonathan Marek2019-11-271-4/+186
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/registers: add missing MH perfcounter enum for a2xxJonathan Marek2019-11-271-0/+185
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* gitlab-ci: Put HTML summary in artifacts for failed piglit jobsMichel Dänzer2019-11-272-0/+7
| | | | | | | This will make it easier to look at details of failed / skipped tests. Acked-by: Daniel Stone <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gitlab-ci: Stop storing piglit test results as JUnitMichel Dänzer2019-11-273-5522/+13614
| | | | | | | | | | Since we're not reporting test results as JUnit anymore, we can use the default JSON format. This affects how test results are summarized, update the reference files accordingly. Reviewed-by: Eric Anholt <[email protected]>
* gitlab-ci: Stop reporting piglit test results via JUnitMichel Dänzer2019-11-271-3/+0
| | | | | | | | It was basically useless in this form, and processing the JUnit data in the GitLab backend was pretty expensive. Acked-by: Daniel Stone <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: fix indirect BO allocation for uniformsIago Toral Quiroga2019-11-271-3/+8
| | | | | | | | | | | | | | | | | | | We were always ensuring a minimum size of 4 bytes for uniforms for the case where we don't have any, to account for hardware pre-fetching of the uniform stream, however, pre-fetching could also lead to to out of bounds reads when have read the last uniform in the stream, so we probably want to have the extra 4 bytes to prevent the kernel from observing invalid memory accesses when the uniform stream sits right at the end of a page. This seems to fix MMU exceptions reported with a Linux 5.4 kernel. Credit goes to Phil Elwell for identifying the problem and narrowing it down to memory accesses in the uniform stream. Reported-by: Phil Elwell <[email protected]> Tested-by: Phil Elwell <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radv: enable VK_KHR_shader_subgroup_extended_types on GFX10Samuel Pitoiset2019-11-271-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add 8-bit and 16-bit supports to ac_build_permlane16()Samuel Pitoiset2019-11-271-8/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix implementation of exclusive scansSamuel Pitoiset2019-11-271-24/+52
| | | | | | | | | | | This implementation is loosely based on ROCm. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10. Fixes: 227c29a80de ("amd/common/gfx10: implement scan & reduce operations") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix enabling sample shading with SampleID/SamplePositionSamuel Pitoiset2019-11-271-7/+24
| | | | | | | | | | When a fragment shader includes an input variable decorated with SampleId or SamplePosition, sample shading should be enabled because minSampleShadingFactor is expected to be 1.0. Cc: 19.2, 19.3 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* turnip: fix integer render targetsJonathan Marek2019-11-261-1/+3
| | | | | | | | | | | | | Add missing required bits. Fixes at least: dEQP-VK.pipeline.render_to_image.dedicated_allocation.1d.small.r16g16_sint_d24_unorm_s8_uint dEQP-VK.pipeline.render_to_image.dedicated_allocation.2d.mipmap.r16g16_sint_d24_unorm_s8_uint dEQP-VK.renderpass.dedicated_allocation.attachment.4.401 dEQP-VK.renderpass2.suballocation.formats.r16_uint.load.draw dEQP-VK.synchronization.op.single_queue.barrier.write_draw_read_copy_image_to_buffer.image_128x128_r16_uint Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* anv: Push constants are relative to dynamic state on IVBJason Ekstrand2019-11-261-0/+17
| | | | | | Fixes: aecde2351 "anv: Pre-compute push ranges for graphics pipelines" Closes: #2136 Reviewed-by: Lionel Landwerlin <[email protected]>
* meson: Add -Werror=gnu-empty-initializer to MSVC compat argsDylan Baker2019-11-261-4/+4
| | | | | | | | | | | | | | | Only clang has this argument (at least as of clang 8 and gcc 9), which errors when using the gcc empty initializer syntax in C: ```C struct foo f = {}; ``` GCC has a warning for this, but only when using -Wpedantic, which is a lot of noise to lose useful warnings in. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium/auxiliary: Fix uses of gnu struct = {} extensionDylan Baker2019-11-265-8/+8
| | | | | | | | | | Most of these will never actually be compiled by windows, but in the interest of being able to make using struct foo = {}; an error and avoiding breaking windows removing a handful of safe uses seems like a good trade off. Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* st/mesa: add st_variant base class to simplify code for shader variantsMarek Olšák2019-11-268-307/+149
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: don't use ** in the st_nir_link_shaders signatureMarek Olšák2019-11-261-20/+20
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: simplify looping over linked shaders when linking NIRMarek Olšák2019-11-261-48/+28
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: propagate gl_PatchVerticesIn from TCS to TES before linking for NIRMarek Olšák2019-11-261-2/+2
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: don't call ProgramStringNotify in glsl_to_nirMarek Olšák2019-11-262-13/+16
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: don't use redundant stp->state.ir.nirMarek Olšák2019-11-263-25/+12
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: don't serialize all streamout state if there are no SO outputsMarek Olšák2019-11-261-4/+15
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* iris: Disable VF cache partial address workaround on Gen11+Kenneth Graunke2019-11-262-0/+14
| | | | | | | | | | | | | | | The vertex cache uses the full 48-bit address on Gen11+. See the documentation for 3DSTATE_VERTEX_BUFFERS, which describes the workaround and lists it as pre-Icelake. Interestingly, the docs don't mention index buffers as needing a workaround at all. So either we've been overzealous, or the docs never got updated to record that. Which begs the question of whether the issue there was fixed, if there was one... Cuts 40% of the PIPE_CONTROLs from Civilization VI's benchmark; appears that it improves performance by about 1-2% on Icelake 8x8 (not frequency locked).
* freedreno: switch to layout helperRob Clark2019-11-2627-199/+190
| | | | | | | | | | | | The slices table and most of the other layout fields in the freedreno_resource moves into fdl_layout. v2: Changes by anholt to not have duplicate fields, which was introducing a surprising behavior change in resource layout (using the level_linear helper before the setup of the shadowed fields) Reviewed-by: Eric Anholt <[email protected]> Acked-by: Rob Clark <[email protected]>
* freedreno/a6xx: Log the tiling mode in resource layout debug.Eric Anholt2019-11-261-2/+2
| | | | | | | This was important for figuring out what went wrong with the layout refactor. Acked-by: Rob Clark <[email protected]>