summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallium/util: Cast to target type before shifting leftMichel Dänzer2019-10-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise a smaller type may be promoted to int, which can hit undefined behaviour: ../src/gallium/auxiliary/util/u_half.h:126:29: runtime error: left shift of 32768 by 16 places cannot be represented in type 'int' #0 0x5646ff63d488 in util_half_to_float ../src/gallium/auxiliary/util/u_half.h:126 #1 0x5646ff63d749 in _mesa_half_to_float ../src/util/half_float.c:145 #2 0x5646ff54d557 in nir_const_value_negative_equal ../src/compiler/nir/nir_instr_set.c:372 #3 0x5646ff44d29a in const_value_negative_equal_test_nir_type_float16_trivially_true_Test::TestBody() ../src/compiler/nir/tests/negative_equal_tests.cpp:121 #4 0x5646ff505c05 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402 #5 0x5646ff4f1513 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438 #6 0x5646ff4979b5 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #7 0x5646ff49a08e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x5646ff49c81c in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x5646ff4b81f6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x5646ff50981e in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402 #11 0x5646ff4f4eeb in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438 #12 0x5646ff4af818 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x5646ff52e639 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x5646ff52e4f9 in main ../src/gtest/src/gtest_main.cc:37 #15 0x7f6bacb78bba in __libc_start_main ../csu/libc-start.c:308 #16 0x5646ff448019 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/compiler/nir/negative_equal+0x17c019) Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Adam Jackson <[email protected]>
* intel/fs: Check for NULL key in fs_visitor constructorMichel Dänzer2019-10-241-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Flagged by UBSan: ../src/intel/compiler/brw_fs_visitor.cpp:986:20: runtime error: member access within null pointer of type 'const struct brw_base_prog_key' #0 0x559fadb48556 in fs_visitor::init() ../src/intel/compiler/brw_fs_visitor.cpp:986 #1 0x559fadb46db3 in fs_visitor::fs_visitor(brw_compiler const*, void*, void*, brw_base_prog_key const*, brw_stage_prog_data*, nir_shader const*, unsigned int, int, brw_vue_map const*) ../src/intel/compiler/brw_fs_visitor.cpp:962 #2 0x559fad9c7cd8 in saturate_propagation_fs_visitor::saturate_propagation_fs_visitor(brw_compiler*, brw_wm_prog_data*, nir_shader*) (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x61bcd8) #3 0x559fad9960a1 in saturate_propagation_test::SetUp() ../src/intel/compiler/test_fs_saturate_propagation.cpp:65 #4 0x559fadd7a32d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402 #5 0x559fadd65c3b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438 #6 0x559fadd0af75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2470 #7 0x559fadd0d8a4 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x559fadd10032 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x559fadd2ba0c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x559fadd7df46 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402 #11 0x559fadd69613 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438 #12 0x559fadd2302e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x559fadda2d61 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x559fadda2c21 in main ../src/gtest/src/gtest_main.cc:37 #15 0x7fe8f6748bba in __libc_start_main ../csu/libc-start.c:308 #16 0x559fad9950f9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x5e90f9) Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Adam Jackson <[email protected]>
* intel/compiler: Cast to target type before shifting leftMichel Dänzer2019-10-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Otherwise a smaller type may be promoted to int, which can hit undefined behaviour: ../src/intel/compiler/brw_packed_float.c:66:17: runtime error: left shift of 128 by 24 places cannot be represented in type 'int' #0 0x5604a03969aa in brw_vf_to_float ../src/intel/compiler/brw_packed_float.c:66 #1 0x5604a0391305 in vf_float_conversion_test_test_vf_to_float_Test::TestBody() ../src/intel/compiler/test_vf_float_conversions.cpp:70 #2 0x5604a041a323 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402 #3 0x5604a0405c31 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438 #4 0x5604a03ab03b in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #5 0x5604a03ad714 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #6 0x5604a03afea2 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #7 0x5604a03cb87c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #8 0x5604a041df3c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402 #9 0x5604a0409609 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438 #10 0x5604a03c2e9e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #11 0x5604a0442d57 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #12 0x5604a0442c17 in main ../src/gtest/src/gtest_main.cc:37 #13 0x7f9a1983dbba in __libc_start_main ../csu/libc-start.c:308 #14 0x5604a0390d89 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/vf_float_conversions+0x8dd89) Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Adam Jackson <[email protected]>
* intel/compiler: Don't left-shift by >= the number of bits of the typeMichel Dänzer2019-10-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid it, use the modulo of the number of bits in the value being shifted, which is presumably what ended up happening on x86. Flagged by UBSan: ../src/intel/compiler/brw_eu_validate.c:974:33: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' #0 0x561abb612ab3 in general_restrictions_on_region_parameters ../src/intel/compiler/brw_eu_validate.c:974 #1 0x561abb617574 in brw_validate_instructions ../src/intel/compiler/brw_eu_validate.c:1851 #2 0x561abb53bd31 in validate ../src/intel/compiler/test_eu_validate.cpp:106 #3 0x561abb555369 in validation_test_source_cannot_span_more_than_2_registers_Test::TestBody() ../src/intel/compiler/test_eu_validate.cpp:486 #4 0x561abb742651 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2402 #5 0x561abb72e64d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ../src/gtest/src/gtest.cc:2438 #6 0x561abb6d5451 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #7 0x561abb6d7b2a in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x561abb6da2b8 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x561abb6f5c92 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x561abb74626a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2402 #11 0x561abb732025 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ../src/gtest/src/gtest.cc:2438 #12 0x561abb6ed2b4 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x561abb768b3b in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x561abb7689fb in main ../src/gtest/src/gtest_main.cc:37 #15 0x7f525e5a9bba in __libc_start_main ../csu/libc-start.c:308 #16 0x561abb538ed9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/eu_validate+0x1b8ed9) Reviewed-by: Adam Jackson <[email protected]>
* anv: fix error messageEric Engestrom2019-10-241-5/+2
| | | | | | | | | | | `strerror()` takes an `errno`, not the negative value returned by the `ioctl()`. Instead of fixing this as `"%s", strerror(errno)`, let's just use the `"%m"` shortcut for it. Fixes: 2b5f30b1d91b98ab27ba ("anv: implement VK_INTEL_performance_query") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* meson: add -Werror=empty-body to disallow `if(x);`Eric Engestrom2019-10-241-0/+2
| | | | | | | | | | | This would have prevented a bug in MR 2058 [1]; with that MR fixed, nothing else uses empty-body blocks, so let's just forbid them altogether. [1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2058#note_237880 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* llvmpipe: avoid generating empty-body blocksEric Engestrom2019-10-241-1/+1
| | | | | | Suggested-by: Erik Faye-Lund <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* llvmpipe: avoid compiling no-op block on release buildsEric Engestrom2019-10-241-1/+2
| | | | | | Suggested-by: Adam Jackson <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* winsys/svga: Limit the maximum DMA hardware buffer sizeThomas Hellstrom2019-10-241-1/+4
| | | | | | | | | | | | | | | | | | | The kernel total GMR/DMA size is limited, but it's definitely possible for the kernel to allow a larger buffer allocation to succeed, but command submission using that buffer as a GMR would fail typically causing an application crash. So have the winsys limit the size of GMR/DMA buffers. The pipe driver will then resort to allocating smaller buffers and perform the DMA transfer in multiple bands, also allowing for the pre-flush mechanism to kick in. This avoids the related application crashes. Fixes: e7843273fae ("winsys/svga: Update to vmwgfx kernel module 2.1") Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* svga: Fix banded DMA upload unmapThomas Hellstrom2019-10-241-1/+1
| | | | | | | | | | | | | | | | Even with banded DMA uploads, st->hwbuf is always non-NULL, but when we've allocated a software buffer to hold the full upload, unmapping of the hardware buffer has already been done before svga_texture_transfer_unmap_dma(), and the code was performing an unmap of an already mapped buffer. Fix this by testing for software buffer not present. Fixes: a9c4a861d5d ("svga: refactor svga_texture_transfer_map/unmap functions") Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gitlab-ci: Update kernel for LAVA jobs to 5.4-rc4Tomeu Vizoso2019-10-242-2/+2
| | | | | | | | | | | Update to 5.4-rc4 so we can test Panfrost on devices with Mali T720 and T820. A bug was found that prevented things working at all on RK3288 devices, so we carry a patch for now in my personal fork. Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Daniel Stone <[email protected]>
* glsl: remove propagate_invariance() call from the linkerTimothy Arceri2019-10-241-2/+0
| | | | | | This was added in 586f4a42e78f and became redundant with 34ab9b0947cd Reviewed-by: Marek Olšák <[email protected]>
* nir: improve nir_variable packingTimothy Arceri2019-10-241-1/+3
| | | | | | | | | | | | | Before: /* size: 136, cachelines: 3, members: 10 */ After: /* size: 128, cachelines: 2, members: 10 */ Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: fix nir_variable_data packingTimothy Arceri2019-10-241-8/+8
| | | | | | | | | | | | | Before: /* size: 60, cachelines: 1, members: 29 */ After: /* size: 56, cachelines: 1, members: 29 */ Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* radeonsi/nir: implement pipe_screen::finalize_nirMarek Olšák2019-10-235-19/+32
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: use pipe_screen::finalize_nirMarek Olšák2019-10-236-25/+89
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* tgsi_to_nir: use pipe_screen::finalize_nirMarek Olšák2019-10-231-4/+9
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: add pipe_screen::finalize_nirMarek Olšák2019-10-235-0/+43
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: update VS shader_info for NIR after lowering passesMarek Olšák2019-10-231-0/+4
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: assign driver locations for VS inputs for NIR before cachingMarek Olšák2019-10-235-9/+15
| | | | | | | fix up edge flags in the NIR pass, because st/mesa doesn't touch the inputs after caching Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: don't lower_global_vars_to_local for VS if there are no dead inputsMarek Olšák2019-10-231-2/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: move some NIR lowering before shader cachingMarek Olšák2019-10-233-14/+23
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* util/u_queue: skip util_queue_finish if num_threads is 0Marek Olšák2019-10-231-0/+7
| | | | | | | This fixes a deadlock in pthread_barrier_destroy. Cc: 19.1 19.2 <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* util/disk_cache: finish all queue jobs in destroy instead of killing themMarek Olšák2019-10-231-0/+1
| | | | | | | If there are queued shaders to be written to disk, wait for that. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Rework edgeflag handlingKenneth Graunke2019-10-232-7/+28
| | | | | | | | | | We were relying on specific pass ordering in st to avoid setting inputs_read/outputs_written for edge flags. Instead, just assume that it happens and throw out the results we don't want. We should probably revisit this and try and add a vertex element property like I originally wanted so we can avoid having it be associated with the VS altogether.
* gallium/noop: implement get_disk_shader_cache and get_compiler_optionsMarek Olšák2019-10-231-0/+18
| | | | trivial
* aco: take LDS into account when calculating num_wavesRhys Perry2019-10-234-7/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | pipeline-db (Vega): SGPRS: 344 -> 344 (0.00 %) VGPRS: 424 -> 524 (23.58 %) Spilled SGPRs: 84 -> 80 (-4.76 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 52812 -> 52484 (-0.62 %) bytes LDS: 135 -> 135 (0.00 %) blocks Max Waves: 56 -> 53 (-5.36 %) v2: consider WGP, rework to be clearer and apply the "maximum 16 workgroups per CU" limit properly v2: use "SIMD" instead of "EU" v2: fix spiller by introducing "Program::max_waves" v2: rename "lds_size" to "lds_limit" v3: make max_waves actually independant of register usage v3: fix issue where max_waves was way too high v3: use DIV_ROUND_UP(a, b) instead of max(a / b, 1) v3: rename "workgroups_per_cu" to "workgroups_per_cu_wgp" v4: fix typo from "workgroups_per_cu" rename Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> (v3)
* aco: increase accuracy of SGPR limitsRhys Perry2019-10-236-28/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SGPRs are allocated in groups of 16 on GFX8/GFX9. GFX10 allocates a fixed number of SGPRs and has 106 addressable SGPRs. pipeline-db (Vega): SGPRS: 5912 -> 6232 (5.41 %) VGPRS: 1772 -> 1780 (0.45 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 88228 -> 87904 (-0.37 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 559 -> 571 (2.15 %) piepline-db (Navi): SGPRS: 341256 -> 363384 (6.48 %) VGPRS: 171536 -> 170960 (-0.34 %) Spilled SGPRs: 832 -> 581 (-30.17 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 14207332 -> 14190872 (-0.12 %) bytes LDS: 33 -> 33 (0.00 %) blocks Max Waves: 18072 -> 18251 (0.99 %) v2: unconditionally count vcc as an extra sgpr on GFX10+ v3: pass SGPRs rounded to 8 Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* radv: round vgprs/sgprs before calculating max_wavesRhys Perry2019-10-231-4/+8
| | | | | | | | | | | | | | | | | | | | Note that ACO doesn't correctly round SGPR counts on GFX8/GFX9. pipeline-db (ACO/Vega): SGPRS: 11000 -> 11000 (0.00 %) VGPRS: 3120 -> 3120 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 164328 -> 164328 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1125 -> 1000 (-11.11 %) v2: consider wave32 Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* docs: Add new Intel extensionLionel Landwerlin2019-10-231-0/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Caio Marcelo de Oliveira Filho <[email protected]>
* Revert "vc4: do not report alpha-test as supported"Erik Faye-Lund2019-10-232-4/+14
| | | | | | This reverts commit a79b93269cf340ce4d23b5b34100039bcaafc841. Reviewed-by: Jose Maria Casanova <[email protected]>
* Revert "v3d: do not report alpha-test as supported"Erik Faye-Lund2019-10-233-3/+11
| | | | | | | This reverts commit 9d0523b569bb7208c6e74cafc0f3945415d94336. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jose Maria Casanova <[email protected]>
* Revert "nir: drop support for using load_alpha_ref_float"Erik Faye-Lund2019-10-231-11/+14
| | | | | | | This reverts commit 5af272b47469398762e984e27f65fc4ecc293d28. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jose Maria Casanova <[email protected]>
* Revert "nir: drop unused alpha_ref_float"Erik Faye-Lund2019-10-232-0/+2
| | | | | | | This reverts commit e8095f2af0736b5937674ca319f29cc9dabb17d4. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jose Maria Casanova <[email protected]>
* radv: fix a performance regression with graphics depth/stencil clearsSamuel Pitoiset2019-10-232-25/+123
| | | | | | | | | | | | | | | | | | | | I recently changed the slow depth/stencil clear path to make sure depth values are explicitly exported by the fragment shader. This is actually only useful when VK_EXT_depth_range_unrestricted is enabled. While this path is correct, it introduced a performance regression with Heroes of the Storm, Shadow of Mordor (Vulkan beta) and probably more titles. This is because it prevents the hardware to do some optimizations like discarding fragments. This commit re-introduces the previous (a bit faster) slow depth/stencil clear path and it selects the unrestricted path only if VK_EXT_depth_range_unrestricted is enabled. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/863 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix vkUpdateDescriptorSets with inline uniform blocksSamuel Pitoiset2019-10-231-0/+8
| | | | | | | | | | | | | | | descriptorCount is the number of bytes into the descriptor, so it shouldn't be used as an index. srcArrayElement/dstArrayElement specify the starting byte offset within the binding to copy from/to. This fixes new CTS tests: dEQP-VK.binding_model.descriptor_copy.*.inline_uniform_block_* dEQP-VK.binding_model.descriptor_copy.*.mix_3 dEQP-VK.binding_model.descriptor_copy.*.mix_array1 Fixes: 8d2654a4197 ("radv: Support VK_EXT_inline_uniform_block.") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix 3D imagesSamuel Pitoiset2019-10-233-17/+17
| | | | | | | | | | | GFX10 does act like GFX9 actually. This fixes dEQP-VK.glsl.texture_functions.query.texturesize.*sampler3d_*. Cc: 19.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: re-enable fast depth/stencil clears with separate aspectsSamuel Pitoiset2019-10-231-3/+2
| | | | | | | | | It used to cause weird issues on GFX10 in the past with vkmark and Wreckfest, and they can't be reproduced now. Shadow Of Mordor (Vulkan beta) hits that path and it works fine. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not emit rbplus if attachments are undefinedSamuel Pitoiset2019-10-231-0/+3
| | | | | | | | | | | Fixes some crashes with dEQP-VK.geometry.layered.*.secondary_cmd_buffer on Raven and other chips that allow rbplus. This just prevents a crash and rbplus probaby needs more work. Cc: 19.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add an assertion in radv_gfx10_compute_bin_size()Samuel Pitoiset2019-10-231-0/+1
| | | | | | | To prevent out of bounds access. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not create meta pipelines with 16 samplesSamuel Pitoiset2019-10-232-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | The driver only supports up to 8 samples, so it's useless to create more pipelines than needed. This fixes a conditional jump reported by Valgrind on GFX10: ==194282== Conditional jump or move depends on uninitialised value(s) ==194282== at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242) ==194282== by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334) ==194282== by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440) ==194282== by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764) ==194282== by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788) ==194282== by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114) ==194282== by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297) ==194282== by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277) ==194282== by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363) ==194282== by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080 This is caused by an out of bound access of 'fmask_array' (ie. index is 4 as for 16 samples). Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: implement VK_INTEL_performance_queryLionel Landwerlin2019-10-239-18/+535
| | | | | | | | | | | | | | | | | | | | | v2: Introduce the appropriate pipe controls Properly deal with changes in metric sets (using execbuf parameter) Record marker at query end v3: Fill out PerfCntr1&2 v4: Introduce vkUninitializePerformanceApiINTEL v5: Use new execbuf extension mechanism v6: Fix comments in genX_query.c (Rafael) Use PIPE_CONTROL workarounds (Rafael) Refactor on the last kernel series update (Lionel) v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: add mdapi writes for register perf countersLionel Landwerlin2019-10-231-0/+36
| | | | | | | Those are not part of the OA reports. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/genxml: add RPSTAT register for core frequencyLionel Landwerlin2019-10-238-0/+40
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/genxml: add generic perf counters registersLionel Landwerlin2019-10-234-0/+72
| | | | | | | | | | We have 2 of those we can configure to source programmable events. Those are not part of the OA reports. Configuration happens in i915 through the metric set selected by the application. On the Mesa side we'll just sample those and do a diff. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: add support for querying kernel loaded configurationsLionel Landwerlin2019-10-232-27/+181
| | | | | | | We use this as a communication mechanism between MDAPI & Anv. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* drm-uapi: Update headers from drm-nextLionel Landwerlin2019-10-234-9/+141
| | | | | | | | | | | | | Pull new updates from drm-next as of the following commit: commit f1b4a9217efd61d0b84c6dc404596c8519ff6f59 Merge: 400e91347e1d f3a36d469621 Author: Dave Airlie <[email protected]> Date: Tue Oct 22 15:04:00 2019 +1000 Merge tag 'du-next-20191016' of git://linuxtv.org/pinchartl/media into drm-next Signed-off-by: Lionel Landwerlin <[email protected]>
* intel/perf: move registers to their own headerLionel Landwerlin2019-10-236-26/+58
| | | | | | | Will conflict with the genxml RPSTAT register. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: extract register configurationLionel Landwerlin2019-10-233-16/+24
| | | | | | | | We want to query the content of register configurations from the kernel. Let's pull this out of the query. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: expose some utility functionsLionel Landwerlin2019-10-232-31/+55
| | | | | | | | | The Vulkan performance query extension is a bit lower level than the GL one. Expose some of the functions to do the result accumulation directly in the Anv driver. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>