summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir/algebraic: Recognize (a < 0 || 0 < b) as min(a, -b) < 0Ian Romanick2019-08-051-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to commit 97e6c1b9 and f5cf74d8ba8c. First apply 0 < b => -b < 0 to get (a < 0 || -b < 0), then apply some pre-existing rules to get min(a, -b) < 0. v2: Substantially update the comment explaining the use of is_used_once and the duplication of patterns. Suggested by Caio. Also, while flt and fge are not commutative, ior and iand are. Half of the original patterns were redundant, so delete them. As alternate justification for deleting them, fmin(a, -b) < 0 <=> 0 < fmax(-a, b). Proof left as an exercise for the reader. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16333789 -> 16333713 (<.01%) instructions in affected programs: 11424 -> 11348 (-0.67%) helped: 32 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 2.38 x̃: 2 helped stats (rel) min: 0.20% max: 1.67% x̄: 0.76% x̃: 0.69% 95% mean confidence interval for instructions value: -3.03 -1.72 95% mean confidence interval for instructions %-change: -0.89% -0.62% Instructions are helped. total cycles in shared programs: 367598295 -> 367596791 (<.01%) cycles in affected programs: 141414 -> 139910 (-1.06%) helped: 23 HURT: 6 helped stats (abs) min: 3 max: 386 x̄: 72.52 x̃: 20 helped stats (rel) min: 0.15% max: 4.86% x̄: 1.01% x̃: 0.76% HURT stats (abs) min: 4 max: 88 x̄: 27.33 x̃: 12 HURT stats (rel) min: 0.22% max: 3.95% x̄: 1.08% x̃: 0.59% 95% mean confidence interval for cycles value: -93.51 -10.21 95% mean confidence interval for cycles %-change: -1.10% -0.05% Cycles are helped. total instructions in shared programs: 10830836 -> 10830779 (<.01%) instructions in affected programs: 6895 -> 6838 (-0.83%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 14 x̄: 4.75 x̃: 1 helped stats (rel) min: 0.14% max: 1.61% x̄: 0.65% x̃: 0.33% 95% mean confidence interval for instructions value: -8.46 -1.04 95% mean confidence interval for instructions %-change: -1.03% -0.27% Instructions are helped. total cycles in shared programs: 154028477 -> 154032740 (<.01%) cycles in affected programs: 178433 -> 182696 (2.39%) helped: 3 HURT: 9 helped stats (abs) min: 3 max: 20 x̄: 11.00 x̃: 10 helped stats (rel) min: 0.07% max: 0.20% x̄: 0.12% x̃: 0.09% HURT stats (abs) min: 27 max: 1415 x̄: 477.33 x̃: 262 HURT stats (rel) min: 0.22% max: 6.45% x̄: 2.49% x̃: 1.76% 95% mean confidence interval for cycles value: 28.68 681.82 95% mean confidence interval for cycles %-change: 0.37% 3.30% Cycles are HURT. Iron Lake total instructions in shared programs: 8137966 -> 8137992 (<.01%) instructions in affected programs: 3281 -> 3307 (0.79%) helped: 0 HURT: 6 HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3 HURT stats (rel) min: 0.63% max: 1.01% x̄: 0.76% x̃: 0.64% 95% mean confidence interval for instructions value: 2.17 6.50 95% mean confidence interval for instructions %-change: 0.56% 0.96% Instructions are HURT. total cycles in shared programs: 188539386 -> 188540038 (<.01%) cycles in affected programs: 103826 -> 104478 (0.63%) helped: 0 HURT: 7 HURT stats (abs) min: 16 max: 218 x̄: 93.14 x̃: 80 HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.53% x̃: 0.46% 95% mean confidence interval for cycles value: 10.26 176.02 95% mean confidence interval for cycles %-change: 0.24% 0.81% Cycles are HURT. GM45 total instructions in shared programs: 5008876 -> 5008889 (<.01%) instructions in affected programs: 1645 -> 1658 (0.79%) helped: 0 HURT: 3 HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3 HURT stats (rel) min: 0.63% max: 1.00% x̄: 0.76% x̃: 0.63% total cycles in shared programs: 128968950 -> 128969426 (<.01%) cycles in affected programs: 64854 -> 65330 (0.73%) helped: 0 HURT: 4 HURT stats (abs) min: 18 max: 218 x̄: 119.00 x̃: 120 HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.60% x̃: 0.66% 95% mean confidence interval for cycles value: -62.92 300.92 95% mean confidence interval for cycles %-change: -0.05% 1.26% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/algebraic: Replace checks that a value is between (or not) [0, 1]Ian Romanick2019-08-051-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Add an extra line to one of the proofs. Suggested by Caio. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16329772 -> 16329427 (<.01%) instructions in affected programs: 41980 -> 41635 (-0.82%) helped: 110 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 3.14 x̃: 2 helped stats (rel) min: 0.19% max: 5.56% x̄: 1.12% x̃: 0.94% 95% mean confidence interval for instructions value: -4.10 -2.17 95% mean confidence interval for instructions %-change: -1.28% -0.96% Instructions are helped. total cycles in shared programs: 367551273 -> 367549979 (<.01%) cycles in affected programs: 492462 -> 491168 (-0.26%) helped: 76 HURT: 25 helped stats (abs) min: 1 max: 400 x̄: 42.86 x̃: 12 helped stats (rel) min: 0.06% max: 10.72% x̄: 1.23% x̃: 0.75% HURT stats (abs) min: 2 max: 730 x̄: 78.52 x̃: 16 HURT stats (rel) min: 0.17% max: 6.89% x̄: 2.08% x̃: 1.23% 95% mean confidence interval for cycles value: -37.79 12.16 95% mean confidence interval for cycles %-change: -0.90% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10831115 -> 10830836 (<.01%) instructions in affected programs: 37830 -> 37551 (-0.74%) helped: 70 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 3.99 x̃: 2 helped stats (rel) min: 0.33% max: 7.14% x̄: 1.21% x̃: 0.97% 95% mean confidence interval for instructions value: -5.47 -2.50 95% mean confidence interval for instructions %-change: -1.49% -0.92% Instructions are helped. total cycles in shared programs: 154029323 -> 154028477 (<.01%) cycles in affected programs: 247909 -> 247063 (-0.34%) helped: 52 HURT: 6 helped stats (abs) min: 2 max: 254 x̄: 25.81 x̃: 4 helped stats (rel) min: 0.07% max: 4.39% x̄: 0.81% x̃: 0.19% HURT stats (abs) min: 4 max: 403 x̄: 82.67 x̃: 8 HURT stats (rel) min: 0.18% max: 1.60% x̄: 0.71% x̃: 0.53% 95% mean confidence interval for cycles value: -34.83 5.65 95% mean confidence interval for cycles %-change: -0.98% -0.32% Inconclusive result (value mean confidence interval includes 0). Iron Lake total instructions in shared programs: 8138007 -> 8137966 (<.01%) instructions in affected programs: 4060 -> 4019 (-1.01%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.32 x̃: 1 helped stats (rel) min: 0.68% max: 8.33% x̄: 1.45% x̃: 0.90% 95% mean confidence interval for instructions value: -1.50 -1.15 95% mean confidence interval for instructions %-change: -2.11% -0.79% Instructions are helped. total cycles in shared programs: 188539492 -> 188539386 (<.01%) cycles in affected programs: 26280 -> 26174 (-0.40%) helped: 25 HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 4.24 x̃: 4 helped stats (rel) min: 0.08% max: 2.11% x̄: 0.54% x̃: 0.50% 95% mean confidence interval for cycles value: -5.08 -3.40 95% mean confidence interval for cycles %-change: -0.70% -0.37% Cycles are helped. GM45 total instructions in shared programs: 5008897 -> 5008876 (<.01%) instructions in affected programs: 2096 -> 2075 (-1.00%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1 helped stats (rel) min: 0.68% max: 7.69% x̄: 1.41% x̃: 0.89% 95% mean confidence interval for instructions value: -1.57 -1.06 95% mean confidence interval for instructions %-change: -2.32% -0.49% Instructions are helped. total cycles in shared programs: 128969020 -> 128968950 (<.01%) cycles in affected programs: 18490 -> 18420 (-0.38%) helped: 15 HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 4.67 x̃: 4 helped stats (rel) min: 0.08% max: 2.11% x̄: 0.51% x̃: 0.48% 95% mean confidence interval for cycles value: -6.03 -3.30 95% mean confidence interval for cycles %-change: -0.78% -0.24% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* tgsi_to_nir: fix nir_gather_ssa_types for TGSI->NIR shadersJonathan Marek2019-08-051-5/+13
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-By: Timur Kristóf <[email protected]>
* anv: Implement VK_EXT_line_rasterizationJason Ekstrand2019-08-068-12/+211
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* genxml: Rename 3DSTATE_SF::Anti-Aliasing EnableJason Ekstrand2019-08-067-7/+7
| | | | | | | This makes it consistent with the new name when it's moved to 3DSTATE_RASTER. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Use dirty bits for dynamic state trackingJason Ekstrand2019-08-063-23/+64
| | | | | | | | | | Previously, we assumed that the dirty bit was always 1 << VK_DYNAMIC_* and this assumption is about to be false. Extensions which define new VK_DYNAMIC_* enums won't be nice and tightly packed which this really requires. Instead, add functions to don the conversions and rework the bits a bit. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Advertise the right line width range on gen9 and CHVJason Ekstrand2019-08-061-1/+5
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* meson: Add panfrost to the --auto listAlyssa Rosenzweig2019-08-051-1/+1
| | | | | | | | | Look ma, we're a real driver now! I was waiting until Panfrost stabilises a bit for this, but now that 19.2 is almost here, let's make us official :) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* lima/ppir: enable lower_vector_cmp to lower fall_equalErico Nunes2019-08-051-0/+1
| | | | | Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima: re-run nir_opt_algebraic after int loweringErico Nunes2019-08-051-0/+15
| | | | | | | | | nir_lower_int_to_float is currently only meant to run once, and some ops must be lowered after being converted from int ops to be implementable, so re-run nir_opt_algebraic after lowering ints to floats. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* pan/midgard: Extend SSA concurrency checks to other argsAlyssa Rosenzweig2019-08-051-13/+12
| | | | | | No glmark changes, but this seems like a good idea. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Rewrite bidirectionally when eliminating movesAlyssa Rosenzweig2019-08-051-3/+2
| | | | | | | | | | | | | | Symptom: the sky is black in SuperTuxKart (flashbacks to SMB/NES emulation intensify). Essentially, what happened is a fixed (special) move to r0 was eliminated but scheduling did not factor this in, so can_run_concurrent_ssa returned true even when there was a logical data dependency that needed to be resolved. Fixes: 20771ede1c0 ("pan/midgard: Add post-RA move elimination") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* intel/compiler: add ability to override shader's assemblyDanylo Piliaiev2019-08-054-12/+85
| | | | | | | | | | | | When dumping shader's assembly with INTEL_DEBUG=vs,tcs,... sha1 of the resulting assembly is also printed, having environment variable INTEL_SHADER_ASM_READ_PATH present driver will try to load a "%sha1%.bin" file from the path and substitute current assembly with the one from the file. Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/tools: add binary output type to i965_asmDanylo Piliaiev2019-08-052-17/+47
| | | | | | | | Add '-t,--type' command line option to specify the output type which can be 'bin', 'c_literal' or 'hex'. Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Sagar Ghuge <[email protected]>
* panfrost: Add app blacklistAlyssa Rosenzweig2019-08-051-2/+16
| | | | | | | | | | | | In preparation for an initial 19.2 release, add a blacklist for apps known to be buggy under Panfrost to protect users. Panfrost is NOT a conformant implementation at this time. Distros: please do not revert this patch. If blacklisted apps are run using Panfrost, dragons will bite you. Thanks :) Signed-off-by: Alyssa Rosenzweig <[email protected]> Acked-by: Tomeu Vizoso <[email protected]>
* iris: Fix bad external BO hash table and zombie list interactionsKenneth Graunke2019-08-051-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A while ago, we started deferring GEM object closure and VMA release until buffers were idle. This had some unforeseen interactions with external buffers. We keep imported buffers in hash tables, so if we have repeated imports of the same GEM object, we map those to the same iris_bo structure. This is critical for several reasons. Unfortunately, we broke this assumption. When freeing a non-idle external buffer, we would drop it from the hash tables, then move it to the zombie list. If someone reimported the same GEM object, we would not find it in the hash tables, and go ahead and make a second iris_bo for that GEM object. But the old iris_bo would still be in the zombie list, and so we would eventually call GEM_CLOSE on it - closing a BO that should have still been live. To work around this, we defer removing a BO from the hash tables until it's actually fully closed. This has the strange effect that an external BO may be on the zombie list, and yet be resurrected before it can be properly cleaned up. In this case, we remove it from the list so it won't be freed. Fixes severe instability in Weston, which was hitting EINVALs and ENOENTs from execbuf2, due to batches referring to a GEM object that had been closed, or at least had its VMA torched. Fixes: 457a55716ea ("iris: Defer closing and freeing VMA until buffers are idle.")
* iris/bufmgr: Move iris_bo_reference into hash_find_bo, rename itKenneth Graunke2019-08-051-14/+16
| | | | | | Everybody importing an external buffer was looking it up in the hash table, then referencing it. We can just do that in the helper instead, which also gives us a convenient spot to stash extra code shortly.
* gallium: add stm DRM entry pointAhmad Fatoum2019-08-053-0/+3
| | | | | | | | | | The STM32MP157 features a Vivante GC400 GPU supported by etnaviv. Add a DRM entry point for the STM display controller, so mesa can be used with it. Signed-off-by: Ahmad Fatoum <[email protected]> Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* gitlab-ci: don't remove a package we don't install anymoreEric Engestrom2019-08-051-2/+1
| | | | | | Fixes: 85dace1c0b7c1839d121 ("gitlab-ci: remove software-properties-common") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* etnaviv: fix a null pointer dereferenceAndrii Simiklit2019-08-051-2/+2
| | | | | | | | This issue was found by cppcheck Signed-off-by: Andrii Simiklit <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* ac/nir: Lower large indirect variables to scratchConnor Abbott2019-08-051-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | results from radeonsi NIR: Totals from affected shaders: SGPRS: 704 -> 464 (-34.09 %) VGPRS: 2056 -> 672 (-67.32 %) Spilled SGPRs: 24 -> 0 (-100.00 %) Spilled VGPRs: 28406 -> 0 (-100.00 %) Private memory VGPRs: 0 -> 3182 (0.00 %) Scratch size: 1064 -> 3228 (203.38 %) dwords per thread Code Size: 935260 -> 40180 (-95.70 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 28 -> 70 (150.00 %) Wait states: 0 -> 0 (0.00 %) results from radv: Totals from affected shaders: SGPRS: 80 -> 48 (-40.00 %) VGPRS: 204 -> 108 (-47.06 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 256 (0.00 %) dwords per thread Code Size: 15792 -> 9504 (-39.82 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1 -> 2 (100.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* drirc: Add discard workaround for Divinity: Original Sin EETimothy Arceri2019-08-051-0/+1
| | | | | | | | This adds an additional work around for the game to fix the blocky shadows as reported in bug 105282 Acked-by: Eric Engestrom <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105282
* lima/ppir: simplify load uni/temp op lowering and schedulingErico Nunes2019-08-042-34/+33
| | | | | | | | | | | | | | | | | | | The load uniform/temporary operations output only to a pipeline register, which must be consumed by another op in the same instruction later. The current implementation delays the decision of who will consume this result to until the scheduling step. If the consumer node is not able to use the pipeline register, a mov node may have to be created, during the scheduler step. As part of the ppir scheduler simplification, and now that the ppir scheduler supports pipeline register dependencies, this can be simplified by always creating a single mov node outputting to a normal register that can be used directly by all consumers. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: simplify select op lowering and schedulingErico Nunes2019-08-045-11/+15
| | | | | | | | | | | | | | | | | | | The select operation relies on the select condition coming from the result of the the alu scalar mult slot, in the same instruction. The current implementation creates a mov node to be the predecessor of select, and then relies on an exception during scheduling to ensure that both ops are inserted in the same instruction. Now that the ppir scheduler supports pipeline register dependencies, this can be simplified by making the mov explicitly output to the fmul pipeline register, and the scheduler can place it without an exception. Since the select condition can only be placed in the scalar mult slot, differently than a regular mov, define a separate op for it. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: support pipeline registers in schedulerErico Nunes2019-08-042-46/+65
| | | | | | | | | | | | | | | | | | | | | | | | | The ppir scheduler grew to be rather complicated and containing many exceptions as it also has to take care of inserting additional nodes when it is mandatory for nodes to be in the same instruction. As such, the lima lowering and scheduling process can be difficult to understand and maintain. The ppir lowering step created nodes hoping that the scheduler would notice the exception and do the right thing. This proposal adds a simple refactor to the scheduler so that it places nodes with pipeline registers in the same instruction. With the scheduler handling this in a general way, it is possible to create same-instruction dependencies by using pipeline registers during the lowering stage. This is simpler to maintain because now we can make these dependencies explicit in a single place (lowering), and we can drop exceptions from scheduling. Reducing the complexity of the scheduler is also useful as preparatory work to support control flow in ppir. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* docs: fix "empty array" meson syntaxEric Engestrom2019-08-041-1/+1
| | | | | | | | On recent versions of Meson (0.47+) these are synonymous, but we still support older versions than that, so let's use the correct syntax to avoid confusing users of old Meson versions. Signed-off-by: Eric Engestrom <[email protected]>
* egl: drop unnecessary function derefEric Engestrom2019-08-041-2/+2
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Adam Jackson <[email protected]>
* glx: drop unnecessary pointer deref for function callsEric Engestrom2019-08-044-46/+46
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Adam Jackson <[email protected]>
* introduce c11_compat.h to provide C11 things in C99Eric Engestrom2019-08-042-0/+28
| | | | | | | | Right now, all it does is provide the new standard `static_assert()` name. Fixes: fbf7c38da35afe7f1de0 ("egl/wayland: use bitset.h for `formats` bit set") Signed-off-by: Eric Engestrom <[email protected]> Tested-by: Bhushan Shah <[email protected]>
* travis: add MacOS Scons buildEric Engestrom2019-08-041-6/+31
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* symbols-check: fix `nm` invocation on MacOSEric Engestrom2019-08-041-4/+14
| | | | | | | | | | | | | | | | | | According to Mac OSX's man page [1], this is how we should get the list of exported symbols: nm -g -P foo.dylib -g to only show the exported symbols -P to show it in a "portable" format, ie. readable by a script Since this is supported by GNU nm as well, let's use that everywhere, although some care needs to be taken as there are some differences in the output. [1] https://www.unix.com/man-page/osx/1/nm/ Signed-off-by: Eric Engestrom <[email protected]> Tested-by: Vinson Lee <[email protected]>
* symbols-check: discard platform symbols earlyEric Engestrom2019-08-041-2/+2
| | | | | | | (as the comment there already claimed) Signed-off-by: Eric Engestrom <[email protected]> Tested-by: Vinson Lee <[email protected]>
* symbols-check: skip test if we can't get the symbols listEric Engestrom2019-08-041-1/+6
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Tested-by: Vinson Lee <[email protected]>
* lima/ppir: move alu vec to scalar lowering into NIRVasily Khoruzhick2019-08-042-107/+14
| | | | | | | | | Utgard PP is vec4, but some operations are scalar, utilize NIR vec to scalar lowering pass and indicate operations that we want to lower. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* iris: Fix handling of SIMD32 fragment shadersJason Ekstrand2019-08-031-44/+50
| | | | | | | | | | | | | | The brw_wm_prog_data_dispatch_grf_start_reg and _prog_offset helpers read the _NPixelDispatchEnable fields from 3DSTATE_PS to figure out which bits to pull out of the prog data and stuff where. Therefore, they need to be called with the final set of _NPixelDispatchEnable bits after we've done the workaround for SIMD32 and 16x MSAA. Otherwise, if you end up with a somewhat odd combination of enables, the GRF start reg and KSP data ends up in the wrong slots. In particular, running SIMD32-only is broken but several other combinations are as well. Fixes: 5445c176e27ba "iris: Disable SIMD32 when using a 16x MSAA..." Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Rename GLX_USE_TLS to USE_ELF_TLS.Bas Nieuwenhuizen2019-08-0318-57/+57
| | | | | | | These days it is not GLX only and it does not work with all TLS implementations. Reviewed-by: Eric Engestrom <[email protected]>
* meson: Do not use GLX_USE_TLS on Android.Bas Nieuwenhuizen2019-08-031-1/+5
| | | | | | | | | | | | The asm code expects a specific kind of implementation, but Android uses something different (emutls). Turns out mesa has a fallback with pthread_getspecific, with an optimizaiton if only a single thread is used. emutls also uses getspecific, so lets just use the optimized mesa implementation. Fixes: 20294dceebc "mesa: Enable asm unconditionally, now that gen_matypes is gone." Reviewed-by: Eric Engestrom <[email protected]>
* etnaviv: s/boolean/boolChristian Gmeiner2019-08-032-2/+2
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* lima/ppir: Add gl_FrontFace handlingAndreas Baierl2019-08-036-13/+35
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* intel/nir: Add 1-bit opcodes to brw_cmod_for_nir_comparison_opJason Ekstrand2019-08-031-0/+10
| | | | Reviewed-by: Matt Turner <[email protected]>
* intel/nir: Add a common nir comparison -> cmod helperJason Ekstrand2019-08-034-82/+47
| | | | | | We already had one in the vec4 code, we just had move it. Reviewed-by: Matt Turner <[email protected]>
* util: fix pointer type on NetBSDEric Engestrom2019-08-031-1/+1
| | | | | | | | | | | NetBSD expects a `void *` argument [1] as the printf-style arguments to the formatting string, so we need to cast the `const` away. [1] https://netbsd.gw.com/cgi-bin/man-cgi?pthread_setname_np++NetBSD-current Suggested-by: Kamil Rytarowski <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* meson: remove unused fieldEric Engestrom2019-08-031-9/+9
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: replace last uses of libxmlconfig with idep_xmlconfigEric Engestrom2019-08-038-16/+18
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: drop unused dep_{thread,dl}Eric Engestrom2019-08-0325-29/+25
| | | | | | | | Unused as of last commit. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: replace libmesa_util with idep_mesautilEric Engestrom2019-08-0351-125/+124
| | | | | | | | | | | This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* pan/midgard: Print texture outmodAlyssa Rosenzweig2019-08-022-4/+8
| | | | | | I have no idea who thought this was a good idea. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Promote all 16 uniformsAlyssa Rosenzweig2019-08-023-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that register spilling is in place, this is reasonable. It turns out for some shaders, it's actually better to cap at 8 work registers and extra >8 uniform reigsters and tolerate the spilling, since the extra resulting threads make up for the spillage. So incidentally, the shader that spills here is in -bterrain, which jumps from 19fps to 21fps as a result of this change. total instructions in shared programs: 3513 -> 3448 (-1.85%) instructions in affected programs: 776 -> 711 (-8.38%) helped: 20 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 3.25 x̃: 2 helped stats (rel) min: 3.57% max: 16.00% x̄: 8.37% x̃: 7.19% 95% mean confidence interval for instructions value: -4.28 -2.22 95% mean confidence interval for instructions %-change: -10.02% -6.73% Instructions are helped. total bundles in shared programs: 2067 -> 2024 (-2.08%) bundles in affected programs: 515 -> 472 (-8.35%) helped: 19 HURT: 1 helped stats (abs) min: 1 max: 6 x̄: 2.37 x̃: 2 helped stats (rel) min: 2.13% max: 17.86% x̄: 10.19% x̃: 11.11% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23% 95% mean confidence interval for bundles value: -3.01 -1.29 95% mean confidence interval for bundles %-change: -12.13% -6.91% Bundles are helped. total quadwords in shared programs: 3468 -> 3426 (-1.21%) quadwords in affected programs: 764 -> 722 (-5.50%) helped: 19 HURT: 1 helped stats (abs) min: 1 max: 5 x̄: 2.26 x̃: 2 helped stats (rel) min: 1.41% max: 12.50% x̄: 6.76% x̃: 7.14% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.08% max: 1.08% x̄: 1.08% x̃: 1.08% 95% mean confidence interval for quadwords value: -2.83 -1.37 95% mean confidence interval for quadwords %-change: -8.08% -4.65% Quadwords are helped. total registers in shared programs: 383 -> 360 (-6.01%) registers in affected programs: 112 -> 89 (-20.54%) helped: 19 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 1.21 x̃: 1 helped stats (rel) min: 12.50% max: 27.27% x̄: 20.63% x̃: 20.00% 95% mean confidence interval for registers value: -1.47 -0.95 95% mean confidence interval for registers %-change: -22.39% -18.87% Registers are helped. total threads in shared programs: 432 -> 451 (4.40%) threads in affected programs: 19 -> 38 (100.00%) helped: 11 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.73 x̃: 2 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for threads value: 1.41 2.04 95% mean confidence interval for threads %-change: 100.00% 100.00% Threads are [helped]. total loops in shared programs: 4 -> 4 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 0 -> 4 spills in affected programs: 0 -> 4 helped: 0 HURT: 2 total fills in shared programs: 0 -> 7 fills in affected programs: 0 -> 7 helped: 0 HURT: 2 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Break mir_spill_register into its functionAlyssa Rosenzweig2019-08-021-117/+129
| | | | | | | No functional changes, just breaks out a megamonster function and fixes the indentation. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Switch sources to an array for trinary sourcesAlyssa Rosenzweig2019-08-0212-145/+133
| | | | | | | | | We need three independent sources to support indirect SSBO writes (as well as textures with both LOD/bias and offsets). Now is a good time to make sources just an array so we don't have to rewrite a ton of code if we ever needed a fourth source for some reason. Signed-off-by: Alyssa Rosenzweig <[email protected]>