summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* spirv: Rework barriersJason Ekstrand2018-03-071-18/+114
| | | | | | | | | | Our previous handling of barriers always used the big hammer and didn't correctly emit memory barriers when specified along with a control barrier. This commit completely reworks the way we emit barriers to make things both more precise and more correct. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* spirv: Add a vtn_constant_value helperJason Ekstrand2018-03-071-0/+6
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* radeonsi: remove si_llvm_add_attributeMarek Olšák2018-03-073-25/+16
|
* radeonsi: fix passing address32_hi to LLVM for high valuesMarek Olšák2018-03-073-5/+8
| | | | The old function treats high values as negative, which LLVM interprets as 0.
* radeonsi: assume has_virtual_memory == trueMarek Olšák2018-03-072-34/+18
|
* radeonsi: add/update assertions for 32-bit address spaceMarek Olšák2018-03-072-3/+18
|
* radeonsi: prevent a negative buffer offset in si_upload_descriptorsMarek Olšák2018-03-071-4/+3
|
* radeonsi: properly extract a buffer address from a descriptorMarek Olšák2018-03-071-1/+7
|
* radeonsi: fix vertex buffer address computation with full 64-bit addressesMarek Olšák2018-03-071-3/+3
|
* radeonsi: mask out high VM address bits in registers where neededMarek Olšák2018-03-073-22/+24
|
* radv: Add entrypoints generation with the new vk.xmlBas Nieuwenhuizen2018-03-071-107/+164
| | | | | | A lot of it is based on intel again. Reviewed-by: Dave Airlie <[email protected]>
* glsl: Fix memory leak with known glsl_type instancesSimon Hausmann2018-03-072-87/+47
| | | | | | | | | | | | | | | | | | When looking up known glsl_type instances in the various hash tables, we end up leaking the key instances used for the lookup, as the glsl_type constructor allocates memory on the global mem_ctx. This patch changes glsl_type to manage its own memory, which fixes the leak and also allows getting rid of the global mem_ctx and its mutex. v2: remove lambda usage (Tapani) (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884 Cc: [email protected] Signed-off-by: Simon Hausmann <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* spirv: Add SpvCapabilityShaderViewportIndexLayerEXTCaio Marcelo de Oliveira Filho2018-03-073-0/+13
| | | | | | | | | | | | | | This capability allows gl_ViewportIndex and gl_Layer to also be used as outputs in Vertex and Tesselation shaders. v2: Make conditional to the capability, add gl_Layer, add tesselation shaders. (Iago) v3: Don't export to tesselation control shader. v4: Add Reviewd-by tag. Reviewed-by: Iago Toral Quiroga <[email protected]>
* android: anv: add libmesa_intel_dev static dependencyMauro Rossi2018-03-071-0/+1
| | | | | | | | | | | | Fixes the following building errors: external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override' external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name' external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: 272bef0601a "intel: Split gen_device_info out into libintel_dev" Reviewed-by: Tapani Pälli <[email protected]>
* Revert "nir: bump loop unroll limit to 96."Timothy Arceri2018-03-071-3/+1
| | | | | | | | | | | | | This reverts commit 2d36efdb7f18f061c519dbb93f6058bf161aad33. This raised limit turns out to harmful for more complex shaders, it causes excessive spilling in some Bioshock Infinite shaders. The fps for the ssao demo on radv remains unchanged when reverting this. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: don't put lod into args if it's zero.Dave Airlie2018-03-071-2/+1
| | | | | | | | | | | If it's zero but put it in args we still end up consuming a register for it. This fixes some spilling in the NIR paths in Dirt Rally that isn't seen with TGSI. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* freedreno: bump required libdrm versionChristian Gmeiner2018-03-062-2/+2
| | | | | | | | | Fixes: 26a9321d0a "freedreno: add global_bindings state" Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Simplify some comparisons like a+b < aIan Romanick2018-03-061-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14514555 -> 14514547 (<.01%) instructions in affected programs: 1972 -> 1964 (-0.41%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.41% -0.40% Instructions are helped. total cycles in shared programs: 533141444 -> 533136780 (<.01%) cycles in affected programs: 164728 -> 160064 (-2.83%) helped: 181 HURT: 3 helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30 helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80% HURT stats (abs) min: 4 max: 54 x̄: 24.00 x̃: 14 HURT stats (rel) min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68% 95% mean confidence interval for cycles value: -27.12 -23.58 95% mean confidence interval for cycles %-change: -3.54% -3.16% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533667 -> 10533539 (<.01%) instructions in affected programs: 10148 -> 10020 (-1.26%) helped: 124 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04% 95% mean confidence interval for instructions value: -1.06 -1.00 95% mean confidence interval for instructions %-change: -2.46% -1.95% Instructions are helped. total cycles in shared programs: 146136887 -> 146132122 (<.01%) cycles in affected programs: 206382 -> 201617 (-2.31%) helped: 171 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30 helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67% 95% mean confidence interval for cycles value: -29.19 -26.54 95% mean confidence interval for cycles %-change: -3.20% -2.76% Cycles are helped. Iron Lake total instructions in shared programs: 7886515 -> 7886507 (<.01%) instructions in affected programs: 3016 -> 3008 (-0.27%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.27% -0.26% Instructions are helped. total cycles in shared programs: 178100396 -> 178100388 (<.01%) cycles in affected programs: 156128 -> 156120 (<.01%) helped: 4 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -3.68 1.68 95% mean confidence interval for cycles %-change: -0.03% <.01% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857872 -> 4857868 (<.01%) instructions in affected programs: 1544 -> 1540 (-0.26%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.28% -0.24% Instructions are helped. total cycles in shared programs: 122167654 -> 122167662 (<.01%) cycles in affected programs: 96248 -> 96256 (<.01%) helped: 0 HURT: 4 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: <.01% 0.02% Cycles are HURT. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Use De Morgan's Law on logic compounded comparisonsIan Romanick2018-03-061-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The replacement of the comparison operators must happen during this step. If it does not, the next pass of nir_opt_algebraic will reapply De Morgan's Law in the "opposite direction" before performing dead code elimination. The resulting infinite loop will eventually get OOM killed. Haswell, Broadwell, and Skylake had similar results. (Broadwell shown) total instructions in shared programs: 14808185 -> 14808036 (<.01%) instructions in affected programs: 13758 -> 13609 (-1.08%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3 helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01% 95% mean confidence interval for instructions value: -4.67 -2.97 95% mean confidence interval for instructions %-change: -1.09% -0.88% Instructions are helped. total cycles in shared programs: 559438333 -> 559435832 (<.01%) cycles in affected programs: 199160 -> 196659 (-1.26%) helped: 42 HURT: 3 helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51 helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40% HURT stats (abs) min: 2 max: 40 x̄: 27.33 x̃: 40 HURT stats (rel) min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74% 95% mean confidence interval for cycles value: -71.47 -39.69 95% mean confidence interval for cycles %-change: -1.64% -0.93% Cycles are helped. Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11811776 -> 11811553 (<.01%) instructions in affected programs: 15201 -> 14978 (-1.47%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6 helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26% 95% mean confidence interval for instructions value: -7.21 -4.23 95% mean confidence interval for instructions %-change: -1.48% -1.12% Instructions are helped. total cycles in shared programs: 257617270 -> 257614589 (<.01%) cycles in affected programs: 212107 -> 209426 (-1.26%) helped: 45 HURT: 0 helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54 helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32% 95% mean confidence interval for cycles value: -74.02 -45.14 95% mean confidence interval for cycles %-change: -1.59% -1.01% Cycles are helped. Iron Lake total instructions in shared programs: 7886648 -> 7886515 (<.01%) instructions in affected programs: 14106 -> 13973 (-0.94%) helped: 29 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4 helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81% 95% mean confidence interval for instructions value: -5.65 -3.52 95% mean confidence interval for instructions %-change: -1.03% -0.76% Instructions are helped. total cycles in shared programs: 178100812 -> 178100396 (<.01%) cycles in affected programs: 67970 -> 67554 (-0.61%) helped: 29 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54% 95% mean confidence interval for cycles value: -18.30 -10.39 95% mean confidence interval for cycles %-change: -0.71% -0.45% Cycles are helped. GM45 total instructions in shared programs: 4857939 -> 4857872 (<.01%) instructions in affected programs: 7426 -> 7359 (-0.90%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4 helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77% 95% mean confidence interval for instructions value: -6.06 -2.87 95% mean confidence interval for instructions %-change: -1.06% -0.67% Instructions are helped. total cycles in shared programs: 122167930 -> 122167654 (<.01%) cycles in affected programs: 43118 -> 42842 (-0.64%) helped: 15 HURT: 0 helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54% 95% mean confidence interval for cycles value: -25.03 -11.77 95% mean confidence interval for cycles %-change: -0.82% -0.41% Cycles are helped. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Replace fmin(b2f(a), b) with a bcselIan Romanick2018-03-061-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All of the affected shaders are HDR mappers from Serious Sam 3. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14516285 -> 14516273 (<.01%) instructions in affected programs: 348 -> 336 (-3.45%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -5.55% -3.06% Instructions are helped. total cycles in shared programs: 533163876 -> 533163808 (<.01%) cycles in affected programs: 1144 -> 1076 (-5.94%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for cycles value: -18.84 -15.16 95% mean confidence interval for cycles %-change: -6.20% -5.68% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533321 -> 10533309 (<.01%) instructions in affected programs: 372 -> 360 (-3.23%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -4.96% -2.86% Instructions are helped. total cycles in shared programs: 146136632 -> 146136428 (<.01%) cycles in affected programs: 11668 -> 11464 (-1.75%) helped: 12 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29% 95% mean confidence interval for cycles value: -17.66 -16.34 95% mean confidence interval for cycles %-change: -2.82% -1.58% Cycles are helped. Iron Lake total instructions in shared programs: 7886301 -> 7886277 (<.01%) instructions in affected programs: 576 -> 552 (-4.17%) helped: 12 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -5.30% -3.72% Instructions are helped. total cycles in shared programs: 178113176 -> 178113176 (0.00%) cycles in affected programs: 2116 -> 2116 (0.00%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58% 95% mean confidence interval for cycles value: -3.25 3.25 95% mean confidence interval for cycles %-change: -0.93% 0.94% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857756 -> 4857744 (<.01%) instructions in affected programs: 294 -> 282 (-4.08%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -5.71% -3.09% Instructions are helped. total cycles in shared programs: 122178730 -> 122178722 (<.01%) cycles in affected programs: 700 -> 692 (-1.14%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Pull b2f out of bcselIan Romanick2018-03-061-0/+1
| | | | | | | | | | | | | | | | All platforms had similar results. (Skylake shown) total instructions in shared programs: 14516592 -> 14516586 (<.01%) instructions in affected programs: 500 -> 494 (-1.20%) helped: 2 HURT: 0 total cycles in shared programs: 533167044 -> 533166998 (<.01%) cycles in affected programs: 6988 -> 6942 (-0.66%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Replace an odd comparison involving fmin of -b2fIan Romanick2018-03-061-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed the fge version while looking at a shader for an unrelated reason. The feq version prevents a regression in a later change that performs strength reduction of some compares. Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514808 -> 14514796 (<.01%) instructions in affected programs: 750 -> 738 (-1.60%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40% 95% mean confidence interval for instructions value: -6.67 0.67 95% mean confidence interval for instructions %-change: -2.43% -0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533144939 -> 533144853 (<.01%) cycles in affected programs: 8911 -> 8825 (-0.97%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19 helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31% 95% mean confidence interval for cycles value: -32.94 -10.06 95% mean confidence interval for cycles %-change: -2.30% -0.26% Cycles are helped. Haswell total instructions in shared programs: 13093785 -> 13093775 (<.01%) instructions in affected programs: 924 -> 914 (-1.08%) helped: 4 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: -4.53 1.20 95% mean confidence interval for instructions %-change: -2.02% 0.97% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 409580553 -> 409580118 (<.01%) cycles in affected programs: 10909 -> 10474 (-3.99%) helped: 5 HURT: 1 helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18 helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78% HURT stats (abs) min: 13 max: 13 x̄: 13.00 x̃: 13 HURT stats (rel) min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39% 95% mean confidence interval for cycles value: -180.68 35.68 95% mean confidence interval for cycles %-change: -19.55% 3.79% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 11811851 -> 11811840 (<.01%) instructions in affected programs: 1032 -> 1021 (-1.07%) helped: 5 HURT: 1 helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1 helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: -4.17 0.51 95% mean confidence interval for instructions %-change: -1.86% 0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 257618403 -> 257618168 (<.01%) cycles in affected programs: 10784 -> 10549 (-2.18%) helped: 4 HURT: 2 helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17 helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72% HURT stats (abs) min: 9 max: 14 x̄: 11.50 x̃: 11 HURT stats (rel) min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33% 95% mean confidence interval for cycles value: -133.11 54.78 95% mean confidence interval for cycles %-change: -14.79% 5.59% Inconclusive result (value mean confidence interval includes 0). GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown) total instructions in shared programs: 10533871 -> 10533859 (<.01%) instructions in affected programs: 865 -> 853 (-1.39%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21% 95% mean confidence interval for instructions value: -6.67 0.67 95% mean confidence interval for instructions %-change: -2.16% -0.29% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 146139904 -> 146139852 (<.01%) cycles in affected programs: 15213 -> 15161 (-0.34%) helped: 4 HURT: 0 helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15 helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29% 95% mean confidence interval for cycles value: -23.79 -2.21 95% mean confidence interval for cycles %-change: -0.88% 0.09% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Mark bcsel-to-fmin (or fmax) transformations as inexactIan Romanick2018-03-061-2/+2
| | | | | | | | | | | | | | These transformations are inexact because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Recognize some more open-coded fmin / fmaxIan Romanick2018-03-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This transformation is inexact because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. v2: Reorder operands and mark as inexact. The latter suggested by Jason. shader-db results: Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514817 -> 14514808 (<.01%) instructions in affected programs: 229 -> 220 (-3.93%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12% total cycles in shared programs: 533145211 -> 533144939 (<.01%) cycles in affected programs: 37268 -> 36996 (-0.73%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2 helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05% Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total cycles in shared programs: 257618409 -> 257618403 (<.01%) cycles in affected programs: 12582 -> 12576 (-0.05%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05% No changes on Iron Lake or GM45. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/omx/tizonia/h264d: Add EGLImage supportGurkirpal Singh2018-03-067-6/+227
| | | | | | | | Example Gstreamer pipeline : MESA_ENABLE_OMX_EGLIMAGE=1 GST_GL_API=gles2 GST_GL_PLATFORM=egl gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! glimagesink Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/tizonia: Add H.264 encoderGurkirpal Singh2018-03-0618-403/+2106
| | | | | | | | | | v2: Refactor out screen functions to st/omx Example Gstreamer pipeline : gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videoconvert ! omxh264enc ! h264parse ! avdec_h264 ! videoconvert ! ximagesink Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/tizonia: Add H.264 decoderGurkirpal Singh2018-03-0625-1286/+2795
| | | | | | | | | | v2: Refactor out screen functions to st/omx Example Gstreamer pipeline : gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! videoconvert ! ximagesink Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/tizonia: Add entrypointGurkirpal Singh2018-03-064-2/+78
| | | | | | | Adds base files for adding components Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/tizonia: Add --enable-omx-tizonia flag and build filesGurkirpal Singh2018-03-0610-7/+128
| | | | | | | | | | | | Allow only bellagio or tizonia to be used at the same time. Detect tizonia package config file Generate libomx_mesa.so and install it to libtizcore.pc::pluginsdir Only compile empty source (target.c) for now. GSoC Project link: https://summerofcode.withgoogle.com/projects/#4737166321123328 Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* st/omx/bellagio: Rename st and target directoriesGurkirpal Singh2018-03-0623-50/+98
| | | | | | | | | | | | | | | | | | | v2: Refactor out screen functions to st/omx Allows to keep all the code under st/omx (st/omx/tizonia and st/omx/bellagio). Reverts targets/omx_bellagio to omx as additions to existing files is enough to compile for both bellagio and tizonia. * autotools changes: --enable-omx -> --enable-omx-bellagio * meson changes: -Dgallium-omx=false -> -Dgallium-omx=disabled -Dgallium-omx=true -> -Dgallium-omx=bellagio Acked-by: Leo Liu <[email protected]> Reviewed-by: Julien Isorce <[email protected]>
* radv: report the scratch private memory size with shader statsSamuel Pitoiset2018-03-061-1/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: count the scratch private memory sizeSamuel Pitoiset2018-03-062-2/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: add ac_count_scratch_private_memory()Samuel Pitoiset2018-03-063-28/+38
| | | | | | | Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: only enable used channels when exporting parametersSamuel Pitoiset2018-03-061-4/+20
| | | | | | | | | | This allows us to generate, for example, "exp param0 v0, off, off, off" if only the first channel is needed. Not sure if this improves performance but it's worth trying. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: update enabled channels mask when optimizing PARAM exportsSamuel Pitoiset2018-03-061-2/+16
| | | | | | | | | When the mask is not 0xf we need to update the number of enabled channels, otherwise the hardware won't emit the components that are combined. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: pass the number of enabled channels to si_llvm_init_export_args()Samuel Pitoiset2018-03-061-8/+13
| | | | | | | | Currently, it's always 0xf but an upcoming patch will reduce the number of channels for parameters export. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/shader: scan output usage mask for VS and TESSamuel Pitoiset2018-03-062-0/+22
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* intel: Add missing includes for building on AndroidClayton Craft2018-03-062-0/+2
| | | | | | | | | | | This adds a missing library to the i965/Android.mk file, and updates intel/Android.mk to include the new library. Without this, mesa does not build on Android. Fixes: 272bef0601a "intel: Split gen_device_info out into libintel_dev" Reviewed-by: Kenneth Graunke <[email protected]>
* vulkan: do not expose surface/swapchain extensions on AndroidTapani Pälli2018-03-062-3/+3
| | | | | | | | | On Android surface/swapchain extensions are implemented by the loader. Patch modifies both anv and radv extension scripts disabling currently exposed ones. See also earlier commit 9f763c1f9b. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Don't expose VK_KHX_multiview on android.Tapani Pälli2018-03-061-1/+1
| | | | | | | | | | | | | | Just like commit 2ffe395 does for radv. Fixes following dEQP test on i965: dEQP-VK.api.info.android.no_unknown_extensions v2: make it !ANDROID since this extension is not about surfaces/swapchain Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128Roland Scheidegger2018-03-061-1/+1
| | | | | | | Some state trackers require 128. (There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl state tracker it's unlikely more than 32 will be needed, if you need more use bindless.)
* tgsi/scan: use wrap-around shift behavior explicitly for file_maskRoland Scheidegger2018-03-063-4/+12
| | | | | | | | | | | | | | The comment said it will only represent the lowest 32 regs. This was not entirely true in practice, since at least on x86 you'll get masked shifts (unless the compiler could recognize it already and toss it out). It turns out this actually works out alright (presumably noone uses it for temp regs) when increasing max sampler views, so make that behavior explicit. Albeit it feels a bit hacky (but in any case, explicit behavior there is better than undefined behavior). Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* clover: Allow overriding platform/device version numbersAaron Watry2018-03-052-5/+14
| | | | | | | | | | | | | | | Useful for testing API, builtin library, and device completeness of not-yet-supported versions. Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (v3) Reviewed-by: Emil Velikov <[email protected]> Cc: Jan Vesely <[email protected]> v4: Remove redundant std::string wrapper around debug_get_option calls v3: mark CL version overrides as static and const v2: Make version_string in platform const in case
* clover/llvm: Pass device down to compileAaron Watry2018-03-051-4/+3
| | | | | | | | | | | | We'll need to be able to detect device version to define the appropriate __OPENCL_VERSION__ header. v2: Rebase after removing the previous patch (Pierre) - Removed "clover: Add device_clc_version to llvm::create_compiler_instance" Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* clover: Pass device to llvm::create_compiler_instanceAaron Watry2018-03-051-4/+5
| | | | | | | | | | | | | | | | | We'll be using dev.device_clc_version to select the default language version soon along with the existing ir_target field. Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> v4: Pass the device down instead of device_clc_version as a separate field v3: Revise to acknowledge that we now have the device in compile/link_program instead of the string values. v2: (Pierre) Move changes to create_compiler_instance invocation to correct patch to prevent temporary build breakage. (Jan) Use device_clc_version instead of device_version for compile/link
* clover/llvm: Use device in llvm compilation instead of copying fieldsAaron Watry2018-03-053-17/+15
| | | | | | | | | | | | | | | Copying the individual fields from the device when compiling/linking will lead to an unnecessarily large number of fields getting passed around. v3: Rebase on current master v2: Use device in function args before making additional changes in following patches Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Jan Vesely <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* radeonsi/nir: fix handling of doubles for gs inputsTimothy Arceri2018-03-061-2/+6
| | | | | | | Fixes piglit test: tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test Reviewed-by: Dave Airlie <[email protected]>
* ac: pass the unmodified number of components to load gs inputsTimothy Arceri2018-03-061-2/+2
| | | | | | | | | | | Currently both users of this would overflow an array when the input was a dual slot double as they expected the number of components to be a max of 4. Since we pass the type we can just let the functions handle doubles in a way they choose. Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move si_nir_load_input_gs() to si_shader.cTimothy Arceri2018-03-063-29/+20
| | | | | | | | All the tess shader and tgsi equivalents are here and it allows use to use llvm_type_is_64bit() in the following patch without exposing it externally. Reviewed-by: Dave Airlie <[email protected]>
* broadcom/vc4: Add support for HW perfmonBoris Brezillon2018-03-055-12/+249
| | | | | | | | | The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Eric Anholt <[email protected]>