summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: perf: update render basic configs for big core gen9/gen10Lionel Landwerlin2019-04-018-23/+24
| | | | | | | | | This updates allows an MI_LRI to trigger a OA report write in the global OA buffer. This isn't really useful for us, we just keep close to the internal public configs. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: add ring busyness metric for cfl gt2Lionel Landwerlin2019-04-011-1/+165
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: enable Icelake metricsLionel Landwerlin2019-03-313-3/+11
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: add Icelake metricsLionel Landwerlin2019-03-311-0/+11899
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: drop programming of an unused NOA registerLionel Landwerlin2019-03-311-11/+6
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: hsw: drop register programming not needed on HSWLionel Landwerlin2019-03-311-2/+1
| | | | | | | This register is flagged as IVB only in the documentation. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: chv: fixup counters namesLionel Landwerlin2019-03-311-25/+25
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: add PMA stall metricsLionel Landwerlin2019-03-3110-10/+1140
| | | | | | | | These are new metrics for Gen8/9 to measure the effect of the PMA stall workaround fix. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: update memory write configLionel Landwerlin2019-03-311-7/+49
| | | | | | | | This rework the programming between older pre-production steppings & new ones. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: update compute metrics configLionel Landwerlin2019-03-311-8/+2
| | | | | | | | This unifies some of the programming between pre-production stepping and production ones. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: update a priority for register programmingLionel Landwerlin2019-03-311-2/+2
| | | | | | | This makes no difference in term of programming, it's just a cleanup. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* panfrost: Implement FIXED formatsAlyssa Rosenzweig2019-03-311-0/+9
| | | | | | Fixes crash in dEQP-GLES2.functional.draw.random.9 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix index calculation types and assertsAlyssa Rosenzweig2019-03-311-5/+4
| | | | | | | Fixes crash in dEQP-GLES2.functional.draw.draw_elements.points.single_attribute. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Clean index state between indexed drawsAlyssa Rosenzweig2019-03-311-1/+3
| | | | | | | Fixes subsequent tests in dEQP-GLES2.functional.draw.draw_elements.indices.* Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/decode: Print negative_startAlyssa Rosenzweig2019-03-311-0/+2
| | | | | | This property slipped through.. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement missing texture formatsAlyssa Rosenzweig2019-03-312-1/+17
| | | | | | | | | | - Implements RGB565/RGBA5551 formats - Don't advertise support for flipped RGBA5551 and ETC Fixes remaining tests in dEQP-GLES2.functional.texture.format.* which is now at 36/36. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Extend tiling for cubemapsAlyssa Rosenzweig2019-03-311-14/+14
| | | | | | | transfer_unmap now tiles for any tiled resource, not just TEXTURE_2D, which should more than just cubemaps! Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement command stream for linear cubemapsAlyssa Rosenzweig2019-03-312-6/+8
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Emit cubemap coordinatesAlyssa Rosenzweig2019-03-312-5/+32
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Include all cubemap faces in bitmap listAlyssa Rosenzweig2019-03-311-3/+9
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/decode: Decode all cubemap facesAlyssa Rosenzweig2019-03-311-1/+7
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Preliminary work for cubemapsAlyssa Rosenzweig2019-03-313-6/+10
| | | | | | | Again, not yet functional, but this sets up the memory management for cube maps. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add L/S op for writing cubemap coordinatesAlyssa Rosenzweig2019-03-311-0/+9
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Disassemble `cube` texture opAlyssa Rosenzweig2019-03-311-0/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix vertex buffer corruptionAlyssa Rosenzweig2019-03-311-4/+6
| | | | | | Fixes crash in dEQP-GLES2.functional.buffer.* Signed-off-by: Alyssa Rosenzweig <[email protected]>
* iris: fix set_sampler_viewRob Clark2019-03-301-2/+3
| | | | | | | Update to match docs. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallium/docs: clarify set_sampler_views (v2)Rob Clark2019-03-302-1/+6
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: convert to "new style" frag inputsRob Clark2019-03-302-2/+33
| | | | | | | | | | Add support for load_barycentric_pixel, load_interpolated_input, and friends. For now, this retains support for old-style inputs, which can probably be dropped with some ttn work. Prep work for sample-shading support. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add pass to move varying loadsRob Clark2019-03-305-0/+151
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rework varying packingRob Clark2019-03-301-30/+98
| | | | | | | | | Originally we kept track of a table of inputs. But with new-style frag inputs this becomes awkward. Re-work it so that initially we assigned un-packed varying locations, and then after the shader is compiled scan to find actual used inputs, and re-pack. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: re-indent commentRob Clark2019-03-301-4/+4
| | | | | | | Make it more clear that it applies to the following 'case' statements, rather than the previous one. Signed-off-by: Rob Clark <[email protected]>
* nir: add lower_all_io_to_elementsRob Clark2019-03-302-0/+2
| | | | | | | I need this part of lower_all_io_to_temps but without the actual lowering to temps part. Signed-off-by: Rob Clark <[email protected]>
* nir: print var name for load_interpolated_input tooRob Clark2019-03-301-0/+1
| | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Karol Herbst <[email protected]>
* i965,iris/blorp: do not blit 0-sizesSergii Romantsov2019-03-302-2/+18
| | | | | | | | | | | | | | | | | Seems there is no sense in blitting 0-sized sources or destinations. Additionaly it may cause segfaults for i965. v2: Function call replaced with inline check v3: Added check to avoid devision by zero (L. Landwerlin) v4: Added simillar check for Iris (L. Landwerlin) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110239 Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* gallium: Fix autotools build with libxatracker.la.Vinson Lee2019-03-291-4/+2
| | | | | | | | | | | CXXLD libxatracker.la /usr/bin/ld: ../../../../src/gallium/auxiliary/.libs/libgallium.a(tgsi_to_nir.o): in function `ttn_finalize_nir': src/gallium/auxiliary/nir/tgsi_to_nir.c:2111: undefined reference to `gl_nir_lower_samplers_as_deref' /usr/bin/ld: src/gallium/auxiliary/nir/tgsi_to_nir.c:2113: undefined reference to `gl_nir_lower_samplers' Fixes: 9a834447d652 ("tgsi_to_nir: Produce optimized NIR for a given pipe_screen.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109929 Signed-off-by: Vinson Lee <[email protected]>
* gallium: fix autotools build of pipe_msm.laTimur Kristóf2019-03-291-4/+2
| | | | | | Signed-off-by: Vinson Lee <[email protected]> Fixes: 9a834447d652 ("tgsi_to_nir: Produce optimized NIR for a given pipe_screen.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109929
* nir: Lock around validation fail shader dumpingJason Ekstrand2019-03-291-0/+10
| | | | | | | This prevents getting mixed-up results if a multi-threaded app has two validation errors in different threads. Reviewed-by: Timothy Arceri <[email protected]>
* util: no-op __builtin_types_compatible_p() for non-GCC compilersBrian Paul2019-03-291-0/+4
| | | | | | | | | | | | | | __builtin_types_compatible_p() is GCC-specific and breaks the MSVC build. This intrinsic has been in u_vector_foreach() for a long time, but that macro has only recently been used in code (nir/nir_opt_comparison_pre.c) that's built with MSVC. Fixes: 2cf59861a ("nir: Add partial redundancy elimination for compares") Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* iris: Clean up compiler warnings about unusedCaio Marcelo de Oliveira Filho2019-03-292-11/+1
| | | | | | | Removed a few unused variables and iris_getparam_boolean(). Kept 'name' around since there's a commented debug that make use of it. Reviewed-by: Kenneth Graunke <[email protected]>
* egl: hide entrypoints that shouldn't be exported when using glvndEric Engestrom2019-03-291-0/+6
| | | | | | | | | | | From GLVND author: > From a functional standpoint, exporting additional symbols doesn't > really matter, since libglvnd will load the vendor libraries with > RTLD_LOCAL. Suggested-by: Kyle Brenneman <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Kyle Brenneman <[email protected]>
* nir/validate: validate that tex deref sources are actually derefsKarol Herbst2019-03-291-0/+11
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/print: fix printing the image_array intrinsic indexKarol Herbst2019-03-291-2/+2
| | | | | | | Fixes: 0de003be0363 ("nir: Add handle/index-based image intrinsics") Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* Revert "ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations"Timothy Arceri2019-03-291-42/+24
| | | | | | | | | This reverts commit 29132af2347ede46a6d02422295a5fadbe5fe788. It seems the new intrinsic causes a hang on radeonsi (VEGA) when running the piglit test: tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test
* ac: fix return type for llvm.amdgcn.frexp.exp.i32.64Samuel Pitoiset2019-03-291-1/+1
| | | | | | | | This fixes the following piglit with RadeonSI tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* virgl: Add a caps feature check versionGert Wollny2019-03-293-1/+4
| | | | | | | | | | | | | | | | | | When we add new feature checks on the host side that is used to enable a cap conditionally that was enabled unconditionally before we might end up with a feature regression when a new mesa version is used with an old virglrenderer version that doesn't check for that cap. To work around this problem add a version id to the caps that corresponds to the features that are actually checked on the host and check that version too when enabling the cap. Fixes: 2ee197d6e84aa37638d423363aca183952816067 virgl: Enable mixed color FBO attachemnets only when the host supports it Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Pohsien Wang <[email protected]>
* radv: do not always initialize HTILE in compressed stateSamuel Pitoiset2019-03-291-2/+8
| | | | | | | | | | | | Especially when performing a transtion from UNDEFINED->GENERAL, the driver shouldn't initialize HTILE metadata in compressed state because it doesn't decompress when the src layout is GENERAL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110259 Fixes: 3a2e93147f7 ("radv: always initialize HTILE when the src layout is UNDEFINED") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* iris: Print the memzone name when allocating BOs with INTEL_DEBUG=bufKenneth Graunke2019-03-281-2/+17
| | | | | This gives me an idea of what kinds of buffers are being allocated on the fly which could help inform our cache decisions.
* nir: use {0} initializer instead of {} to fix MSVC buildBrian Paul2019-03-281-2/+2
| | | | | | Trivial change. Fixes: c6ee46a75 ("nir: Add nir_alu_srcs_negative_equal")
* intel/compiler: Use partial redundancy elimination for comparesIan Romanick2019-03-281-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost all of the hurt shaders are repeated instances of the same shader in synmark's compilation speed tests. shader-db results: All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15256840 -> 15256389 (<.01%) instructions in affected programs: 54137 -> 53686 (-0.83%) helped: 288 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.06% max: 26.67% x̄: 1.99% x̃: 0.74% 95% mean confidence interval for instructions value: -1.76 -1.38 95% mean confidence interval for instructions %-change: -2.47% -1.50% Instructions are helped. total cycles in shared programs: 372286583 -> 372283851 (<.01%) cycles in affected programs: 833829 -> 831097 (-0.33%) helped: 265 HURT: 16 helped stats (abs) min: 2 max: 74 x̄: 11.81 x̃: 4 helped stats (rel) min: 0.04% max: 9.07% x̄: 0.99% x̃: 0.35% HURT stats (abs) min: 2 max: 130 x̄: 24.88 x̃: 8 HURT stats (rel) min: <.01% max: 12.31% x̄: 1.44% x̃: 0.27% 95% mean confidence interval for cycles value: -12.30 -7.15 95% mean confidence interval for cycles %-change: -1.06% -0.64% Cycles are helped. Iron Lake and GM45 had similar results. (GM45 shown) total instructions in shared programs: 5038653 -> 5038495 (<.01%) instructions in affected programs: 13939 -> 13781 (-1.13%) helped: 50 HURT: 1 helped stats (abs) min: 1 max: 15 x̄: 3.18 x̃: 4 helped stats (rel) min: 0.33% max: 13.33% x̄: 2.24% x̃: 1.09% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.83% max: 0.83% x̄: 0.83% x̃: 0.83% 95% mean confidence interval for instructions value: -3.73 -2.47 95% mean confidence interval for instructions %-change: -3.16% -1.21% Instructions are helped. total cycles in shared programs: 128118922 -> 128118228 (<.01%) cycles in affected programs: 134906 -> 134212 (-0.51%) helped: 50 HURT: 0 helped stats (abs) min: 2 max: 60 x̄: 13.88 x̃: 18 helped stats (rel) min: 0.06% max: 3.19% x̄: 0.74% x̃: 0.70% 95% mean confidence interval for cycles value: -16.54 -11.22 95% mean confidence interval for cycles %-change: -0.95% -0.53% Cycles are helped. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add partial redundancy elimination for comparesIan Romanick2019-03-285-0/+414
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass attempts to dectect code sequences like if (x < y) { z = y - x; ... } and replace them with sequences like t = x - y; if (t < 0) { z = -t; ... } On architectures where the subtract can generate the flags used by the if-statement, this saves an instruction. It's also possible that moving an instruction out of the if-statement will allow nir_opt_peephole_select to convert the whole thing to a bcsel. Currently only floating point compares and adds are supported. Adding support for integer will be a challenge due to integer overflow. There are a couple possible solutions, but they may not apply to all architectures. v2: Fix a typo in the commit message and a couple typos in comments. Fix possible NULL pointer deref from result of push_block(). Add missing (-A + B) case. Suggested by Caio. v3: Fix is_not_const_zero to work correctly with types other than nir_type_float32. Suggested by Ken. v4: Add some comments explaining how this works. Suggested by Ken. Reviewed-by: Kenneth Graunke <[email protected]>