aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* broadcom: Move v3d_get_device_info to commonAndreas Bergmeier2019-07-174-52/+88
| | | | In common we can use implementation for Vulkan.
* nir/large_constants: Use dominance information to find more constantsCaio Marcelo de Oliveira Filho2019-07-171-6/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Relax the restriction that all the writes need to be in the first block: now accept variables that have all the writes in the same block, and all the reads are dominated by that block. This let the pass identify large constants that are local to a helper function. The writes will be at the place that the function is inlined, possibly not in the first block (but still all in the same block). Results for vkpipeline-db in SKL: total instructions in shared programs: 3624891 -> 3623145 (-0.05%) instructions in affected programs: 79416 -> 77670 (-2.20%) helped: 16 HURT: 0 total cycles in shared programs: 1458149667 -> 1458147273 (<.01%) cycles in affected programs: 30154164 -> 30151770 (<.01%) helped: 14 HURT: 2 total loops in shared programs: 2437 -> 2437 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 8813 -> 8745 (-0.77%) spills in affected programs: 2894 -> 2826 (-2.35%) helped: 8 HURT: 0 total fills in shared programs: 23470 -> 23392 (-0.33%) fills in affected programs: 12248 -> 12170 (-0.64%) helped: 6 HURT: 2 LOST: 0 GAINED: 0 Results for shader-db in SKL with Iris: total instructions in shared programs: 15379442 -> 15379392 (<.01%) instructions in affected programs: 837 -> 787 (-5.97%) helped: 2 HURT: 2 helped stats (abs) min: 27 max: 27 x̄: 27.00 x̃: 27 helped stats (rel) min: 10.47% max: 10.67% x̄: 10.57% x̃: 10.57% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 1.23% max: 1.23% x̄: 1.23% x̃: 1.23% 95% mean confidence interval for instructions value: -39.14 14.14 95% mean confidence interval for instructions %-change: -15.51% 6.17% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4880 -> 4880 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 370677237 -> 370676567 (<.01%) cycles in affected programs: 17852 -> 17182 (-3.75%) helped: 2 HURT: 1 helped stats (abs) min: 338 max: 356 x̄: 347.00 x̃: 347 helped stats (rel) min: 13.98% max: 14.64% x̄: 14.31% x̃: 14.31% HURT stats (abs) min: 24 max: 24 x̄: 24.00 x̃: 24 HURT stats (rel) min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18% total spills in shared programs: 11772 -> 11772 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 24948 -> 24948 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 0 GAINED: 0 Reviewed-by: Jason Ekstrand <[email protected]>
* intel/fs: Use a strided MOV instead of a conversion for load_* destinationsJason Ekstrand2019-07-171-5/+3
| | | | | | | | In many cases, the compiler can just copy-prop the strided MOV whereas the conversion is a bit trickier. This cuts 5% of the instructions off of one particular Vulkan CTS test which does lots of load_ssbo. Reviewed-by: Matt Turner <[email protected]>
* nir/algebraic: Optimize comparisons and up-castsJason Ekstrand2019-07-171-0/+67
| | | | | | | | | | | | | | | | | | | | These seem like obvious enough optimizations in the world of multiple integer bit sizes. The only known thing which hits these at the moment is some Vulkan CTS tests for 16-bit SSBO values which like to up-cast and check for equality. However, it's something that's bound to come up as we start seeing more integers in shaders. The optimizations of comparisons of casted values with constants are something which we would ideally do with range analysis. However, lacking that, we can do it in opt_algebraic as long as one side is a constant. In dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13, this commit, along with the previous commit, reduce the number of instructions emitted on Skylake from 55328 to 44546, a reduction of 20%. Acked-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir/algebraic: Optimize comparing unpacked valuesJason Ekstrand2019-07-171-0/+8
| | | | | | | | | We could, in theory, add the same optimization for 64-bit unpack operations but that's likely to fight with 64-bit integer lowering on platforms which require it so it will require more infrastructure before that will be a good idea. Reviewed-by: Matt Turner <[email protected]>
* nir/algebraic: Print out the list of transforms in the C fileJason Ekstrand2019-07-171-0/+7
| | | | | | | | This helps greatly when debugging algebraic transform generators because you can now actually see the output and verify that your transforms are getting generated. Acked-by: Matt Turner <[email protected]>
* intel/fs: Properly stride NULL replacement regs in DCEJason Ekstrand2019-07-171-1/+2
| | | | | | | | | This fixes some validation errors generated by certain D->W conversions but is likely not a full solution. Calculating an actual register stride is a far more complex problem in general and should probably be handled by the brw_fs_generator. Reviewed-by: Matt Turner <[email protected]>
* nir: Fix nir_lower_alu_to_scalar's instr filtering.Eric Anholt2019-07-171-1/+1
| | | | | | | | | | | | | It was checking if the dest or src[0] SSA values were vectors, rather than whether the ALU op was using the source as a vector resulting in a nir_fdot4 making it through to vc4 and v3d: vec1 32 ssa_6 = fdot4 ssa_4.xxxx, ssa_5 Fixes: c1cffa4249ca ("nir/alu_to_scalar: Use the new NIR lowering framework") v2: Use Jason's recommendation to look at input_sizes. Reviewed-by: Jason Ekstrand <[email protected]>
* panfrost: Merge varyings_mem into transient buffersAlyssa Rosenzweig2019-07-173-15/+5
| | | | | | | | | | | Theoretically we would like these split since varyings can have specially optimized flags (no map, coherent local). For now, since neither of these flags is particularly meaningful right now, merge them together instead of special casing varyings_mem. Saves upwards of 64MB of RAM per context. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* vulkan/wsi: update swapchain status on vkQueuePresentLionel Landwerlin2019-07-171-0/+21
| | | | | | | | | | | | | | | | | | | | | | | With the following chain of events : vkQueuePresent() <- Surface resize vkQueuePresent() We should be able to report SUBOPTIMAL or OUT_OF_DATE on the second vkQueuePresent() call. Currently we only look at X11 events in the vkAcquireNextImage() path so we're not able to report this. This change checks the queue of events and process any available ones to update the swapchain status. v2: Be consistent about reporting the current error state of the swapchain (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111097 Cc: <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: add an option for disabling NGG on GFX10Samuel Pitoiset2019-07-174-1/+8
| | | | | | | Will be useful for testing the legacy path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* softpipe: pass stream-out targets to draw-module earlyErik Faye-Lund2019-07-172-15/+8
| | | | | | | | This is essensially a port of ed53e61bec9 from LLVMpipe to softpipe, as it makes things a bit simpler and more performant. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-By: Gert Wollny <[email protected]>
* spirv_extensions: i965: initialize SPIR-V extensionsAlejandro Piñeiro2019-07-172-1/+12
| | | | | | v2: Rebase update after changes on previous patches. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv_extensions: add spirv_supported_extensions on gl_constantsAlejandro Piñeiro2019-07-172-1/+21
| | | | | | | | | We can use it to get real values for ARB_spirv_extensions methods. Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Arcady Goldmints-Orlov <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv_extensions: define spirv_extensions_supportedAlejandro Piñeiro2019-07-172-0/+43
| | | | | | | | | | | | | | | | | | | | | | Add a struct to maintain which SPIR-V extensions are supported, and an utility method to initialize it based on nir_spirv_supported_capabilities. v2: * Fixing code style (Ian Romanick) * Adding a prefix (spirv) to fill_supported_spirv_extensions (Ian Romanick) v3: rebase update (nir_spirv_supported_extensions renamed) v4: include AMD_gcn_shader support v5: move spirv_fill_supported_spirv_extensions to src/mesa/main/spirv_extensions.c Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Arcady Goldmints-Orlov <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv_extensions: add list of extensions and to_string methodAlejandro Piñeiro2019-07-172-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ideally this should be generated somehow. One option would be gather all the extension dependencies listed on the core grammar, but there would be the possibility of not including some of the extensions. Note that spirv-tools is doing it just slightly better, as it has a hardcoded list of extensions manually took from the registry, that they parse to get the enum and the to_string method (see generate_grammar_tables.py). v2: * Use a macro to improve readability. (Tapani Pälli) * Add unreachable on the switch, no default (Eric Engestrom) * No typedef enum (Ian Romanick) * Sort extensions names (Ian Romanick) * Don't add extensions unlikely to be supported by Mesa at any point (Ian Romanick) v3: rebase update v4: Include AMD_gcn_shader v5: move spirv_extensions_to_string to src/mesa/main/spirv_extensions.c Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Arcady Goldmints-Orlov <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv_extensions: add GL_ARB_spirv_extensions boilerplateAlejandro Piñeiro2019-07-1712-0/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | v2: * Mention extension gap at gl_API.xml (Emil Velikov) * Bail with INVALID_ENUM if extension not available on getStringi (Emil Velikov) * Use EXTRA_EXT macro when defining the extension at get.c/get_hash_params.py (Emil Velikov) * Rename source files (spirvextensions.[ch] -> spirv_extensions.[ch]) (Ian) v3: * Fix GL_PROGRAM_BINARY_FORMATS glGet query, broken by error on a previous rebase v4: * Fix rebase conflicts on getstring.c after GL_SHADING_LANGUAGE_VERSION query was added v5: * Remove src/mapi/glapi/gen/Makefile.am as it no longer exists in master Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Arcady Goldmints-Orlov <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* radv/gfx10: implement VK_EXT_post_depth_coverageSamuel Pitoiset2019-07-175-0/+5
| | | | | | | | I did implement this extension a while ago but it didn't work on pre GFX10 for some reasons. Now all CTS pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: disable the TC compat zrange workaroundSamuel Pitoiset2019-07-174-4/+13
| | | | | | | Unnecessary. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fallback to the legacy path if tess and extreme geometrySamuel Pitoiset2019-07-172-1/+13
| | | | | | | | | | This is unsupported and hangs. This fixes GPU hangs with dEQP-VK.tessellation.geometry_interaction.limits.output_required_*. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: always build the GS copy shader but uses it on-demandSamuel Pitoiset2019-07-173-7/+24
| | | | | | | | | It should be possible to build it on-demand too but it requires more work. On GFX10, the GS copy shader is required when tess is enabled with extreme geometry. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* softpipe: Remove unused static functionGert Wollny2019-07-171-9/+0
| | | | | | | | | | | Thanks to Eric Engestrom for pointing out that there was something wrong with that function. Fixes: 724a73509e1bc1ce3abf9500e457bb2911b642db softpipe: Prepare handling explicit gradients Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* spirv: Bail when we see CounterBuffer decorationCaio Marcelo de Oliveira Filho2019-07-161-1/+1
| | | | | | | This decoration can be ignored, so we can just skip the next steps. Otherwise we'd have to also handle it in apply_var_decoration. Reviewed-by: Jason Ekstrand <[email protected]>
* iris: Drop copy and pasted iris_timebase_scaleKenneth Graunke2019-07-163-12/+3
| | | | | Lionel moved brw_timebase_scale to gen_device_info_timebase_scale a few months ago, so we should just use that, and not our own copy in iris.
* nir/regs_to_ssa: Handle regs in phi sources properlyJason Ekstrand2019-07-161-2/+32
| | | | | | | | | | | | | Sources of phi instructions act as if they occur at the very end of the predecessor block not the block in which the phi lives. In order to handle them correctly, we have to skip phi sources on the normal instruction walk and handle them as a separate walk over the successor phis. While registers in phi instructions is a bit of an oddity it can happen when we temporarily go out-of-SSA for control-flow manipulations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111075 Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Add a warning for ArrayStride on arrays of blocksJason Ekstrand2019-07-161-2/+9
| | | | | | | | | | | It's disallowed according to the SPIR-V spec or at least I think that's what the spec says. It's in a section explicitly about explicit layout of things in the StorageBuffer, Uniform, and PushConstant storage classes so it's not 100% clear that it applies with other storage classes. However, it seems like it should apply in general and violating it can trigger (fairly harmless) asserts in NIR. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Increase state allocation size limit to 2MBCaio Marcelo de Oliveira Filho2019-07-161-1/+1
| | | | | | | | When running on ICL the dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 needs more than 1M for the shader, so bump it. Reviewed-by: Jason Ekstrand <[email protected]>
* meta: leaking of BO with DrawPixelsYevhenii Kolesnikov2019-07-161-0/+2
| | | | | | | | | | ctx->Unpack.BufferObj wasn't unreferenced. Fixes: d492e7b0171 (meta: Fix invalid PBO access from DrawPixels when trying to just alloc.) CC: Eric Anholt <[email protected]> Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* swrast: Move _mesa_format_pack_colormask() to the only caller.Eric Anholt2019-07-163-78/+72
| | | | | | | | | This avoids needing format_pack to have access to the GLenum return functions for mesa_format. It seems like an odd function and unlikely to be reused. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa: Give _mesa_format_get_color_encoding a clearer name.Eric Anholt2019-07-1614-35/+22
| | | | | | | It only returned one of two values. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa: Drop redundant checks for sRGB before sRGB to linear conversion.Eric Anholt2019-07-162-6/+4
| | | | | | | | _mesa_get_srgb_format_linear() just returns the original format if it wasn't sRGB. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa: Fold _mesa_unpack_depth_stencil_row() into its only caller.Eric Anholt2019-07-163-34/+14
| | | | | | | This was the last bit of gl.h usage in format packing. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa: Convert format_pack/unpack off of GL types.Eric Anholt2019-07-164-353/+352
| | | | | Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa: Port format_pack/unpack off of _mesa_problem().Eric Anholt2019-07-162-47/+17
| | | | | | | unreachable() should be plenty of debug for these. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa: Mostly switch Mesa format info off of GL types other than GLenum.Eric Anholt2019-07-162-142/+144
| | | | | | | | I'm considering moving most of this code to src/util/, and I want that code to not expose GL types in its interfaces. Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa: Rename gl_pack typedefs to mesa_pack.Eric Anholt2019-07-165-20/+20
| | | | | | | These are packing mesa formats, not a GL format/type. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa: Rename gl_format_info to mesa_format_info.Eric Anholt2019-07-162-29/+29
| | | | | | | It's about MESA_FORMATs, after all. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* state_tracker: Move the format test out to be an actual unit test.Eric Anholt2019-07-163-51/+117
| | | | | | | | We want errors in the table to show up as unit test failures in MRs. Also keeps unit test code out of the built drivers. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* u_format: Remove pointless comments.Eric Anholt2019-07-161-6/+0
| | | | | Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* src/util: Switch _mesa_half_to_float() to u_half.h's version.Eric Anholt2019-07-161-43/+2
| | | | | | | | | The two implementations differ across the entire input range only in that u_half.h preserves mantissa bits for NaNs. The u_half.h version shaves 15% off of the text size of half_float.o. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* u_half_test: Turn it into an actual unit test.Eric Anholt2019-07-161-4/+5
| | | | | | | | You could break the test and meson test wouldn't complain, since we returned success either way. Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* android: radv/gfx10: generate gfx10_format_table.hMauro Rossi2019-07-162-1/+17
| | | | | | | | | | | | | | | | | | | | | | This patch adds the missing building rules for Android, to avoid following building errors: In file included from external/mesa/src/amd/vulkan/radv_debug.c:35: In file included from external/mesa/src/amd/vulkan/radv_debug.h:27: external/mesa/src/amd/vulkan/radv_private.h:95:10: fatal error: 'gfx10_format_table.h' file not found ^~~~~~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/amd/vulkan/radv_android.c:31: external/mesa/src/amd/vulkan/radv_private.h:95:10: fatal error: 'gfx10_format_table.h' file not found ^~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: 3dc5ec5d16 ("radv/gfx10: generate gfx10_format_table.h") Signed-off-by: Mauro Rossi <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* mesa/st: add sampler uniformsRob Clark2019-07-161-6/+45
| | | | | | | | | | | Add sampler uniforms for the UV plane(s), so driver can count the uniforms and get the correct sampler count. Fixes lowered YUV on a6xx which actually wants to know # of samplers. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* egl/android: handle multi-fd native windowsRob Clark2019-07-161-25/+55
| | | | | | | | | | | | | | We can hit multi-fd EGL_NATIVE_BUFFER_ANDROID case when the native android buffer is YUV. So we need to handle that. Currently this went unnoticed because, even though we have two or three fd's for YUV native android buffers, they all reference the same backing buffer. But we really shouldn't rely on that. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st,i965: Stop looping on 64-bit loweringJason Ekstrand2019-07-163-30/+13
| | | | | | | | | Now that the 64-bit lowering passes do a complete lowering in one go, we don't need to loop anymore. We do, however, have to ensure that int64 lowering happens after double lowering because double lowering can produce int64 ops. Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_doubles: Handle fdiv and fsub directlyJason Ekstrand2019-07-162-2/+17
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_doubles: Use the new NIR lowering frameworkJason Ekstrand2019-07-161-72/+65
| | | | | | | One advantage of this is that we no longer need to run in a loop because the new framework handles lowering instructions added by lowering. Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_doubles: Use "alu" for the nir_alu_instrJason Ekstrand2019-07-161-15/+15
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_int64: Use the core NIR lowering frameworkJason Ekstrand2019-07-161-74/+49
| | | | | | | One advantage of this is that we no longer need to run in a loop because the new framework handles lowering instructions added by lowering. Reviewed-by: Eric Anholt <[email protected]>
* nir/alu_to_scalar: Use the new NIR lowering frameworkJason Ekstrand2019-07-161-93/+54
| | | | Reviewed-by: Eric Anholt <[email protected]>