aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* turnip: Add todo for d24_s8 copiesBas Nieuwenhuizen2019-09-271-0/+1
| | | | | Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* turnip: Disallow NPoT formats.Bas Nieuwenhuizen2019-09-271-10/+18
| | | | | | | Copying is a mess for these formats for now. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* turnip: Always use UINT formats for copies.Bas Nieuwenhuizen2019-09-271-3/+29
| | | | | | | | | | | | | | Looks like r16_unorm might have precision issues. dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.color.r16_unorm.r16_unorm.general_general fails, but the dumped images in the xml are the same so I'd guess the low bits are the issue. r8_unorm and r16_uint work. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* turnip: Add image->image blitting.Bas Nieuwenhuizen2019-09-271-14/+165
| | | | | | | 3D blits & format reinterpretation are still TBD. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* aco: don't remove the loop exec mask in transition_to_Exact()Rhys Perry2019-09-271-1/+5
| | | | | | | No pipeline-db changes. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: set loop_info::has_discard for demotesRhys Perry2019-09-273-5/+9
| | | | | | | | We need the loop header phis for the outer exec masks. Needed for dEQP-VK.glsl.demote.dynamic_loop_texture Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* iris: Only resolve for image levels/layers which are actually in use.Kenneth Graunke2019-09-262-17/+12
| | | | There's no need to resolve everything.
* lima/ppir: add NIR pass to split varying loadsVasily Khoruzhick2019-09-265-0/+127
| | | | | | | | | | | | | | | NIR may emit a single instrinsic to load several packed varyings, but that's suboptimal for Utgard PP for several reasons: - varyings that are used as sampler inputs can be passed using pipeline register with increased precision - we have small number of regs, so using a vec4 regs for storing two vec2 varyings increases reg pressure. Add NIR pass to split a single load into several loads and utilize it in lima. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* radv: Fix L2 cache rinse programming.Timur Kristóf2019-09-261-5/+9
| | | | | | | | According to radeonsi, GLM doesn't support WB alone, so we have to set INV too when WB is set. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* turnip: emit texture and uniform stateJonathan Marek2019-09-262-15/+339
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: add some shader information in pipeline stateJonathan Marek2019-09-262-0/+32
| | | | | | | | This information is needed by texture/uniform descriptors. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: use nir_opt_copy_prop_varsJonathan Marek2019-09-261-0/+2
| | | | | | | | | | | Avoids getting a "load_output" in a case like this: gl_Position = ubuf.MVP * ubuf.position[gl_VertexIndex]; frag_pos = gl_Position.xyz; Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: lower samplers and uniform buffer indicesJonathan Marek2019-09-262-0/+147
| | | | | | | | | Lower these to something compatible with ir3, and save the descriptor set and binding information. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: basic descriptor sets (uniform buffer and samplers)Jonathan Marek2019-09-262-102/+430
| | | | | | | | Mostly copy-paste from radv, with a few modifications. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: enable linear filteringJonathan Marek2019-09-261-2/+2
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: align layer_sizeJonathan Marek2019-09-261-1/+1
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: use linear tiling for scanout imageJonathan Marek2019-09-261-2/+9
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: implement image view descriptorJonathan Marek2019-09-262-2/+90
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: implement sampler stateJonathan Marek2019-09-262-0/+73
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: fix vertex_idJonathan Marek2019-09-261-1/+1
| | | | | | | | ir3 uses non-zero based vertex id for a6xx Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* turnip: emit shader immediatesJonathan Marek2019-09-261-0/+37
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]>
* util/rb_tree: Stop relying on &iter->field != NULLJason Ekstrand2019-09-261-41/+28
| | | | | | | | | | | | The old version of the iterators relies on a &iter->field != NULL check which works fine on older GCC but newer GCC versions and clang have optimizations that break if you do pointer math on a null pointer. The correct solution to this is to do the null comparisons before we do any sort of &iter->field or use rb_node_data to do the reverse operation. Acked-by: Michel Dänzer <[email protected]> Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* util/rb_tree: Also test _safe iteratorsJason Ekstrand2019-09-261-0/+42
| | | | | | Acked-by: Michel Dänzer <[email protected]> Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* freedreno/a3xx: Mostly fix min-vs-mag filtering decisions on non-mipmap tex.Eric Anholt2019-09-262-73/+12
| | | | | | | | | This is based on the fix I used for the same problem on V3D. In this case, it fixes all but the the dEQP-GLES2.functional.texture.filtering.2d.*_npot cases of dEQP-GLES2.functional.texture.filtering.2d.*'s failures. Acked-by: Rob Clark <[email protected]>
* intel/compiler: avoid truncating int64_t to intMaya Rashish2019-09-261-1/+1
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Maya Rashish <[email protected]>
* lima: support rectangle textureIcenowy Zheng2019-09-264-3/+9
| | | | | | | | | | | | | | As Vasily discovered, the bit 7 of the word 1 of the texture descriptor is set when reloading the framebuffer, to use framebuffer-based offset rather than normalized one. This bit also works for regular textures to enable accessing with non-normalized offset. Add support for rectangle texture by setting this bit for PIPE_TEXTURE_RECT. Suggested-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Icenowy Zheng <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* loader: Avoid use-after-free / use of uninitialized local variablesMichel Dänzer2019-09-261-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per the valgrind output below, we were returning the pointer to freed memory if none of the later conditional pointer assignments were executed. This caused dEQP CI jobs to crash on certain runners, presumably due to a double-free down the line. Also, we were skipping to the out: label before the vendor_id & chip_id variables used by it were initialized, resulting in broken LIBGL_DEBUG=verbose output such as libGL: pci id for fd 4: 51108f00:51108f00, driver radeonsi Fixes: 5a545e355b23 "loader: always map the "amdgpu" kernel driver name to radeonsi (v2)" ==403== Invalid read of size 1 ==403== at 0x4AFD576: surfaceless_probe_device (platform_surfaceless.c:316) ==403== by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958) ==403== by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75) ==403== by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96) ==403== by 0x4AE9367: eglInitialize (eglapi.c:617) ==403== by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DE07A: deqp::gles2::Context::Context(tcu::TestContext&) (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DB5EF: deqp::gles2::TestPackage::init() (in /deqp/modules/gles2/deqp-gles2) ==403== Address 0x56bd340 is 0 bytes inside a block of size 4 free'd ==403== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==403== by 0x4B01767: loader_get_driver_for_fd (loader.c:464) ==403== by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308) ==403== by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958) ==403== by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75) ==403== by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96) ==403== by 0x4AE9367: eglInitialize (eglapi.c:617) ==403== by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2) ==403== Block was alloc'd at ==403== at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==403== by 0x4EE5E09: strndup (strndup.c:43) ==403== by 0x4B010B1: loader_get_kernel_driver_name (loader.c:101) ==403== by 0x4B016AF: loader_get_driver_for_fd (loader.c:462) ==403== by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308) ==403== by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958) ==403== by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75) ==403== by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96) ==403== by 0x4AE9367: eglInitialize (eglapi.c:617) ==403== by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2) Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* Revert "glx: Lift sending the MakeCurrent request to top-level code"Adam Jackson2019-09-262-187/+167
| | | | | | | | | Apparently this provokes crashes elsewhere in code unrelated to MakeCurrent. I hate GLX so very very much. This reverts commit 999c2aed8826f403b071f52b040ce25b56d35f9d. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207
* Revert "glx: Implement GLX_EXT_no_config_context"Adam Jackson2019-09-2612-65/+26
| | | | | | This reverts commit 0d635ccc912d7122f35f81eec27d8b2c0a2a7a28. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207
* radv: Add debug option to dump meta shaders.Timur Kristóf2019-09-263-2/+6
| | | | | | | | This new option can help debug shader compiler problems when there are issues with the meta shaders. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: Introduce ac_get_fs_input_vgpr_cnt.Timur Kristóf2019-09-264-73/+63
| | | | | | | | | | | Add a function called ac_get_fs_input_vgpr_cnt which will return the number of input VGPRs used by an AMD shader. Previously, radv and radeonsi had the same code duplicated, but this commit also allows them to share this code. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: Set shared VGPR count in radv_postprocess_config.Timur Kristóf2019-09-262-2/+18
| | | | | | | | This commit allows RADV to set the shared VGPR count according to the shader config. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: Add num_shared_vgprs to ac_shader_config for GFX10.Timur Kristóf2019-09-262-0/+20
| | | | | | | | | In GFX10 wave64 mode, shared VGPRs allow the two wave halves to share some data with each other. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: Extract some helper functions to ac_shader_util.Timur Kristóf2019-09-265-117/+131
| | | | | | | | | | This commit moves ac_get_tbuffer_format, ac_get_sampler_dim and ac_get_image_dim into ac_shader_util, thus enabling them to be used by compilers other than LLVM. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: Move ac_export_mrt_z to ac_llvm_build.Timur Kristóf2019-09-264-75/+76
| | | | | | | | | The aim of this commit is to keep ac_shader_util LLVM-free, since we would like to use it in ACO later. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* aco: CSE readlane/readfirstlane/permute/reduce with the same exec maskRhys Perry2019-09-262-9/+37
| | | | | | | | | | v2: rename pass_temp to pass_flags v2: also CSE reductions v3: add ds_swizzle_b32 support v3: check gds/offset0/offset1 fields Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: don't CSE v_readlane_b32/v_readfirstlane_b32Rhys Perry2019-09-261-0/+4
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_stringRhys Perry2019-09-266-18/+18
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/aco: return a correct name and description for the backend IRRhys Perry2019-09-263-2/+9
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* aco: store printed backend IR in binaryRhys Perry2019-09-261-4/+21
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* aco,radv/aco: get dissassembly for release builds if requestedRhys Perry2019-09-262-10/+2
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/aco: actually disable ACO when unsupportedRhys Perry2019-09-261-1/+0
| | | | | | | | | We were setting this twice. The second time, we weren't later disabling it if unsupported. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* mesa/st: calculate texture size based on EGLImage miplevelTapani Pälli2019-09-261-2/+5
| | | | | | | Fixes issues with 'egl-gl_oes_egl_image' Piglit test. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* meson: fix logic for generating .pc files with old glvndDylan Baker2019-09-254-21/+24
| | | | | | | | | | | | | | We want to generate PC files for non-glvnd builds and for builds with old glvnd, but the current logic doesn't do that, it builds them unconditionally, and for GLES it builds the shared libraries, which is also not what we want. This does not generate .pc files for gles1 or gles2. Which it we weren't doing before either, making this not a regression but a return to status-quo.o Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1838 Fixes: 93df862b6affb6b8507e40601212a58012bfa873 ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility") Reviewed-by: Matt Turner <[email protected]>
* nir/range-analysis: Use types to provide better ranges from bcsel and movIan Romanick2019-09-251-25/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16328255 -> 16315391 (-0.08%) instructions in affected programs: 218318 -> 205454 (-5.89%) helped: 988 HURT: 0 helped stats (abs) min: 1 max: 72 x̄: 13.02 x̃: 10 helped stats (rel) min: 0.33% max: 16.04% x̄: 6.27% x̃: 4.88% 95% mean confidence interval for instructions value: -13.69 -12.35 95% mean confidence interval for instructions %-change: -6.55% -5.99% Instructions are helped. total cycles in shared programs: 363683977 -> 363615417 (-0.02%) cycles in affected programs: 1475193 -> 1406633 (-4.65%) helped: 923 HURT: 36 helped stats (abs) min: 1 max: 624 x̄: 75.78 x̃: 48 helped stats (rel) min: 0.08% max: 13.89% x̄: 5.20% x̃: 5.08% HURT stats (abs) min: 1 max: 179 x̄: 38.58 x̃: 4 HURT stats (rel) min: 0.06% max: 16.56% x̄: 3.33% x̃: 0.29% 95% mean confidence interval for cycles value: -75.88 -67.10 95% mean confidence interval for cycles %-change: -5.10% -4.66% Cycles are helped. Sandy Bridge total instructions in shared programs: 10785779 -> 10785654 (<.01%) instructions in affected programs: 13855 -> 13730 (-0.90%) helped: 67 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 1.87 x̃: 1 helped stats (rel) min: 0.20% max: 3.45% x̄: 0.97% x̃: 0.78% 95% mean confidence interval for instructions value: -2.47 -1.26 95% mean confidence interval for instructions %-change: -1.13% -0.81% Instructions are helped. total cycles in shared programs: 153704799 -> 153704481 (<.01%) cycles in affected programs: 101509 -> 101191 (-0.31%) helped: 38 HURT: 13 helped stats (abs) min: 1 max: 38 x̄: 12.53 x̃: 16 helped stats (rel) min: 0.07% max: 2.69% x̄: 0.87% x̃: 0.53% HURT stats (abs) min: 1 max: 36 x̄: 12.15 x̃: 7 HURT stats (rel) min: 0.06% max: 2.53% x̄: 0.73% x̃: 0.44% 95% mean confidence interval for cycles value: -10.24 -2.24 95% mean confidence interval for cycles %-change: -0.75% -0.17% Cycles are helped. LOST: 2 GAINED: 0 No shader-db change on Iron Lake or GM45.
* nir/range-analysis: Use types in the hash keyIan Romanick2019-09-251-38/+98
| | | | | | | This allows the reslut of mov and bcsel to be separately interpreted as float or int depending on the use. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/range-analysis: Bail if the types don't matchIan Romanick2019-09-251-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some shaders are hurt by this change because now a load_const(0x00000000) is not recognized as eq_zero when loaded as a float. This behavior is restored in a later patch (nir/range-analysis: Use types to provide better ranges from bcsel and mov). v2: Add a comment about reinterpretation of int/uint/bool. Suggested by Caio. Rewrite condition the check for types being float versus checking for types not being all the things that aren't float. Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16327543 -> 16328255 (<.01%) instructions in affected programs: 55928 -> 56640 (1.27%) helped: 0 HURT: 208 HURT stats (abs) min: 1 max: 16 x̄: 3.42 x̃: 3 HURT stats (rel) min: 0.33% max: 6.74% x̄: 1.31% x̃: 1.12% 95% mean confidence interval for instructions value: 3.06 3.79 95% mean confidence interval for instructions %-change: 1.17% 1.46% Instructions are HURT. total cycles in shared programs: 363682759 -> 363683977 (<.01%) cycles in affected programs: 325758 -> 326976 (0.37%) helped: 44 HURT: 133 helped stats (abs) min: 1 max: 179 x̄: 33.61 x̃: 5 helped stats (rel) min: 0.06% max: 14.21% x̄: 2.47% x̃: 0.29% HURT stats (abs) min: 1 max: 157 x̄: 20.28 x̃: 14 HURT stats (rel) min: 0.07% max: 14.44% x̄: 1.42% x̃: 0.73% 95% mean confidence interval for cycles value: 0.38 13.39 95% mean confidence interval for cycles %-change: -0.06% 0.96% Inconclusive result (%-change mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10787433 -> 10787443 (<.01%) instructions in affected programs: 1842 -> 1852 (0.54%) helped: 0 HURT: 10 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.33% max: 1.85% x̄: 0.73% x̃: 0.49% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.36% 1.10% Instructions are HURT. total cycles in shared programs: 153724543 -> 153724563 (<.01%) cycles in affected programs: 8407 -> 8427 (0.24%) helped: 1 HURT: 3 helped stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18 helped stats (rel) min: 0.98% max: 0.98% x̄: 0.98% x̃: 0.98% HURT stats (abs) min: 4 max: 18 x̄: 12.67 x̃: 16 HURT stats (rel) min: 0.21% max: 0.75% x̄: 0.56% x̃: 0.72% 95% mean confidence interval for cycles value: -21.31 31.31 95% mean confidence interval for cycles %-change: -1.11% 1.46% Inconclusive result (value mean confidence interval includes 0). No shader-db changes on Iron Lake or GM45.
* intel: Add new Comet Lake PCI-idsLionel Landwerlin2019-09-261-0/+3
| | | | | | | | Commit bfc4c359b282 ("drm/i915/cml: Add Missing PCI IDs") in i915 added 3 new CML PCI ids. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: use proper label for Comet Lake skusLionel Landwerlin2019-09-261-18/+18
| | | | | | Fixes: 82f6a746e8 ("intel: Add support for Comet Lake") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/a6xx: Move instrlen and obj_start writes to fd6_emit_shaderKristian H. Kristensen2019-09-251-32/+44
| | | | | | Consolidate a few more generic shaders setup regs in fd6_emit_shader. Signed-off-by: Kristian H. Kristensen <[email protected]>