aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* vulkan/overlay: keep allocating draw data until it can be reusedLionel Landwerlin2019-05-101-113/+135
| | | | | | | | | | | | | | | | | | | | | | | The original implementation assumed that we could allocate the same amount of command buffers as the number of images in the swapchain. But the application could potentially render much faster and rerender into images that have been submitted for presentation but not yet presented. This change keeps on allocating command buffers, vertex buffer, vertex indices as well as a semaphore and a fence for as long as we can't reuse a previously submitted one. This fixes rendering issues in the overlay at high frame rates. v2: Don't recreate semaphores constantly (Józef) v3: Drop useless surface & FreeCommandBuffers (Józef) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110655 Cc: 19.1 <[email protected]> Reviewed-by: Józef Kucia <[email protected]>
* vulkan/overlay: fix truncating error on 32bit platformsLionel Landwerlin2019-05-101-40/+36
| | | | | | | | | | | | | | Non dispatchable handles can be uint64_t. When compiling the layer on a 32bit platform, this will lead to casting uint64_t into (void *) which is 32bit, leading to incorrect handles being mapped internally in the layer. v2: Use more HKEY() (Eric) Signed-off-by: Lionel Landwerlin <[email protected]> Reported-by: Józef Kucia <[email protected]> Fixes: 2d2927938f074f ("vulkan/overlay-layer: fix cast errors") Reviewed-by: Józef Kucia <[email protected]>
* i965: Fix memory leaks in brw_upload_cs_work_groups_surface().Kenneth Graunke2019-05-101-0/+5
| | | | | | | | | | | | | This was taking a reference to the 64kB upload buffer and never returning it, leaking a reference each time this atom triggered. This leaked lots of 64kB upload BOs, eventually running us out of of VMA space. This would usually happen when using mpv to watch a movie, after 20-40 minutes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134 Fixes: 63d7b33f516 i965/cs: Setup surface binding for gl_NumWorkGroups Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* st/va: set the visible image dimensions in vlVaDeriveImageJulien Isorce2019-05-101-2/+4
| | | | | | | | | | | | | | | This fixes video being rendered incorrectly. User wants height of 360 but internally pipe_video_buffer 's height is 368 in the test below. Test: GST_GL_PLATFORM=egl gst-launch-1.0 videotestsrc ! video/x-raw, width=868, height=360, format=NV12 ! vaapipostproc ! glimagesink Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* swrast: Rename blend_func->swrast_blend_funcAlyssa Rosenzweig2019-05-101-5/+5
| | | | | | | | | | | | This avoids a conflict with the new (driver-agnostic) blend_func enum in shader_enum.h, which broke the build of swrast (and i965 by extension). My apologies :( Signed-off-by: Alyssa Rosenzweig <[email protected]> Fixes: f41be53a ("compiler: Add enums for blend state") Cc: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add blend_const_color_rgba sysvalAlyssa Rosenzweig2019-05-101-1/+4
| | | | | | | | | | | | | This represents a float vec4 constant color, as passed to glBlendColor. While the existing 4 shader sysvals are retained to minimize code churn, a single vectorized intrinsic is required for efficient blending on vector architectures. (This may also apply to archictectures like Bifrost where ALU is scalar but load/store is vector; it largely depends on how blending is implemented per-driver.) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium: Add helper to convert PIPE blending to shader_enum styleAlyssa Rosenzweig2019-05-101-0/+92
| | | | | | | | | Complementing the new API-agnostic shader_enum blending style, we add helpers to translate between the two forms. Ideally, we could just use PIPE blending directly, but that makes Vulkan support challenging. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* compiler: Add enums for blend stateAlyssa Rosenzweig2019-05-101-0/+21
| | | | | | | | | | | We add enums corresponding to (GLES) blend state to shader_enums.h, complementing the existing advanced blending enums in the file. This allows us to represent blending state in a driver-agnostic, API-agnostic way to permit lowering. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Eric Anholt <[email protected]>
* nir: allow specifying a set of opcodes in lower_alu_to_scalarJonathan Marek2019-05-1012-19/+24
| | | | | | | | | This can be used by both etnaviv and freedreno/a2xx as they are both vec4 architectures with some instructions being scalar-only. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel/fs/copy-prop: Don't walk all the ACPs for each instructionJason Ekstrand2019-05-101-11/+68
| | | | | | | | | | | | | | | | | | | | | In order to set up KILL sets, the dataflow code was walking the entire array of ACPs for every instruction. If you assume the number of ACPs increases roughly with the number of instructions, this is O(n^2). As it turns out, regions_overlap() is not nearly as cheap as one would like and shows up as a significant chunk on perf traces. This commit changes things around and instead first builds an array of exec_lists which it uses like a hash table (keyed off ACP source or destination) similar to what's done in the rest of the copy-prop code. By first walking the list of ACPs and populating the table and then walking instructions and only looking at ACPs which probably have the same VGRF number, we can reduce the complexity to O(n). This takes the execution time of the piglit vs-isnan-dvec test from about 56.4 seconds on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to about 38.7 seconds. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs/copy-prop: Purge unused ACPsJason Ekstrand2019-05-101-0/+19
| | | | | | | | | | | | | | If the destination of an ACP entry exists only within this block, then there's no need to keep it for dataflow analysis. We can delete it from the out_acp table and avoid growing the bitsets any bigger than we absolutely have to. This reduces the maximum number of global ACP entries in the vs-isnan-dvec with software fp64 on Kaby Lake from 8630 to 3942 and takes the execution time of the piglit vs-isnan-dvec test from about 1:16.2 on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to about 56.4 seconds. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs/copy-prop: Bump the hash table size to 64Jason Ekstrand2019-05-101-1/+1
| | | | | | | | | | While the number of ACPs is generally not huge compared to the number of blocks, 16 does seem a bit small. Bumping it to 64 takes the execution time of the piglit vs-isnan-dvec test from about 1:18.1 on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to about 1:16.2. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* winsys/amdgpu: add VCN JPEG to no user fence groupLeo Liu2019-05-101-1/+2
| | | | | | | | | | There is no user fence for JPEG, the bug triggering kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT) Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: [email protected]
* lima: fix width 4096 resolution GP failQiang Yu2019-05-101-1/+1
| | | | | | | | When width=4096 and shift_w=0, block_w=0x100 which overflow the PLBU_CMD 8 bits for it. Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
* panfrost: Add CAPFs for conservative rasterizationTomeu Vizoso2019-05-101-0/+5
| | | | | | | | | | | | | | Just do what everybody else but Nouveau does and return 0.0f. This prevents the repeated logging of these messages on startup: Unexpected PIPE_CAPF 6 query Unexpected PIPE_CAPF 7 query Unexpected PIPE_CAPF 8 query Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Only take the fast paths on buffers aligned to block sizeTomeu Vizoso2019-05-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the functions operate on 16-byte blocks. Fixes this Valgrind error: Invalid read of size 4 at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85) by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171) by 0x584F587: panfrost_tile_texture (pan_resource.c:489) by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525) by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516) by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515) by 0x5875F13: u_default_texture_subdata (u_transfer.c:80) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd at 0x483F5C8: malloc (vg_replace_malloc.c:299) by 0x584F47D: panfrost_transfer_map (pan_resource.c:467) by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243) by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2) Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Cc: 19.1 <[email protected]>
* panfrost: Fix two uninitialized accesses in compilerTomeu Vizoso2019-05-101-1/+1
| | | | | | | | | | | | | | | | | Valgrind was complaining of those. NIR_PASS only sets progress to TRUE if there was progress. nir_const_load_to_arr() only sets as many constants as components has the instruction. This was causing some dEQP tests to flip-flop, such as: dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Fixes: 14531d676b11 ("nir: make nir_const_value scalar")
* panfrost: ci: Skip running some testsTomeu Vizoso2019-05-101-0/+2
| | | | | | | | These tests add too much time to the total run time, and some of them even hang the DUTs, even if I haven't been able to reproduce it locally. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Don't restart WestonTomeu Vizoso2019-05-101-8/+1
| | | | | | | | | | | | There doesn't seem to actually be any noticeably memory leaks on Weston when running dEQP. We do seem to leak quiet a bit in the client, so we still have to run the dEQP runner in batches. This removes the risk of Weston not restarting properly and introducing spurious failures. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Update list of expected failuresTomeu Vizoso2019-05-101-79/+7
| | | | | | | | This matches the current state of things on both RK3288 and RK3399. Hopefully, from now on we'll only remove stuff from this list. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Tweak dEQP to improve throughputTomeu Vizoso2019-05-101-2/+8
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Fix list of tests to runTomeu Vizoso2019-05-101-2/+2
| | | | | | | | Make sure we have only test case names in the list, excluding names of test groups. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Check for incomplete runsTomeu Vizoso2019-05-101-0/+1
| | | | | | | | | To improve robustness, check that we got the expected number of results. Right now we hard-code the expected number of tests run, but with some effort we may be able to infer it. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Add tests to flip-flop listTomeu Vizoso2019-05-101-5/+48
| | | | | | | These tests aren't giving reliable results. Mask them for now. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Add support for running the tests on RK3288Tomeu Vizoso2019-05-106-75/+193
| | | | | | | | Build artifacts for armhf and schedule them on a Veyron Chromebook with RK3288. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* lima: fix tile buffer reloadingVasily Khoruzhick2019-05-092-2/+4
| | | | | | | | | | Buffer needs to be reloaded every time unless explicit clear() was called. Fixes rendering issues with wayland compositors. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* anv: Remove special allocation for anv_push_constantsCaio Marcelo de Oliveira Filho2019-05-093-79/+7
| | | | | | | | | | | | | The key reason for that mechanism is gone: all the extra optional data that could be in the anv_push_constants was moved elsewhere. At this point, just put anv_push_constants directly in anv_cmd_state (part of anv_cmd_buffer). v2: Remove a NULL check we don't need anymore in anv_cmd_buffer_push_constants(). (Lionel) Fix size we consider for valid push params. (Lionel) Reviewed-by: Lionel Landwerlin <[email protected]>
* iris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERYKenneth Graunke2019-05-094-0/+72
| | | | | | This provides a way for the application to query whether any resets have happened, which lets us expose "robust" contexts. This also enables the KHR_robust_buffer_access_behavior tests.
* iris: Hook up device reset callbacksKenneth Graunke2019-05-094-1/+26
| | | | | | This mechanism lets the driver inform the state tracker about GPU resets, say for destroying a robust API context and reporting a "device lost" error to the application, making it take action to deal with this.
* iris: Try to recover from GPU hangs.Kenneth Graunke2019-05-093-0/+71
| | | | | | | | | The iris batch module now tries to detect that the kernel has banned our GEM context, creates a new non-banned context, and informs the iris context module that all assumptions about state are now invalid and it needs to reinitialize the relevant state. Based on Chris Wilson's work, but significantly rewritten by me.
* iris: Add helpers to clone a hardware context.Chris Wilson2019-05-092-0/+25
| | | | | (Chris Wilson wrote this code in a patch titled "i965: Be resilient in the face of GPU hangs"; Ken fixed a bug and copied it to iris.)
* iris: Mark render batches as non-recoverable.Kenneth Graunke2019-05-091-0/+22
| | | | | | | | | | | | | | | | | | Adapted from Chris Wilson's patch. The comment is largely his. Currently, when iris hangs the GPU, it will continue sending batches which incrementally update the state, assuming it's preserved across batches. However, the kernel's GPU reset support reinitializes the guilty context to the default GPU state (reasonably not wanting to trust the current state). This ends up resetting critical things like STATE_BASE_ADDRESS, causing memory accesses in all subsequent batches to be garbage, and almost certainly result in more hangs until we're banned or we kill the machine. We now ask the kernel to ban our render context immediately, so we notice we've gone off the rails as fast as possible. Eventually, we'll attempt to recover and continue. For now, we just avoid torching the GPU over and over.
* freedreno/ir3: fix rasterflat/glxgearsRob Clark2019-05-092-5/+5
| | | | | | | | Ofc legacy gl features that are broken don't trigger fails in deqp. I should remember to test glxgears more often. Fixes: 7ff6705b8d8 freedreno/ir3: convert to "new style" frag inputs Signed-off-by: Rob Clark <[email protected]>
* anv: Use corresponding type from the vector allocationLionel Landwerlin2019-05-092-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We didn't notice this issue much because the 2 struct share a similar layout, expect for the additional fields... We run into that issue in Anv : ==15236== Invalid write of size 8 ==15236== at 0x8CF3939C: anv_state_table_expand_range (anv_allocator.c:211) ==15236== by 0x8CF394D5: anv_state_table_grow (anv_allocator.c:264) ==15236== by 0x8CF3967E: anv_state_table_add (anv_allocator.c:312) ==15236== by 0x8CF3B13C: anv_state_pool_alloc_no_vg (anv_allocator.c:1167) ==15236== by 0x8CF3B2B0: anv_state_pool_alloc (anv_allocator.c:1190) ==15236== by 0x8CF60871: alloc_surface_state (anv_image.c:1122) ==15236== by 0x8CF61FF9: anv_CreateImageView (anv_image.c:1519) ==15236== by 0x8BCBD2ED: vkCreateImageView (trampoline.c:1358) ==15236== Address 0x8994ef10 is 0 bytes after a block of size 128 alloc'd ==15236== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==15236== by 0x8D2578E6: u_vector_init (u_vector.c:47) ==15236== by 0x8CF3929A: anv_state_table_init (anv_allocator.c:168) ==15236== by 0x8CF3A99A: anv_state_pool_init (anv_allocator.c:921) ==15236== by 0x8CF56517: anv_CreateDevice (anv_device.c:1909) ==15236== by 0x8BCB4FBA: terminator_CreateDevice (loader.c:6073) ==15236== by 0x8DD2CB3D: ??? (in /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so) ==15236== by 0x8DF4D241: vkCreateDevice (in /home/djdeath/.steam/ubuntu12_64/steamoverlayvulkanlayer.so) ==15236== by 0x8BCB35C6: loader_create_device_chain (loader.c:5449) ==15236== by 0x8BCBC230: vkCreateDevice (trampoline.c:838) v2: Rename mmap_cleanups to avoid confusion (Caio) v3: s/fail_mmap_cleanups/fail_cleanups/ (Caio) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110648 Cc: <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa: fix GL_PROGRAM_BINARY_RETRIEVABLE_HINT handlingPierre-Eric Pelloux-Prayer2019-05-092-4/+11
| | | | | | | | | | | | | | | When first implemented in fefd03e16c16 Mesa's behavior was aligned on behavior of Nvidia's driver. This caused a failing test in piglit but was ok since the specification is unclear on this subject. Nvidia's driver behavior has been modified because using version 410.104, the problematic test (program_binary_retrievable_hint) now passes. This commit defers BinaryRetrievableHint update until the next linking so the test passes on Mesa as well. Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* nir: Initialize lower_flrp_progress everywhereIan Romanick2019-05-096-6/+6
| | | | | | | | | | | | | | | | I don't know why I thought NIR_PASS always set the progress variable. Derp. Fixes: d41cdef2a59 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Coverity CID: 1444996 Coverity CID: 1444995 Coverity CID: 1444994 Coverity CID: 1444993 Coverity CID: 1444991 Coverity CID: 1444989
* gallium: fix typo in commentEric Engestrom2019-05-091-1/+1
| | | | Signed-off-by: Eric Engestrom <[email protected]>
* i965_asm: avoid free()ing uninitialized pointersEric Engestrom2019-05-091-1/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965_asm: fix memleakEric Engestrom2019-05-091-0/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* radv: fix setting the number of rectangles when it's dyanmicSamuel Pitoiset2019-05-091-4/+6
| | | | | | | | | | We need to know the number of rectangles. This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*. Fixes: 5db0bf99944 ("radv: Implement VK_EXT_discard_rectangles.") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* iris: Reorganise execbuf to have a single point of failureChris Wilson2019-05-081-27/+20
| | | | | | | | | | | Propagate the failure from GEM_EXECBUFFER2, cleanup then report failure if need be. We retain the current behaviour to abort() at the first sign of trouble -- for a non-robustness context, arguably this is the right thing to do as the client cannot recover, and the system state is lost. How to properly integrate with KHR_robustness and reset-strategy is left as a future exercise. Reviewed-by: Kenneth Graunke <[email protected]>
* kmsro: add _dri.so to two of the kmsro drivers.Dave Airlie2019-05-091-2/+2
| | | | | | Fixes: 8cfc17bdda3 (kmsro: Add the rest of the current set of tinydrm drivers.) Reviewed-by: Eric Engestrom <[email protected]>
* iris: Report the same video memory settings as i965.Kenneth Graunke2019-05-082-2/+34
| | | | This just copy and pastes Ian's code from i965.
* gallium/util: fix two MSVC compiler warningsBrian Paul2019-05-082-3/+3
| | | | | | | Remove stray const qualifier. s/unsigned/enum tgsi_semantic/ Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/pp: s/uint/enum tgsi_semantic/ to fix MSVC warningBrian Paul2019-05-081-1/+1
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* noop: s/enum pipe_transfer_usage/unsigned/ to fix MSVC warningBrian Paul2019-05-081-1/+1
| | | | | | | The function pointer declaration in pipe_context uses unsigned for the bitmask. Reviewed-by: Roland Scheidegger <[email protected]>
* ddebug: fix a few MSVC compiler warningsBrian Paul2019-05-082-8/+9
| | | | | | | Don't return an expression in void functions. Replace an unsigned int with proper enum. Reviewed-by: Roland Scheidegger <[email protected]>
* glsl: s/GLboolean/bool/ to silence MSVC compiler warningBrian Paul2019-05-081-1/+1
| | | | | | It complains about mixing GLboolean and bool in the |= expression. Reviewed-by: Roland Scheidegger <[email protected]>
* nir/flrp: Reassociate add in flrp(±1, b, c) lowering pathIan Romanick2019-05-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this reassociation, this lowering path is still beneficial. Ice Lake total instructions in shared programs: 17220191 -> 17207181 (-0.08%) instructions in affected programs: 999871 -> 986861 (-1.30%) helped: 3703 HURT: 17 helped stats (abs) min: 1 max: 686 x̄: 3.52 x̃: 3 helped stats (rel) min: 0.09% max: 51.97% x̄: 2.21% x̃: 1.35% HURT stats (abs) min: 1 max: 9 x̄: 1.47 x̃: 1 HURT stats (rel) min: 0.08% max: 4.55% x̄: 0.78% x̃: 0.55% 95% mean confidence interval for instructions value: -4.01 -2.99 95% mean confidence interval for instructions %-change: -2.29% -2.11% Instructions are helped. total cycles in shared programs: 360871298 -> 360755040 (-0.03%) cycles in affected programs: 9931334 -> 9815076 (-1.17%) helped: 2388 HURT: 1569 helped stats (abs) min: 1 max: 10228 x̄: 93.54 x̃: 18 helped stats (rel) min: <.01% max: 74.11% x̄: 3.36% x̃: 1.07% HURT stats (abs) min: 1 max: 1917 x̄: 68.27 x̃: 22 HURT stats (rel) min: <.01% max: 44.90% x̄: 3.44% x̃: 1.72% 95% mean confidence interval for cycles value: -39.48 -19.28 95% mean confidence interval for cycles %-change: -0.86% -0.46% Cycles are helped. total spills in shared programs: 12355 -> 12159 (-1.59%) spills in affected programs: 295 -> 99 (-66.44%) helped: 2 HURT: 1 total fills in shared programs: 25398 -> 25207 (-0.75%) fills in affected programs: 288 -> 97 (-66.32%) helped: 2 HURT: 1 LOST: 3 GAINED: 44 Iron Lake total instructions in shared programs: 8169225 -> 8159729 (-0.12%) instructions in affected programs: 1025712 -> 1016216 (-0.93%) helped: 3352 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 2.83 x̃: 3 helped stats (rel) min: 0.15% max: 12.00% x̄: 1.51% x̃: 1.05% 95% mean confidence interval for instructions value: -2.86 -2.80 95% mean confidence interval for instructions %-change: -1.56% -1.46% Instructions are helped. total cycles in shared programs: 188656796 -> 188612280 (-0.02%) cycles in affected programs: 18633584 -> 18589068 (-0.24%) helped: 3085 HURT: 14 helped stats (abs) min: 2 max: 72 x̄: 14.45 x̃: 12 helped stats (rel) min: 0.02% max: 5.73% x̄: 0.73% x̃: 0.31% HURT stats (abs) min: 2 max: 4 x̄: 3.71 x̃: 4 HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -14.55 -14.18 95% mean confidence interval for cycles %-change: -0.76% -0.69% Cycles are helped. GM45 total instructions in shared programs: 5026905 -> 5021856 (-0.10%) instructions in affected programs: 584169 -> 579120 (-0.86%) helped: 1776 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 2.84 x̃: 3 helped stats (rel) min: 0.15% max: 11.11% x̄: 1.43% x̃: 0.98% 95% mean confidence interval for instructions value: -2.88 -2.80 95% mean confidence interval for instructions %-change: -1.50% -1.37% Instructions are helped. total cycles in shared programs: 129047376 -> 129018918 (-0.02%) cycles in affected programs: 12941924 -> 12913466 (-0.22%) helped: 1722 HURT: 14 helped stats (abs) min: 4 max: 72 x̄: 16.56 x̃: 18 helped stats (rel) min: 0.02% max: 5.73% x̄: 0.72% x̃: 0.30% HURT stats (abs) min: 2 max: 4 x̄: 3.71 x̃: 4 HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -16.65 -16.13 95% mean confidence interval for cycles %-change: -0.76% -0.66% Cycles are helped. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nir/flrp: Fix typo on the flrp(±1, b, c) pathIan Romanick2019-05-081-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After Samuel reported the bisect, I was able to find the bug by inspection. Good thing for well-named varibles. :) Unfortunately, this undoes almost all of the benefit of the original patch. Ice Lake total instructions in shared programs: 17183159 -> 17218166 (0.20%) instructions in affected programs: 1308722 -> 1343729 (2.67%) helped: 98 HURT: 4746 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.47% max: 2.70% x̄: 0.60% x̃: 0.57% HURT stats (abs) min: 1 max: 691 x̄: 7.40 x̃: 8 HURT stats (rel) min: 0.10% max: 700.00% x̄: 5.82% x̃: 2.83% 95% mean confidence interval for instructions value: 6.82 7.64 95% mean confidence interval for instructions %-change: 5.22% 6.15% Instructions are HURT. total cycles in shared programs: 360705959 -> 360853522 (0.04%) cycles in affected programs: 10754380 -> 10901943 (1.37%) helped: 1594 HURT: 3331 helped stats (abs) min: 1 max: 1896 x̄: 119.81 x̃: 60 helped stats (rel) min: <.01% max: 35.48% x̄: 5.06% x̃: 3.64% HURT stats (abs) min: 1 max: 10208 x̄: 101.63 x̃: 38 HURT stats (rel) min: 0.01% max: 878.95% x̄: 9.01% x̃: 2.78% 95% mean confidence interval for cycles value: 21.11 38.81 95% mean confidence interval for cycles %-change: 3.76% 5.15% Cycles are HURT. total spills in shared programs: 12158 -> 12355 (1.62%) spills in affected programs: 98 -> 295 (201.02%) helped: 1 HURT: 2 total fills in shared programs: 25204 -> 25398 (0.77%) fills in affected programs: 94 -> 288 (206.38%) helped: 0 HURT: 3 LOST: 15 GAINED: 8 Iron Lake total instructions in shared programs: 8121430 -> 8166733 (0.56%) instructions in affected programs: 1148353 -> 1193656 (3.95%) helped: 2 HURT: 4046 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.85% max: 1.92% x̄: 1.89% x̃: 1.89% HURT stats (abs) min: 1 max: 43 x̄: 11.20 x̃: 11 HURT stats (rel) min: 0.20% max: 716.67% x̄: 7.40% x̃: 3.87% 95% mean confidence interval for instructions value: 11.02 11.37 95% mean confidence interval for instructions %-change: 6.84% 7.94% Instructions are HURT. total cycles in shared programs: 188376326 -> 188601568 (0.12%) cycles in affected programs: 27416674 -> 27641916 (0.82%) helped: 68 HURT: 3947 helped stats (abs) min: 2 max: 222 x̄: 13.88 x̃: 6 helped stats (rel) min: <.01% max: 1.28% x̄: 0.15% x̃: 0.01% HURT stats (abs) min: 2 max: 670 x̄: 57.31 x̃: 64 HURT stats (rel) min: <.01% max: 1811.11% x̄: 4.11% x̃: 1.09% 95% mean confidence interval for cycles value: 55.01 57.20 95% mean confidence interval for cycles %-change: 2.88% 5.19% Cycles are HURT. LOST: 35 GAINED: 3 GM45 total instructions in shared programs: 4979794 -> 5003551 (0.48%) instructions in affected programs: 635174 -> 658931 (3.74%) helped: 1 HURT: 2142 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.85% max: 1.85% x̄: 1.85% x̃: 1.85% HURT stats (abs) min: 1 max: 43 x̄: 11.09 x̃: 11 HURT stats (rel) min: 0.20% max: 716.67% x̄: 7.00% x̃: 3.53% 95% mean confidence interval for instructions value: 10.85 11.33 95% mean confidence interval for instructions %-change: 6.25% 7.74% Instructions are HURT. total cycles in shared programs: 128519586 -> 128654990 (0.11%) cycles in affected programs: 17635304 -> 17770708 (0.77%) helped: 46 HURT: 2088 helped stats (abs) min: 4 max: 220 x̄: 18.13 x̃: 6 helped stats (rel) min: <.01% max: 1.28% x̄: 0.15% x̃: 0.01% HURT stats (abs) min: 2 max: 670 x̄: 65.25 x̃: 66 HURT stats (rel) min: <.01% max: 1464.29% x̄: 4.05% x̃: 0.99% 95% mean confidence interval for cycles value: 61.75 65.15 95% mean confidence interval for cycles %-change: 2.58% 5.34% Cycles are HURT. LOST: 38 GAINED: 38 Fixes: 5b908db604b ("nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently") Reported-by: Samuel Pitoiset <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>