mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	panfrost/midgard: Handle csel correctly	Alyssa Rosenzweig	2019-05-12	5	-152/+128
\| \| \| \| \| \| \| \| \| \| \| \|	We use an algebraic pass for the csel optimizations, and use proper vectorized csel ops (i/fcsel_v) for mixed, rather lowering. To avoid regressions along the way, we fix an issue with the copy propagation pass (it should not attempt to propagate constants). Similarly, we take care to break bundles when using csel to fix some scheduler corner cases. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	iris: Implement ARB_indirect_parameters	Illia Iorin	2019-05-11	5	-23/+156
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	iris_draw_vbo is divided into two functions to remove unnecessary operations from the loop. This implementation of ARB_indirect_parameters takes into account NV_conditional_render by saving MI_PREDICATE_RESULT at the start of a draw call and restoring it at the end also the result of NV_conditional_render is taken into account when computing predicates that limit draw calls for ARB_indirect_parameters in a similar way to 1952fd8d in ANV. v2: Optimize indirect draws (suggested by Kenneth Graunke) v3: (by Kenneth Graunke) - Fix an issue where indirect draws wouldn't set patch information before updating the compiled TCS. - Move some code back to iris_draw_vbo to avoid duplicating it. - Fix minor indentation issues. Signed-off-by: Illia Iorin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	iris: Split iris_update_draw_info into two functions.	Kenneth Graunke	2019-05-11	1	-0/+12
\| \| \| \| \| \| \| \|	Shader draw parameters need updating on each iteration of a multidraw loop, but the primitive based information only needs to be updated once. Also, patch information needs to be recorded before filling out the TCS program key, as it determines the number of HS instances.
*	iris: Use full ways for L3 cache setup on Icelake.	Kenneth Graunke	2019-05-10	1	-0/+1
\| \| \| \| \| \| \| \| \|	Anuj fixed this in i965 and anv, but the fix never landed in iris. Fixes tessellation corruption on Icelake. Thanks to Rafael for bisecting this and tracking it down. Fixes: d0996d5fab6 iris: Emit default L3 config for the render pipeline Reviewed-by: Rafael Antognolli <[email protected]>
*	st/va: set the visible image dimensions in vlVaDeriveImage	Julien Isorce	2019-05-10	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes video being rendered incorrectly. User wants height of 360 but internally pipe_video_buffer 's height is 368 in the test below. Test: GST_GL_PLATFORM=egl gst-launch-1.0 videotestsrc ! video/x-raw, width=868, height=360, format=NV12 ! vaapipostproc ! glimagesink Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Leo Liu <[email protected]>
*	gallium: Add helper to convert PIPE blending to shader_enum style	Alyssa Rosenzweig	2019-05-10	1	-0/+92
\| \| \| \| \| \| \| \| \|	Complementing the new API-agnostic shader_enum blending style, we add helpers to translate between the two forms. Ideally, we could just use PIPE blending directly, but that makes Vulkan support challenging. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	nir: allow specifying a set of opcodes in lower_alu_to_scalar	Jonathan Marek	2019-05-10	5	-6/+6
\| \| \| \| \| \| \| \| \|	This can be used by both etnaviv and freedreno/a2xx as they are both vec4 architectures with some instructions being scalar-only. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	winsys/amdgpu: add VCN JPEG to no user fence group	Leo Liu	2019-05-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	There is no user fence for JPEG, the bug triggering kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT) Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: [email protected]
*	lima: fix width 4096 resolution GP fail	Qiang Yu	2019-05-10	1	-1/+1
\| \| \| \| \| \| \| \|	When width=4096 and shift_w=0, block_w=0x100 which overflow the PLBU_CMD 8 bits for it. Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
*	panfrost: Add CAPFs for conservative rasterization	Tomeu Vizoso	2019-05-10	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Just do what everybody else but Nouveau does and return 0.0f. This prevents the repeated logging of these messages on startup: Unexpected PIPE_CAPF 6 query Unexpected PIPE_CAPF 7 query Unexpected PIPE_CAPF 8 query Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Only take the fast paths on buffers aligned to block size	Tomeu Vizoso	2019-05-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the functions operate on 16-byte blocks. Fixes this Valgrind error: Invalid read of size 4 at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85) by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171) by 0x584F587: panfrost_tile_texture (pan_resource.c:489) by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525) by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516) by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515) by 0x5875F13: u_default_texture_subdata (u_transfer.c:80) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd at 0x483F5C8: malloc (vg_replace_malloc.c:299) by 0x584F47D: panfrost_transfer_map (pan_resource.c:467) by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243) by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2) Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Cc: 19.1 <[email protected]>
*	panfrost: Fix two uninitialized accesses in compiler	Tomeu Vizoso	2019-05-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Valgrind was complaining of those. NIR_PASS only sets progress to TRUE if there was progress. nir_const_load_to_arr() only sets as many constants as components has the instruction. This was causing some dEQP tests to flip-flop, such as: dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Fixes: 14531d676b11 ("nir: make nir_const_value scalar")
*	panfrost: ci: Skip running some tests	Tomeu Vizoso	2019-05-10	1	-0/+2
\| \| \| \| \| \| \| \|	These tests add too much time to the total run time, and some of them even hang the DUTs, even if I haven't been able to reproduce it locally. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: ci: Don't restart Weston	Tomeu Vizoso	2019-05-10	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \| \|	There doesn't seem to actually be any noticeably memory leaks on Weston when running dEQP. We do seem to leak quiet a bit in the client, so we still have to run the dEQP runner in batches. This removes the risk of Weston not restarting properly and introducing spurious failures. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: ci: Update list of expected failures	Tomeu Vizoso	2019-05-10	1	-79/+7
\| \| \| \| \| \| \| \|	This matches the current state of things on both RK3288 and RK3399. Hopefully, from now on we'll only remove stuff from this list. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: ci: Tweak dEQP to improve throughput	Tomeu Vizoso	2019-05-10	1	-2/+8
\| \| \| \| \|	Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: ci: Fix list of tests to run	Tomeu Vizoso	2019-05-10	1	-2/+2
\| \| \| \| \| \| \| \|	Make sure we have only test case names in the list, excluding names of test groups. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: ci: Check for incomplete runs	Tomeu Vizoso	2019-05-10	1	-0/+1
\| \| \| \| \| \| \| \| \|	To improve robustness, check that we got the expected number of results. Right now we hard-code the expected number of tests run, but with some effort we may be able to infer it. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: ci: Add tests to flip-flop list	Tomeu Vizoso	2019-05-10	1	-5/+48
\| \| \| \| \| \| \|	These tests aren't giving reliable results. Mask them for now. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: ci: Add support for running the tests on RK3288	Tomeu Vizoso	2019-05-10	6	-75/+193
\| \| \| \| \| \| \| \|	Build artifacts for armhf and schedule them on a Veyron Chromebook with RK3288. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	lima: fix tile buffer reloading	Vasily Khoruzhick	2019-05-09	2	-2/+4
\| \| \| \| \| \| \| \| \| \|	Buffer needs to be reloaded every time unless explicit clear() was called. Fixes rendering issues with wayland compositors. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	iris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERY	Kenneth Graunke	2019-05-09	4	-0/+72
\| \| \| \| \| \|	This provides a way for the application to query whether any resets have happened, which lets us expose "robust" contexts. This also enables the KHR_robust_buffer_access_behavior tests.
*	iris: Hook up device reset callbacks	Kenneth Graunke	2019-05-09	4	-1/+26
\| \| \| \| \| \|	This mechanism lets the driver inform the state tracker about GPU resets, say for destroying a robust API context and reporting a "device lost" error to the application, making it take action to deal with this.
*	iris: Try to recover from GPU hangs.	Kenneth Graunke	2019-05-09	3	-0/+71
\| \| \| \| \| \| \| \| \|	The iris batch module now tries to detect that the kernel has banned our GEM context, creates a new non-banned context, and informs the iris context module that all assumptions about state are now invalid and it needs to reinitialize the relevant state. Based on Chris Wilson's work, but significantly rewritten by me.
*	iris: Add helpers to clone a hardware context.	Chris Wilson	2019-05-09	2	-0/+25
\| \| \| \| \|	(Chris Wilson wrote this code in a patch titled "i965: Be resilient in the face of GPU hangs"; Ken fixed a bug and copied it to iris.)
*	iris: Mark render batches as non-recoverable.	Kenneth Graunke	2019-05-09	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adapted from Chris Wilson's patch. The comment is largely his. Currently, when iris hangs the GPU, it will continue sending batches which incrementally update the state, assuming it's preserved across batches. However, the kernel's GPU reset support reinitializes the guilty context to the default GPU state (reasonably not wanting to trust the current state). This ends up resetting critical things like STATE_BASE_ADDRESS, causing memory accesses in all subsequent batches to be garbage, and almost certainly result in more hangs until we're banned or we kill the machine. We now ask the kernel to ban our render context immediately, so we notice we've gone off the rails as fast as possible. Eventually, we'll attempt to recover and continue. For now, we just avoid torching the GPU over and over.
*	nir: Initialize lower_flrp_progress everywhere	Ian Romanick	2019-05-09	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I don't know why I thought NIR_PASS always set the progress variable. Derp. Fixes: d41cdef2a59 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Coverity CID: 1444996 Coverity CID: 1444995 Coverity CID: 1444994 Coverity CID: 1444993 Coverity CID: 1444991 Coverity CID: 1444989
*	gallium: fix typo in comment	Eric Engestrom	2019-05-09	1	-1/+1
\| \| \| \|	Signed-off-by: Eric Engestrom <[email protected]>
*	iris: Reorganise execbuf to have a single point of failure	Chris Wilson	2019-05-08	1	-27/+20
\| \| \| \| \| \| \| \| \| \| \|	Propagate the failure from GEM_EXECBUFFER2, cleanup then report failure if need be. We retain the current behaviour to abort() at the first sign of trouble -- for a non-robustness context, arguably this is the right thing to do as the client cannot recover, and the system state is lost. How to properly integrate with KHR_robustness and reset-strategy is left as a future exercise. Reviewed-by: Kenneth Graunke <[email protected]>
*	kmsro: add _dri.so to two of the kmsro drivers.	Dave Airlie	2019-05-09	1	-2/+2
\| \| \| \| \| \|	Fixes: 8cfc17bdda3 (kmsro: Add the rest of the current set of tinydrm drivers.) Reviewed-by: Eric Engestrom <[email protected]>
*	iris: Report the same video memory settings as i965.	Kenneth Graunke	2019-05-08	2	-2/+34
\| \| \| \|	This just copy and pastes Ian's code from i965.
*	gallium/util: fix two MSVC compiler warnings	Brian Paul	2019-05-08	2	-3/+3
\| \| \| \| \| \| \|	Remove stray const qualifier. s/unsigned/enum tgsi_semantic/ Reviewed-by: Roland Scheidegger <[email protected]>
*	gallium/pp: s/uint/enum tgsi_semantic/ to fix MSVC warning	Brian Paul	2019-05-08	1	-1/+1
\| \| \| \|	Reviewed-by: Roland Scheidegger <[email protected]>
*	noop: s/enum pipe_transfer_usage/unsigned/ to fix MSVC warning	Brian Paul	2019-05-08	1	-1/+1
\| \| \| \| \| \| \|	The function pointer declaration in pipe_context uses unsigned for the bitmask. Reviewed-by: Roland Scheidegger <[email protected]>
*	ddebug: fix a few MSVC compiler warnings	Brian Paul	2019-05-08	2	-8/+9
\| \| \| \| \| \| \|	Don't return an expression in void functions. Replace an unsigned int with proper enum. Reviewed-by: Roland Scheidegger <[email protected]>
*	radeonsi: add an AMD_TEX_ANISO environment variable	Timothy Arceri	2019-05-08	1	-0/+4
\| \| \| \| \| \| \|	This brings it inline with the recently added AMD_DEBUG. Reviewed-by: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109619
*	iris: Also handle res->offset for buffer sampler/image views	Kenneth Graunke	2019-05-07	1	-8/+9
\|
*	iris: support dmabuf imports with offsets	Mike Blumenkrantz	2019-05-07	4	-12/+12
\| \| \| \| \| \| \| \| \|	this adds support for imports where the image data begins at an offset from the start of the buffer, as used in h/x264 fixes kwg/mesa#47 Reviewed-by: Kenneth Graunke <[email protected]>
*	gallivm: fix broken 8-wide s3tc decoding	Roland Scheidegger	2019-05-07	1	-17/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Brian noticed there was an uninitialized var for the 8-wide case and 128 bit blocks, which made it always crash. Likewise, the 64bit block case had another crash bug due to type mismatch. Color decode (used for all s3tc formats) also had a bogus shuffle for this case, leading to decode artifacts. Fix these all up, which makes the code actually work 8-wide. Note that it's still not used - I've verified it works, and the generated assembly does look quite a bit simpler actually (20-30% less instructions for the s3tc decode part with avx2), however in practice it still seems to be sligthly slower for some unknown reason (tested with openarena) on my haswell box, so for now continue to split things into 4-wide vectors before decoding. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	lima: enable sin and cos lowering for GP	Vasily Khoruzhick	2019-05-07	1	-0/+1
\| \| \| \| \| \| \| \|	GP doesn't support sin/cos natively, so we have to lower them. Reviewed-by: Qiang Yu <[email protected]> Tested-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	freedreno/ir3: move const_state to ir3_shader	Rob Clark	2019-05-07	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For a6xx, we construct/emit a single VS const state used for both binning pass and draw pass. So far we were mostly getting lucky that there were not (obvious) mismatches between the const_state (like different lowered immediates) between the binning and draw pass VS ir3_shader_variant. And I guess this situation will come up more as GS and tess is added into the equation. Since really everything about the const state is not specific to the variant, move this. The main exception is lowered immediates, but these are the last to appear in the layout, and it doesn't hurt for each new shader variant to just append any immed's it lowers to the end of the immediate state. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: move immediates to const_state	Rob Clark	2019-05-07	1	-2/+2
\| \| \| \| \| \| \|	They are really part of the constant state, and it will moving things from ir3_shader_variant to ir3_shader if we combine them. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: consolidate const state	Rob Clark	2019-05-07	1	-15/+23
\| \| \| \| \| \| \| \|	Combine the offsets of differenet parts of the constant space with (what was formerly known as) ir3_driver_const_layout. Bunch of churn, but no functional change. Signed-off-by: Rob Clark <[email protected]>
*	intel/compiler: Use the flrp lowering pass for all stages on Gen4 and Gen5	Ian Romanick	2019-05-06	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously lower_flrp32 was only set for vertex shaders. Fragment shaders performed a(1-c)+bc lowering during code generation. The shaders with loops hurt are SIMD8 and SIMD16 shaders for a text-identical fragment shader. v2: Rebase on 26391cceaa1 ("intel/compiler: Lower ffma on Gen4 and Gen5"). v3: Rebase on a004e95dd73 ("radeonsi/nir: create si_nir_opts() helper") Iron Lake total instructions in shared programs: 8211385 -> 8185974 (-0.31%) instructions in affected programs: 2503898 -> 2478487 (-1.01%) helped: 9936 HURT: 921 helped stats (abs) min: 1 max: 155 x̄: 2.86 x̃: 2 helped stats (rel) min: 0.10% max: 35.48% x̄: 1.67% x̃: 1.11% HURT stats (abs) min: 1 max: 12 x̄: 3.24 x̃: 2 HURT stats (rel) min: 0.21% max: 13.64% x̄: 1.86% x̃: 0.89% 95% mean confidence interval for instructions value: -2.43 -2.25 95% mean confidence interval for instructions %-change: -1.41% -1.33% Instructions are helped. total cycles in shared programs: 188523186 -> 188401198 (-0.06%) cycles in affected programs: 71541604 -> 71419616 (-0.17%) helped: 11649 HURT: 1871 helped stats (abs) min: 2 max: 930 x̄: 12.62 x̃: 6 helped stats (rel) min: <.01% max: 44.61% x̄: 0.68% x̃: 0.25% HURT stats (abs) min: 2 max: 138 x̄: 13.38 x̃: 8 HURT stats (rel) min: <.01% max: 10.99% x̄: 0.49% x̃: 0.17% 95% mean confidence interval for cycles value: -9.42 -8.63 95% mean confidence interval for cycles %-change: -0.54% -0.50% Cycles are helped. total loops in shared programs: 852 -> 856 (0.47%) loops in affected programs: 0 -> 4 helped: 0 HURT: 4 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.00% max: 0.00% x̄: 0.00% x̃: 0.00% 95% mean confidence interval for loops value: 1.00 1.00 95% mean confidence interval for loops %-change: 0.00% 0.00% Loops are HURT. LOST: 3 GAINED: 12 GM45 total instructions in shared programs: 5046407 -> 5033694 (-0.25%) instructions in affected programs: 1303584 -> 1290871 (-0.98%) helped: 5010 HURT: 464 helped stats (abs) min: 1 max: 155 x̄: 2.85 x̃: 2 helped stats (rel) min: 0.10% max: 34.38% x̄: 1.63% x̃: 1.08% HURT stats (abs) min: 1 max: 75 x̄: 3.39 x̃: 2 HURT stats (rel) min: 0.20% max: 13.04% x̄: 1.84% x̃: 0.87% 95% mean confidence interval for instructions value: -2.45 -2.20 95% mean confidence interval for instructions %-change: -1.40% -1.28% Instructions are helped. total cycles in shared programs: 128889476 -> 128812366 (-0.06%) cycles in affected programs: 44845402 -> 44768292 (-0.17%) helped: 6079 HURT: 940 helped stats (abs) min: 2 max: 930 x̄: 15.16 x̃: 8 helped stats (rel) min: <.01% max: 41.03% x̄: 0.71% x̃: 0.25% HURT stats (abs) min: 2 max: 138 x̄: 16.01 x̃: 8 HURT stats (rel) min: <.01% max: 10.99% x̄: 0.50% x̃: 0.17% 95% mean confidence interval for cycles value: -11.63 -10.34 95% mean confidence interval for cycles %-change: -0.58% -0.52% Cycles are helped. total loops in shared programs: 633 -> 635 (0.32%) loops in affected programs: 0 -> 2 helped: 0 HURT: 2 total spills in shared programs: 60 -> 69 (15.00%) spills in affected programs: 54 -> 63 (16.67%) helped: 0 HURT: 1 total fills in shared programs: 92 -> 105 (14.13%) fills in affected programs: 80 -> 93 (16.25%) helped: 0 HURT: 1 LOST: 15 GAINED: 15 Reviewed-by: Jason Ekstrand <[email protected]> [v2] Reviewed-by: Matt Turner <[email protected]> [v2]
*	nir: Use the flrp lowering pass instead of nir_opt_algebraic	Ian Romanick	2019-05-06	3	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I tried to be very careful while updating all the various drivers, but I don't have any of that hardware for testing. :( i965 is the only platform that sets always_precise = true, and it is only set true for fragment shaders. Gen4 and Gen5 both set lower_flrp32 only for vertex shaders. For fragment shaders, nir_op_flrp is lowered during code generation as a(1-c)+bc. On all other platforms 64-bit nir_op_flrp and on Gen11 32-bit nir_op_flrp are lowered using the old nir_opt_algebraic method. No changes on any other Intel platforms. v2: Add panfrost changes. Iron Lake and GM45 had similar results. (Iron Lake shown) total cycles in shared programs: 188647754 -> 188647748 (<.01%) cycles in affected programs: 5096 -> 5090 (-0.12%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% Reviewed-by: Matt Turner <[email protected]>
*	nir: nir_shader_compiler_options: drop native_integers	Christian Gmeiner	2019-05-07	6	-8/+0
\| \| \| \| \| \| \| \|	Driver which do not support native integers should use a lowering pass to go from integers to floats. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	panfrost: Refactor blend descriptors	Alyssa Rosenzweig	2019-05-07	3	-120/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit does a fairly large cleanup of blend descriptors, although there should not be any functional changes. In particular, we split apart the Midgard and Bifrost blend descriptors, since they are radically different. From there, we can identify that the Midgard descriptor as previously written was really two render targets' descriptors stuck together. From this observation, we split the Midgard descriptor into what a single RT actually needs. This enables us to correctly dump blending configuration for MRT samples on Midgard. It also allows the Midgard and Bifrost blend code to peacefully coexist, with runtime selection rather than a #ifdef. So, as a bonus, this will help the future Bifrost effort, eliminating one major source of compile-time architectural divergence. Signed-off-by: Alyssa Rosenzweig <[email protected]>
*	lima/gpir: enable lowering for ftrunc	Vasily Khoruzhick	2019-05-07	1	-0/+1
\| \| \| \| \|	Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: implement nir_op_fmov	Vasily Khoruzhick	2019-05-07	1	-0/+1
\| \| \| \| \|	Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	lima: use int_to_float lowering pass	Vasily Khoruzhick	2019-05-07	1	-2/+6
\| \| \| \| \| \| \| \|	Neither GP nor PP in Mali4x0 support integers, so utilize new pass and set native_integers to true for now until this flag is dropped. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>