mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	radv: Handle clip+cull distances more generally as compact arrays.	Bas Nieuwenhuizen	2019-02-25	4	-99/+83
\| \| \| \| \| \| \| \| \| \| \| \| \|	Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 . That MR keeps the clip and cull arrays split. So we have to handle - compact arrays with location_frac != 0 - VARYING_SLOT_CLIP_DIST1 Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 1ef2855692d53349588fa3a9b425c9ae229e5e14)
*	nir/xfb: Handle compact arrays in gather_xfb_info	Jason Ekstrand	2019-02-25	1	-11/+22
\| \| \| \| \| \| \| \|	This makes us properly handle gl_ClipDistance and gl_CullDistance. Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info" Reviewed-by: Alejandro Piñeiro <[email protected]> (cherry picked from commit 1a93fc382b18ee6d1135952d23f0b6a8aa8cd31f)
*	nir/xfb: Work in terms of components rather than slots	Jason Ekstrand	2019-02-25	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We needed to better handle cases where a chunk of a variable starts at some non-zero location_frac and rolls over into the next slot but may not be more than 4 dwords. For example, if gl_CullDistance is an array of 3 things and has location_frac = 2, it will span across two vec4s but is not, itself, bigger than a vec4. If you ignore the clip/cull special case, it's not allowed to happen for anything else because the only things that can span more than one slot is dvec3 and dvec4 and they're both bigger than a vec4. The current code uses this attrib_slot thing where we count attribute slots and iterate over them. However, that doesn't work in the case above because gl_CullDistance will have an attrib_slot count of 1 even though it does span two slots. We could fix this by adjusting attrib_slot but we already have comp_mask and it's easier to just handle it that way. Reviewed-by: Alejandro Piñeiro <[email protected]> (cherry picked from commit 558c3145045f1c6da8bddb31ed77a418ab27f2f9)
*	nir: Rewrite lower_clip_cull_distance_arrays to do a lot less lowering	Jason Ekstrand	2019-02-25	1	-117/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of going to all the work of to combine them into one array, just make two arrays and use location_frac to colocate them within CLIP0. Then the back-end can sort things out and stack them on top of each other. Thanks to ef99f4c8, we also don't need to set compact anymore. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 4e69fba534e7377f3bc6c40c73e6bc5c23437d4e) Conflicts resolved by Dylan Conflicts: src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
*	nir/xfb: Properly align 64-bit values	Jason Ekstrand	2019-02-25	1	-0/+4
\| \| \| \| \| \|	Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info" Reviewed-by: Alejandro Piñeiro <[email protected]> (cherry picked from commit 8f0fe71cc5658728adc273daa03400aab7ec6d93)
*	compiler/types: Add a contains_64bit helper	Jason Ekstrand	2019-02-25	4	-0/+29
\| \| \| \| \|	Reviewed-by: Alejandro Piñeiro <[email protected]> (cherry picked from commit 30b548fc6258e9a72722f511e377cf4716fd443c)
*	radv: Allow interpolation on non-float types.	Bas Nieuwenhuizen	2019-02-25	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \|	In particular structs containing floats and 16-bit floating point types. Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Fixes: da295946361 "spirv: Only split blocks" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109735 Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit f3247841040a202faffe4709c07da9bd41693580)
*	radv: Fix float16 interpolation set up.	Bas Nieuwenhuizen	2019-02-25	6	-15/+92
\| \| \| \| \| \| \| \| \| \| \| \| \|	float16 types can have non-flat interpolation so set up the HW correctly for that. Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit a1fdd4a4a73604469b6204a56457b08f8ae4a948) Conflicts resolved by Dylan Conflicts: src/amd/vulkan/radv_nir_to_llvm.c
*	dri: meson: do not prefix user provided dri-drivers-path	Sergii Romantsov	2019-02-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The user can select the location where there dri drivers are installed by the dri-drivers-path meson option. By default path will be $prefix/$libdir/dri. Currently we add $prefix to the user provided path. Resulting in an incorrect or even missing path. v2: fixed dri_search_path by default, rebased to master v3: new commit-message (Emil Velikov), cc mesa-stable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698 CC: Rafael Antognolli <[email protected]> CC: Dylan Baker <[email protected]> Cc: 18.3 19.0 <[email protected]> Fixes: 306914db92e1 (meson: Add dridriverdir variable to dri.pc.) Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit f6556ec7d126b31da37c08d7cb657250505e01a0)
*	meson: ensure that xmlpool_options.h is generated for gallium targets that ↵	David Shao	2019-02-25	5	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	need it Fixes: 68076b87474e7959c161 "meson: build gallium vdpau state tracker" Fixes: 22a817af8a89eb3c762f "meson: build gallium xvmc state tracker" Fixes: 5a785d51a6d68ec676ce "meson: build gallium va state tracker" Fixes: 0ba909f0f111824223bc "meson: build gallium xa state tracker" Fixes: 1d36dc674d528b93bec3 "meson: build gallium omx state tracker" Reviewed-by: Eric Engestrom <[email protected]> (cherry picked from commit 6fa923a65daf1ee73c5cc763ade91abc82da7085)
*	swr/rast: bypass size limit for non-sampled textures	Alok Hota	2019-02-25	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	This fixes a bug where SWR will fail to render in cases with large buffer allocations, e.g. very large meshes whose vertex buffers exceed 2GB CC: <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]> (cherry picked from commit 6053499f2eafde606b13a9663016e9be8e4089eb)
*	tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics	Marek Olšák	2019-02-25	1	-1/+3
\| \| \| \| \| \| \| \| \|	This might have decreased performance for radeonsi/tgsi, because most most shaders claimed they used bindless. Cc: 18.3 19.0 <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (cherry picked from commit b326a15edab34d09e7b328dd8726137960ae12a5)
*	anv: advertise 8 subpixel precision bits	Juan A. Suarez Romero	2019-02-25	2	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is used to select between 8 bit subpixel precision (value 0) or 4 bit subpixel precision (value 1). As this value is not set, means it is taking the value 0, so 8 bit are used. On the other side, in the Vulkan CTS tests, if the reference rasterizer, which uses 8 bit precision, as it is used to check what should be the expected value for the tests, is changed to use 4 bit as ANV was advertising so far, some of the tests will fail. So it seems ANV is actually using 8 bits. v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason) v3: use _8Bit definition as value (Jason) v4: (by Jason) anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect This field was added on gen8 even though there's an identically defined one in 3DSTATE_SF. CC: Jason Ekstrand <[email protected]> CC: Kenneth Graunke <[email protected]> CC: 18.3 19.0 <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 4f917e6a61860b58a05d40584f7aa3d5e4e32b75)
*	genxml: add missing field values for 3DSTATE_SF	Juan A. Suarez Romero	2019-02-25	6	-6/+24
\| \| \| \| \| \| \| \| \| \|	Fill out "Vertex Sub Pixel Precision Select" possible values. CC: 18.3 19.0 <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 3b423eeb2d326418147fdfbdc89a415e44a557d3)
*	intel: fix urb size for CFL GT1	Lionel Landwerlin	2019-02-25	1	-0/+1
\| \| \| \| \| \| \| \| \|	Same 192Kb amount as SKL/KBL GT1 applies. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Fixes: de7ed0ba5522 ("i965/CFL: Add PCI Ids for Coffee Lake.") (cherry picked from commit 1d626fc02895daa9e7f7c74a829b9512f08869e8)
*	intel/fs: Implement extended strides greater than 4 for IR source regions.	Francisco Jerez	2019-02-25	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Strides up to 32B can be implemented for the source regions of most instructions by leveraging either the vertical or the horizontal stride of the hardware Align1 region. The main motivation for this is that currently the lower_integer_multiplication() pass will happily double the stride of one of the 32-bit sources, which can blow up if the stride of the original source was already the maximum value allowed by the hardware. An alternative would be to use the regioning legalization pass in order to lower such strides into the composition of multiple legal strides, but that would be somewhat less efficient. This showed up as a regression from my commit cbea91eb57a501bebb1ca2 in Vulkan 1.1 CTS tests on CHV/BXT platforms, however it was really a pre-existing problem that had affected conformance on other platforms without native support for integer multiplication. CHV/BXT were getting around it because the code I removed in that commit had the "fortunate" side effect of emitting narrower regions that didn't hit the hardware stride limit after lowering. Beyond fixing the regression this fixes ~90 additional Vulkan 1.1 subgroup CTS tests on ICL (that's why this patch is marked for inclusion in mesa-stable even though the original regressing patch was not). According to Jason, a nearly equivalent change had been committed previously as e8c9e65185de3e821e1 and then (mistakenly?) reverted as a31d0382084c8aa8. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328 Reported-by: Mark Janes <[email protected]> Tested-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit e03be78252afa8f1033b0824eff8d48df4fd6727)
*	intel/fs: Exclude control sources from execution type and region alignment ↵	Francisco Jerez	2019-02-25	3	-4/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	calculations. Currently the execution type calculation will return a bogus value in cases like: mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u Which will be considered to have a 32-bit integer execution type even though the actual indirect move operation will be carried out with 16-bit precision. Similarly there's no need to apply the CHV/BXT double-precision region alignment restrictions to such control sources, since they aren't directly involved in the double-precision arithmetic operations emitted by these virtual instructions. Applying the CHV/BXT restrictions to control sources was expected to be harmless if mildly inefficient, but unfortunately it exposed problems at codegen level for virtual instructions (namely the SHUFFLE instruction used for the Vulkan 1.1 subgroup feature) that weren't prepared to accept control sources with an arbitrary strided region. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328 Reported-by: Mark Janes <[email protected]> Fixes: efa4e4bc5fc "intel/fs: Introduce regioning lowering pass." Tested-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit c3c27762f787a93ee3f27189bef8d7cdcb3a6cab)
*	i965: re-emit index buffer state on a reset option change.	Andrii Simiklit	2019-02-25	3	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Seems like we forget to update the index buffer (ib) status and IndexedDrawCutIndexEnable or CutIndexEnable flag is left unchanged it leads to ignoring of glEnable/glDisable functions for GL_PRIMITIVE_RESTART in some cases. The index buffer (ib) status should be re-emmited after the reset option change to avoid some unexpected behavior. Reviewed-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109451 Cc: <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]> (cherry picked from commit f4f4ec941e1427142656e588244f378e469e996e)
*	wayland/egl: Ensure EGL surface is resized on DRI update_buffers()	Carlos Garnacho	2019-02-20	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fullscreening and unfullscreening a totem window while playing a video sometimes results in the video subsurface not changing size along. This is also reproducible with epiphany. If a surface gets resized while we have an active back buffer for it, the resized dimensions won't get neither immediately applied on the resize callback, nor correctly synchronized on update_buffers(), as the (now stale) surface size and currently attached buffer size still do match. There's actually 2 things to synchronize here, first the surface query size might not be updated yet to the wl_egl_window's (i.e. resize_callback happened while there is a back buffer), and second the wayland buffers would need dropping if new surface size differs with the currently attached buffer. These are done in separate steps now. https://bugzilla.redhat.com/show_bug.cgi?id=1650929 https://bugs.freedesktop.org/show_bug.cgi?id=109594 Fixes: a9fb331ea7d ("wayland/egl: update surface size on window resize") Signed-off-by: Carlos Garnacho <[email protected]> Reviewed-by: Juan A. Suarez <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Tested-by: Bastien Nocera <[email protected]> Tested-by: Denys Kostin <[email protected]> (cherry picked from commit 30a01cd9232ed83a0259d184b82e050bae219ed3)
*	radv: Sync ETC2 whitelisted devices.	Bas Nieuwenhuizen	2019-02-20	3	-5/+11
\| \| \| \| \| \|	Fixes: 4bb6c49375e "radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9." Reviewed-by: Dave Airlie <[email protected]> (cherry picked from commit 7631feaa0040616585cf69b52241d2b06b82b524)
*	drirc: Add sddm-greeter to adaptive_sync blacklist.	Mario Kleiner	2019-02-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	This is the sddm login screen. Fixes: a9c36dbf9c56 ("drirc: Initial blacklist for adaptive sync") Signed-off-by: Mario Kleiner <[email protected]> Cc: 19.0 <[email protected]> Signed-off-by: Marek Olšák <[email protected]> (cherry picked from commit afb15d14ca19ea321280bb83215af4c55b9ce881)
*	driconf: add Civ6Sub executable for Civilization 6	Marek Olšák	2019-02-20	1	-0/+6
\| \| \| \| \| \| \| \|	I'm getting Civ6Sub instead of Civ6. Cc: 18.3 19.0 <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit bff8da6c591e55e4b5f04aea1fef29e6230e9222)
*	radeonsi: always enable NIR for Civilization 6 to fix corruption	Marek Olšák	2019-02-20	1	-0/+3
\| \| \| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104602 Cc: 18.3 19.0 <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit ae21bdf47cacafdf69b904cbf3e433cbe0cccb84)
*	radeonsi: add driconf option radeonsi_enable_nir	Marek Olšák	2019-02-20	3	-1/+8
\| \| \| \| \| \|	Cc: 18.3 19.0 <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> (cherry picked from commit ccbfe44e5ff88a19451701561f752c6046677122)
*	mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachment	Tapani Pälli	2019-02-19	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes invalid access to Attachment array which would occur if caller would exceed MaxColorAttachments. In practice this should not ever happen because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be valid and InvalidateFramebuffer will error out before but this should make coverity happy. v2: const, remove _EXT (Ian) CID: 1442559 Fixes: 0c42b5f3cb9 "mesa: wire up InvalidateFramebuffer" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 9762a9f89380a8070654a80e73d927297c29da35)
*	radv: ensure export arguments are always float	Rhys Perry	2019-02-19	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	So that the signature is correct and consistent, the inputs to a export intrinsic should always be 32-bit floats. This and the previous commit fixes a large amount crashes from dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_* tests Fixes: b722b29f10d ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 0ca550e01ac55c67c2deef50f5cb750a0181352b)
*	radv: bitcast 16-bit outputs to integers	Rhys Perry	2019-02-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	16-bit outputs are stored as 16-bit floats in the outputs array, so they have to be bitcast. Fixes: b722b29f10d ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 64065aa504c4872a15f7b0894b6037a6b2bcae65)
*	v3d: Fix the check for "is the last thrsw inside control flow"	Eric Anholt	2019-02-19	2	-8/+17
\| \| \| \| \| \| \| \| \| \|	The execute.file check used to be good enough, until I stopped setting up the execute mask for uniform ifs. No known tests fixed, noticed while doing a refactor. Fixes: 080506057310 ("v3d: Handle dynamically uniform IF statements with uniform control flow.") (cherry picked from commit 441294962cd65d44febdbe9ef0b0d99b5d27cec8)
*	v3d: Use the early_fragment_tests flag for the shader's disable-EZ field.	Eric Anholt	2019-02-19	4	-17/+23
\| \| \| \| \| \| \| \| \| \| \| \|	Apparently we need disable-EZ flagged, not just "does Z writes". Fixes dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo on 7278, even though it passed in simulation. Signed-off-by: Eric Anholt <[email protected]> Fixes: 051a41d3d56e ("v3d: Add support for the early_fragment_tests flag.") (cherry picked from commit cd5e0b272919a654079620adecd2abe24ff51233)
*	radv: fix writing the alpha channel of MRT0 when alpha coverage is enabled	Samuel Pitoiset	2019-02-19	1	-7/+8
\| \| \| \| \| \| \| \| \|	This version is better and safer. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit 47616810ed7cfce21d239391131ad9a5ef558b52)
*	radv: write the alpha channel of MRT0 when alpha coverage is enabled	Samuel Pitoiset	2019-02-19	1	-0/+8
\| \| \| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597 Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit 0d8f09629377da9cf48ab4315574d69fdef5369d)
*	nir: Don't reassociate add/mul chains containing only constants	Kenneth Graunke	2019-02-19	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The idea here is to reassociate a * (b * c) into (a * c) * b, when b is a non-constant value, but a and c are constants, allowing them to be combined. But nothing was enforcing that 'b' must be non-constant, which meant that running opt_algebraic in a loop would never terminate if the IR contained non-folded constant expressions like 256 * 0.5 * 2. Normally, we call constant folding in such a loop too, but IMO it's better for nir_opt_algebraic to be robust and not rely on that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109581 Fixes: 32e266a9a58 i965: Compile fp64 funcs only if we do not have 64-bit hardware support Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 535251487ba56c4fd98465c4682881c2b9734242)
*	intel/compiler/test: Add unit test for mismatched signedness comparison	Matt Turner	2019-02-15	1	-0/+32
\| \| \| \| \| \| \| \| \| \|	v2 (idr): Move adding the test to after adding the fix. Reordering the two commits prevents possible headaches for git-bisect with scripts that always do 'ninja check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit ac21dd4aee450b2a4bc63adb05356b07abba2ff6)
*	intel/compiler: Avoid propagating inequality cmods if types are different	Matt Turner	2019-02-15	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2: Fix silly bug in logic. s/\|\|/&&/ All but one of the affected shaders is in an Unreal4 demo. The other is in Tomb Raider. All of the cases that Ian investigated appear to be sequences like the following if (int(uint(some_float)) < 0) /* other relations too */ ... At least in Tomb Raider, it's not obvious that this sequence came from the original shader. In some of the Unreal demos, the shader contains code like if (int(uint(textureLod(...))) > 0) ... which explicitly generates the offending sequence. All Gen6+ platforms had similar results (Skylake shown): total instructions in shared programs: 15437170 -> 15437187 (<.01%) instructions in affected programs: 4492 -> 4509 (0.38%) helped: 0 HURT: 17 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.73% x̄: 0.66% x̃: 0.73% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.57% 0.75% Instructions are HURT. total cycles in shared programs: 383007996 -> 383007992 (<.01%) cycles in affected programs: 20542 -> 20538 (-0.02%) helped: 6 HURT: 7 helped stats (abs) min: 2 max: 6 x̄: 5.33 x̃: 6 helped stats (rel) min: 0.11% max: 0.36% x̄: 0.32% x̃: 0.36% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.27% max: 0.27% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for cycles value: -3.30 2.69 95% mean confidence interval for cycles %-change: -0.19% 0.19% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: [email protected] Tested-by: Danylo Piliaiev <[email protected]> (cherry picked from commit 2dff9a66b629834bffad47e7a9025e0f1de5ffc3)
*	intel/fs: Bail in optimize_extract_to_float if we have modifiers	Jason Ekstrand	2019-02-15	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a bug in runscape where we were optimizing x >> 16 to an extract and then negating and converting to float. The NIR to fs pass was dropping the negate on the floor breaking a geometry shader and causing it to render nothing. Fixes: 1f862e923cb "i965/fs: Optimize float conversions of byte/word..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109601 Tested-by: Lionel Landwerlin <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit 367b0ede4d9115aba772d6e46ec73642761f7ff6)
*	swr: set PIPE_CAP_MAX_VARYINGS correctly	Ilia Mirkin	2019-02-15	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately swr was missed in the original commit. The number of varyings should generally match up to what's reported as the shader caps for fragment inputs. Fixes: 6010d7b8e8be (gallium: add PIPE_CAP_MAX_VARYINGS) Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Alok Hota <[email protected]> Cc: 19.0 <[email protected]> (cherry picked from commit 8c859367df95b74e7596f7fefffbdbf08bb8f8c7)
*	anv: Put MOCS in the correct location	Kenneth Graunke	2019-02-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	My patch to switch from struct-based MOCS to numeric MOCS accidentally divided all MOCS entries by 2 in the Vulkan driver. MOCS on Gen9+ is just an array index into a table. But in the hardware packets, the index starts at bit 1. So we need to shift it. Fixes: 0b44644ca68 (genxml: Consistently use a numeric "MOCS" field) Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 39aee57523a02552e7eae7df5da488e535aeb1eb)
*	spirv: Add missing break	Ian Romanick	2019-02-14	1	-0/+1
\| \| \| \| \| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Fixes: c6465fec0c5 ("spirv: add SpvCapabilityInt64Atomics") CID: 1442555 (cherry picked from commit 9a918050e0886d8c6d6adc0c687ffd30d8f70b40)
*	meson: Add dependency on genxml to anvil	Dylan Baker	2019-02-14	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the Intel "anvil" driver races with the generation of genxml files, while i965 has an explicit dependency. This patch adds the same dependency to anvil. Fixes: d1992255bb29054fa51763376d125183a9f602f ("meson: Add build Intel "anv" vulkan driver") Acked-by: Jason Ekstrand <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (cherry picked from commit 279060cd32dd673c6a5bf302ceac852f51a6c17c)
*	radv: always export gl_SampleMask when the fragment shader uses it	Samuel Pitoiset	2019-02-14	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	For some reasons, this breaks trees rendering in Project Cars. Fixes: 85010585cde ("radv: only enable gl_SampleMask if MSAA is enabled too") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109401 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit 334da034d8d91ca5a0a1bff8deaefd8ca762c42e)
*	radv/winsys: fix BO list creation when RADV_DEBUG=allbos is set	Samuel Pitoiset	2019-02-14	1	-0/+1
\| \| \| \| \| \| \|	Fixes: 50fd253bd6e ("radv/winsys: Add priority handling during submit.") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit 5e18000d1b070ecf627138b7bff47ff8fef81576)
*	nir/opt_if: don't mark progress if nothing changes	Karol Herbst	2019-02-13	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if we have something like this: loop { ... if x { break; } else { continue; } } opt_if_loop_last_continue returns true marking progress allthough nothing changes. Fixes: 5921a19d4b0c6 "nir: add if opt opt_if_loop_last_continue()" Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 7e08f22a72cfc379902feeca3673db6aa344f782)
*	radeonsi: Fix guardband computation for large render targets	Oscar Blumberg	2019-02-13	1	-2/+28
\| \| \| \| \| \| \| \| \| \|	Stop using 12.12 quantization for viewports that are not contained in the lower 4k corner of the render target as the hardware needs to keep both absolute and relative coordinates representable. Signed-off-by: Marek Olšák <[email protected]> Cc: 18.3 19.0 <[email protected]> (cherry picked from commit 3c540e0a748844258e77254fc4f864f3b875fe18)
*	anv/cmd_buffer: check for NULL framebuffer	Juan A. Suarez Romero	2019-02-12	1	-5/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This can happen when we record a VkCmdDraw in a secondary buffer that was created inheriting from the primary buffer, but with the framebuffer set to NULL in the VkCommandBufferInheritanceInfo. Vulkan 1.1.81 spec says that "the application must ensure (using scissor if neccesary) that all rendering is contained in the render area [...] [which] must be contained within the framebuffer dimesions". While this should be done by the application, commit 465e5a86 added the clamp to the framebuffer size, in case of application does not do it. But this requires to know the framebuffer dimensions. If we do not have a framebuffer at that moment, the best compromise we can do is to just apply the scissor as it is, and let the application to ensure the rendering is contained in the render area. v2: do not clamp to framebuffer if there isn't a framebuffer v3 (Jason): - clamp earlier in the conditional - clamp to render area if command buffer is primary v4: clamp also x and y to render area (Jason) v5: rename used variables (Jason) Fixes: 465e5a86 ("anv: Clamp scissors to the framebuffer boundary") CC: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 1ad26f941792f07f226c054811be78b0c0ac9fce)
*	radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8	Samuel Pitoiset	2019-02-12	3	-3/+10
\| \| \| \| \| \| \| \| \| \|	This fixes a critical issue. Cc: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit 1b8983c25be19073c02fe9630e949be55f8280fa)
*	radv: fix compiler issues with GCC 9	Samuel Pitoiset	2019-02-12	1	-42/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"The C standard says that compound literals which occur inside of the body of a function have automatic storage duration associated with the enclosing block. Older GCC releases were putting such compound literals into the scope of the whole function, so their lifetime actually ended at the end of containing function. This has been fixed in GCC 9. Code that relied on this extended lifetime needs to be fixed, move the compound literals to whatever scope they need to accessible in." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109543 Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Gustaw Smolarczyk <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (cherry picked from commit 129a9f4937b8f2adb4d37999677d748d816d611c)
*	st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048	Kenneth Graunke	2019-02-12	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Piglit's vp-max-array test creates a vertex program containing a uniform array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB. Mesa will then add additional state-var parameters for things like the MVP matrix. radeonsi currently exposes a value of 4096, derived from constant buffer upload size. This means the array will have 4096 elements, and the extra MVP state-vars would get a prog_src_register::Index of over 4096. Unfortunately, prog_src_register::Index is a signed 13-bit integer, so values beyond 4096 end up turning into negative numbers. Negative source indexes are only valid for relative addressing, so this ends up generating illegal IR. In prog_to_nir, this would cause an out of bounds array access. st_mesa_to_tgsi checks for a negative value, assumes it's bogus, and remaps it to parameter 0 in order to get something in-range. This isn't right - instead of reading the MVP matrix, it would read the first element of the vertex program's large array. But the test only checks that the program compiles, so we never noticed that it was broken. This patch limits the size of the program limits, with the understanding that we may need to generate additional state-vars internally. i965 has exposed 1024 for this limit for years, so I don't expect lowering it to 2048 will cause any practical problems for radeonsi or other drivers. Fixes vp-max-array with prog_to_nir.c. Cc: "19.0" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (cherry picked from commit f45dd6d31b2ff46a082931386ccd0bf043cfad59)
*	st/va/vp9: set max reference as default of VP9 reference number	Leo Liu	2019-02-12	1	-1/+6
\| \| \| \| \| \| \| \| \|	If there is no information about number of render targets Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]> Cc: 19.0 <[email protected]> (cherry picked from commit a0a52a036708dbf5989778795fd67a79e3226289)
*	st/va: fix the incorrect max profiles report	Leo Liu	2019-02-12	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Add "PIPE_VIDEO_PROFILE_MAX" to enum, so it will make sure here will be correct when adding more profiles in the future. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109107 Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]> Cc: 19.0 <[email protected]> (cherry picked from commit 21cdb828a3f4d1e2f140fc7c81a4bc305b2f6b04)
*	winsys/amdgpu: don't drop manually added fence dependencies	Marek Olšák	2019-02-12	1	-2/+0
\| \| \| \| \| \| \| \|	wow, it's hard to believe that fence and syncobjs dependencies were ignored. Cc: 18.3 19.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit ddfe209a0d61917e7b08100eeac82f4c20ca59e8)