mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	intel/compiler: Properly consider UBO loads that cross 32B boundaries.	Kenneth Graunke	2018-06-14	1	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The UBO push analysis pass incorrectly assumed that all values would fit within a 32B chunk, and only recorded a bit for the 32B chunk containing the starting offset. For example, if a UBO contained the following, tightly packed: vec4 a; // [0, 16) float b; // [16, 20) vec4 c; // [20, 36) then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, which means that we ought to record two 32B chunks in the bitfield. Similarly, dvec4s would suffer from the same problem. v2: Rewrite the accounting, my calculations were wrong. v3: Write a comment about partial values (requested by Jason). Reviewed-by: Rafael Antognolli <[email protected]> [v1] Reviewed-by: Jason Ekstrand <[email protected]> [v3]
*	glsl: Don't copy propagate elements from SSBO or shared variables either	Ian Romanick	2018-06-14	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since SSBOs can be written by a different GPU thread, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. The same shader was helped by this patch and the previous. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399119 -> 14399113 (<.01%) instructions in affected programs: 683 -> 677 (-0.88%) helped: 1 HURT: 0 total cycles in shared programs: 532973113 -> 532971865 (<.01%) cycles in affected programs: 524666 -> 523418 (-0.24%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
*	glsl: Don't copy propagate from SSBO or shared variables either	Ian Romanick	2018-06-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since SSBOs can be written by other GPU threads, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399120 -> 14399119 (<.01%) instructions in affected programs: 684 -> 683 (-0.15%) helped: 1 HURT: 0 total cycles in shared programs: 532978931 -> 532973113 (<.01%) cycles in affected programs: 530484 -> 524666 (-1.10%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
*	meson: only build vl_winsys_dri.c when x11 platform is used	Lukas Rusak	2018-06-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This seems to have been missed in the move from autotools This fixes the following build issue: ../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error: X11/Xlib-xcb.h: No such file or directory #include <X11/Xlib-xcb.h> ^~~~~~~~~~~~~~~~ Fixes: b1b65397d0c4978e36a84c0a1c98a4bd6cb9588e ("meson: Build gallium auxiliary") Reviewed-by: Dylan Baker <[email protected]>
*	st/mesa: add missing switch cases in glsl_to_tgsi_visitor::visit()	Brian Paul	2018-06-14	1	-0/+2
\| \| \| \| \| \|	To silence compiler warning about unhandled switch cases. Reviewed-by: Charmaine Lee <[email protected]>
*	radv: Fix output for sparse MRTs.	Bas Nieuwenhuizen	2018-06-14	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \|	We need to init the cb_shader_format correctly with the changed col_format, so this moves the col_format adjustment to before the adjustment to before the cb_shader_mask gets generated. Fixes: 06d3c650980 "radv: fix a GPU hang when MRTs are sparse" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903 CC: 18.1 <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	radv: update the ZRANGE_PRECISION value for the TC-compat bug	Samuel Pitoiset	2018-06-14	1	-0/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On GFX8+, there is a bug that affects TC-compatible depth surfaces when the ZRange is not reset after LateZ kills pixels. The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match the last fast clear value. Because the value is set to 1 by default, we only need to update it when clearing Z to 0.0. We also need to set the depth clear regs and to update ZRANGE_PRECISION when initializing a TC-compat depth image to 0. Original patch from James Legg. This fixes random CTS fails with dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 CC: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	anv: reduce maxFragmentInputComponents	Samuel Iglesias Gonsálvez	2018-06-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the application asks for the maximum number of fragment input components (128), use all of them plus some builtins that are passed in the VUE, then we exceed the maximum number of used VUE slots (32) and we break one assert that checks this limit. Also, with separate shader objects, we add CLIP_DIST0, CLIP_DIST1 builtins in brw_compute_vue_map() because we don't know if gl_ClipDistance is going to be read/write by an adjacent stage. Fixes VK-GL-CTS CL#2569. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers	Marek Olšák	2018-06-13	1	-2/+2
\| \| \| \| \| \| \| \|	This fixes: GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations Cc: 18.0 18.1 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	radeonsi/gfx9: update & clean up a DPBB heuristic	Marek Olšák	2018-06-13	1	-9/+5
\| \| \| \|	Tested-by: Dieter Nützel <[email protected]>
*	radeonsi/gfx9: set POPS_DRAIN_PS_ON_OVERLAP due to a hw bug	Marek Olšák	2018-06-13	1	-2/+4
\| \| \| \| \| \|	This may not be needed yet, but let's set it now. Tested-by: Dieter Nützel <[email protected]>
*	radeonsi/gfx9: remove UINT_MAX array terminators in bin size tables	Marek Olšák	2018-06-13	1	-19/+1
\| \| \| \|	Tested-by: Dieter Nützel <[email protected]>
*	radeonsi/gfx9: update bin sizes	Marek Olšák	2018-06-13	1	-35/+38
\| \| \| \| \| \|	This is based on our docs (recently updated), not amdvlk. Tested-by: Dieter Nützel <[email protected]>
*	radeonsi/gfx9: update primitive binning code for EQAA	Marek Olšák	2018-06-13	1	-4/+9
\| \| \| \|	Tested-by: Dieter Nützel <[email protected]>
*	radeonsi: assume that rasterizer state is non-NULL in draw_vbo	Marek Olšák	2018-06-13	4	-75/+61
\| \| \| \|	Tested-by: Dieter Nützel <[email protected]>
*	radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency	Marek Olšák	2018-06-13	4	-13/+23
\| \| \| \|	Tested-by: Dieter Nützel <[email protected]>
*	radeonsi: move the guardband registers into a separate state atom	Marek Olšák	2018-06-13	5	-19/+35
\| \| \| \| \| \| \| \| \|	They have a different frequency of updates and don't change when scissors change. I think this even fixes something in si_update_vs_viewport_state. Tested-by: Dieter Nützel <[email protected]>
*	radeonsi/gfx9: implement the scissor bug workaround without performance drop	Marek Olšák	2018-06-13	2	-29/+81
\| \| \| \| \| \|	This might improve performance on Vega10 and Raven. Tested-by: Dieter Nützel <[email protected]>
*	radeonsi: don't set VGT_LS_HS_CONFIG if it doesn't change	Marek Olšák	2018-06-13	3	-6/+12
\| \| \| \|	Tested-by: Dieter Nützel <[email protected]>
*	radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs	Marek Olšák	2018-06-13	4	-33/+26
\| \| \| \| \| \|	same as amdvlk. Tested-by: Dieter Nützel <[email protected]>
*	radeonsi: record CLIPVERTEX output usage properly for compatibility profiles	Marek Olšák	2018-06-13	1	-1/+0
\| \| \| \| \| \|	This was missed when adding CLIPVERTEX support into GS & tess. Tested-by: Dieter Nützel <[email protected]>
*	radeonsi: fix FBFETCH with 2D MSAA arrays	Marek Olšák	2018-06-13	1	-1/+2
\| \| \| \|	Tested-by: Dieter Nützel <[email protected]>
*	ac: handle undefined EQAA samples in ac_apply_fmask_to_sample	Marek Olšák	2018-06-13	1	-2/+4
\| \| \| \| \| \|	RADV might wanna use this helper too. Tested-by: Dieter Nützel <[email protected]>
*	radeonsi: return real memory usage instead of per-process usage	Marek Olšák	2018-06-13	1	-2/+2
\| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/gpu_info: report real total memory sizes	Marek Olšák	2018-06-13	1	-28/+54
\| \| \| \| \| \| \|	The change from MIN2 to MAX2 is intentional. Cc: 18.1 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	docs: mark virgl GL 4.0 features as complete.	Dave Airlie	2018-06-14	1	-13/+13
\| \| \| \|	virgl should now expose GL4.1 where it can.
*	virgl: add ARB_tessellation_shader support. (v2)	Dave Airlie	2018-06-14	7	-8/+107
\| \| \| \| \| \| \| \| \| \|	This should add all the pieces to enable tess shaders on virgl. v2: fixup transform to handle tess and strip out precise. set default for max patch varyings to work around issue when tess gets enabled from v1 caps but v2 caps aren't in place. (Elie) Reviewed-by: Elie Tournier <[email protected]>
*	glsl: allow standalone semicolons outside main()	Dave Airlie	2018-06-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	GLSL 4.60 offically added this but games and older CTS suites actually had shaders that did this, we may as well enable it everywhere. Adding stable because it appears apps in the wild do this. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: <[email protected]>
*	radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8	Samuel Pitoiset	2018-06-13	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	This causes rendering issues in Shadow Warrior 2 with DXVK. Cc: [email protected] Fixes: ccc64f3133 ("radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	configure.ac: Test for __atomic_add_fetch in atomic checks	Andrew Galante	2018-06-13	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	Some platforms have 64-bit __atomic_load_n but not 64-bit __atomic_add_fetch, so test for both of them. Bug: https://bugs.gentoo.org/655616 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
*	meson: Test for __atomic_add_fetch in atomic checks	Andrew Galante	2018-06-13	1	-2/+5
\| \| \| \| \| \| \| \| \| \|	Some platforms have 64-bit __atomic_load_n but not 64-bit __atomic_add_fetch, so test for both of them. Bug: https://bugs.gentoo.org/655616 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
*	meson: Fix -latomic check	Matt Turner	2018-06-13	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	Commit 54ba73ef102f (configure.ac/meson.build: Fix -latomic test) fixed some checks for -latomic, and then commit 54bbe600ec26 (configure.ac: rework -latomic check) further extended the fixes in configure.ac but not in Meson. This commit extends those fixes to the Meson tests. Fixes: 54bbe600ec26 (configure.ac: rework -latomic check) Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
*	meson: Remove various completed todos	Dylan Baker	2018-06-13	3	-12/+0
\| \| \| \| \| \| \| \|	v3: - Remove "won't do" todos, so only completed todo's are now removed. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]> (v2)
*	meson: Make use of optional modules	Dylan Baker	2018-06-13	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \|	meson 0.43 gained support for optional modules, which clover wold like to use. Since we require 0.44.1 now we can rely on them being available for clover. compile tested only. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	meson: Add support for ppc assembly/optimizations	Dylan Baker	2018-06-13	2	-4/+34
\| \| \| \| \| \| \| \| \| \| \|	v2: - Use -mpower8-vector in compiler test for altivec - rename altivec option to power8 - reword power8 option description to be more clear, originally I had made it a boolean, but replaced it with an auto option. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	meson: Add support for SPARC assembly	Dylan Baker	2018-06-13	3	-2/+14
\| \| \| \| \| \| \| \| \|	This was blindly copied from autotools and tested by a helpful gentoo user. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	meson: Set include dirs for asm	Dylan Baker	2018-06-13	1	-2/+6
\| \| \| \| \| \| \| \| \|	v2: - split this from the next patch - Only include x86-64 and not x86 when buiding x86_64 Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	meson: move cc and cpp definitions to top of main meson.build	Dylan Baker	2018-06-13	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	This just makes using cc and cpp easier. v2: - Add this patch to fix altivec Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	Revert "intel/compiler: Properly consider UBO loads that cross 32B boundaries."	Jason Ekstrand	2018-06-13	1	-7/+1
\| \| \| \| \| \|	This reverts commit b8fa847c2ed9c7c743f31e57560a09fae3992f46. This broke about 30k Vulkan CTS tests.
*	intel/compiler: Properly consider UBO loads that cross 32B boundaries.	Kenneth Graunke	2018-06-13	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The UBO push analysis pass incorrectly assumed that all values would fit within a 32B chunk, and only recorded a bit for the 32B chunk containing the starting offset. For example, if a UBO contained the following, tightly packed: vec4 a; // [0, 16) float b; // [16, 20) vec4 c; // [20, 36) then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, which means that we ought to record two 32B chunks in the bitfield. Similarly, dvec4s would suffer from the same problem. Reviewed-by: Rafael Antognolli <[email protected]>
*	drivers/dri/i965: add missing #include	Ross Burton	2018-06-12	1	-0/+2
\| \| \| \| \| \|	brw_bufmgr.h uses time_t without include time.h, so the build fails under musl. Reviewed-by: Eric Engestrom <[email protected]>
*	anv/android: Use an address for each anv_image plane	Mauro Rossi	2018-06-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes to avoid building error after change in image->planes[] structure, {bo,bo_offset} has to be replaced by address.{bo,offset} and update is needed also in the assert() for debug builds. external/mesa/src/intel/vulkan/anv_android.c:188:21: error: no member named 'bo' in 'struct anv_image::(anonymous at external/mesa/src/intel/vulkan/anv_private.h:2647:4)' image->planes[0].bo = bo; ~~~~~~~~~~~~~~~~ ^ 1 error generated. Fixes: bf34ef16ac ("anv: Use an address for each anv_image plane") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv/android: Set the BO flags in bo_cache_import (v2)	Mauro Rossi	2018-06-12	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changes to avoid building error: external/mesa/src/intel/vulkan/anv_android.c:131:72: error: too few arguments to function call, expected 5, have 4 result = anv_bo_cache_import(device, &device->bo_cache, dma_buf, &bo); ~~~~~~~~~~~~~~~~~~~ ^ 1 error generated. (v2) Set the correct bo_flags based on support of 48bit addresses and soft-pin Fixes: b0d50247a7 ("anv/allocator: Set the BO flags in bo_cache_alloc/import") Fixes: e7d0378bd9 ("anv: Soft-pin client-allocated memory") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv: Disable __gen_validate_value if NDEBUG is set.	Kenneth Graunke	2018-06-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were enabling undefined memory checking for genxml values based on Valgrind being installed at build time, even for release builds. This generates piles and piles of assembly whenever you touch genxml. With gcc 7.3.1 and -O3 and -march=native on a Kabylake with Valgrind installed at build time: text data bss dec hex filename 5978385 262884 13488 6254757 5f70a5 libvulkan_intel.so 3799377 262884 13488 4075749 3e30e5 libvulkan_intel.so That's a 36% reduction in text size. Fixes: 047ed02723071d7eccbed3210b5be6ae73603a53 (vk/emit: Use valgrind to validate every packed field) Reviewed-by: Jason Ekstrand <[email protected]>
*	README: wording fix for previous commit	Eric Engestrom	2018-06-11	1	-2/+3
\| \| \| \|	Signed-off-by: Eric Engestrom <[email protected]>
*	README: add link to WhosWho for IRC nicks	Eric Engestrom	2018-06-11	1	-0/+2
\| \| \| \|	Signed-off-by: Eric Engestrom <[email protected]>
*	add project README	Eric Engestrom	2018-06-11	1	-0/+76
\| \| \| \| \| \| \| \| \| \|	Now that we're using GitLab, let's take advantage of the "landing page" README feature with some minimal information, mostly to point people to the right resources. Acked-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
*	i965: fix resource leak	Eric Engestrom	2018-06-11	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	v2: intel_miptree_release() already takes care of the planes, no need to hand-code the loop (Lionel) Coverity ID: 1436909 Fixes: 3352f2d746d3959b22ca4 "i965: Create multiple miptrees for planar YUV images" Reviewed-by: Lionel Landwerlin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
*	freedreno/ir3: use pipe_image_view's cpp	Rob Clark	2018-06-11	1	-1/+6
\| \| \| \| \| \| \|	At least for PIPE_BUFFER, we could get the resource used as (for example) R32F imageBuffer. So using cpp=1 from the rsc is wrong. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: fix image dimensions offset	Rob Clark	2018-06-11	1	-1/+1
\| \| \| \| \| \|	copy-pasta fail from how SSBO sizes are handled. Signed-off-by: Rob Clark <[email protected]>