mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	i965: add a debug option to disable oa config loading	Lionel Landwerlin	2017-11-28	3	-2/+4
\| \| \| \| \| \| \| \| \|	This provides a good way to verify we haven't broken using the perf driver on older kernels (which don't have the oa config loading mechanism). Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: perf: add support for userspace configurations	Lionel Landwerlin	2017-11-28	1	-8/+101
\| \| \| \| \| \| \| \| \| \|	This allows us to deploy new configurations without touching the kernel. v2: Detect loadable configs without creating one (Chris) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: perf: update configs for loading from userspace	Lionel Landwerlin	2017-11-28	10	-0/+243
\| \| \| \| \| \| \| \| \| \|	When making configs loadable from userspace in the kernel, we left to userspace more responsability around programming some registers. In particular one register we use to set directly in the driver has now been moved into the configs. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	util: add mesa-sha1 test to meson	Eric Engestrom	2017-11-28	1	-0/+9
\| \| \| \| \| \|	Fixes: 513d7ffa23d42e96f831 "util: Add a SHA1 unit test program" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	compiler: fix typo	Eric Engestrom	2017-11-28	1	-1/+1
\| \| \| \| \|	Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	compiler: use NDEBUG to guard asserts	Eric Engestrom	2017-11-28	3	-6/+6
\| \| \| \| \| \| \| \| \|	nir_validate.c's #endif already had the correct NDEBUG comment Fixes: dcb1acdea00a8f2c29777 "nir/validate: Only build in debug mode" Fixes: 9ff71b649b4b3808a9e17 "i965/nir: Validate that NIR passes call nir_metadata_preserve()" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	broadcom: use NDEBUG to guard asserts	Eric Engestrom	2017-11-28	1	-5/+5
\| \| \| \| \|	Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	vc4: check preprocessor token existence using #ifdef instead of #if	Eric Engestrom	2017-11-28	1	-3/+3
\| \| \| \| \| \| \|	(other uses of USE_VC4_SIMULATOR are already correct) Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	docs/llvmpipe.html: Minor edits	Ben Crocker	2017-11-28	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \|	Language and spelling fixups in three places. Cc: "17.2" "17.3" <[email protected]> Signed-off-by: Ben Crocker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> [Eric: move two fixes from the other patch to this one.] Signed-off-by: Eric Engestrom <[email protected]>
*	st/dri: replace hard-coded array size with ARRAY_SIZE()	Eric Engestrom	2017-11-28	1	-1/+1
\| \| \| \| \|	Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi/gfx9: simplify condition for on-chip ESGS	Nicolai Hähnle	2017-11-28	1	-3/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: clarify that si_shader_selector::esgs_itemsize is set for the ES part	Nicolai Hähnle	2017-11-28	1	-1/+3
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: use si_shader_context instead of lp_build_context in more places	Nicolai Hähnle	2017-11-28	1	-27/+23
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: cleanup si_initialize_color_surface	Nicolai Hähnle	2017-11-28	1	-12/+12
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: avoid attempting to create CMASK if the tiling mode doesn't have it	Nicolai Hähnle	2017-11-28	1	-0/+2
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: check that we don't leak fine.buf references	Nicolai Hähnle	2017-11-28	1	-0/+2
\| \| \| \| \| \|	Just as an added precaution. Reviewed-by: Marek Olšák <[email protected]>
*	ac/surface: fix indentation	Nicolai Hähnle	2017-11-28	1	-1/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: sid.h cleanups	Nicolai Hähnle	2017-11-28	3	-20/+38
\| \| \| \| \| \| \|	Fix a bunch of labels indicating when registers were added/removed and normalize the SI-class GRBM_GFX_INDEX. Reviewed-by: Marek Olšák <[email protected]>
*	st_glsl_to_tgsi: check for the tail sentinel in merge_two_dsts	Nicolai Hähnle	2017-11-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	This fixes yet another case where DFRACEXP has only one destination. Found by address sanitizer. Fixes tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4-only-mantissa.shader_test Fixes: 3b666aa74795 ("st/glsl_to_tgsi: fix DFRACEXP with only one destination") Acked-by: Marek Olšák <[email protected]>
*	mesa/gles: adjust internal format in glTexSubImage2D error checks	Tapani Pälli	2017-11-28	1	-1/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When floating point textures are created on OpenGL ES 2.0, driver is free to choose used internal format. Mesa makes this decision in adjust_for_oes_float_texture. Error checking for glTexImage2D properly checks that sized formats are not used. We use same error checking path for glTexSubImage2D (since there is lot of overlap), however since those checks include internalFormat checks, we need to pass original internalFormat passed by the client. Patch adds oes_float_internal_format that does reverse adjust_for_oes_float_texture to get that format. Fixes following test failure: ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float (when running test with MESA_GLES_VERSION_OVERRIDE=2.0) Signed-off-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227 Cc: "17.3" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	radv: Use the suffixed versions of VK_QUEUE_GLOBAL_PRIORITY_*	Jason Ekstrand	2017-11-27	1	-4/+4
\| \| \| \|	Acked-by: Dave Airlie <[email protected]>
*	vulkan: Update the XML and headers to 1.0.66	Jason Ekstrand	2017-11-27	2	-24/+116
\| \| \| \|	Acked-by: Dave Airlie <[email protected]>
*	intel/blorp: Drop blorp_resolve_ccs_attachment	Jason Ekstrand	2017-11-27	2	-61/+20
\| \| \| \| \| \| \| \| \| \|	The only reason why we needed that version was because the Vulkan driver needed to be able to create the surface states so it could handle indirect clear colors. Now that blorp handles them natively, there's no need for the extra entrypoint. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
*	anv: Let blorp handle indirect clear colors for CCS resolves	Jason Ekstrand	2017-11-27	3	-67/+20
\| \| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
*	anv: Move get_fast_clear_state_address into anv_private.h	Jason Ekstrand	2017-11-27	2	-50/+33
\| \| \| \| \| \| \|	While we're at it, we break it into two nicely named functions. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
*	intel/blorp: Take a range of layers in blorp_ccs_resolve	Jason Ekstrand	2017-11-27	3	-4/+8
\| \| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
*	intel/blorp: Add initial support for indirect clear colors	Jason Ekstrand	2017-11-27	6	-0/+109
\| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965/blorp: Use a designated initializer for blorp_surf	Jason Ekstrand	2017-11-27	1	-8/+9
\| \| \| \| \| \| \| \|	This way uninitialized fields get automatically zeroed and it's safe to add more fields to blorp_surf. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
*	intel/blorp: Add fast-clear to the special case in MSAA resolves	Jason Ekstrand	2017-11-27	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \|	This doesn't go all the way of avoiding the txf_ms if it's fast-cleared, however it does at least make us only do it once. This should improve performance of MSAA resolves in the presence of lots of clear color. Without the patch, enabling fast-clears in the multisampling Sascha demo drops the framerate by about 10%. With this patch, enabling fast-clears increases the demo's framerate by 25%. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
*	intel/blorp/blit: Rename blorp_nir_txf_ms_mcs	Jason Ekstrand	2017-11-27	1	-4/+5
\| \| \| \| \| \| \| \| \|	That name is already taken by one of the helpers in blorp_nir_builder.h and, while we haven't moved the guts of blorp_blit.c there yet, we'd like to start using some things from that header. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
*	Android: disable warnings causing errors	Rob Herring	2017-11-27	1	-0/+1
\| \| \| \| \| \| \|	AOSP master has changed the build default to -Werror making all the warnings errors. Override that with -Wno-error. Signed-off-by: Rob Herring <[email protected]>
*	st/glsl_to_tgsi: make use of driver_cache_blob with the disk cache	Timothy Arceri	2017-11-28	4	-231/+110
\| \| \| \| \| \| \| \| \| \| \| \|	driver_cache_blob was introduced with the i965 disk cache, it allows us to simplify the cache a little and possibly offers some minor speed improvements since we load the GLSL metadata and TGSI from disk in one pass. Using driver_cache_blob should also make it straight forward to implement binary support for ARB_get_program_binary in gallium. Reviewed-by: Marek Olšák <[email protected]>
*	glsl: Fix typo nagivation -> navigation	Gwan-gyeong Mun	2017-11-28	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	gl_table.py: add extern C guard for the generated glapitable.h	Emil Velikov	2017-11-27	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	The header can be included from C++, hence contents should have appropriate notation. Cc: [email protected] Cc: Dylan Baker <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	ac: pack legacy_surf_level better	Marek Olšák	2017-11-27	1	-3/+3
\| \| \| \| \| \|	r600_texture: 1488 -> 1248 bytes Reviewed-by: Nicolai Hähnle <[email protected]>
*	ac: change legacy_surf_level::slice_size to dword units	Marek Olšák	2017-11-27	12	-36/+38
\| \| \| \| \| \| \| \| \|	The next commit will reduce the size even more. v2: typecast to uint64_t manually v3: add more typecasts, add asserts Reviewed-by: Nicolai Hähnle <[email protected]>
*	ac: pack ac_surface better	Marek Olšák	2017-11-27	3	-11/+12
\| \| \| \| \| \|	r600_texture: 1736 -> 1488 bytes Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: always initialize max_forced_staging_uploads	Marek Olšák	2017-11-27	1	-0/+2
\| \| \| \| \| \| \| \| \|	r600_resource is malloc'd. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103808 Fixes: 4b0dc098b256 ("gallium/u_threaded: don't map big VRAM buffers for the first upload directly") Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: remove an old hack for evergreen	Marek Olšák	2017-11-27	1	-10/+0
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: set COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST when profitable	Marek Olšák	2017-11-27	1	-1/+16
\| \| \| \| \| \|	ported from Vulkan Reviewed-by: Nicolai Hähnle <[email protected]>
*	ac/nir: don't write tcs outputs to LDS that aren't read back.	Dave Airlie	2017-11-27	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \|	If the TCS doesn't read back the outputs, no need to store them to LDS in the first place. (except for tess factors). This seems to give about 50fps (3290->3330) with tessellation demo. I haven't tested if it impacts DoW3 at all. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	nir: fill outputs_read field and add patch outputs read (v2)	Dave Airlie	2017-11-27	2	-12/+30
\| \| \| \| \| \| \| \|	This is to be used for TCS optimisations on radv. v2: don't set written on reads (nha) Reviewed-by: Timothy Arceri <[email protected]>
*	r600/eg: dump event type in dumps	Dave Airlie	2017-11-27	1	-0/+1
\| \| \| \| \| \|	This just makes it easier to debug some things. Signed-off-by: Dave Airlie <[email protected]>
*	nouveau/compiler: Allow to omit line numbers when printing instructions	Tobias Klausmann	2017-11-26	5	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This comes in handy when checking "NV50_PROG_DEBUG=1" outputs with diff! V2: - Use environmental variable (Karol Herbst) V3: - Use the already populated nv50_ir_prog_info to forward information to the print pass (Pierre Moreau) V4: - get rid of default value in PrintPass constructor Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	radeonsi: try flushing unflushed fences in si_fence_finish even when timeout ↵	Nicolai Hähnle	2017-11-26	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	== 0 Under certain conditions, waiting on a GL sync objects should act like a flush, regardless of the timeout. Portal 2, CS:GO, and presumably other Source engine games rely on this behavior and hang during loading without this fix. Fixes: bc65dcab3bc4 ("radeonsi: avoid syncing the driver thread in si_fence_finish") Signed-off-by: Marek Olšák <[email protected]> Tested-by: Kai Wasserbäch <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103902 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103904
*	nv50/ir: move LateAlgebraicOpt to the very end	Ilia Mirkin	2017-11-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Memory loads can take offsets, but the SHLADD will often attempt to consume the offsets too. As there may be multiple memory loads with the same base but different offsets, those would end up in a SHLADD instead of the offset of the memory operation. This moves the pass after we've had a chance to attempt to propagate immediate adds into the indirect offset. total instructions in shared programs : 6580681 -> 6567716 (-0.20%) total gprs used in shared programs : 944261 -> 943375 (-0.09%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60339896 -> 60221504 (-0.20%) local shared gpr inst bytes helped 0 0 555 2698 2698 hurt 0 0 138 336 336 Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: when merging immediates/consts, load directly	Ilia Mirkin	2017-11-26	1	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a MERGE operation gets its constraint moves added, it susbstantially extends live ranges to be reusing an immediate from earlier in the program (not to mention the silliness of loading an immediate into a register, and then moving into another register). We detect these scenarios and insert moves that take the immediate or constbuf load directly into the register. If it's the last use, then we can just move that operation to the closer location. With SM35 (255 regs) we get these results: total instructions in shared programs : 6583670 -> 6580681 (-0.05%) total gprs used in shared programs : 950818 -> 944261 (-0.69%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60367456 -> 60339896 (-0.05%) local shared gpr inst bytes helped 0 0 4584 3186 3186 hurt 0 0 55 968 968 I suspect they will be better for SM20 and SM30. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: add optimization for modulo by a non-power-of-2 value	Ilia Mirkin	2017-11-26	1	-0/+15
\| \| \| \| \| \| \| \|	We can still use the optimized division methods which make use of multiplication with overflow. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]>
*	nv50/ir: optimize signed integer modulo by pow-of-2	Ilia Mirkin	2017-11-25	2	-10/+29
\| \| \| \| \| \| \| \| \|	It's common to use signed int modulo in GLSL. As it happens, the GLSL specs allow the result to be undefined, but that seems fairly surprising. It's not that much more effort to get it right, at least for positive modulo operators. Signed-off-by: Ilia Mirkin <[email protected]>
*	util: Just give up and define PIPE_ARCH_LITTLE_ENDIAN on MSVC	Matt Turner	2017-11-25	1	-2/+3
\| \| \| \|	MSVC doesn't support #warning?! Getting really tired of this.