mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: Unroll SIMD16 DDY_FINE on Sandybridge.	Kenneth Graunke	2016-04-25	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \|	This fixes 10 dEQP-GLES3 subtests: dEQP-GLES3.functional.shaders.derivate.dfdy.texture.float_nicest.*. Matt noticed that our Piglit tests for this use even numbered registers, while the failing dEQP tests use odd numbered registers. We believe that it works for even numbered registers, but not otherwise. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	mesa/gles: Allow format GL_RED to be used with MESA_FORMAT_R_UNORM	Jordan Justen	2016-04-25	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the bound framebuffer has a format of MESA_FORMAT_R_UNORM, then IMPLEMENTATION_COLOR_READ_FORMAT will return GL_RED. This change applies to OpenGLES contexts where additional restrictions are placed on the formats that are allowed to be supported. Fixes OpenGLES 3.1 CTS tests: * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16 * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16Linear * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32F * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32FLinear Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Mark URB reads as volatile.	Kenneth Graunke	2016-04-25	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	They can be affected by URB writes. In the upcoming scalar TCS backend, this prevents read-modify-write cycles from being broken by CSE removing reads. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
*	i965: Make a few tessellation related functions non-static.	Kenneth Graunke	2016-04-25	3	-47/+51
\| \| \| \| \| \| \| \|	Also, move them to brw_shader.cpp so they're in a location for code used by both the vec4 and fs worlds. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
*	i965/tex_image: Flush certain subnormal ASTC channel values	Nanley Chery	2016-04-23	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When uploading a linear, void-extent, ASTC LDR block on Skylake, we are required to flush to zero the UNORM16 channel values that would be denormalized. This is specifically required for the values: 1, 2, and 3. Fixes the 14 failing tests in: dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.* v2: Split out flushing function (Kristian Høgsberg) v3: Map with READ instead of INVALIDATE (Kenneth Graunke) Signed-off-by: Nanley Chery <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Enable for buffer resolves	Topi Pohjolainen	2016-04-23	1	-1/+1
\| \| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94181 Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Enable for normal color clears	Topi Pohjolainen	2016-04-23	1	-0/+9
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Fix clear code for ignoring colormask for XRGB formats on Gen9+	Topi Pohjolainen	2016-04-23	1	-7/+26
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is equivalent of 73b01e2711ff45a1f313d5372d6c8fa4fe55d4d2 for blorp. v2 (Ken): No need to call _mesa_format_has_color_component() now that the number of components is gotten from _mesa_base_format_component_count(). Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa/formats: Take luminance into account in component count	Topi Pohjolainen	2016-04-23	1	-0/+1
\| \| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/blorp: Do not trigger re-emission of base state address	Topi Pohjolainen	2016-04-23	2	-2/+0
\| \| \| \| \| \| \| \|	In case blorp needs to configure it will be just as if render or compute pipeline had configured it. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Reconfigure base state address only if needed	Topi Pohjolainen	2016-04-23	3	-3/+7
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Use BRW_NEW_BLORP instead of trashing all state bits	Topi Pohjolainen	2016-04-23	2	-5/+2
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Make all atoms to track BRW_NEW_BLORP by default	Kenneth Graunke	2016-04-23	62	-46/+179
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]
*	i965: Introduce state flag for blorp	Topi Pohjolainen	2016-04-23	2	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the past, BLORP has clobbered all BRW_NEW_* state flags, to trigger re-emission of the entire 3D pipeline on the next draw. However, there are some packets BLORP simply leaves alone, so there's no need to re-emit them. Trying to reduce the set of dirty bits flagged after BLORP runs is tricky. Instead, we introduce a BRW_NEW_BLORP flag. This should be set on any atom which emits a packet that BLORP also emits. When BLORP runs, it will flag BRW_NEW_BLORP, causing those packets to get re-emitted. This also makes it easy to avoid re-emitting specific atoms - we can simply drop the BRW_NEW_BLORP flag on those. To start, we assume that all packets need to be re-emitted. This is the safest approach and closest to the existing code's behavior. Many of these are obviously not required, and can be dropped in subsequent patches. Signed-off-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
*	i965/blorp/gen6: Use normal base state address setup	Topi Pohjolainen	2016-04-23	3	-54/+5
\| \| \| \| \| \| \| \| \| \| \| \|	This is identical to the blorp version which only differs in case fragment shader isn't used. In that case blorp would reset batch buffer address to zero. This is not really needed, and having blorp to use base state address setup that is compatible with normal upload allows one to skip resetting it. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Remove pointers to non-existing atoms	Topi Pohjolainen	2016-04-23	1	-8/+0
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Disable KHR_texture_compression_astc_hdr on Gen9	Nanley Chery	2016-04-22	2	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Although Gen9 samples from most HDR ASTC surfaces of correctly, there currently are no software workarounds to fix the incorrect sampling that occurs in others of certain color endpoint modes. With this change, we are no longer failing the 14 tests from: dEQP-GLES3.functional.texture.compressed.astc.endpoint_value_hdr_cem_15.* Signed-off-by: Nanley Chery <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: Disable channel expressions for scalar GS, TCS, TES.	Kenneth Graunke	2016-04-22	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Broadwell, I get the following shader-db statistics: Tessellation Control Shaders: total instructions in shared programs: 57327 -> 57012 (-0.55%) instructions in affected programs: 27334 -> 27019 (-1.15%) helped: 45 HURT: 0 total cycles in shared programs: 265692 -> 255188 (-3.95%) cycles in affected programs: 263122 -> 252618 (-3.99%) helped: 184 HURT: 26 Tessellation Evaluation Shaders: total instructions in shared programs: 23236 -> 23157 (-0.34%) instructions in affected programs: 2791 -> 2712 (-2.83%) helped: 27 HURT: 0 total cycles in shared programs: 151858 -> 149704 (-1.42%) cycles in affected programs: 151858 -> 149704 (-1.42%) helped: 101 HURT: 114 Geometry Shaders: Orbital Explorer goes from 6442 -> 6356 instructions. Two Shadow of Mordor shaders increase by a single instruction. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/blorp: Add support for 2x msaa	Topi Pohjolainen	2016-04-22	2	-10/+9
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Add support for encoding/decoding interleaved 2x msaa	Topi Pohjolainen	2016-04-22	1	-8/+36
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: don't lower mod() in glsl ir	Samuel Iglesias Gonsálvez	2016-04-22	1	-1/+0
\| \| \| \| \| \| \| \| \|	NIR will lower it in nir_opt_algebraic. No change in shader-db. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/surface_state: Use libisl functions for image format lowering	Jason Ekstrand	2016-04-21	3	-120/+12
\| \| \| \| \| \| \|	This lets us delete some redundant code and keep all of the image_load_store format lowering logic in one place: libisl. Reviewed-by: Chad Versace <[email protected]>
*	i965/fs_surface_builder: Use isl instead of mesa for format info	Jason Ekstrand	2016-04-21	1	-66/+52
\| \| \| \|	Reviewed-by: Chad Versace <[email protected]>
*	i965/fs_surface_builder: Add a helper for converting GL to ISL formats	Jason Ekstrand	2016-04-21	1	-0/+55
\| \| \| \|	Reviewed-by: Chad Versace <[email protected]>
*	i965/fs_surface_builder: Explicitly handle FORMAT_NONE in num_image_coordinates	Jason Ekstrand	2016-04-21	1	-0/+1
\| \| \| \| \| \| \| \| \|	Previously, we were relying on has_matching_typed_format returning true for MESA_FORMAT_NONE which, in turn, relied on _mesa_get_format_bytes returning 1 for MESA_FORMAT_NONE. When we switch to ISL, this behaviour will no longer be something we can rely on. Reviewed-by: Chad Versace <[email protected]>
*	i965/fs_surface_builder: Take a GL format enum instead of mesa_format	Jason Ekstrand	2016-04-21	3	-9/+10
\| \| \| \|	Reviewed-by: Chad Versace <[email protected]>
*	i965: Add a dependency on libisl	Jason Ekstrand	2016-04-21	1	-1/+6
\| \| \| \| \| \| \|	To avoid build issues, ensure that you're running `make' at the top level and/or you've executed `make clean' beforehand. Reviewed-by: Chad Versace <[email protected]>
*	st/mesa: check return value of begin/end_query	Nicolai Hähnle	2016-04-21	1	-22/+33
\| \| \| \| \| \| \| \| \|	They can only indicate out of memory conditions, since the other error conditions are caught earlier. v2: fix error message in EndQuery Reviewed-by: Samuel Pitoiset <[email protected]>
*	i965: Always use Y-tiled buffers on SKL+	Ben Widawsky	2016-04-21	4	-8/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Starting with Skylake, the display engine is capable of scanning out from Y-tiled buffers. As such, we can and should use Y-tiling for better efficiency. This also has the added benefit of being able to fast clear the winsys buffer. Note that the buffer allocation done for mipmaps will already never allocate an X-tiled buffer for GEN9. This has an almost universal positive impact on benchmarks, some improving by as much as 20%. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*	Marek Olšák	2016-04-22	5	-55/+55
\| \| \| \|	Acked-by: Jose Fonseca <[email protected]>
*	gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*	Marek Olšák	2016-04-22	3	-50/+50
\| \| \| \| \| \| \| \|	Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <[email protected]>
*	i965: Fix clear code for ignoring colormask for XRGB formats on Gen9+.	Kenneth Graunke	2016-04-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit cda886a4851ab767fba40e8474d6fa8190347e4f, Neil made us stop advertising RGBX formats on Gen9+, as the hardware apparently no longer has working fast clear support for those formats. Instead, we just fall back to RGBA formats, and use SCS to override alpha to 1.0. This is fine, but had one unintended side effect: it made us fall back to slow clears when the color mask disables alpha. Normally, we ignore the color mask for non-existent channels. This includes alpha for XRGB formats as writing garbage to the X channel is harmless. But, now that we use RGBA, we think there's a real alpha channel, and can't do the optimization. To hack around this, check if _BaseFormat is GL_RGB and ignore alpha. Improves WebGL Aquarium performance on Skylake GT3e by about 50% by letting it use repclears instead of slow clears. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/blorp: Improve precission of blitting coordinates when clipping	Iago Toral Quiroga	2016-04-21	1	-61/+163
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We do this in two steps: first we clip the dst rect and adjust the src rect accordingly. Then we do it the other way around. In both passes the adjustment part involves multiplying by a scale factor that can lead to a small precision loss. This is breaking a few dEQP tests. Specifically, the problem happens when we need to clip the same coordinate twice. For example, if srcX0 and dstX0 need both to be clipped we want to avoid the situation where we clip srcX0 first, then adjust dstX0 accordingly but then we realize that the resulting dstX0 still needs to be clipped, so we clip dstX0 and adjust srcX0 again. Each of these two passes can lead to precission loss. What we want to do here is detect the rect that leads to the largest clip (accounting for the scale factor involved), clip that rect and adjust the other one. With this we ensure that the adjusted coordinate does not need to be clipped again and we can skip a second pass, improving precision. Fixes the following 4 dEQP tests: dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_linear Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Mark Janes <[email protected]>
*	i965/fs: Readd opt_drop_redundant_mov_to_flags().	Matt Turner	2016-04-21	2	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit b449366587b5f3f64c6fb45fe22c39e4bc8a4309. I removed the pass thinking that it was now not useful, but that was not true. I believe I ran shader-db on HSW and saw no results, but HSW does not use the unlit centroid workaround code and as a result does not emit redundant MOV_DISPATCH_TO_FLAGS instructions. On IVB, the shader-db results are: total instructions in shared programs: 6650806 -> 6646303 (-0.07%) instructions in affected programs: 106893 -> 102390 (-4.21%) helped: 793 total cycles in shared programs: 56195538 -> 56103720 (-0.16%) cycles in affected programs: 873048 -> 781230 (-10.52%) helped: 553 HURT: 209 On SNB, the shader-db results are: total instructions in shared programs: 7173074 -> 7168541 (-0.06%) instructions in affected programs: 119757 -> 115224 (-3.79%) helped: 799 total cycles in shared programs: 98128032 -> 98072938 (-0.06%) cycles in affected programs: 1437104 -> 1382010 (-3.83%) helped: 454 HURT: 237 Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/blorp: Do not emit pma stall on gen9+	Topi Pohjolainen	2016-04-21	1	-1/+3
\| \| \| \| \| \| \|	This was left out from the original gen8 upload introduction. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: automake: remove gratuitous "+" during variable assignment	Emil Velikov	2016-04-21	1	-2/+2
\| \| \| \| \| \| \|	There is not initial assignment, thus appending to it does not work. Fixes: b27c85c4c08 "i965: add build rule for brw_nir_trig_workarounds.c" Signed-off-by: Emil Velikov <[email protected]>
*	dri/common: add MESA_FORMAT_R8G8B8{A8, X8}_UNORM formats as supported configs	Rob Herring	2016-04-21	1	-0/+10
\| \| \| \| \| \| \| \| \|	Add MESA_FORMAT_R8G8B8A8_UNORM and MESA_FORMAT_R8G8B8X8_UNORM formats as these are the preferred formats for Android. Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: add build rule for brw_nir_trig_workarounds.c on Android	Rob Herring	2016-04-21	4	-2/+49
\| \| \| \| \| \| \| \| \| \| \|	Commit bfd17c76c126 ("i965: Port INTEL_PRECISE_TRIG=1 to NIR.") added a generated file brw_nir_trig_workarounds.c which broke the Android build. Add the necessary makefiles to the Android build. Cc: Kenneth Graunke <[email protected]> Signed-off-by: Rob Herring <[email protected]> Tested-by: Chih-Wei Huang <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
*	i965/tiled_memcpy: don't unconditionally use __builtin_bswap32	Jonathan Gray	2016-04-21	1	-1/+14
\| \| \| \| \| \| \| \| \| \|	Use the defines Mesa configure sets to indicate presence of the bswap32 builtins. This lets i965 work on OpenBSD again after the changes that were made in 0a5d8d9af42fd77fce1492d55f958da97816961a. Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/blorp: Reduce the urb size requirement for vertex buffer	Topi Pohjolainen	2016-04-21	1	-5/+4
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Reduce the size of vertex buffer	Topi Pohjolainen	2016-04-21	1	-12/+19
\| \| \| \| \| \| \| \| \| \| \|	Previously the vertex buffer consisted of eight floats per vertex of which six where constants. These can be as easily provided by vertex fetcher as it is capable of filling vertex elements with constant one and zero. This reduces the size of the vertex buffer from 3 * 8 * 4 = 96 to 3 * 2 * 4 = 24 bytes. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Do not tricker urb re-configuration unnecessarily	Topi Pohjolainen	2016-04-21	2	-1/+5
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Skip re-emitting urb config whenever possible	Topi Pohjolainen	2016-04-21	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	Otherwise clearing with blorp will regress performance in some synthetic test cases. v2: Used vsize >= 2 instead of vsize > 0, and updated the comment. Review by Ken in one of the earlier patches revealed this. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Prepare to switch from compute pipeline	Topi Pohjolainen	2016-04-21	1	-0/+2
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Skip uploading state/options not needed for clears	Topi Pohjolainen	2016-04-21	3	-17/+37
\| \| \| \| \| \| \| \| \|	In case there is no source it means the program does a simple clear or a resolve. In such case there is no need to program sampling state or enable pixel kill in fragment shader. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/blorp: Re-introduce clear programs	Topi Pohjolainen	2016-04-21	5	-4/+473
\| \| \| \| \| \| \|	This partially reverts 2f28a0dc23165123cf1e8b5942acad37878edd8a Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/meta: Move check for srgb into is_color_fast_clear_compatible()	Topi Pohjolainen	2016-04-21	2	-17/+19
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/meta: Expose check for fast clear compatibility	Topi Pohjolainen	2016-04-21	2	-20/+25
\| \| \| \| \| \| \|	Also add the additional render format check to the same utility. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/meta: Expose fast clear value setup	Topi Pohjolainen	2016-04-21	2	-5/+10
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/meta: Expose non-fast clear rectangle calculation	Topi Pohjolainen	2016-04-21	2	-10/+21
\| \| \| \| \|	Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>