mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	mesa: add GL_EXT_copy_image support	Ilia Mirkin	2016-03-30	1	-0/+1
\| \| \| \| \| \| \| \|	The extension is identical to GL_OES_copy_image. But dEQP has tests that want the EXT variant. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	mesa: add GL_OES_copy_image support	Ilia Mirkin	2016-03-30	6	-1/+128
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	mesa: remove duplicate MAX_GEOMETRY_SHADER_INVOCATIONS entry	Ilia Mirkin	2016-03-30	1	-3/+0
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	st/mesa: add ES sample-shading support	Ilia Mirkin	2016-03-30	1	-0/+6
\| \| \| \| \| \| \| \| \|	We require the full ARB_gpu_shader5 for now, but in the future some other CAP could get exposed to indicate that only the multisample-related behavior of ARB_gpu_shader5 is available. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	mesa: add GL_OES_shader_multisample_interpolation support	Ilia Mirkin	2016-03-30	3	-3/+14
\| \| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	mesa: add GL_OES_sample_shading support	Ilia Mirkin	2016-03-30	4	-3/+8
\| \| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	mesa: add OES_sample_variables to extension table, add enable bit	Ilia Mirkin	2016-03-30	2	-0/+2
\| \| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Don't add barrier deps for FB write messages.	Matt Turner	2016-03-30	1	-1/+2
\| \| \| \| \| \| \|	Ken did this earlier, and this is just me reimplementing his patch a little differently. Reviewed-by: Francisco Jerez <[email protected]>
*	i965: Add and use is_scheduling_barrier() function.	Matt Turner	2016-03-30	1	-4/+17
\|
*	i965: Remove NOP insertion kludge in scheduler.	Matt Turner	2016-03-30	1	-20/+5
\| \| \| \| \| \| \| \| \| \| \|	Instead of removing every instruction in add_insts_from_block(), just move the instruction to its scheduled location. This is a step towards doing both bottom-up and top-down scheduling without conflicts. Note that this patch changes cycle counts for programs because it begins including control flow instructions in the estimates. Reviewed-by: Francisco Jerez <[email protected]>
*	i965: Assert that an instruction is not inserted around itself.	Matt Turner	2016-03-30	1	-0/+4
\| \| \| \|	Reviewed-by: Francisco Jerez <[email protected]>
*	i965: Relax restriction on scheduling last instruction.	Matt Turner	2016-03-30	1	-20/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I think when this code was written, basic blocks were always ended by a control flow instruction or an end-of-thread message. That's no longer the case, and removing this restriction actually helps things: instructions in affected programs: 7267 -> 7244 (-0.32%) helped: 4 total cycles in shared programs: 66559580 -> 66431900 (-0.19%) cycles in affected programs: 28310152 -> 28182472 (-0.45%) helped: 9577 HURT: 879 GAINED: 2 The addition of the is_control_flow() checks is not a functional change, since the add_insts_from_block() does not put them in the list of instructions to schedule. I plan to change this in a later patch. Reviewed-by: Francisco Jerez <[email protected]>
*	i965/vec4/tcs: Set conditional mod on TCS_OPCODE_SRC0_010_IS_ZERO.	Matt Turner	2016-03-30	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Missing this causes an assertion failure in the scheduler with the next patch. Additionally, this gives cmod propagation enough information to optimize code better. total instructions in shared programs: 7112991 -> 7112852 (-0.00%) instructions in affected programs: 25704 -> 25565 (-0.54%) helped: 139 total cycles in shared programs: 64812898 -> 64810674 (-0.00%) cycles in affected programs: 127224 -> 125000 (-1.75%) helped: 139 Acked-by: Francisco Jerez <[email protected]>
*	Revert "i965: Don't add barrier deps for FB write messages."	Matt Turner	2016-03-30	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit d0e1d6b7e27bf5f05436e47080d326d7daa63af2. The change in the vec4 code is a mistake -- there's never an FS_OPCODE_FB_WRITE in vec4 code. The change in the fs code had the (harmless) effect of not recognizing an FB_WRITE as a scheduling barrier even if it was marked EOT -- harmless because the scheduler marked the last instruction of a block as a barrier, something I'm changing in the following patches. This will be reimplemented later in the series.
*	i965: Simplify full scheduling-barrier conditions.	Matt Turner	2016-03-30	1	-27/+8
\| \| \| \| \| \| \|	All of these were simply code for "architecture register file" (and in the case of destinations, "not the null register"). Reviewed-by: Francisco Jerez <[email protected]>
*	i965: Remove incorrect cycle estimates.	Matt Turner	2016-03-30	1	-10/+0
\| \| \| \| \| \| \| \|	These printed the cycle count the last basic block (sched.time is set per basic block!). We have accurate, full program, data printed elsewhere. Reviewed-by: Francisco Jerez <[email protected]>
*	st/mesa: fix fallout from xfb changes.	Dave Airlie	2016-03-31	1	-2/+2
\| \| \| \| \| \| \|	Failed to update state tracker with new buffer interface. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	mesa: add query support for GL_TRANSFORM_FEEDBACK_BUFFER interface	Timothy Arceri	2016-03-31	3	-2/+51
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	glsl: add transform feedback buffers to resource list	Timothy Arceri	2016-03-31	3	-3/+3
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	mesa: add support to query GL_TRANSFORM_FEEDBACK_BUFFER_INDEX	Timothy Arceri	2016-03-31	2	-0/+7
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	mesa: add support to query GL_OFFSET for GL_TRANSFORM_FEEDBACK_VARYING	Timothy Arceri	2016-03-31	2	-3/+12
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	mesa: rename tranform feeback varying macro XFB to XFV	Timothy Arceri	2016-03-31	1	-6/+6
\| \| \| \| \| \|	A latter patch will use XFB for buffers. Reviewed-by: Dave Airlie <[email protected]>
*	glsl: validate global out xfb_stride qualifiers and set stride on empty buffers	Timothy Arceri	2016-03-31	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Here we use the built-in validation in ast_layout_expression::process_qualifier_constant() to check for mismatching global out strides on buffers in a single shader. From the ARB_enhanced_layouts spec: "While xfb_stride can be declared multiple times for the same buffer, it is a compile-time or link-time error to have different values specified for the stride for the same buffer." For intrastage validation a new helper link_xfb_stride_layout_qualifiers() is created. We also take this opportunity to make sure stride is at least a multiple of 4, we will validate doubles at a later stage. From the ARB_enhanced_layouts spec: "If the buffer is capturing any double-typed outputs, the stride must be a multiple of 8, otherwise it must be a multiple of 4, or a compile-time or link-time error results." Finally we update store_tfeedback_info() to apply the strides to LinkedTransformFeedback and update the buffers bitmask to mark any global buffers with a stride as active. For example a shader with: layout (xfb_buffer = 0, xfb_offset = 0) out vec4 gs_fs; layout (xfb_buffer = 1, xfb_stride = 64) out; Is expected to have a buffer bound to both 0 and 1. From the ARB_enhanced_layouts spec: "A binding point requires a bound buffer object if and only if its associated stride in the program object used for transform feedback primitive capture is non-zero." Reviewed-by: Dave Airlie <[email protected]>
*	mesa: split transform feedback buffer into its own struct	Timothy Arceri	2016-03-31	6	-20/+28
\| \| \| \| \| \| \|	This will be used in a following patch to implement interface query support for TRANSFORM_FEEDBACK_BUFFER. Reviewed-by: Dave Airlie <[email protected]>
*	glsl: use bitmask of active xfb buffer indices	Timothy Arceri	2016-03-31	4	-22/+24
\| \| \| \| \| \| \| \| \| \| \| \| \|	This allows us to print the correct binding point when not all buffers declared in the shader are bound. For example if we use a single buffer: layout(xfb_buffer=2, offset=0) out vec4 v; We now print '2' when the buffer is not bound rather than '0'. Reviewed-by: Dave Airlie <[email protected]>
*	i965: Don't inline intel_batchbuffer_require_space().	Matt Turner	2016-03-30	2	-26/+28
\| \| \| \| \| \| \| \| \| \| \| \|	It's called by the inline intel_batchbuffer_begin() function which itself is used in BEGIN_BATCH. So in sequence of code emitting multiple packets, we have inlined this ~200 byte function multiple times. Making it an out-of-line function presumably improved icache usage. Improves performance of Gl32Batch7 by 3.39898% +/- 0.358674% (n=155) on Ivybridge. Reviewed-by: Abdiel Janulgue <[email protected]>
*	mesa: allow mutable buffer textures to back GL ES images	Ilia Mirkin	2016-03-29	1	-1/+6
\| \| \| \| \| \| \| \| \|	Since there is no way to create immutable texture buffers in GL ES, mutable buffer textures are allowed to back images. See issue 7 of the GL_OES_texture_buffer specification. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	mesa: make _mesa_prepare_mipmap_level() static	Brian Paul	2016-03-29	2	-15/+8
\| \| \| \| \| \| \| \| \|	No longer called from any other file. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]>
*	meta: use _mesa_prepare_mipmap_levels()	Brian Paul	2016-03-29	1	-24/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The prepare_mipmap_level() wrapper for _mesa_prepare_mipmap_level() is not needed. It only served to undo the GL_TEXTURE_1D_ARRAY height/depth change was was made before the call to prepare_mipmap_level() Said another way, regardless of how the meta code manipulates the height/ depth dims for GL_TEXTURE_1D_ARRAY, the gl_texture_image dimensions are correctly set up by _mesa_prepare_mipmap_levels(). Tested by plugging _mesa_meta_GenerateMipmap() into the swrast driver and testing with piglit. v2 (idr): Early out of the mipmap generation loop with dstImage is NULL. This can occur for immutable textures that have a limited range of levels or in the presense of memory allocation failures. Fixes arb_texture_view-mipgen on Intel platforms. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	xlib: add support for GLX_ARB_create_context	Brian Paul	2016-03-29	3	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the glXCreateContextAttribsARB() function for the xlib/swrast driver. This allows more piglit tests to run with this driver. For example, without this patch we get: $ bin/fbo-generatemipmap-1d -auto piglit: error: waffle_config_choose failed due to WAFFLE_ERROR_UNSUPPORTED_ ON_PLATFORM: GLX_ARB_create_context is required in order to request an OpenGL version not equal to the default value 1.0 piglit: error: Failed to create waffle_config for OpenGL 2.0 Compatibility Context piglit: info: Failed to create any GL context PIGLIT: {"result": "skip" } Reviewed-by: Jose Fonseca <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
*	st/mesa: simplify st_generate_mipmap()	Brian Paul	2016-03-29	1	-78/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The whole st_generate_mipmap() function was overly complicated. Now we just call the new _mesa_prepare_mipmap_levels() function to prepare the texture mipmap memory, then call the generate function which fills in the texture images. This fixes a failed assertion in llvmpipe/softpipe which is hit with the new piglit generatemipmap-base-change test. Also fixes some device errors (format mismatches) with the VMware svga driver. v2: fix a comment typo, per Sinclair Reviewed-by: Sinclair Yeh <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	mesa: new _mesa_prepare_mipmap_levels() function for mipmap generation	Brian Paul	2016-03-29	2	-31/+62
\| \| \| \| \| \| \| \| \| \| \| \|	Simplifies the loops in generate_mipmap_uncompressed() and generate_mipmap_compressed(). Will be used in the state tracker too. Could probably be used in the meta code. If so, some additional clean-ups can be done after that. v2: use unsigned types instead of GLuint, per Ian Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	i965: Don't use CUBE wrap modes for integer formats on IVB/BYT.	Kenneth Graunke	2016-03-29	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no linear filtering for integer formats, so we should always be using CLAMP_TO_EDGE mode. Fixes 46 dEQP cases on Ivybridge (which were likely broken by commit 0faf26e6a0a34c3544644852802484f2404cc83e). This workaround doesn't appear to be necessary on any other hardware; I haven't found any documentation mentioning errata in this area. v2: Only apply on Ivybridge/Baytrail to avoid regressing GLES3.1 tests. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> [v1]
*	Revert "i965: Set address rounding bits for GL_NEAREST filtering as well."	Kenneth Graunke	2016-03-29	1	-6/+3
\| \| \| \| \| \| \|	This reverts commit 60d6a8989ab44cf47accee6bc692ba6fb98f6a9f. It's pretty sketchy, and apparently regressed a bunch of dEQP tests on Sandybridge.
*	st/mesa: implement new DMA-buf based VDPAU interop v2	Christian König	2016-03-29	1	-49/+132
\| \| \| \| \| \| \| \| \| \|	Avoid using internal structures from another API. v2: rebase and moved includes so they don't cause problem when VDPAU isn't installed. Signed-off-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (v1) Reviewed-by: Leo Liu <[email protected]>
*	st/mesa: enable OES_texture_buffer when all components available	Ilia Mirkin	2016-03-29	1	-0/+6
\| \| \| \| \| \| \| \|	OES_texture_buffer combines bits from a number of desktop extensions. When they're all available, turn it on. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	mesa: add OES_texture_buffer and EXT_texture_buffer support	Ilia Mirkin	2016-03-28	7	-44/+55
\| \| \| \| \| \| \| \|	Allow ES 3.1 contexts to access the texture buffer functionality. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	mesa: add OES_texture_buffer and EXT_texture_buffer extension to table	Ilia Mirkin	2016-03-28	2	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	We need to add a new bit since the GL ES exts require functionality from a combination of texture buffer extensions as well as images (for imageBuffer) support. Additionally, not all GPUs support all the texture buffer functionality (e.g. rgb32 isn't supported by nv50). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	mesa: properly return GetTexLevelParameter queries for buffer textures	Ilia Mirkin	2016-03-28	1	-2/+52
\| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes all failures with dEQP tests in this area. While ARB_texture_buffer_object explicitly says that GetTexLevelParameter & co should not be supported, GL 3.1 reverses this decision and allows all of these queries there. Conversely, there is no text that forbids the buffer-specific queries from being used with non-buffer images. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	mesa: remove initialized field from uniform storage	Timothy Arceri	2016-03-29	3	-43/+1
\| \| \| \| \| \| \| \|	The only place this was used was in a gallium debug function that had to be manually enabled. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	i965: Set address rounding bits for GL_NEAREST filtering as well.	Kenneth Graunke	2016-03-28	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yuanhan Liu decided these were useful for linear filtering in commit 76669381 (circa 2011). Prior to that, we never set them; it seems he tried to preserve that behavior for nearest filtering. It turns out they're useful for nearest filtering, too: setting these fixes the following dEQP-GLES3 tests: functional.fbo.blit.rect.nearest_consistency_mag functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_y functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_min functional.fbo.blit.rect.nearest_consistency_min_reverse_src_x functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_y Apparently, BLORP has always set these bits unconditionally. However, setting them unconditionally appears to regress tests using texture projection, 3D samplers, integer formats, and vertex shaders, all in combination, such as: functional.shaders.texture_functions.textureprojlod.isampler3d_vertex Setting them on Gen4-5 appears to regress Piglit's tests/spec/arb_sampler_objects/framebufferblit. Honestly, it looks like the real problem here is a lack of precision. I'm just hacking around problems here (as embarassing as it is). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Always use BRW_TEXCOORDMODE_CUBE when seamless filtering.	Kenneth Graunke	2016-03-28	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using seamless cube map mode and NEAREST filtering, we explicitly overrode the wrap modes to CLAMP_TO_EDGE. This was to implement the following spec text: "If NEAREST filtering is done within a miplevel, always apply apply wrap mode CLAMP_TO_EDGE." However, textureGather() ignores the sampler's filtering mode, and instead returns the four pixels that would be blended by LINEAR filtering. This implies that we should do proper seamless filtering, and include pixels from adjacent cube faces. It turns out that we can simply delete the NEAREST -> CLAMP_TO_EDGE overrides. Normal cube map sampling works by first selecting the face, and then nearest filtering fetches the closest texel. If the nearest texel was on a different face, then that face would have been chosen. So it should always be within the face anyway, which effectively performs CLAMP_TO_EDGE. Fixes 86 dEQP-GLES31.texture.gather.basic.cube.* tests. Signed-off-by: Kenneth Graunke <[email protected]> Suggested-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs.	Kenneth Graunke	2016-03-28	2	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Our driver uses the brw_render_cache mechanism to track buffers we've rendered to and are about to sample from. Previously, we did a single PIPE_CONTROL with the following bits set: - Render Target Flush - Depth Cache Flush - Texture Cache Invalidate - VF Cache Invalidate - Instruction Cache Invalidate - CS Stall This combined both "top of pipe" invalidations and "bottom of pipe" flushes, which isn't how the hardware is intended to be programmed. The "top of pipe" invalidations may happen right away, without any guarantees that rendering using those caches has completed. That rendering may continue altering the caches. The "bottom of pipe" flushes do wait for the rendering to complete. The CS stall also prevents further work from happening until data is flushed out. What we wanted to do was wait for rendering complete, flush the new data out of the render and depth caches, wait, then invalidate any stale data in read-only caches. We can accomplish this by doing the "bottom of pipe" flushes with a CS stall, then the "top of pipe" flushes as a second PIPE_CONTROL. The flushes will wait until the rendering is complete, and the CS stall will prevent the second PIPE_CONTROL with the invalidations from executing until the first is done. Fixes dEQP-GLES3.functional.texture.specification.teximage2d_pbo subtests on Braswell and Skylake. These tests hit the meta PBO texture upload path, which binds the PBO as a texture and samples from it, while rendering to the destination texture. The tests then sample from the texture. For now, we leave Gen4-5 alone. It probably needs work too, but apparently it hasn't even been setting the (G45+) TC invalidation bit at all... v2: Add Sandybridge post-sync non-zero workaround, for safety. Cc: [email protected] Suggested-by: Francisco Jerez <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
*	i965: Whack UAV bit when FS discards and there are no color writes.	Kenneth Graunke	2016-03-28	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no framebuffer attachments, using a shader that discards based on gl_FragCoord. It uses occlusion queries to inspect whether pixels are rendered or not. Unfortunately, the hardware is not dispatching any pixel shaders, so discards never happen, and the full quad of pixels increments PS_DEPTH_COUNT, making the occlusion query results bogus. To understand why, we have to delve into the WM_INT internal signalling mechanism's formulas. The "WM_INT::Pixel Shader Kill Pixel" signal is defined as: 3DSTATE_WM::ForceKillPixel == ON \|\| (3DSTATE_WM::ForceKillPixel != Off && !WM_INT::WM_HZ_OP && 3DSTATE_WM::EDSC_Mode != PREPS && (WM_INT::Depth Write Enable \|\| WM_INT::Stencil Write Enable) && ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (3DSTATE_PS_EXTRA::PixelShaderKillsPixels \|\| 3DSTATE_PS_EXTRA:: oMask Present to RenderTarget \|\| 3DSTATE_PS_BLEND::AlphaToCoverageEnable \|\| 3DSTATE_PS_BLEND::AlphaTestEnable \|\| 3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable)) Because there is no depth or stencil buffer, writes to those buffers are disabled. So the highlighted condition is false, making the whole "Kill Pixel" condition false. This then feeds into the following "WM_INT::ThreadDispatchEnable" condition: 3DSTATE_WM::ForceThreadDispatch != OFF && !WM_INT::WM_HZ_OP && 3DSTATE_PS_EXTRA::PixelShaderValid && (3DSTATE_PS_EXTRA::PixelShaderHasUAV \|\| WM_INT::Pixel Shader Kill Pixel \|\| WM_INT::RTIndependentRasterizationEnable \|\| (!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT && 3DSTATE_PS_BLEND::HasWriteableRT) \|\| (WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF && (WM_INT::Depth Test Enable \|\| WM_INT::Depth Write Enable)) \|\| (3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) \|\| (3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable \|\| WM_INT::Depth Write Enable \|\| WM_INT::Stencil Test Enable))) Given that there's no depth/stencil testing, no writeable render target, and the hardware thinks kill pixel doesn't happen, all of these conditions are false. We have to whack some bit to make PS invocations happen. There are many options. Curro suggested using the UAV bit. There's some precedence in doing that - we set it for fragment shaders that do SSBO/image/atomic writes when no color buffer writes are enabled. We can simply include discard here too. Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests. v2: Add a comment suggested and written by Jason Ekstrand. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	mesa/st: Fix NULL access if no fragment shader is bound	Bas Nieuwenhuizen	2016-03-28	1	-2/+2
\| \| \| \| \|	Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	st/mesa: only minify height if target != 1D array in st_finalize_texture	Marek Olšák	2016-03-28	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \|	The st_texture_object documentation says: "the number of 1D array layers will be in height0" We can't minify that. Spotted by luck. No app is known to hit this issue. Reviewed-by: Ilia Mirkin <[email protected]>
*	mesa: optimize out the realloc from glCopyTexImagexD()	Miklós Máté	2016-03-27	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \|	v2: comment about the purpose of the code v3: also compare texFormat, add a perf debug message, formatting fixes Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	st/mesa: fix handling the fallback texture	Miklós Máté	2016-03-27	1	-3/+4
\| \| \| \| \| \| \| \| \| \|	This fixes crash when post-processing is enabled in SW:KotOR. v2: fix const-ness v3: move assignment into the if() block Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	st/mesa: enable GL_ATI_fragment_shader	Miklós Máté	2016-03-27	1	-0/+1
\| \| \| \| \|	Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	st/mesa: implement GL_ATI_fragment_shader	Miklós Máté	2016-03-27	10	-4/+1064
\| \| \| \| \| \| \| \| \| \| \| \| \|	v2: fix arithmetic for special opcodes, fix fog state, cleanup v3: simplify handling of special opcodes, fix rebinding with different textargets or fog equation, lots of formatting fixes v4: adapt to the compile early, fix later architecture, formatting fixes Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>