mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	mesa: Add a helper function for determining the restart index.	Kenneth Graunke	2013-05-29	2	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The derived state approach currently used (_RestartIndex) doesn't work: in the GL_PRIMITIVE_RESTART_FIXED_INDEX case, the restart index depends on the index buffer's data type, and that isn't known until draw time. The existing code also fails to obey the GL 4.3 rules which say that FIXED_INDEX takes precedence over normal primitive restart. This helper function correctly determines the restart index, and will replace the derived state. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	vbo: Ignore PRIMITIVE_RESTART_FIXED_INDEX for glDrawArrays().	Kenneth Graunke	2013-05-29	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The derived _PrimitiveRestart enable flag combines the PrimitiveRestart and PrimitiveRestartFixedIndex enable flags. However, DrawArrays is not supposed to do FixedIndex restart: From the OpenGL 4.3 Core specification, section 10.3.5 (page 302): "If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not performed for array elements transferred by any drawing command not taking a type parameter, including all of the Draw commands other than DrawElements." The OpenGL ES 3.0 specification agrees by omission: "When DrawElements, DrawElementsInstanced, or DrawRangeElements transfers a set of generic attribute array elements to the GL..." Notably, DrawArrays is not included in the list of draw calls that take PRIMITIVE_RESTART_FIXED_INDEX into consideration. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965/vs: Fix implied_mrf_writes() for integer division pre-gen6.	Eric Anholt	2013-05-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Previously it would assertion fail in debug builds (though the correct value was returned in a non-debug build). Marking it as a candidate for stable even though it has no current consumers in the stable branches, in case one shows up in a later backport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64727 NOTE: This is a candidate for stable branches. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Fix test for smearing enabled on an instruction.	Eric Anholt	2013-05-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were expanding the live range too far, breaking register_coalesce_2() and compute_to_mrf() on 16-wide shaders. Turning it back on improves GLB2.7 performance by 0.239355% +/- 0.0850649% (n=398). shader-db stats are: total instructions in shared programs: 1627211 -> 1609262 (-1.10%) instructions in affected programs: 450351 -> 432402 (-3.99%) While 33 new 16-wide shaders are gained, 70 are lost. Despite that, tropics (the app that lost the most 16-wide) shows a .41% +/- .16% (n=7/8, first-run outlier removed) performance improvement on my HSW. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Fix segfault in instruction scheduling with LINTERP using last GRF.	Eric Anholt	2013-05-29	1	-2/+8
\| \| \| \| \| \| \| \| \|	The scheduler didn't know about uniform-type accesses, and if a uniform access was last in a 16-wide, we'd walk off the end of the array. This never happened, because we'd never coalesce out all the GRFs, due to a bug to be fixed in the next commit. Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: Fix test for optimistic coloring being necessary.	Eric Anholt	2013-05-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	i965 and radeon use ra_set_node_reg() to force payload registers to specific registers while exposing those registers to the allocator still. We were treating those register nodes as unsuccessfully allocated in the ra_simplify() step, leading to walking the registers again to do optimistic coloring even if there was nothing left ot do. Acked-by: Kenneth Graunke <[email protected]>
*	intel: Enable blit glCopyTexSubImage/glBlitFramebuffer with sRGB.	Eric Anholt	2013-05-28	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Since the introduction of default-to-SARGB8 window system framebuffers, non-blorp hardware lost blit acceleration for these two paths between the window system and ARGB8888 textures. Since we shouldn't be doing any conversion anyway, just compatibility-check the linear variants of the formats. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61954 Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Tobias Jakobi <[email protected]>
*	intel: Remove dead intel_drawbuf_region().	Eric Anholt	2013-05-28	2	-16/+0
\| \| \| \| \| \| \| \|	Since the glBitmap() MRT change, it's unused. There was basically no way to responsibly use this function since MRT was introduced. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Fix format handling of blit glBitmap()	Eric Anholt	2013-05-28	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \|	Any 32-bit format got ARGB8888 handling (including, say, GL_RG1616), and anything else got 16-bit (including, say, GL_R8), which could potentially hang the GPU by writing out of bounds. NOTE: This is a candidate for the stable branches. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Fix MRT handling of glBitmap().	Eric Anholt	2013-05-28	1	-9/+14
\| \| \| \| \| \| \| \| \|	We'd only hit color buffer 0 even if multiple draw buffers were bound. NOTE: This is a candidate for the stable branches. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Rebuild PBO blit glTexImage() on top of miptrees.	Eric Anholt	2013-05-28	1	-30/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will ensure that we have resolves if we ever extend this to glTexSubImage(), and fixes missing image start offset handling. The texture buffer alloc ended up getting moved up, because we want to look at the format of the image's actual mt to see if we'll end up blitting the right thing, in the case of packed depth/stencil uploads. This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Rebuild PBO blit glReadPixels() on top of miptrees.	Eric Anholt	2013-05-28	1	-25/+23
\| \| \| \| \| \| \| \| \|	The previous code was missing depth resolves, that had only been prevented due to no blitting of Y tiling. The pair of flip args in the new blit function means that we can just drop the pack->Invert fallback. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Rework intel_miptree_create_for_region() to wrap a BO.	Eric Anholt	2013-05-28	3	-24/+67
\| \| \| \| \| \| \| \| \| \| \| \| \|	I needed to do this for the PBO blit cases to use intel_miptree_blit(). But this also actually partially fixes a bug in EGLImage handling: We can't share regions across contexts, because regions have a refcount that isn't protected by a mutex, and different contexts can be simulataneously accessed from multiple threads. Now we just need to get regions out of __DRIImage. There was also a missing use of image->offset in the EGLImage renderbuffer storage code. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Make a temporary miptree for the blit path of miptree mapping.	Eric Anholt	2013-05-28	2	-74/+29
\| \| \| \| \| \| \| \| \| \| \|	In a bit of debug code, we no longer have the inter-slice x/y to print. But I think the level/slice is more useful in this case for looking at what's getting mapped, especially given that INTEL_DEBUG=blit will tell you the other value. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Make a temporary miptree when doing blit uploads for glTexSubImage().	Eric Anholt	2013-05-28	1	-44/+28
\| \| \| \| \| \| \| \| \| \| \|	While this is a bit more CPU work, it also is less code to handle this path, and fixes problems with 32k-pitch textures and missing resolves. v2: Add error checking in new code. Reviewed-and-tested-by: Ian Romanick <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]> (v1) Acked-by: Paul Berry <[email protected]>
*	intel: Extend the force_y_tiling flag to allow forcing no tiling.	Eric Anholt	2013-05-28	5	-13/+26
\| \| \| \| \| \| \| \| \| \| \| \|	For a blit-uploaded temporary, it's faster on current hardware to memcpy the data into a linear CPU mapping than to go through the GTT. v2: Turn the not-fully-supported mask into 3 supported enum values. Reviewed-and-tested-by: Ian Romanick <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Paul Berry <[email protected]> (v2) Reviewed-by: Chad Versace <[email protected]> (v2)
*	intel: Add an assert for glCopyTexSubImage() being called on MSAA buffers.	Eric Anholt	2013-05-28	1	-0/+6
\| \| \| \| \| \| \| \| \|	This is just in case someone else trips over this due to our weird reuse of this code in glBlitFramebuffer(). Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	i965: Allow glCopyTexSubImage() on depth textures.	Eric Anholt	2013-05-28	1	-5/+0
\| \| \| \| \| \| \| \|	If the hw is pre-gen5 and can't blit depth, it'll cleanly error out. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	i965: Prefer blorp glBlitFramebuffer() to the glCopyTexSubImage-based blit.	Eric Anholt	2013-05-28	1	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \|	I think we've measured no performance difference from this in the past, except that the blorp code can do things like multisample resolves. Prevents piglit regression in the next commit when a testcase started trying to do a multisampled resolve through the old glCopyTexSubImage() path. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	i965: Consistently do depth resolves before blitting.	Eric Anholt	2013-05-28	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	We were protected for a long time by the fact that depth was Y tiled and you couldn't blit Y. Now that we can blit Y, we were failing to resolve depth in glCopyPixels(). Note in the comment about swrast, that the swrast map path does resolves appropriately already. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Make a wrapper for intelEmitCopyBlit using miptrees.	Eric Anholt	2013-05-28	5	-111/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	I had previously asserted that it was hard to write a useful, simpler blit function, but I think this might be it. This has the side effect of extending the 32k pitch check to a few more places that were missing it. v2: Update comment for being moved inside intel_miptree_blit(). Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Rename intel_renderbuffer_tile_offsets.	Eric Anholt	2013-05-28	3	-6/+6
\| \| \| \| \| \| \| \|	This makes it more consistent with intel_miptree_get_tile_offsets(). Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Reduce intel_renderbuffer_tile_offsets to a thin wrapper.	Eric Anholt	2013-05-28	2	-28/+7
\| \| \| \| \| \|	Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
*	intel: Make intel_miptree_get_tile_offsets return a page offset.	Eric Anholt	2013-05-28	4	-10/+26
\| \| \| \| \| \| \| \| \|	Right now, the callers in i965 don't expect a nonzero page offset to actually occur (since that's being handled elsewhere), but it seems like a trap to leave it this way. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
*	mesa: fix GLSL program objects with more than 16 samplers combined	Marek Olšák	2013-05-28	5	-34/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is the sampler units are allocated from the same pool for all shader stages, so if a vertex shader uses 12 samplers (0..11), the fragment shader samplers start at index 12, leaving only 4 sampler units for the fragment shader. The main cause is probably the fact that samplers (texture unit -> sampler unit mapping, etc.) are tracked globally for an entire program object. This commit adapts the GLSL linker and core Mesa such that the sampler units are assigned to sampler uniforms for each shader stage separately (if a sampler uniform is used in all shader stages, it may occupy a different sampler unit in each, and vice versa, an i-th sampler unit may refer to a different sampler uniform in each shader stage), and the sampler-specific variables are moved from gl_shader_program to gl_shader. This doesn't require any driver changes, and it fixes piglit/max-samplers for gallium and classic swrast. It also works with any number of shader stages. v2: - converted tabs to spaces - added an assertion to _mesa_get_sampler_uniform_value Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	swrast: increase array size of TextureSample	Marek Olšák	2013-05-28	2	-4/+4
\| \| \| \| \| \| \| \|	to match the size of ctx->Texture.Unit, and it will also fix piglit/max-samplers with the following commit. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	mesa: declare UniformBufferBindings as an array with a static size	Marek Olšák	2013-05-28	4	-14/+9
\| \| \| \| \| \| \| \|	Some Gallium drivers were crashing, because the array was not large enough. v2: clamp the per-shader maximum in st/mesa, then sum them all up NOTE: This is a candidate for the stable branches.
*	xlib: add null ctx check in glXDestroyContext()	Brian Paul	2013-05-24	1	-10/+12
\| \| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934 NOTE: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <[email protected]>
*	st/mesa: add switch cases for new IR enums to silence warnings	Brian Paul	2013-05-24	1	-0/+2
\|
*	i965: Go back to using the kernel SOL reset feature.	Kenneth Graunke	2013-05-23	3	-8/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It turns out the MI_LOAD_REGISTER_IMM approach doesn't work on Haswell, and regressed essentially all the transform feedback Piglit tests. This morally reverts eaa6fbe6d54dc99efac4ab8e800edef65ce8220d. However, the code is still simpler than it was. On BeginTransformFeedback, we simply flush the batch and set the SOL reset flag so that the next batch will start with zeroed offsets. There's still no software counting. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64887 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Enable guardband clipping on Gen4/5.	Chris Forbes	2013-05-24	1	-3/+45
\| \| \| \| \| \| \| \| \| \|	Enables guardband clipping when the viewport covers the entire render target. No piglit regressions on Ironlake. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	ARB_fp: accept duplicate precision options	Chris Forbes	2013-05-24	1	-9/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Relaxes the validation of OPTION ARB_precision_hint_{nicest,fastest}; to allow duplicate options. The spec says that both /nicest/ and /fastest/ cannot be specified together, but could be interpreted either way for respecification of the same option. Other drivers (NVIDIA etc) accept this, and at least one Unity3D game expects it to succeed (Kerbal Space Program). V2: Add spec quote. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	intel: Count fragments in our blitter-based glBitmap() path.	Eric Anholt	2013-05-22	1	-8/+12
\| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59440 Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Shut up more compiler warnings from vector insert/extract changes.	Eric Anholt	2013-05-22	1	-0/+8
\| \| \| \| \|	Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Skip etc-to-rgb transcode on BayTrail.	Eric Anholt	2013-05-20	1	-31/+33
\| \| \| \| \| \|	The hardware does it, so no need for this workaround. Reviewed-and-tested-by: Kenneth Graunke <[email protected]>
*	mesa: Remove extension checking from ChooseTexFormat.	Eric Anholt	2013-05-21	1	-651/+533
\| \| \| \| \|	This should already be handled by _mesa_base_tex_format() calls in TexImage*.
*	mesa: Add ChooseTexFormat support for the new XBGR formats.	Eric Anholt	2013-05-21	1	-0/+10
\|
*	i965: Split BeginTransformFeedback hook into Gen6 and Gen7+ variants.	Kenneth Graunke	2013-05-21	4	-29/+42
\| \| \| \| \| \| \| \| \| \| \|	Most of the work in BeginTransformFeedback is only necessary on Gen6. We may as well just skip it on Gen7+. v2: Add an intel->gen == 6 assert. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Kill software primitive counting entirely.	Kenneth Graunke	2013-05-21	6	-108/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have hardware contexts, we don't need to continually reprogram the GS_SVBI_INDEX registers. They're automatically saved and restored with the context, so they can just increment over time. We only need to reset them when starting transform feedback. There's also no reason to delay until the next drawing operation; we can just emit the packet immediately. However, this means we must drop the initialization in brw_invariant_state, as BeginTransformFeedback may occur before the first drawing in a context. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Remove software geometry query code.	Kenneth Graunke	2013-05-21	4	-71/+0
\| \| \| \| \| \| \| \| \| \|	EXT_transform_feedback isn't yet supported on Gen4-5, so none of this query code is actually used. This also means we can remove some of the surrounding support code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Delete unused brw->sol.offset_0_batch_start field.	Kenneth Graunke	2013-05-21	3	-8/+0
\| \| \| \| \| \| \| \|	This was only used for the the non-hardware context code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Stop using the kernel SOL reset feature.	Kenneth Graunke	2013-05-21	3	-10/+8
\| \| \| \| \| \| \| \|	We can just do it ourselves with MI_LOAD_REGISTER_IMM. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Remove dead code for Gen7 SOL without hardware contexts.	Kenneth Graunke	2013-05-21	1	-15/+0
\| \| \| \| \| \| \| \| \|	Failing to get a hardware context now means failing to load the driver, so this code will never get hit. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Add a macro for accessing the SO_WRITE_OFFSET[0-3] registers.	Kenneth Graunke	2013-05-21	1	-0/+2
\| \| \| \| \| \| \| \|	Using a function-like macro makes it easy to loop over all four streams. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	xlib: check for null ctx pointer in glXIsDirect()	Brian Paul	2013-05-21	1	-1/+1
\| \| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64745 Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <[email protected]>
*	i965: Fix build failure	Anuj Phogat	2013-05-20	1	-0/+1
\| \| \| \| \|	meta.h should be included in brw_state_upload.c to get access to function _mesa_meta_in_progress().
*	i965: Implement transform feedback query support in hardware on Gen6+.	Kenneth Graunke	2013-05-20	1	-35/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have hardware contexts and can use MI_STORE_REGISTER_MEM, we can use the GPU's pipeline statistics counters rather than going out of our way to count primitives in software. Aside from being simpler, this also paves the way for Geometry Shaders, which can output an arbitrary number of primitives on the GPU. It will also allow us to use hardware primitive restart when these queries are in use. The GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query is easy: it corresponds to the SO_NUM_PRIMS_WRITTEN/SO_NUM_PRIMS_WRITTEN0_IVB counters. The GL_PRIMITIVES_GENERATED query is trickier. Gen provides several statistics registers which /almost/ match the semantics required: - IA_PRIMITIVES_COUNT The number of primitives fetched by the VF or IA (input assembler). This undercounts when GS is enabled, as it can output many primitives. - GS_PRIMITIVES_COUNT The number of primitives output by the GS. Unfortunately, this doesn't increment unless the GS unit is actually enabled, and it usually isn't. - SO_PRIM_STORAGE_NEEDED*_IVB The amount of space needed to write primitives output by transform feedback. These naturally only work when transform feedback is on. We'd also have to add the counters for all four streams. - CL_INVOCATION_COUNT The number of primitives processed by the clipper. This doesn't work if the GS or SOL throw away primitives for rasterizer discard. However, it does increment even if the clipper is in REJECT_ALL mode. Dynamically switching between counters would be painfully complicated, especially since GS, rasterizer discard, and transform feedback can all be switched on and off repeatedly during a single query. The most usable counter is CL_INVOCATION_COUNT. The previous two patches reworked rasterizer discard support so that all primitives hit the clipper, making this work. v2: Occlusion query bug fixes removed and squashed in earlier patches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Handle rasterizer discard in the clipper rather than GS on Gen6.	Kenneth Graunke	2013-05-20	4	-40/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has more of a negative impact than the previous patch, as on Gen6 passing primitives through to the clipper means we actually have to make the GS thread write them to the URB. I don't see another good solution though, and rasterizer discard is not the most common of cases, so hopefully it won't be too terrible. v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags; remove the rasterizer_discard field from brw_gs_prog_key. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1] Reviewed-by: Paul Berry <[email protected]>
*	i965: Handle rasterizer discard in the clipper rather than SOL on Gen7.	Kenneth Graunke	2013-05-20	2	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to implement the GL_PRIMITIVES_GENERATED query in a sane fashion on our hardware, we can't discard primitives until the clipper. The patch after next explains the rationale. By setting the clipper to REJECT_ALL mode, all primitives get thrown away, so rendering is still appropriately disabled. This may negatively impact performance in the rasterizer discard case, but it's unclear how much and this hasn't been observed to be a bottleneck in any application we've looked at. The clipper is the very next stage in the pipeline, so I don't think it will be terrible. v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	i965: Disable clipper statistics when meta operations are in progress.	Kenneth Graunke	2013-05-20	2	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't currently use the clipper statistics, but we'll soon use CL_INVOCATIONS_COUNT to implement the GL_PRIMITIVES_GENERATED query. The number of primitives generated is not supposed to be altered during operations such as glGenerateMipmap. Prevents spec/EXT_transform_feedback/generatemipmap prims_generated from breaking when we start using pipeline statistics registers to implement the GL_PRIMITIVES_GENERATED query in a few commits. v2: Use the BRW_NEW_META_IN_PROGRESS flag for correct state handling. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1] Reviewed-by: Paul Berry <[email protected]>