mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: Fix typos in license	Ian Romanick	2015-09-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	grep -lr 'sub license' \| while read f; do \ sed --in-place -e 's/sub license/sublicense/' $f ;\ done grep -lr 'NON-INFRINGEMENT' \| while read f; do \ sed --in-place -e 's/NON-INFRINGEMENT/NONINFRINGEMENT/' $f ;\ done As noted by Matt, both of these changes match the MIT license text found at http://opensource.org/licenses/MIT. Signed-off-by: Ian Romanick <[email protected]> Acked-by: Matt Turner <[email protected]>
*	i965: Remove horizontal bars from file header comments	Ian Romanick	2015-09-10	1	-4/+2
\| \| \| \| \| \| \|	Why was that ever a thing? Signed-off-by: Ian Romanick <[email protected]> Acked-by: Matt Turner <[email protected]>
*	i965: Resolve GCC sign-compare warning.	Rhys Kidd	2015-08-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	mesa/src/mesa/drivers/dri/i965/brw_draw.c: In function 'brw_draw_destroy': mesa/src/mesa/drivers/dri/i965/brw_draw.c:630:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < brw->vb.nr_buffers; i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_draw.c:636:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < brw->vb.nr_enabled; i++) { ^ Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
*	i965: Resolve GCC sign-compare warning.	Rhys Kidd	2015-08-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	mesa/src/mesa/drivers/dri/i965/brw_draw.c: In function 'brw_postdraw_set_buffers_need_resolve': mesa/src/mesa/drivers/dri/i965/brw_draw.c:390:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < fb->_NumColorDrawBuffers; i++) { ^ Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
*	vbo: pass the stream from DrawTransformFeedbackStream to drivers	Marek Olšák	2015-08-06	1	-1/+2
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	i965: Trivial formatting changes in brw_draw.c	Ian Romanick	2015-08-03	1	-51/+51
\| \| \| \| \| \|	Signed-off-by: Ian Romanick <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
*	mesa: Rename _mesa_lookup_enum_by_nr() to _mesa_enum_to_string().	Kenneth Graunke	2015-07-20	1	-4/+4
\| \| \| \| \| \| \|	Generated by sed; no manual changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	i965: Move BEGIN_BATCH() into same control flow as ADVANCE_BATCH().	Matt Turner	2015-07-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	BEGIN_BATCH() and ADVANCE_BATCH() will contain "do {" and "} while (0)" respectively to allow declaring local variables used by intervening OUT_BATCH macros. As such, BEGIN_BATCH() and ADVANCE_BATCH() need to be in the same control flow. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
*	i965: Rename intel_emit* to reflect their new location in brw_pipe_control	Chris Wilson	2015-06-24	1	-2/+2
\| \| \| \| \|	Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Assert that the GL primitive isn't out of range.	Matt Turner	2015-06-23	1	-1/+3
\| \| \| \| \| \| \| \|	Coverity sees the if (mode >= BRW_PRIM_OFFSET (128)) test and assumes that the else-branch might execute for mode to up 127, which out be out of bounds. Reviewed-by: Jordan Justen <[email protected]>
*	i965: Use predicate enable bit for conditional rendering w/o stalling	Neil Roberts	2015-05-12	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously whenever a primitive is drawn the driver would call _mesa_check_conditional_render which blocks waiting for the result of the query to determine whether to render. On Gen7+ there is a bit in the 3DPRIMITIVE command which can be used to disable the primitive based on the value of a state bit. This state bit can be set based on whether two registers have different values using the MI_PREDICATE command. We can load these two registers with the pixel count values stored in the query begin and end to implement conditional rendering without stalling. Unfortunately these two source registers were not in the whitelist of available registers in the kernel driver until v3.19. This patch uses the command parser version from intel_screen to detect whether to attempt to set the predicate data registers. The predicate enable bit is currently only used for drawing 3D primitives. For blits, clears, bitmaps, copypixels and drawpixels it still causes a stall. For most of these it would probably just work to call the new brw_check_conditional_render function instead of _mesa_check_conditional_render because they already work in terms of rendering primitives. However it's a bit trickier for blits because it can use the BLT ring or the blorp codepath. I think these operations are less useful for conditional rendering than rendering primitives so it might be best to leave it for a later patch. v2: Use the command parser version to detect whether we can write to the predicate data registers instead of trying to execute a register load command. v3: Simple rebase v4: Changes suggested by Kenneth Graunke: Split the load_64bit_register function out to a separate patch so it can be a shared public function. Avoid calling _mesa_check_conditional_render if we've already determined that there's no query object. Some styling fixes. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/state: Don't use brw->state.dirty.brw	Jordan Justen	2015-03-31	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now, we only use ctx->NewDriverState. I used this bash & sed command in the i965 directory: for file in .[ch] .[ch]pp; do sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/state: Rename brw_clear_dirty_bits to brw_render_state_finished	Jordan Justen	2015-03-31	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/state: Rename brw_upload_state to brw_upload_render_state	Jordan Justen	2015-03-31	1	-3/+4
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Do Sandybridge workaround flushes before each primitive.	Kenneth Graunke	2015-02-17	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sandybridge requires the post-sync non-zero workaround in a ton of places, and if you ever miss one, the GPU usually hangs. Currently, we try to track exactly when a workaround flush is necessary (via the brw->batch.need_workaround_flush flag). This is tricky to get right, and we've botched it several times in the past. This patch unconditionally performs the post-sync non-zero flush at the start of each primitive's state upload (including BLORP). We drop the needs_workaround_flush flag, and drop all the other callers, as the flush has already been performed. We have no data to indicate that simply flushing all the time will hurt performance, and it has the potential to help stability. v2: Add post-sync workaround to initial GPU state upload to be extra cautious (suggested by Chad Versace). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Fix start/base_vertex_location for >1 prims but !BRW_NEW_VERTICES.	Kenneth Graunke	2014-12-31	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a partial revert of c89306983c07e5a88c0d636267e5ccf263cb4213. It split the {start,base}_vertex_location handling into several steps: 1. Set brw->draw.start_vertex_location = prim[i].start and brw->draw.base_vertex_location = prim[i].basevertex. (This happened once per _mesa_prim, in the main drawing loop.) 2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset appropriately. (This happened in brw_prepare_shader_draw_parameters, which was called just after brw_prepare_vertices, as part of state upload, and only happened when BRW_NEW_VERTICES was flagged.) 3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim). If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on the second (or later) primitives, we would do step #1, but not #2. The first _mesa_prim would get correct values, but subsequent ones would only get the first half of the summation. The reason I originally did this was because I needed the value of gl_BaseVertexARB to exist in a buffer object prior to uploading 3DSTATE_VERTEX_BUFFERS. I believed I wanted to upload the value of 3DPRIMITIVE's "Base Vertex Location" field, which was computed as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) + brw->vb.start_vertex_bias. The latter value wasn't available until after brw_prepare_vertices, and the former weren't available in the state upload code at all. Hence the awkward split. However, I believe that including brw->vb.start_vertex_bias was a mistake. It's an extra bias we apply when uploading vertex data into VBOs, to move [min_index, max_index] to [0, max_index - min_index]. >From the GL_ARB_shader_draw_parameters specification: "<gl_BaseVertexARB> holds the integer value passed to the <baseVertex> parameter to the command that resulted in the current shader invocation. In the case where the command has no <baseVertex> parameter, the value of <gl_BaseVertexARB> is zero." I conclude that gl_BaseVertexARB should only include the baseVertex parameter from glDrawElements, not any internal biases we add for optimization purposes. With that in mind, gl_BaseVertexARB only needs prim[i].start or prim[i].basevertex. We can simply store that, and go back to computing start_vertex_location and base_vertex_location in brw_emit_prim(), like we used to. This is much simpler, and should actually fix two bugs. Fixes missing geometry in Unvanquished. Cc: "10.4 10.3" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529 Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Use WARN_ONCE for the single-primitive-exceeded-aperture message.	Kenneth Graunke	2014-12-31	1	-9/+4
\| \| \| \| \| \| \|	This makes it show up via ARB_debug_output and is also less code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Compute VS attribute WA bits earlier and check if they changed.	Kenneth Graunke	2014-12-04	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BRW_NEW_VERTICES is flagged every time we draw a primitive. Having the brw_vs_prog atom depend on BRW_NEW_VERTICES meant that we had to compute the VS program key and do a program cache lookup for every single primitive. This is painfully expensive. The workaround bit computation is almost entirely based on the vertex attribute arrays (brw->vb.inputs[i]), which are set by brw_merge_inputs. The only thing it uses the VS program for is to see which VS inputs are actually read. brw_merge_inputs() happens once per primitive, and can safely look at the currently bound vertex program, as it doesn't change in the middle of a draw. This patch moves the workaround bit computation to brw_merge_inputs(), right after assigning brw->vb.inputs[i], and stores the previous WA bit values in the context. If they've actually changed from the last draw (which is uncommon), we signal that we need a new vertex program, causing brw_vs_prog to compute a new key. Improves performance in Gl32Batch7 by 13.6123% +/- 0.739652% (n=166) on Haswell GT3e. I'm told Baytrail shows similar gains. v2: Introduce a new BRW_NEW_VS_ATTRIB_WORKAROUNDS dirty bit, rather than reusing BRW_NEW_VERTEX_PROGRAM (suggested by Chris Forbes). This prevents unnecessary re-emission of surface/sampler related atoms (and an SOL atom on Sandybridge). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Just return void from brw_try_draw_prims	Ian Romanick	2014-12-02	1	-5/+2
\| \| \| \| \| \| \| \| \| \|	Note from Ken: "We used to use the return value to indicate whether software fallbacks were necessary, but we haven't in years." Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Fix reference counting in new basevertex upload code.	Kenneth Graunke	2014-09-12	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the non-indirect draw case, we call intel_upload_data to upload gl_BaseVertex. It makes brw->draw.draw_params_bo point to the upload buffer, and increments the upload BO reference count. So, we need to unreference it when making brw->draw.draw_params_bo point at something else, or else we'll retain a reference to stale upload buffers and hold on to them forever. This also means that the indirect case should increment the reference count on the indirect draw buffer when making brw->draw.draw_params_bo point at it. That way, both paths increment the reference count, so we can safely unreference it every time. Signed-off-by: Kenneth Graunke <[email protected]> Cc: "10.3" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]>
*	i965: Make gl_BaseVertex available in a buffer object.	Kenneth Graunke	2014-09-10	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \|	This will be used for GL_ARB_shader_draw_parameters, as well as fixing gl_VertexID, which is supposed to include gl_BaseVertex's value. For indirect draws, we simply point at the indirect buffer; for normal draws, we upload the value via the upload buffer. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Calculate start/base_vertex_location after preparing vertices.	Kenneth Graunke	2014-09-10	1	-9/+8
\| \| \| \| \|	Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	Revert 5 i965 patches: 8e27a4d2, 373143ed, c5bdf9be, 6f56e142, 88e3d404	Jordan Justen	2014-09-04	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reverts * "i965: Modify state upload to allow 2 different sets of state atoms." 8e27a4d2b3e4e74e9a77446bce49607433d86be3 * "i965: Modify dirty bit handling to support 2 pipelines." 373143ed9187c4d4ce1e3c486b5dd0880d18ec8b * "i965: Create a macro for checking a dirty bit." c5bdf9be1eca190417998d548fd140c1eca37a54 Conflicts: src/mesa/drivers/dri/i965/brw_context.h * "i965: Create a macro for setting all dirty bits." 6f56e1424d923fd80c84090fbf4506c9eaaffea1 Conflicts: src/mesa/drivers/dri/i965/brw_blorp.cpp src/mesa/drivers/dri/i965/brw_state_cache.c src/mesa/drivers/dri/i965/brw_state_upload.c * "i965: Create a macro for setting a dirty bit." 88e3d404dad009d8cff5124cf8acee7daeaceb64 Signed-off-by: Jordan Justen <[email protected]>
*	i965: Modify dirty bit handling to support 2 pipelines.	Paul Berry	2014-09-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The hardware state for compute shaders is almost entirely orthogonal to the hardware state for 3D rendering. To avoid sending unnecessary state to the hardware, we'll need to have a separate set of state atoms for the compute pipeline and the 3D pipeline. That means we need to maintain two separate sets of dirty bits to determine which state atoms need to be run. But the dirty bits are not completely independent; for example, if BRW_NEW_SURFACES is flagged while doing 3D rendering, then not only do we need to re-run 3D state atoms that depend on BRW_NEW_SURFACES, but we also need to re-run compute state atoms that depend on BRW_NEW_SURFACES. But we'll also need to re-run those state atoms the next time the compute pipeline is run. To accomplish this, we record two sets of dirty bits, one for each pipeline. When bits are dirtied (via SET_DIRTY_BIT() or SET_DIRTY_ALL()) we set them to the dirty state in both pipelines. When brw_state_upload() is run, we clear the dirty bits just for the pipeline that was run. Note that since the number of pipelines is known at compile time to be 2, the compiler should unroll the loops in SET_DIRTY_BIT() and SET_DIRTY_ALL(). Reviewed-by: Jordan Justen <[email protected]>
*	i965: Create a macro for setting a dirty bit.	Paul Berry	2014-09-01	1	-6/+6
\| \| \| \| \| \| \|	This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <[email protected]>
*	i965: Move pre-draw resolve buffers to dd::UpdateState	Kristian Høgsberg	2014-08-15	1	-40/+0
\| \| \| \| \| \| \| \| \| \| \|	No functional change except for glBegin/glEnd style rendering, where we now do the resolves at glBegin time instead of FLUSH_VERTICES time. This is also the reason for this change, so that when we later switch fast clear resolve to use meta, we won't be doing meta operations in the middle of a begin/end sequence. Signed-off-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: Add a mechanism for sending native primitives into the driver	Kristian Høgsberg	2014-08-15	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \|	The brw_draw_prims() function is the draw entry point into the driver, and takes struct _mesa_prim for input. We want to be able to feed native primitives into the driver, and to that end we introduce BRW_PRIM_OFFSET, which lets use describe geometry using the native GEN primitive types. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Emit a performance warning on conditional rendering.	Kenneth Graunke	2014-08-08	1	-0/+5
\| \| \| \| \| \| \| \| \|	We have a CPU-side implementation of conditional rendering; it really should be done on the GPU. It's not necessarily that hard, but nobody has gotten to fixing it yet. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Drop sizeof(struct brw_sampler_state) from estimated prim size.	Kenneth Graunke	2014-08-02	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	This is the last user of the structure. v2: Use a local variable with a sensible name so people know what 16 is. (Suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Avoid redundant call to brw_merge_inputs() in brw_try_draw_prims()	Iago Toral Quiroga	2014-05-13	1	-7/+6
\| \| \| \| \| \| \| \|	We always call brw_merge_inputs() right before looping over the primitives but this can be called inside the loop for each primitive too. In the case we do it for the first primitive the call is redundant and can be skipped. Reviewed-by: Eric Anholt <[email protected]>
*	i965: Delete the intel_regions.c code.	Eric Anholt	2014-05-01	1	-1/+0
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Drop use of intel_region from miptrees.	Eric Anholt	2014-05-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Note: region->width/height used to reflect the total_width/height padding of separate stencil, though mt->total_width didn't. region->width/height was being used in EGL images, where the padded value would have been the wrong one, so I converted them to use rb->Width/Height. v2: Drop debug printf that slipped in (caught by Ken) Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	mesa: Replace use of _ReallyEnabled as a boolean with use of _Current.	Eric Anholt	2014-04-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	I'm probably not the only person that has tried to kill _ReallyEnabled. This does the mechanical part of the work, and cleans _ReallyEnabled from i965. I think that using _Current makes texture management clearer: You can't have multiple targets in use in the same texture image unit at the same time, because there's just that one pointer. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Use ctx->Texture._MaxEnabledTexImageUnit for upper bound	Chris Forbes	2014-04-21	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	Avoid looping over 32/48/96 (!!) tex image units every draw, most of which we don't care about. Improves performance on everyone's favorite not-a-benchmark by 2.9% on Haswell. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Stop setting up a 1:1 "attrib" member in our vertex inputs.	Eric Anholt	2014-04-11	1	-1/+0
\| \| \| \| \| \| \| \| \|	It's just the array index, so we can just go look at the array and see which element we are. No significant performance difference (n=140) Reviewed-by: Ian Romanick <[email protected]>
*	tnl: Merge _tnl_vbo_draw_prims() into _tnl_draw_prims().	Iago Toral Quiroga	2014-04-08	1	-12/+12
\| \| \| \| \| \| \| \| \|	This should help prevent situations where we render without proper index bounds. For example: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Make sure we always compute valid index bounds before drawing.	Iago Toral Quiroga	2014-03-28	1	-1/+2
\| \| \| \| \| \| \| \| \|	When doing software rendering (i.e. rendering to the selection buffer) we need to make sure that we have valid index bounds before calling _tnl_draw_prims(), otherwise we can crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Drop broken front_buffer_reading/drawing optimization.	Eric Anholt	2014-03-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	The flag wasn't getting updated correctly when the ctx->DrawBuffer or ctx->ReadBuffer changed. It usually ended up working out because most apps only have one window system framebuffer, or if they have more than one and they have any front read/drawing, they will have called glReadBuffer()/glDrawBuffer() on it when they get started on the new buffer. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965: Fix render-to-texture in non-FinishRenderTexture cases.	Eric Anholt	2014-03-06	1	-6/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've had several problems now with FinishRenderTexture not getting called enough, and we're ready to just give up on it ever doing what we need. In particular, an upcoming Steam title had rendering bugs that could be fixed by always_flush_cache=true. Instead of hoping Mesa core can figure out when we need to flush our caches, just track what BOs we've rendered to in a set, and when we render from a BO in that set, emit a flush and clear the set. There's some overhead to keeping this set, but most of that is just hashing the pointer -- it turns out our set never even gets very large, because cache flushes are so common (even on cairo-gl). No statistically significant performance difference in cairo-gl (n=100), despite spending ~.5% CPU in these set operations. v1: (Original patch by Eric Anholt.) v2: (Changes by Ken Graunke.) - Rebase forward from May 7th 2013 -> March 4th 2014. - Drop the FinishRenderTexture hook entirely; after rebasing the patch, the hook was just an empty function. - Move the brw_render_cache_set_clear() call from intel_batchbuffer_emit_flush() to brw_emit_pipe_control_flush(). In theory, this could catch more cases where we've flushed. - Consider stencil as a possible texturing source. v3: (changes by anholt): - Move set_clear() back to emit_mi_flush() -- it means we can drop more forced flushes from the code. In the previous location, it wouldn't have been called when we wanted pre-gen6. - Move the set clear from batch init to reset -- it should be empty at the start of every batch, since the kernel handled any inter-batch flush for us. v4: Drop the debug code in set.c that I accidentally committed. Signed-off-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Dylan Baker <[email protected]> [v2]
*	i965: Validate (and resolve) all the bound textures.	Chris Forbes	2014-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BRW_MAX_TEX_UNIT is the static limit on the number of textures we support per-stage, not in total. Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which is significantly larger, and across the various shader stages, up to ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually used. Fixes invisible bad behavior in piglit's max-samplers test (although this escalated to an assertion failure on HSW with texture_view, since non-immutable textures only have _Format set by validation.) Signed-off-by: Chris Forbes <[email protected]> Cc: "9.2 10.0 10.1" <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Move singlesample_mt to the renderbuffer.	Eric Anholt	2014-02-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Since only window system renderbuffers can have a singlesample_mt, this lets us drop a bunch of sanity checking to make sure that we're just a renderbuffer-like thing. v2: Fix a badly-written comment (thanks Kenneth!), drop the now trivial helper function for set_needs_downsample. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Use the new brw_load_register_mem helper for draw indirect.	Kenneth Graunke	2014-02-07	1	-31/+22
\| \| \| \| \| \| \| \| \| \| \|	This makes it work on Broadwell, too. v2: Drop bogus double write to 3DPRIM_BASE_VERTEX register (caught by Chris Forbes). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	s/Tungsten Graphics/VMware/	José Fonseca	2014-01-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 \| xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra$ph\\|hp$ics,\? [iI]nc\.\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics$,\? [iI]nc\.$\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/[email protected]/[email protected]/ s/[email protected]/[email protected]/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\[email protected]/[email protected]/g s/keithw\[email protected]/[email protected]/g s/[email protected]/[email protected]/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/[email protected]/[email protected]/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <[email protected]>
*	i965: Ensure that all necessary state is re-emitted if we run out of aperture.	Paul Berry	2014-01-13	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, if we ran out of aperture space during brw_try_draw_prims(), we would rewind the batch buffer pointer (potentially throwing some state that may have been emitted by brw_upload_state()), flush the batch, and then try again. However, we wouldn't reset the dirty bits to the state they had before the call to brw_upload_state(). As a result, when we tried again, there was a danger that we wouldn't re-emit all the necessary state. (Note: prior to the introduction of hardware contexts, this wasn't a problem because flushing the batch forced all state to be re-emitted). This patch fixes the problem by leaving the dirty bits set at the end of brw_upload_state(); we only clear them after we have determined that we don't need to rewind the batch buffer. Cc: 10.0 9.2 <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Drop trailing whitespace from the rest of the driver.	Kenneth Graunke	2013-12-05	1	-9/+9
\| \| \| \| \| \| \|	Performed via: $ for file in ; do sed -i 's/ //g'; done Signed-off-by: Kenneth Graunke <[email protected]>
*	i965: pass indirect buffer to primitive restart check	Chris Forbes	2013-11-25	1	-3/+4
\| \| \| \| \| \|	Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: implement indirect drawing for Gen7	Chris Forbes	2013-11-25	1	-2/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Just prior to emitting the 3DPRIMITIVE command, we load each of the indirect registers. The values loaded are either from offsets into the current indirect BO, or constant zero if the parameter is not used for this draw. Enabling use of the indirect registers is done by turning on a bit in the first dword of the 3DPRIMITIVE command itself. V3: - Deduplicate the common part of both indexed and nonindexed indirect setup. - Just refer to the indirect bo out of the context directly. V4: - Fix bo reference to specify the range we care about. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	mesa: add indirect drawing buffer parameter to draw functions	Christoph Bumiller	2013-11-25	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Split from patch implementing ARB_draw_indirect. v2: Const-qualify the struct gl_buffer_object *indirect argument. v3: Fix up some more draw calls for new argument. v4: Fix up rebase conflicts in i965. v5: Undo const-qualification Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Convert brw->batch.is_blit to a BLT_RING/RENDER_RING enum.	Kenneth Graunke	2013-11-21	1	-1/+1
\| \| \| \| \| \| \| \|	Passing BLT_RING or RENDER_RING to batchbuffer functions is a lot more obvious than passing true or false. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	i965: Mark brw_draw_prims tfb_vertcount parameter as unused.	Kenneth Graunke	2013-10-31	1	-1/+3
\| \| \| \| \| \| \| \| \|	Renaming it makes it obvious that it isn't used, and the assertion verifies that the VBO module never passes us such an object. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>