aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_draw.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: Reemit vertex state between indirect multi drawsKristian Høgsberg Kristensen2015-12-291-2/+22
| | | | | | | | | | | | If we're doing an indirect draw, prims[i].basevertex is always 0 and the real base vertex value is in the indirect parameter buffer. We try to avoid flagging BRW_NEW_VERTICES if prims[i].basevertex doesn't change, which then breaks down for indirect draws. Thus, if a program uses base vertex or base instance, and the draw call is indirect, always flag BRW_NEW_VERTICES. A new piglit test, spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this. Reviewed-by: Anuj Phogat <[email protected]>
* i965: Add support for gl_DrawIDARB and enable extensionKristian Høgsberg Kristensen2015-12-291-0/+12
| | | | | | | | | | We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <[email protected]>
* i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARBKristian Høgsberg Kristensen2015-12-291-2/+2
| | | | | | | We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <[email protected]>
* i965: Add state bits for tess stagesChris Forbes2015-12-071-2/+5
| | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add backend structures for tess stagesChris Forbes2015-12-071-0/+4
| | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Drop #include of main/glheader.h.Matt Turner2015-11-241-1/+0
| | | | | | It's never used. Reviewed-by: Ian Romanick <[email protected]>
* i965: Map GL_PATCHES to 3DPRIM_PATCHLIST_n.Kenneth Graunke2015-11-111-1/+8
| | | | | | | | | | | | | | | | Inspired by a patch by Fabian Bieler. Fabian defined a _3DPRIM_PATCHLIST_0 macro (which isn't actually a valid topology type); I instead chose to make a macro that takes an argument. He also took the number of patch vertices from _mesa_prim (which was set to ctx->TessCtrlProgram.patch_vertices) - I chose to use it directly to avoid the need for the VBO patch. v2: Change macro to 0x20 + (n - 1) instead of 0x1F + n to better match the documentation (suggested by Ian). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa/i965: Refactor brw_is_front_buffer_{drawing,reading} to common codeIan Romanick2015-10-061-1/+2
| | | | | | | | | | There are multiple similar implementations of these functions, and a later patch was going to add another. v2: Move removing intel_framebuffer to a different patch. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Use C99 initializers for primitive arraysIan Romanick2015-10-061-24/+24
| | | | | | | | | Using C99 initializers for the primitive arrays makes things more readable. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix typos in licenseIan Romanick2015-09-101-2/+2
| | | | | | | | | | | | | | | | grep -lr 'sub license' | while read f; do \ sed --in-place -e 's/sub license/sublicense/' $f ;\ done grep -lr 'NON-INFRINGEMENT' | while read f; do \ sed --in-place -e 's/NON-INFRINGEMENT/NONINFRINGEMENT/' $f ;\ done As noted by Matt, both of these changes match the MIT license text found at http://opensource.org/licenses/MIT. Signed-off-by: Ian Romanick <[email protected]> Acked-by: Matt Turner <[email protected]>
* i965: Remove horizontal bars from file header commentsIan Romanick2015-09-101-4/+2
| | | | | | | Why was that ever a thing? Signed-off-by: Ian Romanick <[email protected]> Acked-by: Matt Turner <[email protected]>
* i965: Resolve GCC sign-compare warning.Rhys Kidd2015-08-181-1/+1
| | | | | | | | | | | | | mesa/src/mesa/drivers/dri/i965/brw_draw.c: In function 'brw_draw_destroy': mesa/src/mesa/drivers/dri/i965/brw_draw.c:630:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < brw->vb.nr_buffers; i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_draw.c:636:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < brw->vb.nr_enabled; i++) { ^ Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* i965: Resolve GCC sign-compare warning.Rhys Kidd2015-08-181-1/+1
| | | | | | | | | | mesa/src/mesa/drivers/dri/i965/brw_draw.c: In function 'brw_postdraw_set_buffers_need_resolve': mesa/src/mesa/drivers/dri/i965/brw_draw.c:390:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < fb->_NumColorDrawBuffers; i++) { ^ Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* vbo: pass the stream from DrawTransformFeedbackStream to driversMarek Olšák2015-08-061-1/+2
| | | | Reviewed-by: Dave Airlie <[email protected]>
* i965: Trivial formatting changes in brw_draw.cIan Romanick2015-08-031-51/+51
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Juha-Pekka Heikkila <[email protected]>
* mesa: Rename _mesa_lookup_enum_by_nr() to _mesa_enum_to_string().Kenneth Graunke2015-07-201-4/+4
| | | | | | | Generated by sed; no manual changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Move BEGIN_BATCH() into same control flow as ADVANCE_BATCH().Matt Turner2015-07-151-2/+2
| | | | | | | | | | BEGIN_BATCH() and ADVANCE_BATCH() will contain "do {" and "} while (0)" respectively to allow declaring local variables used by intervening OUT_BATCH macros. As such, BEGIN_BATCH() and ADVANCE_BATCH() need to be in the same control flow. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* i965: Rename intel_emit* to reflect their new location in brw_pipe_controlChris Wilson2015-06-241-2/+2
| | | | | Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Assert that the GL primitive isn't out of range.Matt Turner2015-06-231-1/+3
| | | | | | | | Coverity sees the if (mode >= BRW_PRIM_OFFSET (128)) test and assumes that the else-branch might execute for mode to up 127, which out be out of bounds. Reviewed-by: Jordan Justen <[email protected]>
* i965: Use predicate enable bit for conditional rendering w/o stallingNeil Roberts2015-05-121-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously whenever a primitive is drawn the driver would call _mesa_check_conditional_render which blocks waiting for the result of the query to determine whether to render. On Gen7+ there is a bit in the 3DPRIMITIVE command which can be used to disable the primitive based on the value of a state bit. This state bit can be set based on whether two registers have different values using the MI_PREDICATE command. We can load these two registers with the pixel count values stored in the query begin and end to implement conditional rendering without stalling. Unfortunately these two source registers were not in the whitelist of available registers in the kernel driver until v3.19. This patch uses the command parser version from intel_screen to detect whether to attempt to set the predicate data registers. The predicate enable bit is currently only used for drawing 3D primitives. For blits, clears, bitmaps, copypixels and drawpixels it still causes a stall. For most of these it would probably just work to call the new brw_check_conditional_render function instead of _mesa_check_conditional_render because they already work in terms of rendering primitives. However it's a bit trickier for blits because it can use the BLT ring or the blorp codepath. I think these operations are less useful for conditional rendering than rendering primitives so it might be best to leave it for a later patch. v2: Use the command parser version to detect whether we can write to the predicate data registers instead of trying to execute a register load command. v3: Simple rebase v4: Changes suggested by Kenneth Graunke: Split the load_64bit_register function out to a separate patch so it can be a shared public function. Avoid calling _mesa_check_conditional_render if we've already determined that there's no query object. Some styling fixes. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/state: Don't use brw->state.dirty.brwJordan Justen2015-03-311-11/+11
| | | | | | | | | | | | | | | Now, we only use ctx->NewDriverState. I used this bash & sed command in the i965 directory: for file in *.[ch] *.[ch]pp; do sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/state: Rename brw_clear_dirty_bits to brw_render_state_finishedJordan Justen2015-03-311-1/+1
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/state: Rename brw_upload_state to brw_upload_render_stateJordan Justen2015-03-311-3/+4
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Do Sandybridge workaround flushes before each primitive.Kenneth Graunke2015-02-171-3/+0
| | | | | | | | | | | | | | | | | | | | | | | Sandybridge requires the post-sync non-zero workaround in a ton of places, and if you ever miss one, the GPU usually hangs. Currently, we try to track exactly when a workaround flush is necessary (via the brw->batch.need_workaround_flush flag). This is tricky to get right, and we've botched it several times in the past. This patch unconditionally performs the post-sync non-zero flush at the start of each primitive's state upload (including BLORP). We drop the needs_workaround_flush flag, and drop all the other callers, as the flush has already been performed. We have no data to indicate that simply flushing all the time will hurt performance, and it has the potential to help stability. v2: Add post-sync workaround to initial GPU state upload to be extra cautious (suggested by Chad Versace). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Fix start/base_vertex_location for >1 prims but !BRW_NEW_VERTICES.Kenneth Graunke2014-12-311-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a partial revert of c89306983c07e5a88c0d636267e5ccf263cb4213. It split the {start,base}_vertex_location handling into several steps: 1. Set brw->draw.start_vertex_location = prim[i].start and brw->draw.base_vertex_location = prim[i].basevertex. (This happened once per _mesa_prim, in the main drawing loop.) 2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset appropriately. (This happened in brw_prepare_shader_draw_parameters, which was called just after brw_prepare_vertices, as part of state upload, and only happened when BRW_NEW_VERTICES was flagged.) 3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim). If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on the second (or later) primitives, we would do step #1, but not #2. The first _mesa_prim would get correct values, but subsequent ones would only get the first half of the summation. The reason I originally did this was because I needed the value of gl_BaseVertexARB to exist in a buffer object prior to uploading 3DSTATE_VERTEX_BUFFERS. I believed I wanted to upload the value of 3DPRIMITIVE's "Base Vertex Location" field, which was computed as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) + brw->vb.start_vertex_bias. The latter value wasn't available until after brw_prepare_vertices, and the former weren't available in the state upload code at all. Hence the awkward split. However, I believe that including brw->vb.start_vertex_bias was a mistake. It's an extra bias we apply when uploading vertex data into VBOs, to move [min_index, max_index] to [0, max_index - min_index]. >From the GL_ARB_shader_draw_parameters specification: "<gl_BaseVertexARB> holds the integer value passed to the <baseVertex> parameter to the command that resulted in the current shader invocation. In the case where the command has no <baseVertex> parameter, the value of <gl_BaseVertexARB> is zero." I conclude that gl_BaseVertexARB should only include the baseVertex parameter from glDraw*Elements*, not any internal biases we add for optimization purposes. With that in mind, gl_BaseVertexARB only needs prim[i].start or prim[i].basevertex. We can simply store that, and go back to computing start_vertex_location and base_vertex_location in brw_emit_prim(), like we used to. This is much simpler, and should actually fix two bugs. Fixes missing geometry in Unvanquished. Cc: "10.4 10.3" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529 Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Use WARN_ONCE for the single-primitive-exceeded-aperture message.Kenneth Graunke2014-12-311-9/+4
| | | | | | | This makes it show up via ARB_debug_output and is also less code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Compute VS attribute WA bits earlier and check if they changed.Kenneth Graunke2014-12-041-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BRW_NEW_VERTICES is flagged every time we draw a primitive. Having the brw_vs_prog atom depend on BRW_NEW_VERTICES meant that we had to compute the VS program key and do a program cache lookup for every single primitive. This is painfully expensive. The workaround bit computation is almost entirely based on the vertex attribute arrays (brw->vb.inputs[i]), which are set by brw_merge_inputs. The only thing it uses the VS program for is to see which VS inputs are actually read. brw_merge_inputs() happens once per primitive, and can safely look at the currently bound vertex program, as it doesn't change in the middle of a draw. This patch moves the workaround bit computation to brw_merge_inputs(), right after assigning brw->vb.inputs[i], and stores the previous WA bit values in the context. If they've actually changed from the last draw (which is uncommon), we signal that we need a new vertex program, causing brw_vs_prog to compute a new key. Improves performance in Gl32Batch7 by 13.6123% +/- 0.739652% (n=166) on Haswell GT3e. I'm told Baytrail shows similar gains. v2: Introduce a new BRW_NEW_VS_ATTRIB_WORKAROUNDS dirty bit, rather than reusing BRW_NEW_VERTEX_PROGRAM (suggested by Chris Forbes). This prevents unnecessary re-emission of surface/sampler related atoms (and an SOL atom on Sandybridge). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Just return void from brw_try_draw_primsIan Romanick2014-12-021-5/+2
| | | | | | | | | | Note from Ken: "We used to use the return value to indicate whether software fallbacks were necessary, but we haven't in years." Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix reference counting in new basevertex upload code.Kenneth Graunke2014-09-121-0/+3
| | | | | | | | | | | | | | | | | | | | In the non-indirect draw case, we call intel_upload_data to upload gl_BaseVertex. It makes brw->draw.draw_params_bo point to the upload buffer, and increments the upload BO reference count. So, we need to unreference it when making brw->draw.draw_params_bo point at something else, or else we'll retain a reference to stale upload buffers and hold on to them forever. This also means that the indirect case should increment the reference count on the indirect draw buffer when making brw->draw.draw_params_bo point at it. That way, both paths increment the reference count, so we can safely unreference it every time. Signed-off-by: Kenneth Graunke <[email protected]> Cc: "10.3" <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Ian Romanick <[email protected]>
* i965: Make gl_BaseVertex available in a buffer object.Kenneth Graunke2014-09-101-0/+14
| | | | | | | | | | | This will be used for GL_ARB_shader_draw_parameters, as well as fixing gl_VertexID, which is supposed to include gl_BaseVertex's value. For indirect draws, we simply point at the indirect buffer; for normal draws, we upload the value via the upload buffer. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Calculate start/base_vertex_location after preparing vertices.Kenneth Graunke2014-09-101-9/+8
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* Revert 5 i965 patches: 8e27a4d2, 373143ed, c5bdf9be, 6f56e142, 88e3d404Jordan Justen2014-09-041-10/+10
| | | | | | | | | | | | | | | | | | | | | | Reverts * "i965: Modify state upload to allow 2 different sets of state atoms." 8e27a4d2b3e4e74e9a77446bce49607433d86be3 * "i965: Modify dirty bit handling to support 2 pipelines." 373143ed9187c4d4ce1e3c486b5dd0880d18ec8b * "i965: Create a macro for checking a dirty bit." c5bdf9be1eca190417998d548fd140c1eca37a54 Conflicts: src/mesa/drivers/dri/i965/brw_context.h * "i965: Create a macro for setting all dirty bits." 6f56e1424d923fd80c84090fbf4506c9eaaffea1 Conflicts: src/mesa/drivers/dri/i965/brw_blorp.cpp src/mesa/drivers/dri/i965/brw_state_cache.c src/mesa/drivers/dri/i965/brw_state_upload.c * "i965: Create a macro for setting a dirty bit." 88e3d404dad009d8cff5124cf8acee7daeaceb64 Signed-off-by: Jordan Justen <[email protected]>
* i965: Modify dirty bit handling to support 2 pipelines.Paul Berry2014-09-011-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The hardware state for compute shaders is almost entirely orthogonal to the hardware state for 3D rendering. To avoid sending unnecessary state to the hardware, we'll need to have a separate set of state atoms for the compute pipeline and the 3D pipeline. That means we need to maintain two separate sets of dirty bits to determine which state atoms need to be run. But the dirty bits are not completely independent; for example, if BRW_NEW_SURFACES is flagged while doing 3D rendering, then not only do we need to re-run 3D state atoms that depend on BRW_NEW_SURFACES, but we also need to re-run compute state atoms that depend on BRW_NEW_SURFACES. But we'll also need to re-run those state atoms the next time the compute pipeline is run. To accomplish this, we record two sets of dirty bits, one for each pipeline. When bits are dirtied (via SET_DIRTY_BIT() or SET_DIRTY_ALL()) we set them to the dirty state in both pipelines. When brw_state_upload() is run, we clear the dirty bits just for the pipeline that was run. Note that since the number of pipelines is known at compile time to be 2, the compiler should unroll the loops in SET_DIRTY_BIT() and SET_DIRTY_ALL(). Reviewed-by: Jordan Justen <[email protected]>
* i965: Create a macro for setting a dirty bit.Paul Berry2014-09-011-6/+6
| | | | | | | This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <[email protected]>
* i965: Move pre-draw resolve buffers to dd::UpdateStateKristian Høgsberg2014-08-151-40/+0
| | | | | | | | | | | No functional change except for glBegin/glEnd style rendering, where we now do the resolves at glBegin time instead of FLUSH_VERTICES time. This is also the reason for this change, so that when we later switch fast clear resolve to use meta, we won't be doing meta operations in the middle of a begin/end sequence. Signed-off-by: Kristian Høgsberg <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Add a mechanism for sending native primitives into the driverKristian Høgsberg2014-08-151-3/+12
| | | | | | | | | | | The brw_draw_prims() function is the draw entry point into the driver, and takes struct _mesa_prim for input. We want to be able to feed native primitives into the driver, and to that end we introduce BRW_PRIM_OFFSET, which lets use describe geometry using the native GEN primitive types. Signed-off-by: Kristian Høgsberg <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Emit a performance warning on conditional rendering.Kenneth Graunke2014-08-081-0/+5
| | | | | | | | | We have a CPU-side implementation of conditional rendering; it really should be done on the GPU. It's not necessarily that hard, but nobody has gotten to fixing it yet. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Drop sizeof(struct brw_sampler_state) from estimated prim size.Kenneth Graunke2014-08-021-3/+3
| | | | | | | | | | This is the last user of the structure. v2: Use a local variable with a sensible name so people know what 16 is. (Suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Avoid redundant call to brw_merge_inputs() in brw_try_draw_prims()Iago Toral Quiroga2014-05-131-7/+6
| | | | | | | | We always call brw_merge_inputs() right before looping over the primitives but this can be called inside the loop for each primitive too. In the case we do it for the first primitive the call is redundant and can be skipped. Reviewed-by: Eric Anholt <[email protected]>
* i965: Delete the intel_regions.c code.Eric Anholt2014-05-011-1/+0
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Drop use of intel_region from miptrees.Eric Anholt2014-05-011-4/+4
| | | | | | | | | | | | Note: region->width/height used to reflect the total_width/height padding of separate stencil, though mt->total_width didn't. region->width/height was being used in EGL images, where the padded value would have been the wrong one, so I converted them to use rb->Width/Height. v2: Drop debug printf that slipped in (caught by Ken) Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* mesa: Replace use of _ReallyEnabled as a boolean with use of _Current.Eric Anholt2014-04-301-1/+1
| | | | | | | | | | | | | I'm probably not the only person that has tried to kill _ReallyEnabled. This does the mechanical part of the work, and cleans _ReallyEnabled from i965. I think that using _Current makes texture management clearer: You can't have multiple targets in use in the same texture image unit at the same time, because there's just that one pointer. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use ctx->Texture._MaxEnabledTexImageUnit for upper boundChris Forbes2014-04-211-1/+2
| | | | | | | | | | | Avoid looping over 32/48/96 (!!) tex image units every draw, most of which we don't care about. Improves performance on everyone's favorite not-a-benchmark by 2.9% on Haswell. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Stop setting up a 1:1 "attrib" member in our vertex inputs.Eric Anholt2014-04-111-1/+0
| | | | | | | | | It's just the array index, so we can just go look at the array and see which element we are. No significant performance difference (n=140) Reviewed-by: Ian Romanick <[email protected]>
* tnl: Merge _tnl_vbo_draw_prims() into _tnl_draw_prims().Iago Toral Quiroga2014-04-081-12/+12
| | | | | | | | | This should help prevent situations where we render without proper index bounds. For example: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Make sure we always compute valid index bounds before drawing.Iago Toral Quiroga2014-03-281-1/+2
| | | | | | | | | When doing software rendering (i.e. rendering to the selection buffer) we need to make sure that we have valid index bounds before calling _tnl_draw_prims(), otherwise we can crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop broken front_buffer_reading/drawing optimization.Eric Anholt2014-03-111-1/+2
| | | | | | | | | | | | The flag wasn't getting updated correctly when the ctx->DrawBuffer or ctx->ReadBuffer changed. It usually ended up working out because most apps only have one window system framebuffer, or if they have more than one and they have any front read/drawing, they will have called glReadBuffer()/glDrawBuffer() on it when they get started on the new buffer. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Fix render-to-texture in non-FinishRenderTexture cases.Eric Anholt2014-03-061-6/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've had several problems now with FinishRenderTexture not getting called enough, and we're ready to just give up on it ever doing what we need. In particular, an upcoming Steam title had rendering bugs that could be fixed by always_flush_cache=true. Instead of hoping Mesa core can figure out when we need to flush our caches, just track what BOs we've rendered to in a set, and when we render from a BO in that set, emit a flush and clear the set. There's some overhead to keeping this set, but most of that is just hashing the pointer -- it turns out our set never even gets very large, because cache flushes are so common (even on cairo-gl). No statistically significant performance difference in cairo-gl (n=100), despite spending ~.5% CPU in these set operations. v1: (Original patch by Eric Anholt.) v2: (Changes by Ken Graunke.) - Rebase forward from May 7th 2013 -> March 4th 2014. - Drop the FinishRenderTexture hook entirely; after rebasing the patch, the hook was just an empty function. - Move the brw_render_cache_set_clear() call from intel_batchbuffer_emit_flush() to brw_emit_pipe_control_flush(). In theory, this could catch more cases where we've flushed. - Consider stencil as a possible texturing source. v3: (changes by anholt): - Move set_clear() back to emit_mi_flush() -- it means we can drop more forced flushes from the code. In the previous location, it wouldn't have been called when we wanted pre-gen6. - Move the set clear from batch init to reset -- it should be empty at the start of every batch, since the kernel handled any inter-batch flush for us. v4: Drop the debug code in set.c that I accidentally committed. Signed-off-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Dylan Baker <[email protected]> [v2]
* i965: Validate (and resolve) all the bound textures.Chris Forbes2014-03-021-1/+1
| | | | | | | | | | | | | | | | | | | BRW_MAX_TEX_UNIT is the static limit on the number of textures we support per-stage, not in total. Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which is significantly larger, and across the various shader stages, up to ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually used. Fixes invisible bad behavior in piglit's max-samplers test (although this escalated to an assertion failure on HSW with texture_view, since non-immutable textures only have _Format set by validation.) Signed-off-by: Chris Forbes <[email protected]> Cc: "9.2 10.0 10.1" <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move singlesample_mt to the renderbuffer.Eric Anholt2014-02-181-2/+2
| | | | | | | | | | | Since only window system renderbuffers can have a singlesample_mt, this lets us drop a bunch of sanity checking to make sure that we're just a renderbuffer-like thing. v2: Fix a badly-written comment (thanks Kenneth!), drop the now trivial helper function for set_needs_downsample. Reviewed-by: Kenneth Graunke <[email protected]>