aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_vtbl.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: Remove the rest of brw_update_draw_buffer().Eric Anholt2013-06-251-27/+5
| | | | | | | | | | | The last piece of code with an effect was flagging _NEW_BUFFERS. Only, that is already flagged from everything that calls this function: Mesa GL state updates flag it before even calling down into the driver, and the calls from the DRI2 window system framebuffer update path end up flagging it as part of the ResizeBuffers() hook. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Stop updating FBO state on drawbuffers change.Eric Anholt2013-06-251-8/+0
| | | | | | | | The computed fields are updated appropriately as part of the normal draw call path due to _NEW_BUFFERS being set. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Stop recomputing drawbuffer bounds on drawbuffer change.Eric Anholt2013-06-251-2/+0
| | | | | | | | | For winsys FBOs, the bounds are appropriately updated immediately upon _mesa_resize_framebuffer(). For user FBOs, they're updated as part of the normal draw path state update due to _NEW_BUFFERS having been flagged. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove _NEW_DEPTH state flagging on drawbuffers change.Eric Anholt2013-06-251-2/+0
| | | | | | | | Of the places noting a _NEW_DEPTH dependency, all were already checking for _NEW_BUFFERS if appropriate. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Stop doing special _NEW_STENCIL state flagging on drawbuffers.Eric Anholt2013-06-251-6/+1
| | | | | | | | 2/3 packets depending on Stencil._Enabled already checked for _NEW_BUFFERS, so just add _NEW_BUFFERS to the remaining one. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Stop flagging viewport/scissor change on drawbuffers change.Eric Anholt2013-06-251-3/+0
| | | | | | | | | The viewport (ctx->Viewport._WindowMap) doesn't change with drawable size changes, and we update scissor (ctx->DrawBuffer->_Xmin and friends) on _NEW_BUFFERS in things like brw_sf_state.c. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Stop flagging _NEW_POLYGON on drawbuffers change.Eric Anholt2013-06-251-5/+0
| | | | | | | | Things like brw_sf.c that need to know about orientation are already recomputing on _NEW_BUFFERS. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Delete unused brw->sol.offset_0_batch_start field.Kenneth Graunke2013-05-211-6/+0
| | | | | | | | This was only used for the the non-hardware context code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Reduce code duplication in handling of depth, stencil, and HiZ.Paul Berry2013-04-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | This patch consolidates duplicate code in the brw_depthbuffer and gen7_depthbuffer state atoms. Previously, these state atoms contained 5 chunks of code for emitting the _3DSTATE_DEPTH_BUFFER packet (3 for Gen4-6 and 2 for Gen7). Also a lot of logic for determining the appropriate buffer setup was duplicated between the Gen4-6 and Gen7 functions. This refactor splits the code into three separate functions: brw_emit_depthbuffer(), which determines the appropriate buffer setup in a mostly generation-independent way, brw_emit_depth_stencil_hiz(), which emits the appropriate state packets for Gen4-6, and gen7_emit_depth_stencil_hiz(), which emits the appropriate state packets for Gen7. Tested using Piglit on Gen5-7 (no regressions). v2: Re-word some comments. Fix an assertion that incorrectly prohibited packed depth/stencil formats on Gen6 (these are allowed provided that HiZ is disabled). Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Start using HIZ for Z16 textures.Eric Anholt2012-12-261-0/+1
| | | | | | | | | | I had left this out for a long time because it regressed some depthstencil-render-miplevels cases when it was enabled. Now that the bugs causing those are fixed, there's nothing stopping us. Improves glbenchmark 2.1 offscreen performance by 7.3% +/- 2.8% (n=10). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make sure that the shader_time report at context destroy happens.Eric Anholt2012-12-141-0/+3
| | | | | | Otherwise, you end up with some report from within a second of context destroy, which is now what you really want for testing the impact of changes
* i965: Add a debug flag for counting cycles spent in each compiled shader.Eric Anholt2012-12-051-0/+14
| | | | | | | | | | | | | | | | | | | | | | This can be used for two purposes: Using hand-coded shaders to determine per-instruction timings, or figuring out which shader to optimize in a whole application. Note that this doesn't cover the instructions that set up the message to the URB/FB write -- we'd need to convert the MRF usage in these instructions to GRFs so that our offsets/times don't overwrite our shader outputs. Reviewed-by: Kenneth Graunke <[email protected]> (v1) v2: Check the timestamp reset flag in the VS, which is apparently getting set fairly regularly in the range we watch, resulting in negative numbers getting added to our 32-bit counter, and thus large values added to our uint64_t. v3: Rebase on reladdr changes, removing a new safety check that proved impossible to satisfy. Add a comment to the AOP defs from Ken's review, and put them in a slightly more sensible spot. v4: Check timestamp reset in the FS as well.
* i965: Fix slow leak of brw->wm.compile_data->storeEric Anholt2012-11-081-2/+0
| | | | | | | | We were successfully freeing our compile data at context destroy, but until then we were allocating a new store every compile without freeing it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56019 Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't bother trying to extend the current vertex buffers.Kenneth Graunke2012-10-311-1/+0
| | | | | | | | | | | | | | | | | | | | This essentially reverts the following: commit c625aa19cb53ed27f91bfd16fea6ea727e9a5bbd Author: Chris Wilson <[email protected]> Date: Fri Feb 18 10:37:43 2011 +0000 intel: extend current vertex buffers While working on optimizing an upcoming Steam title, I broke this code. Eric expressed his doubts about this optimization, and noted that the original commit offered no performance data. I ran before and after benchmarks on Xonotic and Citybench, and found that this code made no difference. So, remove it to reduce complexity and make future work simpler. Reviewed-by: Eric Anholt <[email protected]>
* i965: brwInitVtbl needs to know the chipset generationIan Romanick2012-09-281-0/+1
| | | | | | Fixes major regressions since de958de. Signed-off-by: Ian Romanick <[email protected]>
* intel: Reserve enough space to finish occlusion queries on Gen6.Kenneth Graunke2012-08-121-0/+4
| | | | | | | | | | | | | | | | | | | After realizing that brw_finish_batch emitted some final PIPE_CONTROLs to record occlusion queries, Chris noted that we probably hadn't reserved enough space to actually emit them. Reserving a full 60 bytes seems a bit harsh, since we only need that much if occlusion queries are actually active. Plus, 28 bytes would be sufficient for Gen7, and 24 for Gen4-5. We could optimize this in the future, but it doesn't seem too critical. NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53311 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add hardware context support.Kenneth Graunke2012-07-101-4/+11
| | | | | | | | | | | With fixes and updates from Ben Widawsky and comments from Paul Berry. v2: Use drm_intel_gem_context_destroy to destroy hardware context; remove useless initialization of hw_ctx, both suggested by Eric. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Drop a layer of indirection in doing HiZ resolves.Eric Anholt2012-05-231-12/+0
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Completely annotate the batch bo when aub dumping.Paul Berry2012-05-221-0/+1
| | | | | | | | | | | | | | | | | | | Previously, when the environment variable INTEL_DEBUG=aub was set, mesa would simply instruct DRM to start dumping data to an .aub file, but we would not provide DRM with any information about the format of the data in various buffers. As a result, a lot of the data in the generate .aub file would be unannotated, making further data analysis difficult. This patch causes the entire contents of each batch buffer to be annotated using the data in brw->state_batch_list (which was previously used only to annotate the output of INTEL_DEBUG=bat). This includes data that was allocated by brw_state_batch, such as binding tables, surface and sampler states, depth/stencil state, and so on. The new annotation mechanism requires DRM version 2.4.34. Reviewed-by: Eric Anholt <[email protected]>
* i965/hiz: Convert gen{6,7}_hiz.h to gen{6,7}_blorp.hPaul Berry2012-05-101-2/+2
| | | | | | | This patch renames the gen6_hiz.h and gen7_hiz.h files to correspond to the renames of the corresponding .cpp files (see previous commit). Reviewed-by: Chad Versace <[email protected]>
* i965: Avoid blocking on the GPU for setting the HiZ op vertex data.Eric Anholt2012-02-281-1/+0
| | | | | | | | | | | | We need to allocate new space every time to avoid blocking on the last HiZ op completing. There are two easy ways to do this: brw_state_batch() and intel_upload_data(). brw_state_batch() is simpler and avoids another buffer allocation. Improves Unigine Tropics performance 0.376416% +/- 0.148722% (n=7). Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Rewrite the HiZ opChad Versace2012-02-071-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The HiZ op was implemented as a meta-op. This patch reimplements it by emitting a special HiZ batch. This fixes several known bugs, and likely a lot of undiscovered ones too. ==== Why the HiZ meta-op needed to die ==== The HiZ op was implemented as a meta-op, which caused lots of trouble. All other meta-ops occur as a result of some GL call (for example, glClear and glGenerateMipmap), but the HiZ meta-op was special. It was called in places that Mesa (in particular, the vbo and swrast modules) did not expect---and were not prepared for---state changes to occur (for example: glDraw; glCallList; within glBegin/End blocks; and within swrast_prepare_render as a result of intel_miptree_map). In an attempt to work around these unexpected state changes, I added two hooks in i965: - A hook for glDraw, located in brw_predraw_resolve_buffers (which is called in the glDraw path). This hook detected if a predraw resolve meta-op had occurred, and would hackishly repropagate some GL state if necessary. This ensured that the meta-op state changes would not intefere with the vbo module's subsequent execution of glDraw. - A hook for glBegin, implemented by brwPrepareExecBegin. This hook resolved all buffers before entering a glBegin/End block, thus preventing an infinitely recurring call to vbo_exec_FlushVertices. The vbo module calls vbo_exec_FlushVertices to flush its vertex queue in response to GL state changes. Unfortunately, these hooks were not sufficient. The meta-op state changes still interacted badly with glPopAttrib (as discovered in bug 44927) and with swrast rendering (as discovered by debugging gen6's swrast fallback for glBitmap). I expect there are more undiscovered bugs. Rather than play whack-a-mole in a minefield, the sane approach is to replace the HiZ meta-op with something safer. ==== How it was killed ==== This patch consists of several logical components: 1. Rewrite the HiZ op by replacing function gen6_resolve_slice with gen6_hiz_exec and gen7_hiz_exec. The new functions do not call a meta-op, but instead manually construct and emit a batch to "draw" the HiZ op's rectangle primitive. The new functions alter no GL state. 2. Add fields to brw_context::hiz for the new HiZ op. 3. Emit a workaround flush when toggling 3DSTATE_VS.VsFunctionEnable. 4. Kill all dead HiZ code: - the function gen6_resolve_slice - the dirty flag BRW_NEW_HIZ - the dead fields in brw_context::hiz - the state packet manipulation triggered by the now removed brw_context::hiz::op - the meta-op workaround in brw_predraw_resolve_buffers (discussed above) - the meta-op workaround brwPrepareExecBegin (discussed above) Note: This is a candidate for the 8.0 branch. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43327 Reported-by: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44927 Reported-by: [email protected] Signed-off-by: Chad Versace <[email protected]>
* i965/gen7: Fix up the transform feedback buffer pointers on later batches.Eric Anholt2012-01-061-0/+6
| | | | | | Fixes piglit EXT_transform_feedback/intervening-read Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add separate stencil/HiZ setup for MESA_FORMAT_Z32_FLOAT_X24S8.Eric Anholt2011-12-191-0/+2
| | | | | | | | | This is a little more unusual than the separate MESA_FORMAT_S8_Z24 support, because in addition to storing the real stencil data in a MESA_FORMAT_S8 miptree, we also make the Z miptree be MESA_FORMAT_Z32_FLOAT instead of the requested format. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop separate stencil assertions in update_draw_buffer().Eric Anholt2011-12-141-16/+0
| | | | | | | The comment said they deserved to be in emit_depthbuffer, and at this point they were all there already. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Only prefer separate stencil when we can do HiZ.Eric Anholt2011-12-071-2/+10
| | | | | | | | | | | | This required is_hiz_depth_format to start returning true on S8_Z24 as well, since that's the format we have here. The two previous callers are only calling it on non-depthstencil formats. This avoids us needing to have HiZ working on a new Z format immediately upon exposing the format (particularly painful for Z32_FLOAT_X24S8, which means all the fake packed depth/stencil paths). Reviewed-by: Chad Versace <[email protected]>
* intel: Change signature of HiZ resolve functionsChad Versace2011-11-221-2/+2
| | | | | | | | | | | | | Now that intel_renderbuffer::region has been replaced with a miptree, the HiZ functions region parameter must be replaced with a miptree parameter. Change the return type from bool to void. Rename the 'depth' parameter to 'layer', because it will correspond to irb->mt_layer. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Remove unused HiZ functionsChad Versace2011-11-221-9/+0
| | | | | | | | | | | | | | Remove the following functions: i830_hiz_resolve_noop i915_hiz_resolve_noop brw_hiz_resolve_noop My original strategy for how intel->vtbl.resolve_*buffer was used has substantially changed. The above functions are no longer called in the current strategy. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Replace intel_renderbuffer::region with a miptree [v3]Chad Versace2011-11-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Essentially, this patch just globally substitutes `irb->region` with `irb->mt->region` and then does some minor cleanups to avoid segfaults and other problems. This is in preparation for 1. Fixing scatter/gather for mipmapped separate stencil textures. 2. Supporting HiZ for mipmapped depth textures. As a nice benefit, this lays down some preliminary groundwork for easily texturing from any renderbuffer, even those of the window system. A future commit will replace intel_mipmap_tree::hiz_region with a miptree. v2: - Return early in intel_process_dri2_buffer_*() if region allocation fails. - Fix double semicolon. - Fix miptree reference leaks in the following functions: intel_process_dri2_buffer_with_separate_stencil() intel_image_target_renderbuffer_storage() v3: - [anholt] Fix check for hiz allocation failure. Replace ``if (!irb->mt)` with ``if(!irb->mt->hiz_region)``. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Fix swrast_render_start() for depthstencil buffers with separate stencilChad Versace2011-11-211-21/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Don't map the depthstencil buffer twice Place a guard in intel_renderbuffer_map() to prevent a renderbuffer from being mapped twice. This happened if a single buffer was attached to the framebuffer's depth and stencil attachment points. (Interestingly, because intel_map_renderbuffer_gtt() is idempotent, the double mapping did not cause bugs for depthstencil buffers *without* separate stencil). 2. Stop overriding gl_framebuffer::_DepthBuffer,_StencilBuffer Normally, if a depthstencil buffer is attached to the framebuffer's depth attachment point, then _mesa_update_framebuffer() installs a wrapper depth renderbuffer at gl_framebuffer::_DepthBuffer. Ditto for the stencil attachment point and gl_framebuffer::_StencilBuffer A depthstencil intel_renderbuffer with separate stencil contains hidden depth and stencil renderbuffers, which are the *real* renderbuffers. In order to force swrast to work, we were installing, in brw_update_draw_buffer(), the hidden renderbuffers at gl_framebuffer::_DepthBuffer and _StencilBuffer, thus overriding the behavior of _mesa_update_framebuffer(). However, now that intel_renderbuffer_map() is implemented with MapRenderbuffer(), overriding _mesa_update_framebuffer's introduces bugs. This patch removes the override code. Fixes several Piglit tests on gen7. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Add new vtable entries for surface state updating functions.Kenneth Graunke2011-11-101-0/+6
| | | | | | | | | | | | | | | Gen7+ SURFACE_STATE is different from Gen4-6, so we need separate per-generation functions for creating and updating it. However, the usage is the same, and callers just want to utilize the appropriate functions with minimal pain. So, put them in the vtable. Since these take a brw_context pointer and are only used on Gen4, just add a forward declaration. This is the simplest (if not cleanest) solution. It would be nicer to have a i965-specific vtable, but that's a refactor for another day. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove the validated BO list, now that it's unused.Eric Anholt2011-10-291-1/+0
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Add HiZ operations to intel_context::vtbl for all driversChad Versace2011-10-181-0/+15
| | | | | | | | | | | | | Add the following to the vtbl: hiz_resolve_depthbuffer hiz_resolve_hizbuffer For all drivers for which HiZ is not enabled, the methods are set to be no-ops. If HiZ is enabled, the methods are currently to set to empty stubs. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Fix Android build by removing relative includesChad Versace2011-08-301-1/+1
| | | | | | | | | | Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Fix many of the trivial WebGL demos that broke due to IB optimization.Eric Anholt2011-07-251-0/+1
| | | | | | | | | | | The index buffer state emit only occurred if there was an IB in place and we were in either a new batch or a new IB state. But because we only flagged new IB state if IB state changed from the last IB state we calculated, we could simply never emit IB state after batchbuffer wraps if the first draw didn't use the IB and we didn't actually change the IB. Fixes piglit glx-multi-context-ib-1.
* i965: Remove i915 paths from brw_update_draw_buffers().Eric Anholt2011-07-181-37/+11
| | | | Reviewed-by: Chad Versace <[email protected]>
* i965: Remove unused region calculations in brw_update_draw_buffer().Eric Anholt2011-07-181-60/+2
| | | | Reviewed-by: Chad Versace <[email protected]>
* i965: Remove empty brw_set_draw_region.Eric Anholt2011-07-181-14/+0
| | | | Reviewed-by: Chad Versace <[email protected]>
* i965: Remove FALLBACK() from brw_update_draw_region().Eric Anholt2011-07-181-16/+0
| | | | | | The 965 driver doesn't use these for deciding on fallbacks. Reviewed-by: Chad Versace <[email protected]>
* intel: Move intel_draw_buffers() code into each driver.Eric Anholt2011-07-181-0/+201
| | | | | | | | The illusion of shared code here wasn't fooling anybody. It was tempting to keep i830 and i915 still shared, but I think I actually want to make them diverge shortly. Reviewed-by: Chad Versace <[email protected]>
* i965: Track the brw_state_batch() data while under INTEL_DEBUG=batch.Eric Anholt2011-07-111-0/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6: Limit the workaround flush to once per primitive.Eric Anholt2011-06-201-0/+5
| | | | | We're about to call this function in a bunch of state emits, so let's not spam the hardware with flushes too hard.
* i965: Use state streaming on programs, and state base address on gen5+.Eric Anholt2011-06-181-6/+6
| | | | | | | | | | There will be a little bit of thrashing of the program cache BO as the cache warms up, but once the application is in steady state, this reduces relocations on gen5 and later. On my T420 laptop, cairogl firefox-talos-gfx performance improves 2.6% +/- 1.3% (n=6). No statistically significant performance difference on nexuiz (n=5).
* i965: Only flag the new-batch related state as dirty at new batch time.Eric Anholt2011-06-181-5/+1
| | | | | | This was debug code from the initial import of the driver. No statistically significant performance difference on cairo-gl or nexuiz (n=6).
* intel: Add is_hiz_depth_format() to intel_contex.vtblChad Versace2011-05-251-0/+11
| | | | | | | | | Given a format, is_hiz_depth_format() indicates if HiZ can be enabled on a depthbuffer of that format. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Stop caching the combined depth/stencil region in brw_context.c.Eric Anholt2011-05-181-9/+0
| | | | | | This was going to get in the way of separate depth/stencil (which wants to know about both, and whether they are the same rb), and also wasn't a sufficient flag for the fix in the following commit.
* i965: Get a ralloc context into brw_compile.Kenneth Graunke2011-05-171-7/+3
| | | | | | | | | | | | This would be so much easier if we were using C++; we could simply use constructors and destructors. Instead, we have to update all the callers. While we're at it, ralloc various brw_wm_compile fields rather than explicitly calloc/free'ing them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/gen4: Move the GS state to state streaming.Eric Anholt2011-04-291-1/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen4: Move clip state to state streamingEric Anholt2011-04-291-1/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen4: Move VS state to state streaming.Eric Anholt2011-04-291-2/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>