aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_draw.c
Commit message (Collapse)AuthorAgeFilesLines
* intel: extend current vertex buffersChris Wilson2011-02-211-0/+5
| | | | | | | | | If the next vertex arrays are a (discontiguous) continuation of the current arrays, such that the new vertices are simply offset from the start of the current vertex buffer definitions we can reuse those defintions and avoid the overhead of relocations and invalidations. Signed-off-by: Chris Wilson <[email protected]>
* i965: emit one vb packet per vboChris Wilson2011-02-211-6/+21
| | | | | | | Track reuse of the vertex buffer objects and so minimise the number of vertex buffers used by the hardware (and their relocations). Signed-off-by: Chris Wilson <[email protected]>
* intel: use pwrite for batchChris Wilson2011-02-211-12/+7
| | | | | | | | | | | It's faster. Not only is the memcpy more efficiently performed in the kernel (making up for the system call overhead), but by not using mmap we remove the greater overhead of tracking the vma of every batch. And it means we can read back from the batch buffer without incurring the cost of a uncached read through the GTT. Signed-off-by: Chris Wilson <[email protected]>
* i965: Combine vb upload buffer with the general upload bufferChris Wilson2011-02-211-5/+0
| | | | | | | Reuse the new common upload buffer for uploading temporary indices and rebuilt vertex arrays. Signed-off-by: Chris Wilson <[email protected]>
* i965: Add support for using the BLT ring on gen6.Eric Anholt2010-12-131-2/+3
|
* intel: Annotate debug printout checks with unlikely().Eric Anholt2010-11-031-6/+4
| | | | | | | This provides the optimizer with hints about code hotness, which we're quite certain about for debug printouts (or, rather, while we developers often hit the checks for debug printouts, we don't care about performance while doing so).
* Revert "i965: fallback lineloop on sandybridge for now"Zhenyu Wang2010-10-141-7/+0
| | | | This reverts commit 73dab75b4165f7d2214a68d4ba8e3cb7aab9b4ac.
* Drop GLcontext typedef and use struct gl_context insteadKristian Høgsberg2010-10-131-5/+5
|
* i965: Don't rebase the index buffer to min 0 if any arrays are in VBOs.Eric Anholt2010-10-121-1/+1
| | | | | | | | | There was a check to only do the rebase if we didn't have everything in VBOs, but nexuiz apparently hands us a mix of VBOs and arrays, resulting in blocking on the GPU to do a rebase. Improves nexuiz 800x600, high-settings performance on my Ironlake 41% (+/- 1.3%), from 14.0fps to 19.7fps.
* i965: fallback lineloop on sandybridge for nowZhenyu Wang2010-09-291-0/+7
| | | | Until we fixed GS hang issue.
* i965: Move no_batch_wrap assertion out across the area we're trying to verify.Eric Anholt2010-06-111-5/+3
| | | | | It's more likely that we wrap badly in state setup than in the little primitive packet.
* intel: Change dri_bo_* to drm_intel_bo* to consistently use new API.Eric Anholt2010-06-081-4/+4
| | | | | The slightly less mechanical change of converting the emit_reloc calls will follow.
* i965: Reduce a single GL_QUADS to GL_TRIANGLE_FAN.Eric Anholt2010-05-131-11/+20
| | | | | | | | | This is similar to the GL_QUAD_STRIP -> TRIANGLE_STRIP optimization -- the GS usage to split the quads into tris is a huge bottleneck, so a quick check improves glean blendFunc time massively (width * height of the window of single-pixel GL_QUADS, many many times). This may also end up helping with cairo performance, which sometimes ends up drawing a single quad.
* Replace the _mesa_*printf() wrappers with the plain libc versionsKristian Høgsberg2010-02-191-2/+2
|
* intel: Implement the DRI2 invalidate function properlyKristian Høgsberg2010-02-171-0/+2
| | | | | | | | | | | | | | | | | | | This uses a stamp mechanisms to mark the DRI drawable as invalid. Instead of immediately updating the buffers we just bump the drawable stamp and call out to DRI2GetBuffers "later". "Later" used to be at LOCK_HARDWARE time, and this patch brings back callouts at the points where we used to call LOCK_HARDWARE. A new function, intel_prepare_render(), is called where we used to call LOCK_HARDWARE, and if the buffers are invalid, we call out to DRI2GetBuffers there. This lets us invalidate buffers only when notified instead of on every glViewport() call. If the loader calls the DRI invalidate entrypoint, we disable viewport triggered buffer invalidation. Additionally, we can clean up the old viewport mechanism a bit, since we can just invalidate the buffers and not worry about reentrancy and whatnot.
* Merge branch 'mesa_7_7_branch'Brian Paul2010-01-251-2/+0
|\ | | | | | | | | | | | | | | | | | | | | Conflicts: src/mesa/drivers/dri/intel/intel_screen.c src/mesa/drivers/dri/intel/intel_swapbuffers.c src/mesa/drivers/dri/r300/r300_emit.c src/mesa/drivers/dri/r300/r300_ioctl.c src/mesa/drivers/dri/r300/r300_tex.c src/mesa/drivers/dri/r300/r300_texstate.c
| * i965: Remove unnecessary headers.Vinson Lee2010-01-221-2/+0
| |
* | intel: Drop more cliprect bookkeepingKristian Høgsberg2010-01-041-4/+0
| |
* | intel: Drop batchbuffer cliprect_mode trackingKristian Høgsberg2010-01-041-3/+2
| |
* | intel: Drop LOCK/UNLOCK_HARDWARE()Kristian Høgsberg2010-01-041-4/+0
| |
* | intel: Consistently use no_batch_wrap in intel_context struct.Eric Anholt2009-11-191-2/+2
|/
* intel: Use PIPE_CONTROL on gen4 hardware for doing pipeline flushing.Eric Anholt2009-11-061-6/+2
| | | | | | This should do all the things that MI_FLUSH did, but it can be pipelined so that further rendering isn't blocked on the flush completion unless necessary.
* Merge branch 'mesa_7_6_branch'Brian Paul2009-09-091-0/+1
|\
| * Merge branch 'mesa_7_5_branch' into mesa_7_6_branchBrian Paul2009-09-091-0/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: Makefile configs/default progs/glsl/Makefile src/gallium/auxiliary/util/u_simple_shaders.c src/gallium/state_trackers/glx/xlib/xm_api.c src/mesa/drivers/dri/i965/brw_draw_upload.c src/mesa/drivers/dri/i965/brw_vs_emit.c src/mesa/drivers/dri/intel/intel_context.h src/mesa/drivers/dri/intel/intel_pixel.c src/mesa/drivers/dri/intel/intel_pixel_read.c src/mesa/main/texenvprogram.c src/mesa/main/version.h
| | * i965: fix incorrect test for vertex position attributeBrian Paul2009-09-081-0/+1
| | |
* | | intel: Add support for ARB_draw_elements_base_vertex.Eric Anholt2009-09-081-1/+1
| | | | | | | | | | | | | | | On the 965, we just drop the value into the primitive packet. On non-945, we rely on the sw tnl code handling it.
* | | i965: #include clean-upsBrian Paul2009-09-081-7/+4
|/ /
* | i965: Avoid re-uploading the index buffer when we don't need to.Eric Anholt2009-08-121-0/+2
| | | | | | | | No performance difference proven at 95% confidence with my GLSL demo (n=10).
* | vbo: Avoid extra validation of DrawElements.Eric Anholt2009-08-121-38/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | This saves mapping the index buffer to get a bounds on the indices that drivers just drop on the floor in the VBO case (cache win), saves a bonus walk of the indices in the CheckArrayBounds case, and other miscellaneous validation. On intel it's a particularly a large win (50-100% in my app) because even though we let the indices stay in both CPU and GPU caches, we still end up waiting for the GPU to be done with the buffer before reading from it. Drivers that want the min/max_index fields must now check index_bounds_valid and use vbo_get_minmax_index before using them.
* | i965: Remove BRW_NEW_INPUT_VARYINGEric Anholt2009-07-071-7/+1
| | | | | | | | This state flag has been unused since the ffvertex_prog move to core.
* | intel: Move note_unlock() implementation to the one place it's needed.Eric Anholt2009-06-291-0/+2
|/
* intel: Add always_flush_batch driconf option for making small batchbuffers.Eric Anholt2009-03-051-0/+2
| | | | | This can improve debugging with INTEL_DEBUG=batch,sync by giving smaller batchbuffers.
* intel: Add always_flush_cache driconf option for debugging cache flush failure.Eric Anholt2009-03-051-0/+18
| | | | | I keep wanting to hack this knob in as a one-time thing, so it seemed useful to have all the time.
* i965: add software fallback for conformant 3D textures and GL_CLAMPRobert Ellison2009-03-041-5/+24
| | | | | | | | | | | | | | | | | The i965 hardware cannot do GL_CLAMP behavior on textures; an earlier commit forced a software fallback if strict conformance was required (i.e. the INTEL_STRICT_CONFORMANCE environment variable was set) and 2D textures were used, but it was somewhat flawed - it could trigger the software fallback even if 2D textures weren't enabled, as long as one texture unit was enabled. This fixes that, and adds software fallback for GL_CLAMP behavior with 1D and 3D textures. It also adds support for a particular setting of the INTEL_STRICT_CONFORMANCE environment variable, which forces software fallbacks to be taken *all* the time. This is helpful with debugging. The value is: export INTEL_STRICT_CONFORMANCE=2
* i965: texture fixes: bordered textures, fallback renderingRobert Ellison2009-02-271-3/+31
| | | | | | | | | | | | | | | | | | | | i965 doesn't natively support GL_CLAMP; it treats it like GL_CLAMP_TO_EDGE, which fails conformance tests. This fix adds a clause to the check_fallbacks() test to check whether GL_CLAMP is in use on any enabled 2D texture. If so, and if strict conformance is required (via INTEL_STRICT_CONFORMANCE), a software fallback is mandated. In addition, validate textures *before* checking for fallbacks, rather than after; otherwise, the texture state is never validated and can't be trusted. (In particular, if texturing is enabled and the sampler would access any level beyond level 0 of a texture, the sampler will segfault, because texture validation sets the firstLevel and lastLevel fields of a texture object so that the valid levels will be mapped and accessed correctly. If texture validation doesn't occur, only level 0 is accessed correctly, and that only because firstLevel and lastLevel happen to be set to 0.)
* i965: fix line stipple fallback for GL_LINE_STRIP primitivesRobert Ellison2009-02-231-1/+1
| | | | | | | | | | When doing line stipple, the stipple count resets on each line segment, unless the primitive is a GL_LINE_LOOP or a GL_LINE_STRIP. The existing code correctly identifies the need for a software fallback to handle conformant line stipple on GL_LINE_LOOP primitives, but neglects to make the same assessment on GL_LINE_STRIP primitives. This fixes it so they match.
* i965: Remove brw->attribs now that we can just always look in the GLcontext.Eric Anholt2009-02-021-9/+12
|
* i965: Delete old metaops code now that there are no remaining consumers.Eric Anholt2009-02-021-5/+2
|
* i965: Update state before checking for fallbacks in brw_try_draw_prims.Eric Anholt2008-12-151-2/+2
| | | | | | This got flipped around in 7855b2aef6bd9e9c2d73260b5cd166159b2525c6. Bug #18907. Thanks to idr for pointing me at a nicer testcase than blender.
* i965: Reduce fast-pathiness of brw_try_draw_prims, bringing in important checks.Eric Anholt2008-11-281-51/+52
| | | | | | | Later primitives, even if they caused a full state validate, wouldn't check that there was enough space in the batchbuffer, occasionally triggering the sanity check. We also skipped the aperture space check, even if it would mean bringing in new programs and associated state.
* i965: Upload state on primitive switch, don't just prepare it.Eric Anholt2008-11-121-0/+1
| | | | | This was a regression in 59b2c2adbbece27ccf54e58b598ea29cb3a5aa85 that broke blender, among other apps.
* i965: Fix check_aperture calls to cover everything needed for the prim at once.Eric Anholt2008-10-281-2/+30
| | | | | | | | Previously, since my check_aperture API change, we would check each piece of state against the batchbuffer individually, but not all the state against the batchbuffer at once. In addition to not being terribly useful in assuring success, it probably also increased CPU load by calling check_aperture many times per primitive.
* intel: Don't keep intel->pClipRects, and instead just calculate it when needed.Eric Anholt2008-10-281-13/+10
| | | | | | | This avoids issues with dereferencing stale cliprects around intel_draw_buffer time. Additionally, take advantage of cliprects staying constant for FBOs and DRI2, and emit cliprects in the batchbuffer instead of having to flush batch each time they change.
* i965: Add ARB_occlusion_query support.Eric Anholt2008-10-071-1/+1
|
* intel: Fix a number of memory leaks on context destroy.Eric Anholt2008-09-261-0/+10
|
* i965: Cope with batch getting flushed in the middle of batchbuffer emits.Eric Anholt2008-09-231-3/+6
| | | | | | | | | This isn't required for GEM (at least, yet), but the check_aperture code for non-GEM results in batch getting flushed during emit. brw_state_upload restarts state emits, but a bunch of the state emit functions were assuming that they would be called exactly once, after prepare and before new_batch. Bug #17179.
* mesa: added "main/" prefix to includes, remove some -I paths from ↵Brian Paul2008-09-181-5/+5
| | | | Makefile.template
* intel: track move of bo_exec from drivers to bufmgr.Eric Anholt2008-09-101-1/+0
|
* Revert "Revert "Merge branch 'drm-gem'""Dave Airlie2008-08-241-56/+16
| | | | This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
* Revert "Merge branch 'drm-gem'"Dave Airlie2008-08-241-16/+56
| | | | | | | | This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03. Conflicts: src/mesa/drivers/dri/i965/brw_wm_surface_state.c