aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_draw.c
Commit message (Collapse)AuthorAgeFilesLines
* intel: Convert from GLboolean to 'bool' from stdbool.h.Kenneth Graunke2011-10-181-10/+10
| | | | | | | | | | | | | | | | | I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chad Versace <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Change type of brw_context.primitive from GLenum to hardware primitiveChad Versace2011-10-101-23/+20
| | | | | | | | | | | | | | | | | | | For example, GL_TRIANLGES is converted to _3DPRIM_TRILIST. The conversion is necessary because HiZ and MSAA resolve operations emit a 3DPRIM_RECTLIST, which cannot be conveyed by GLenum. As a consequence, brw_gs_prog_key.primitive is also converted. v2 ---- - [anholt] Split brw_set_prim into brw/gen6 variants in previous commit, since not much code is really shared between the two. - [anholt] Replace switch statements with table lookups, since this is a hot path. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Split brw_set_prim into brw/gen6 variantsChad Versace2011-10-101-1/+18
| | | | | | | | | | | The "slight optimization to avoid the GS program" in brw_set_prim() is not used by Gen 6, since Gen 6 doesn't use a GS program. Also, Gen 6 doesn't use reduced primitives. Also, document that intel_context.reduced_primitive is only used for Gen < 6 Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Don't bother telling tnl about state updates unless we fall back.Eric Anholt2011-06-241-0/+1
| | | | | | This was sucking up 1% of the CPU on 3DMMES. Reviewed-by: Ian Romanick <[email protected]>
* i965/gen6: Limit the workaround flush to once per primitive.Eric Anholt2011-06-201-0/+2
| | | | | We're about to call this function in a bunch of state emits, so let's not spam the hardware with flushes too hard.
* i965: Drop remaining strict conformance fallback for GL_POINT_SMOOTH.Eric Anholt2011-06-031-30/+0
| | | | | We actually could do this in hardware in the fragment shader using gl_PointCoord and the point's size.
* i965: Drop strict conformance fallback for GL_LINE_STIPPLE.Eric Anholt2011-06-031-18/+0
| | | | | | We implement line stipples, just not *quite* correctly. We have a piglit testcase to use when we want to fix it, if we do. Until then, don't lie to our test suites.
* i965: Drop strict conformance fallback for GL_LINE_SMOOTH.Eric Anholt2011-06-031-9/+0
| | | | | | | | | | We do have hardware antialised lines. If we care, we should actually fix them to be conformant (or as close as possible) instead of using this knob to fool testcases using swrast. For some interesting reading on the state of GL_*_SMOOTH across several drivers, see: http://homepage.mac.com/arekkusu/bugs/invariance/HWAA.html
* i965: Drop strict conformance fallback for GL_POLYGON_SMOOTH.Eric Anholt2011-06-031-6/+0
| | | | | | From my reading of the GL 2.1 spec, no antialiasing is strictly conformant for polygon smoothing. Yes, it's absurd, but then, hardware doesn't support this so maybe it's not so absurd.
* i965: Drop INTEL_CONFORMANCE=2 fallback code.Eric Anholt2011-06-031-3/+0
| | | | | This was just a duplicate of no_rast=true driconf option, which is relatively standard across drivers.
* i965: Add support for correct GL_CLAMP behavior by clamping coordinates.Eric Anholt2011-05-181-36/+0
| | | | | | | | This removes the stupid strict-conformance fallback code I broke when adding ARB_sampler_objects. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36572 Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* i965: Emit 3DPRIMITIVE Ivybridge-style.Kenneth Graunke2011-05-171-1/+59
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Move sampler state to state streaming.Eric Anholt2011-04-291-0/+3
| | | | | | | | | | | Overall, across this series since the last set of numbers, gen6 3DMMES taiji performance has dropped 0.8% +/- 0.3% (n=15), probably due to the increased reissuing of state from some of the state objects that otherwise never changed, and increased occurrence of the per-batch overhead as we've increased how much we put in the batch BO without increasing the batch BO's size. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6: Stream the VS push constants.Eric Anholt2011-04-291-0/+1
| | | | | | Improves 3DMMES taiji demo performance by 10.1% +/- 0.9% (n=15). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6: Stream the WM push constants.Eric Anholt2011-04-291-5/+7
| | | | | | | Improves 3DMMES taiji demo performance by 5.1% +/- 1.9% (n=15), by reducing CPU time spent thrashing around those tiny little constant BOs. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Add support for ARB_sampler_objects.Eric Anholt2011-04-231-6/+10
| | | | | | | | | | | | This extension support consists of replacing "gl_texture_obj->Sampler." with "_mesa_get_samplerobj(ctx, unit)->". One instance of referencing the texture's base sampler remains in the initial miptree allocation, where I'm not sure we have a clear association with any texture unit. Tested with piglit ARB_sampler_objects/sampler-objects. Reviewed-by: Brian Paul <[email protected]>
* i965: Add support for NV_conditional_render.Eric Anholt2011-04-231-0/+4
| | | | | | | | Since we lack hardware support for it, this is a simple matter of checking _mesa_check_conditional_render at the entrypoints, and suppressing it for the metaops where it doesn't apply. Reviewed-by: Brian Paul <[email protected]>
* i965: Convert 3DPRIMITIVE command from struct-style to OUT_BATCH style.Kenneth Graunke2011-04-181-22/+31
| | | | | | | | | | Most of the newer portions of the code use OUT_BATCH style. I prefer this style because it offers a clear distinction between a) hardware messages/structures with a mandatory format, and b) data structures for our own internal use that we can format however we want. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: move sampler state into new gl_sampler_object typeBrian Paul2011-04-101-6/+6
| | | | | | gl_texture_object contains an instance of this type for the regular texture object sampling state. glGenSamplers() generates new instances of gl_sampler_object which can override that state with glBindSampler().
* intel: extend current vertex buffersChris Wilson2011-02-211-0/+5
| | | | | | | | | If the next vertex arrays are a (discontiguous) continuation of the current arrays, such that the new vertices are simply offset from the start of the current vertex buffer definitions we can reuse those defintions and avoid the overhead of relocations and invalidations. Signed-off-by: Chris Wilson <[email protected]>
* i965: emit one vb packet per vboChris Wilson2011-02-211-6/+21
| | | | | | | Track reuse of the vertex buffer objects and so minimise the number of vertex buffers used by the hardware (and their relocations). Signed-off-by: Chris Wilson <[email protected]>
* intel: use pwrite for batchChris Wilson2011-02-211-12/+7
| | | | | | | | | | | It's faster. Not only is the memcpy more efficiently performed in the kernel (making up for the system call overhead), but by not using mmap we remove the greater overhead of tracking the vma of every batch. And it means we can read back from the batch buffer without incurring the cost of a uncached read through the GTT. Signed-off-by: Chris Wilson <[email protected]>
* i965: Combine vb upload buffer with the general upload bufferChris Wilson2011-02-211-5/+0
| | | | | | | Reuse the new common upload buffer for uploading temporary indices and rebuilt vertex arrays. Signed-off-by: Chris Wilson <[email protected]>
* i965: Add support for using the BLT ring on gen6.Eric Anholt2010-12-131-2/+3
|
* intel: Annotate debug printout checks with unlikely().Eric Anholt2010-11-031-6/+4
| | | | | | | This provides the optimizer with hints about code hotness, which we're quite certain about for debug printouts (or, rather, while we developers often hit the checks for debug printouts, we don't care about performance while doing so).
* Revert "i965: fallback lineloop on sandybridge for now"Zhenyu Wang2010-10-141-7/+0
| | | | This reverts commit 73dab75b4165f7d2214a68d4ba8e3cb7aab9b4ac.
* Drop GLcontext typedef and use struct gl_context insteadKristian Høgsberg2010-10-131-5/+5
|
* i965: Don't rebase the index buffer to min 0 if any arrays are in VBOs.Eric Anholt2010-10-121-1/+1
| | | | | | | | | There was a check to only do the rebase if we didn't have everything in VBOs, but nexuiz apparently hands us a mix of VBOs and arrays, resulting in blocking on the GPU to do a rebase. Improves nexuiz 800x600, high-settings performance on my Ironlake 41% (+/- 1.3%), from 14.0fps to 19.7fps.
* i965: fallback lineloop on sandybridge for nowZhenyu Wang2010-09-291-0/+7
| | | | Until we fixed GS hang issue.
* i965: Move no_batch_wrap assertion out across the area we're trying to verify.Eric Anholt2010-06-111-5/+3
| | | | | It's more likely that we wrap badly in state setup than in the little primitive packet.
* intel: Change dri_bo_* to drm_intel_bo* to consistently use new API.Eric Anholt2010-06-081-4/+4
| | | | | The slightly less mechanical change of converting the emit_reloc calls will follow.
* i965: Reduce a single GL_QUADS to GL_TRIANGLE_FAN.Eric Anholt2010-05-131-11/+20
| | | | | | | | | This is similar to the GL_QUAD_STRIP -> TRIANGLE_STRIP optimization -- the GS usage to split the quads into tris is a huge bottleneck, so a quick check improves glean blendFunc time massively (width * height of the window of single-pixel GL_QUADS, many many times). This may also end up helping with cairo performance, which sometimes ends up drawing a single quad.
* Replace the _mesa_*printf() wrappers with the plain libc versionsKristian Høgsberg2010-02-191-2/+2
|
* intel: Implement the DRI2 invalidate function properlyKristian Høgsberg2010-02-171-0/+2
| | | | | | | | | | | | | | | | | | | This uses a stamp mechanisms to mark the DRI drawable as invalid. Instead of immediately updating the buffers we just bump the drawable stamp and call out to DRI2GetBuffers "later". "Later" used to be at LOCK_HARDWARE time, and this patch brings back callouts at the points where we used to call LOCK_HARDWARE. A new function, intel_prepare_render(), is called where we used to call LOCK_HARDWARE, and if the buffers are invalid, we call out to DRI2GetBuffers there. This lets us invalidate buffers only when notified instead of on every glViewport() call. If the loader calls the DRI invalidate entrypoint, we disable viewport triggered buffer invalidation. Additionally, we can clean up the old viewport mechanism a bit, since we can just invalidate the buffers and not worry about reentrancy and whatnot.
* Merge branch 'mesa_7_7_branch'Brian Paul2010-01-251-2/+0
|\ | | | | | | | | | | | | | | | | | | | | Conflicts: src/mesa/drivers/dri/intel/intel_screen.c src/mesa/drivers/dri/intel/intel_swapbuffers.c src/mesa/drivers/dri/r300/r300_emit.c src/mesa/drivers/dri/r300/r300_ioctl.c src/mesa/drivers/dri/r300/r300_tex.c src/mesa/drivers/dri/r300/r300_texstate.c
| * i965: Remove unnecessary headers.Vinson Lee2010-01-221-2/+0
| |
* | intel: Drop more cliprect bookkeepingKristian Høgsberg2010-01-041-4/+0
| |
* | intel: Drop batchbuffer cliprect_mode trackingKristian Høgsberg2010-01-041-3/+2
| |
* | intel: Drop LOCK/UNLOCK_HARDWARE()Kristian Høgsberg2010-01-041-4/+0
| |
* | intel: Consistently use no_batch_wrap in intel_context struct.Eric Anholt2009-11-191-2/+2
|/
* intel: Use PIPE_CONTROL on gen4 hardware for doing pipeline flushing.Eric Anholt2009-11-061-6/+2
| | | | | | This should do all the things that MI_FLUSH did, but it can be pipelined so that further rendering isn't blocked on the flush completion unless necessary.
* Merge branch 'mesa_7_6_branch'Brian Paul2009-09-091-0/+1
|\
| * Merge branch 'mesa_7_5_branch' into mesa_7_6_branchBrian Paul2009-09-091-0/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: Makefile configs/default progs/glsl/Makefile src/gallium/auxiliary/util/u_simple_shaders.c src/gallium/state_trackers/glx/xlib/xm_api.c src/mesa/drivers/dri/i965/brw_draw_upload.c src/mesa/drivers/dri/i965/brw_vs_emit.c src/mesa/drivers/dri/intel/intel_context.h src/mesa/drivers/dri/intel/intel_pixel.c src/mesa/drivers/dri/intel/intel_pixel_read.c src/mesa/main/texenvprogram.c src/mesa/main/version.h
| | * i965: fix incorrect test for vertex position attributeBrian Paul2009-09-081-0/+1
| | |
* | | intel: Add support for ARB_draw_elements_base_vertex.Eric Anholt2009-09-081-1/+1
| | | | | | | | | | | | | | | On the 965, we just drop the value into the primitive packet. On non-945, we rely on the sw tnl code handling it.
* | | i965: #include clean-upsBrian Paul2009-09-081-7/+4
|/ /
* | i965: Avoid re-uploading the index buffer when we don't need to.Eric Anholt2009-08-121-0/+2
| | | | | | | | No performance difference proven at 95% confidence with my GLSL demo (n=10).
* | vbo: Avoid extra validation of DrawElements.Eric Anholt2009-08-121-38/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | This saves mapping the index buffer to get a bounds on the indices that drivers just drop on the floor in the VBO case (cache win), saves a bonus walk of the indices in the CheckArrayBounds case, and other miscellaneous validation. On intel it's a particularly a large win (50-100% in my app) because even though we let the indices stay in both CPU and GPU caches, we still end up waiting for the GPU to be done with the buffer before reading from it. Drivers that want the min/max_index fields must now check index_bounds_valid and use vbo_get_minmax_index before using them.
* | i965: Remove BRW_NEW_INPUT_VARYINGEric Anholt2009-07-071-7/+1
| | | | | | | | This state flag has been unused since the ffvertex_prog move to core.
* | intel: Move note_unlock() implementation to the one place it's needed.Eric Anholt2009-06-291-0/+2
|/