summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
* mesa: simplify Driver.TexSubImage() parametersBrian Paul2011-12-302-43/+33
| | | | | | | | There's no need to pass the target, level and texObj parameters since they can be easily obtained from the texImage pointer. Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Fix memory leak in intel_miptree_create()Chad Versace2011-12-291-2/+2
| | | | | | | | On failure, intel_miptree_create() needs to *release* the miptree, not just free it, so that the stencil_mt gets released too. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965/fs: Allow constant propagation into IF with embedded compare.Eric Anholt2011-12-291-0/+1
| | | | | | This saves a couple of instructions on most programs with control flow. More interestingly, 6 shaders from unigine sanctuary now fit into 16-wide without register spilling.
* intel: Drop the batchbuffer flush on glRenderbufferStorage().Eric Anholt2011-12-291-2/+0
| | | | | There's nothing batchbuffer-related here. State updates by the caller will trigger re-emitting of any new hardware state.
* intel: Drop the batchbuffer flush on glFramebufferRenderbuffer().Eric Anholt2011-12-291-2/+0
| | | | | There should be nothing special about this call compared to other callers of intel_draw_buffer().
* intel: Make the batchbuffer flush debug more useful.Eric Anholt2011-12-292-3/+5
| | | | | | | We were printing out the line triggering the flush, but a variety of different causes just printed the line number for intel_flush()'s call of intel_batchbuffer_flush(). Plumb the line numbers from the caller of intel_flush() on through.
* intel: Fix performance regression in Lightsmark since HiZ changes.Eric Anholt2011-12-291-0/+3
| | | | | | | | | | | Since the refactor in d7b33309fe160212f2eb73f471f3aedcb5d0b5c1, depth in the miptree changed from 1 to 6, so we always decided it didn't match, and we would relayout to something that would still not "match". Improves performance 23.8% (+/- 1.1%, n=4) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43329
* intel: Don't consider miptrees for other texture targets to match.Eric Anholt2011-12-292-1/+3
| | | | | We would have done a relayout at validate time, but it's senseless to store into a miptree if it's going to force relayout.
* mesa: Re-add main/bitset.h to fix classic nouveau build failure.José Fonseca2011-12-281-0/+2
| | | | | | | | bitset.h is still used by classic nouveau -- see `git grep '\<BITSET_'` -- and the state stored is too big to fit in 64bit integers (it requires approximately 87 bits), so there is no obvious alternative here. This effecively reverts commit 196800d79829a420073f762fac90090a7b416d2d.
* mesa: Remove now unused main/bitset.h.Mathias Fröhlich2011-12-281-2/+0
| | | | Signed-off-by: Mathias Froehlich <[email protected]>
* radeon: Convert to use GLbitfield64 directly.Mathias Fröhlich2011-12-283-38/+37
| | | | | Signed-off-by: Mathias Froehlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nouveau: Convert to use GLbitfield64 directly.Mathias Fröhlich2011-12-282-2/+2
| | | | | Signed-off-by: Mathias Froehlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i915: Convert to use GLbitfield64 directly.Mathias Fröhlich2011-12-282-14/+12
| | | | | Signed-off-by: Mathias Froehlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965/vs: Properly clear cur_value when propagating direct copies.Kenneth Graunke2011-12-271-16/+12
| | | | | | | | | | | | | | | | | | | | | | | | Consider the following code: MOV A.x, B.x MOV B.x, C.x After the first line, cur_value[A][0] == B, indicating that A.x's current value came from register B. When processing the second line, we update cur_value[B][0] to C. However, for drect copies, we fail to reset cur_value[A][0] to NULL. This is necessary because the value of A is no longer the value of B. Fixes Counter-Strike: Source in Wine (where the menu rendered completely black in DX9 mode), completely white textures in Civilization V, and the new Piglit test glsl-vs-copy-propagation-1.shader_test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42032 Tested-by: Matt Turner <[email protected]> Tested-by: Christopher James Halse Rogers <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/vs: Fix incorrect subscript when resetting copy propagation records.Kenneth Graunke2011-12-271-1/+1
| | | | | | | | | | | | | | | | In this code, 'i' loops over the number of virtual GRFs, while 'j' loops over the number of vector components (0 <= j <= 3). It can't possibly be correct to see if bit 'i' is set in the destination writemask, as it will have values much larger than 3. Clearly this is supposed to be 'j'. Found by inspection. Tested-by: Matt Turner <[email protected]> Tested-by: Christopher James Halse Rogers <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Create mock implementation of GL_OES_EGL_image_externalChad Versace2011-12-274-0/+6
| | | | | | | | | | | | | | | | | | | | | In Android IceCreamSandwich, SurfaceFlinger requires GL_OES_image_external for basic compositing tasks. Without the extension, SurfaceFlinger fails to start. Despite the incompleteness of the extension's implementation introduced by this patch, it is good enough to enable SurfaceFlinger and to unblock the people who need to begin testing Mesa on IceCreamSandwich. To enable the extension, set the environment variable MESA_EXTENSION_OVERRIDE="+GL_OES_EGL_image_external". Ideally, Android should set this in init.rc. WARNING: This implementation of GL_OES_EGL_image_external is not complete. Some of it is even incorrect. When we begin to really implement GL_OES_EGL_image_external, much of the patch will need reverting. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: increase the brw eu instruction store size dynamicallyYuanhan Liu2011-12-263-3/+18
| | | | | | | | | | | | | | Here is the final patch to enable dynamic eu instruction store size: increase the brw eu instruction store size dynamically instead of just allocating it statically with a constant limit. This would fix something that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would limit it to 10000'. v2: comments from ken, do not hardcode the eu limit to (1024 * 1024) Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: call next_insn() before referencing a instruction by indexYuanhan Liu2011-12-261-14/+26
| | | | | | | | | | | | | | A single next_insn may change the base address of instruction store memory(p->store), so call it first before referencing the instruction store pointer from an index. This the final prepare work to enable the dynamic store size. v2: comments from Ken, define emit_endif as bool type Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: get the jmp distance by instruction indexYuanhan Liu2011-12-264-12/+10
| | | | | | | | | | | | | If dynamic instruction store size is enabled, while after the brw_JMPI() and before the brw_land_fwd_jump() function, the eu instruction store base address(p->store) may change. Thus, the safe way to reference the jmp instruction is by index instead of by the instruction address. v2: comments from Eric, don't change the prototype of brw_JMPI Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: let the if_stack just store the instruction indexYuanhan Liu2011-12-263-10/+19
| | | | | | | | | | | | | If dynamic instruction store size is enabled, while after the brw_IF/ELSE() and before the brw_ENDIF() function, the eu instruction store base address(p->store) may change. Thus let if_stack just store the instruction index. This is somehow more flexible and safe than store the instruction memory address. Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Fix incorrect order of dwords in gen6_update_sol_indices()Paul Berry2011-12-241-1/+1
| | | | | | | | | | | | When updating SOL indices, we were accidentally putting the starting index in dword 1 and the SVBI number to increment in dword 2--these should be reversed. Usually both of these values are zero, so we didn't see any problem. However, if a transform feedback operation spans multiple batch buffers, the starting index will be nonzero. Fixes piglit test "EXT_transform_feedback/intervening-read output". Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Fix transform feedback of triangle strips.Paul Berry2011-12-242-18/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When rendering triangle strips, vertices come down the pipeline in the order specified, even though this causes alternate triangles to have reversed winding order. For example, if the vertices are ABCDE, then the GS is invoked on triangles ABC, BCD, and CDE, even though this means that triangle BCD is in the reverse of the normal winding order. The hardware automatically flags the triangles with reversed winding order as _3DPRIM_TRISTRIP_REVERSE, so that face culling and two-sided coloring can be adjusted to account for the reversed order. In order to ensure that winding order is correct when streaming vertices out to a transform feedback buffer, we need to alter the ordering of BCD to BDC when the first provoking vertex convention is in use, and to CBD when the last provoking vertex convention is in use. To do this, we precompute an array of indices indicating where each vertex will be placed in the transform feedback buffer; normally this is SVBI[0] + (0, 1, 2), indicating that vertex order should be preserved. When the primitive type is _3DPRIM_TRISTRIP_REVERSE, we change this order to either SVBI[0] + (0, 2, 1) or SVBI[0] + (1, 0, 2), depending on the provoking vertex convention. Fixes piglit tests "EXT_transform_feedback/tessellation triangle_strip" on Gen6. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: remove gl_renderbuffer::PutRowRGB()Brian Paul2011-12-245-92/+0
| | | | | | No longer used anywhere. Reviewed-by: Eric Anholt <[email protected]>
* mesa: remove gl_renderbufer::PutMonoRow() and PutMonoValues()Brian Paul2011-12-245-307/+1
| | | | | | | The former was only used for clearing buffers. The later wasn't used anywhere! Remove them and all implementations of those functions. Reviewed-by: Eric Anholt <[email protected]>
* i965/gen7: Fix feedback for flat-shaded tristrips versus provoking vertex.Eric Anholt2011-12-231-0/+5
| | | | | | | Fixes piglit tesselation triangle_strip flat_last. Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Paul Berry <[email protected]>
* i965/gen7: Add support for transform feedback.Eric Anholt2011-12-231-7/+201
| | | | | | | | | | | | Fixes almost all of the transform feedback piglit tests. Remaining are a few tests related to tesselation for quads/trifans/tristrips/polygons with flat shading. v2: Incorporate Paul's feedback (squash with previous, state flag note, static assert, update FINISHME) Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Paul Berry <[email protected]>
* i965/gen7: Move SOL stage disable to gen7_sol_state.cEric Anholt2011-12-234-7/+58
| | | | | | | We'll be growing more code in here as we actually enable the unit. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/gen7: Add register definitions for GL_EXT_transform_feedback.Eric Anholt2011-12-232-2/+86
| | | | | | | v2: Make the buffer enable bitfield take an index argument. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/gen7: Make primitives_written counting work.Eric Anholt2011-12-231-6/+27
| | | | | | | | | | | The code was relying on gs.prog_data's copy of the number-of-verts-per-prim, which segfaulted on gen7 since it doesn't make a GS program. We can easily calculate that value right here. v2: Fix svbi_0_starting_index regression. Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen7: Enable EXT_transform_feedback extension under 3.0 override.Eric Anholt2011-12-231-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965 Gen6+: Invalidate VF address-based cache on flushPaul Berry2011-12-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although there is not much documentation of this fact, there are in fact two separate VF caches: - an "index-based" cache (described in the Sandy Bridge PRM, vol 2 part 1, section 2.1.2 "Vertex Cache"). This cache stores URB handles of vertex shader outputs; its purpose is to avoid redundant invocations of the vertex shader when drawing in random access mode (e.g. glDrawElements()), and the same vertex index is specified multiple times. It is automatically invalidated between 3D_PRIMITIVE commands and between instances within a single 3D_PRIMITIVE command. - an "address-based" cache (mentioned briefly in vol 2 part 1, section 1.7.4 "PIPE_CONTROL Command"). This cache stores the data read from vertex buffers; its purpose is to avoid redundant memory accesses when doing instanced drawing or when multiple 3D_PRIMITIVE commands access the same vertex data. It needs to be manually invalidated whenever new data is written to a buffer that is used for vertex data. Previous to this patch, it was not necessary for Mesa to explicitly invalidate the address-based cache, because there were no reasonable use cases in which the GPU would write to a vertex data buffer during a batch, and inter-batch flushing was taken care of by the kernel. However, with transform feedback, there is now a reasonable use case: vertex data is written to a buffer using transform feedback, and then that data is immediately re-used as vertex input in the next drawing operation. To make this use case work, we need to flush the address-based VF cache between transform feedback and the next draw operation. Since we are already calling intel_batchbuffer_emit_mi_flush() when transform feedback completes, and intel_batchbuffer_emit_mi_flush() is intended to invalidate all caches, it seems reasonable to add VF cache invalidation to this function. As with commit 63cf7fad13fc9cfdd2ae7b031426f79107000300 (i965: Flush pipeline on EndTransformFeedback), this is not an ideal solution. It would be preferable to only invalidate the VF cache if the next draw call was about to consume data generated by a previous draw call in the same batch. However, since we don't have the necessary dependency tracking infrastructure to figure that out right now, we have to overzealously invalidate the cache. Fixes Piglit test "EXT_transform_feedback/immediate-reuse". Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Resend binding table pointer after updating SOL bindings.Paul Berry2011-12-231-0/+2
| | | | | | | | | After creating new binding table entries for transform feedback, we need to set the dirty flag BRW_NEW_SURFACES, so that a new binding table pointer will be sent to the hardware. Otherwise the new binding table entries will not take effect. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Rename BRW_NEW_WM_SURFACES to BRW_NEW_SURFACES.Paul Berry2011-12-233-9/+9
| | | | | | | | | The surface states tracked by BRW_NEW_WM_SURFACES are no longer used for just WM. They are also used for vertex texturing and transform feedback. To avoid confusion, this patch renames BRW_NEW_WM_SURFACES to BRW_NEW_SURFACES. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't use BRW_DEPTHFORMAT_D24_UNORM_X8_UINT on Gen4.Kenneth Graunke2011-12-231-1/+4
| | | | | | | | | | | X8 depth formats weren't supported until Ironlake (Gen 5). Fixes GPU hangs introduced in d84a180417d1eabd680554970f1eaaa93abcd41e. One example test case was "fbo-missing-attachment-blit from". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965 gen6: Implement transform feedback pause/resume functionality.Paul Berry2011-12-233-3/+6
| | | | | | | | | Although i965 gen6 does not yet support ARB_transform_feedback2 or NV_transform_feedback2, it needs to support pause/resume functionality so that meta-ops will work correctly. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* dri2: Add createContextAttribs entry point for DRISW version 3Ian Romanick2011-12-231-2/+6
| | | | Signed-off-by: Ian Romanick <[email protected]>
* dri2: Add createContextAttribs entry point for DRI2 version 3Ian Romanick2011-12-231-2/+6
| | | | Signed-off-by: Ian Romanick <[email protected]>
* i965: Don't make consumers of brw_CONT/brw_WHILE track if depth in loop.Eric Anholt2011-12-216-58/+25
| | | | | | | The codegen backends all had this same tracking, so just do it at the EU level. Reviewed-by: Yuanhan Liu <[email protected]>
* i965: Don't make consumers of brw_WHILE do pre-gen6 BREAK/CONT patching.Eric Anholt2011-12-214-86/+45
| | | | | | | The EU code itself can just do this work, since all the consumers were duplicating it. Reviewed-by: Yuanhan Liu <[email protected]>
* i965: Don't make consumers of brw_DO()/brw_WHILE() track loop start.Eric Anholt2011-12-219-28/+58
| | | | | | | This is a similar cleanup to what we did for brw_IF(), brw_ELSE(), brw_ENDIF() handling. Reviewed-by: Yuanhan Liu <[email protected]>
* i965: Drop unused do_insn argument from gen6_CONT().Eric Anholt2011-12-215-7/+5
| | | | | | The branch distances get patched up later at the WHILE instruction. Reviewed-by: Yuanhan Liu <[email protected]>
* mesa: Add _NEW_RASTERIZER_DISCARD as synonym for _NEW_TRANSFORM.Paul Berry2011-12-211-2/+3
| | | | | | | | | | This makes it easier to keep track of which dirty bits correspond to which pieces of context, since it makes _NEW_RASTERIZER_DISCARD correspond with ctx->RasterDiscard. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Move RasterDiscard to toplevel of gl_context.Paul Berry2011-12-211-1/+1
| | | | | | | | | | | | | | | | | | | | Previously we were storing the RasterDiscard flag (for GL_RASTERIZER_DISCARD) in gl_context::TransformFeedback. This was confusing, because we use the _NEW_TRANSFORM flag (not _NEW_TRANSFORM_FEEDBACK) to track state updates to it, and because rasterizer discard has effects even when transform feedback is not in use. This patch makes RasterDiscard a toplevel element in gl_context rather than a subfield of gl_context::TransformFeedback. Note: We can't put RasterDiscard inside gl_context::Transform, since all items inside gl_context::Transform need to be pieces of state that are saved and restored using PushAttrib and PopAttrib. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965 gen6: Turn on transform feedback extension unconditionally.Paul Berry2011-12-201-1/+1
| | | | | | | | | | | | Previously, we only enabled transform feedback when MESA_GL_VERSION_OVERRIDE was 3.0 or greater, since transform feedback support was not completely finished, so it didn't make sense to advertise support for it unless absolutely necessary. Now that transform feedback is fully implemented on gen6, we can enable this extension unconditionally. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Implement transform feedback queries.Paul Berry2011-12-203-0/+54
| | | | | | | | | | | | | | | | | | | | This patch adds software-based PRIMITIVES_GENERATED and TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries that work by keeping track of the number of primitives that are sent down the pipeline, and adjusting as necessary to account for the way each primitive type is tessellated. In the long run we'll want to replace this with a hardware-based implementation, because the software approach won't work with geometry shaders or primitive restart. However, at the moment, we don't have the necessary kernel support to implement a hardware-based query (we would need the kernel to save GPU registers when context switching, so that drawing performed by another process doesn't get counted). Fixes Piglit tests EXT_transform_feedback/query-primitives_generated-* and EXT_transform_feedback/query-primitives-written-*. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Convert if/else to switch statements in brw_queryobj.cPaul Berry2011-12-201-6/+30
| | | | | | | | | | | | | Previously, i965 only supported two query types: GL_TIME_ELAPSED_EXT and GL_SAMPLES_PASSED_ARB, and it distinguished between the two using if/else statements that compared query->Base.Target to GL_TIME_ELAPSED_EXT. This patch changes the if/else statements to switch statements so that we can add more query types without having to have a chain of else-ifs. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Ensure correct transform feedback indices on new batch.Paul Berry2011-12-205-8/+72
| | | | | | | | | | | | | | | | We don't currently have kernel support for saving GPU registers on a context switch, so if multiple processes are performing transform feedback at the same time, their SVBI registers will interfere with each other. To avoid this situation, we keep a software shadow of the state of the SVBI 0 register (which is the only register we use), and re-upload it on every new batch. The function that updates the shadow state of SVBI 0 is called brw_update_primitive_count, since it will also be used to update the counters for the PRIMITIVES_GENERATED and TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Implement rasterizer discard.Paul Berry2011-12-203-0/+37
| | | | | | | | | | | | | | | | | | | This patch enables rasterizer discard functionality (a part of transform feedback) in Gen6, by generating an alternate GS program when rasterizer discard is active. Instead of forwarding vertices down the pipeline, the alternate GS program uses a URB Write message to deallocate the URB entry that was allocated by FF sync and terminate the thread. Note: parts of the Sandy Bridge PRM seem to imply that we could do this more efficiently, by clearing the GEN6_GS_RENDERING_ENABLE bit, and not allocating a URB entry at all. However, it's not clear how we are supposed to terminate the thread if we do that. Volume 2 part 1, section 4.5.4, says "GS threads must terminate by sending a URB_WRITE message with the EOT and Complete bits set.", and my experiments so far confirm that. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Implement bounds checking for transform feedback output.Kenneth Graunke2011-12-204-0/+52
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Flush pipeline on EndTransformFeedback.Paul Berry2011-12-203-0/+22
| | | | | | | | | | | | | | | | | | | | | | | A common use case for transform feedback is to perform one draw operation that writes transform feedback output to a buffer, followed by a second draw operation that consumes that buffer as vertex input. Since vertex input is consumed at an earlier pipeline stage than writing transform feedback output, we need to flush the pipeline to ensure that the transform feedback output is completely written before the data is consumed. In an ideal world, we would do some dependency tracking, so that we would only flush the pipeline if the next draw call was about to consume data generated by a previous draw call in the same batch. However, since we don't have that sort of dependency tracking infrastructure right now, we just unconditionally flush the buffer every time glEndTransformFeedback() is called. This will cause a performance hit compared to the ideal case (since we will sometimes flush the pipeline unnecessarily), but fortunately the performance hit will be confined to circumstances where transform feedback is in use. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>