aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
* i965: Apply a workaround for the Ironlake "vertex flashing".Eric Anholt2011-03-041-1/+8
| | | | | | | | | | | | | | This is an awful hack and will hurt performance on Ironlake, but we're at a loss as to what's going wrong otherwise. This is the only common variable we've found that avoids the problem on 4 applications (CelShading, gnome-shell, Pill Popper, and my GLSL demo), while other variables we've tried appear to only be confounding. Neither the specifications nor the hardware team have been able to provide any enlightenment, despite much searching. https://bugs.freedesktop.org/show_bug.cgi?id=29172 Tested by: Chris Lord <[email protected]> (Pill Popper) Tested by: Ryan Lortie <[email protected]> (gnome-shell)
* i965: Fix extending VB packetsChris Wilson2011-03-041-2/+2
| | | | | | | | Computation of the delta of this array from the last had a silly little bug and ignored any initial delta==0 causing grief in Nexuiz and friends. Signed-off-by: Chris Wilson <[email protected]>
* i965: Handle URB_FENCE erratum for BroadwaterChris Wilson2011-03-041-0/+8
| | | | | | | | | | There is a silicon bug which causes unpredictable behaviour if the URB_FENCE command should cross a cache-line boundary. Pad before the command to avoid such occurrences. As this command only applies to gen4/5, do the fixup unconditionally as the specs do not actually state for which chip it was fixed (and the cost is negligible)... Signed-off-by: Chris Wilson <[email protected]>
* i965: Align index to type size and flush if the type changesChris Wilson2011-03-045-13/+22
| | | | Signed-off-by: Chris Wilson <[email protected]>
* intel: Add couple of missing gen6 commands to decodeChris Wilson2011-03-041-0/+2
| | | | Signed-off-by: Chris Wilson <[email protected]>
* i965: Prevent using a zero sized (or of unknown type) vertex arrayChris Wilson2011-03-041-5/+6
| | | | Signed-off-by: Chris Wilson <[email protected]>
* i965: SNB GT1 has only 32k urb and max 128 urb entries.Zou Nan hai2011-03-032-4/+19
| | | | Signed-off-by: Zou Nan hai <[email protected]>
* i965: Maxinum the usage of urb space on SNB.Zou Nan hai2011-03-021-10/+6
| | | | | | | | | SNB has 64k urb space, we only use piece of them. The more urb space we alloc, the more concurrent vs threads we can run. push the urb space usage to the limit. Signed-off-by: Zou Nan hai <[email protected]>
* intel: Support glCopyTexImage() from ARGB8888 to XRGB8888.Kenneth Graunke2011-03-011-2/+11
| | | | Nexuiz was hitting a software fallback.
* i965: Use negative relocation deltas to minimse vertex uploadsChris Wilson2011-03-014-8/+27
| | | | | | | | | | | | With relaxed relocation checking in the kernel, we can specify a negative delta (i.e. pointing outside of the target bo) in order to fake a range in a large buffer. We only then need to upload the elements used and adjust the buffer offset such that they correspond with the indices used in the DrawArrays. (Depends on libdrm 0209428b3918c4336018da9293cdcbf7f8fedfb6) Signed-off-by: Chris Wilson <[email protected]>
* i965: Undo 'continuation of vb packets'Chris Wilson2011-03-011-1/+1
| | | | | This breaks nexuiz for unknown reason; disable until a true fix can be found.
* i965: Fix uploading of shortened vertex packetsChris Wilson2011-03-011-12/+13
| | | | | | ... handle all cases and not just the interleaved upload. Signed-off-by: Chris Wilson <[email protected]>
* i965: Upload all vertices usedChris Wilson2011-03-012-31/+38
| | | | | | | | | | ... and take advantage of start_vertex_bias to trim to [min_index, max_index] where possible (i.e. when we need to upload all arrays). Fixes half_float_vertex(misc.fillmode.wireframe) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34595 Signed-off-by: Chris Wilson <[email protected]>
* Revert "i965/fs: Correctly set up gl_FragCoord.w on Sandybridge."Kenneth Graunke2011-03-011-1/+1
| | | | | This reverts commit 4a3b28113c3d23ba21bb8b8f5ebab7c567083a6d, as it caused a regression on Ironlake (bug #34646).
* i965: bump VS thread number to 60 on SNBZou Nan hai2011-03-012-2/+11
| | | | Signed-off-by: Zou Nan hai <[email protected]>
* mesa: move PBO-related functions into a new fileBrian Paul2011-02-285-0/+5
|
* intel: Use the current context rather than last bound context for a drawable.Eric Anholt2011-02-261-1/+2
| | | | | | | | If another thread bound a context to the drawable then unbound it, the driContextPriv would end up NULL. With the previous two fixes, this fixes glx-multithread-makecurrent-2, despite the issue not being about the multithreaded makecurrent.
* i965/fs: Initial plumbing to support TXD.Kenneth Graunke2011-02-252-0/+14
| | | | | This adds the opcode and the code to convert ir_txd to OPCODE_TXD; it doesn't actually add support yet.
* i965/fs: Complete TXL support on gen5+.Kenneth Graunke2011-02-251-0/+7
| | | | | Initial plumbing existed to turn the ir_txl into OPCODE_TXL, but it was never handled.
* i965/fs: Complete TXL support on gen4.Kenneth Graunke2011-02-251-0/+10
| | | | | Initial plumbing existed to turn the ir_txl into OPCODE_TXL, but it was never handled.
* i965/fs: Use a properly named constant in TXB handling.Kenneth Graunke2011-02-251-1/+1
| | | | | | | The old value, BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE makes it sound like we're doing a non-bias texture lookup. It has the same value as the new constant BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_BIAS_COMPARE, so there should be no functional changes.
* i965: Add #defines for gen4 SIMD8 TXB/TXL with shadow comparison.Kenneth Graunke2011-02-251-0/+2
| | | | From volume 4, page 161 of the public i965 documentation.
* i965: Increase Sandybridge point size clamp in the clip state.Kenneth Graunke2011-02-241-1/+1
| | | | | | | | | 255.875 matches the hardware documentation. Presumably this was a typo. NOTE: This is a candidate for the 7.10 branch, along with commit 2bfc23fb86964e4153f57f2a56248760f6066033. Reviewed-by: Eric Anholt <[email protected]>
* intel: Try using glCopyTexSubImage2D in _mesa_meta_BlitFramebufferNeil Roberts2011-02-243-22/+108
| | | | | | | | | | | | | | | | In the case where glBlitFramebuffer is being used to copy to a texture without scaling it is faster if we can use the hardware to do a blit rather than having to do a texture render. In most of the drivers glCopyTexSubImage2D will use a blit so this patch makes it check for when glBlitFramebuffer is doing a simple copy and then divert to glCopyTexSubImage2D. This was originally proposed as an extension to the common meta-ops. However, it was rejected as using the BLT is only advantageous for Intel hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33934 Signed-off-by: Chris Wilson <[email protected]>
* i965: Remember to pack the constant blend color as floats into the batchChris Wilson2011-02-241-4/+4
| | | | | | | Fixes regression from aac120977d1ead319141d48d65c9bba626ec03b8. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34597 Signed-off-by: Chris Wilson <[email protected]>
* intel: Reset the buffer offset after releasing reference to packed uploadChris Wilson2011-02-242-58/+77
| | | | | | | Fixes oglc/vbo(basic.bufferdata) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34603 Signed-off-by: Chris Wilson <[email protected]>
* i965: Unmap the correct pointer after discontiguous uploadChris Wilson2011-02-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes piglit/fbo-depth-sample-compare: ==14722== Invalid free() / delete / delete[] ==14722== at 0x4C240FD: free (vg_replace_malloc.c:366) ==14722== by 0x84FBBFD: intel_upload_unmap (intel_buffer_objects.c:695) ==14722== by 0x85205BC: brw_prepare_vertices (brw_draw_upload.c:457) ==14722== by 0x852F975: brw_validate_state (brw_state_upload.c:394) ==14722== by 0x851FA24: brw_draw_prims (brw_draw.c:365) ==14722== by 0x85F2221: vbo_exec_vtx_flush (vbo_exec_draw.c:389) ==14722== by 0x85EF443: vbo_exec_FlushVertices_internal (vbo_exec_api.c:543) ==14722== by 0x85EF49B: vbo_exec_FlushVertices (vbo_exec_api.c:973) ==14722== by 0x86D6A16: _mesa_set_enable (enable.c:351) ==14722== by 0x42CAD1: render_to_fbo (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== by 0x42CEE3: piglit_display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== by 0x42F508: display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== Address 0xc606310 is 0 bytes after a block of size 18,720 alloc'd ==14722== at 0x4C244E8: malloc (vg_replace_malloc.c:236) ==14722== by 0x85202AB: copy_array_to_vbo_array (brw_draw_upload.c:256) ==14722== by 0x85205BC: brw_prepare_vertices (brw_draw_upload.c:457) ==14722== by 0x852F975: brw_validate_state (brw_state_upload.c:394) ==14722== by 0x851FA24: brw_draw_prims (brw_draw.c:365) ==14722== by 0x85F2221: vbo_exec_vtx_flush (vbo_exec_draw.c:389) ==14722== by 0x85EF443: vbo_exec_FlushVertices_internal (vbo_exec_api.c:543) ==14722== by 0x85EF49B: vbo_exec_FlushVertices (vbo_exec_api.c:973) ==14722== by 0x86D6A16: _mesa_set_enable (enable.c:351) ==14722== by 0x42CAD1: render_to_fbo (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== by 0x42CEE3: piglit_display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== by 0x42F508: display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34604 Signed-off-by: Chris Wilson <[email protected]>
* intel: Protect against waiting on a NULL render target boChris Wilson2011-02-241-1/+1
| | | | | | | | | | If we fall back to software rendering due to the render target being absent (GPU hang or other error in creating the named target), then we do not need to nor should we wait upon the results. Reported-by: Magnus Kessler <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34656 Signed-off-by: Chris Wilson <[email protected]>
* intel: gen3 is particular sensitive to batch sizeChris Wilson2011-02-231-1/+1
| | | | | | | | | | | ... and prefers a small batch whereas gen4+ prefer a large batch to carry more state. Tuning using openarena/padman indicate that a batch size of just 4096 is best for those cases. Bugzilla: https://bugs.freedesktop.org/process_bug.cgi Signed-off-by: Chris Wilson <[email protected]>
* i915: And remember assign the new value to the state reg...Chris Wilson2011-02-231-0/+1
| | | | | | | Fixes regression from 298ebb78de8a6b6edf0aa0fe8d784d00bbc2930e. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34589 Signed-off-by: Chris Wilson <[email protected]>
* i965: Increase Sandybridge point size clamp.Kenneth Graunke2011-02-221-1/+1
| | | | | | | | 255.875 matches the hardware documentation. Presumably this was a typo. Found by inspection. Not known to fix any issues. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Correctly set up gl_FragCoord.w on Sandybridge.Kenneth Graunke2011-02-221-1/+1
| | | | | | | | pixel_w is the final result; wpos_w is used on gen4 to compute it. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Refactor control flow stack handling.Kenneth Graunke2011-02-221-7/+27
| | | | | | | | | We can't safely use fixed size arrays since Gen6+ supports unlimited nesting of control flow. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Avoid register coalescing away gen6 MATH workarounds.Kenneth Graunke2011-02-221-0/+10
| | | | | | | | | | | | | The code that generates MATH instructions attempts to work around the hardware ignoring source modifiers (abs and negate) by emitting moves into temporaries. Unfortunately, this pass coalesced those registers, restoring the original problem. Avoid doing that. Fixes several OpenGL ES2 conformance failures on Sandybridge. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Apply source modifier workarounds to POW as well.Kenneth Graunke2011-02-221-3/+7
| | | | | | | | | | Single-operand math already had these workarounds, but POW (the only two operand function) did not. It needs them too - otherwise we can hit assertion failures in brw_eu_emit.c when code is actually generated. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix shaders that write to gl_PointSize on Sandybridge.Kenneth Graunke2011-02-221-0/+2
| | | | | | | | | gl_PointSize (VERT_RESULT_PSIZ) doesn't take up a message register, as it's part of the header. Without this fix, writing to gl_PointSize would cause the SF to read and use the wrong attributes, leading to all kinds of random looking failure. Reviewed-by: Eric Anholt <[email protected]>
* i965: Trim the interleaved upload to the minimum number of verticesChris Wilson2011-02-221-1/+5
| | | | | | ... should have no impact on a properly formatted draw operation. Signed-off-by: Chris Wilson <[email protected]>
* i965: Reinstate max-index paranoiaChris Wilson2011-02-221-1/+1
| | | | | | | Don't trust the applications not to reference beyond the end of the vertex buffers. Signed-off-by: Chris Wilson <[email protected]>
* i965: Zero the offset into the vbo when uploading non-interleavedChris Wilson2011-02-221-0/+1
| | | | | | Fixes regression from 559435d9152acc7162e4e60aae6591c7c6c8274b. Signed-off-by: Chris Wilson <[email protected]>
* i965: Fix VB packet reuse when offset for the new buffer isn't stride aligned.Eric Anholt2011-02-211-1/+1
| | | | Fixes regression in scissor-stencil-clear and 5 other tests.
* radeon: add default switch case to silence unhandled enum warningBrian Paul2011-02-211-0/+2
|
* intel: Fix insufficient integer width for upload buffer offsetChris Wilson2011-02-211-2/+2
| | | | | | | | I was being overly miserly and gave the offset of the buffer into the bo insufficient bits, distracted by the adjacency of the buffer[4096]. Ref: https://bugs.freedesktop.org/show_bug.cgi?id=34541 Signed-off-by: Chris Wilson <[email protected]>
* i965: Remove spurious duplicate ADVANCE_BATCHChris Wilson2011-02-211-1/+0
| | | | | | ... a leftover from a bad merge. Signed-off-by: Chris Wilson <[email protected]>
* i915: Emit a single relocation per vboChris Wilson2011-02-215-17/+45
| | | | | | | | | Reducing the number of relocations has lots of nice knock-on effects, not least including reducing batch buffer size, auxilliary array sizes (vmalloced and copied into the kernel), processing of uncached relocations etc. Signed-off-by: Chris Wilson <[email protected]>
* i915: Suppress emission of redundant stencil updatesChris Wilson2011-02-211-45/+55
| | | | Signed-off-by: Chris Wilson <[email protected]>
* i915: Separate BLEND from general context state.Chris Wilson2011-02-213-22/+40
| | | | Signed-off-by: Chris Wilson <[email protected]>
* i915: Only flag context changes if the actual state is changedChris Wilson2011-02-211-49/+105
| | | | Signed-off-by: Chris Wilson <[email protected]>
* i915: suppress repeated sampler state emissionChris Wilson2011-02-212-0/+11
| | | | Signed-off-by: Chris Wilson <[email protected]>
* i915: Eliminate redundant CONSTANTS updatesChris Wilson2011-02-211-25/+26
| | | | Signed-off-by: Chris Wilson <[email protected]>
* i965: Use compiler builtins when availableChris Wilson2011-02-212-11/+8
| | | | Signed-off-by: Chris Wilson <[email protected]>