summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: handle gl_PointCoord for Gen4 and Gen5 platformsYuanhan Liu2012-03-075-5/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch add the support of gl_PointCoord gl builtin variable for platform gen4 and gen5(ILK). Unlike gen6+, we don't have a hardware support of gl_PointCoord, means hardware will not calculate the interpolation coefficient for you. Instead, you should handle it yourself in sf shader stage. But badly, gl_PointCoord is a FS instead of VS builtin variable, thus it's not included in c.vue_map generated in VS stage. Thus the current code doesn't aware of this attribute. And to handle it correctly, we need add it to c.vue_map manually to let SF shader generate the needed interpolation coefficient for FS shader. SF stage has it's own copy of vue_map, thus I think it's safe to do it manually. Since handling gl_PointCoord for gen4 and gen5 platforms is somehow a little special, I added a lot of comments and hope I didn't overdo it ;) v2: add a /* _NEW_BUFFERS */ comment to note the state flag dependency and also add the _NEW_BUFFERS dirty mask (Eric). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45975 Piglit: glsl-fs-pointcoord and fbo-gl_pointcoord NOTE: This is a candidate for stable release branches. Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i915: move the FALLBACK_DRAW_OFFSET check outside the drawing rect checkYuanhan Liu2012-03-071-4/+3
| | | | | | | | | | | | | We have to do fallback when the 'Clipped Drawing Rectangle X/Y Max' exceed the hardware's limit no matter the drawing rectangle offset changed or not. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46665 NOTE: This is a candidate for stable release branches. Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* dri/nouveau: don't use nested functionsnobled2012-03-062-64/+78
| | | | | | | | | | | | | | | | It's a GNU extension that isn't supported by clang right now: http://gcc.gnu.org/onlinedocs/gcc-4.6.3/gcc/Nested-Functions.html http://clang.llvm.org/docs/UsersManual.html#c_unimpl_gcc With this, clang now compiles the nouveau classic driver. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44061 (Types changed from e.g. 'unsigned char' to 'GLubyte' so that the types can be concatenated to form a unique function name without any whitespace interfering.) [ Francisco Jerez: give meaningful names to the dispatch functions. ]
* i965: fixup W-tile offset computation to take swizzling into accountDaniel Vetter2012-03-057-22/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's even a comment in the code containing the right swizzling computations! Previously this has not been noticed because we need to manually enabled swizzling on snb/ivb (kernel 3.4 will do that) and we don't use the separate stencil on ilk (where the bios enables swizzling). This fixes piglit ./bin/fbo-stencil readpixels GL_DEPTH32F_STENCIL8 -auto on recent drm-intel-next kernels. Also remove the comment about ivb, it's stale now. Swizzling detection is done by allocating a temporary x-tiled buffer object. Unfortunately kernels before v3.2 lie on snb/ivb because they claim that swizzling is enable, but it isn't. The kernel commit that fixes this for backport to pre-v3.2 is commit acc83eb5a1e0ae7dbbf89ca2a1a943ade224bb84 Author: Daniel Vetter <[email protected]> Date: Mon Sep 12 20:49:16 2011 +0200 drm/i915: fix swizzling on gen6+ But if the kernel doesn't lie, this now works on swizzling and not swizzling machines. NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Fix compilation without FEATURE_EXT_transform_feedbackBenjamin Franzke2012-03-051-0/+6
| | | | | | | | | That is when building with --disable-opengl. Fix for commit cb045880b113b0042d8dfb7e4cdf76e6cc76c1d1. CC: Paul Berry <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* meta: Fix compilation without FEATURE_feedbackBenjamin Franzke2012-03-051-0/+6
| | | | | | | | | That is when building with --disable-opengl. Fix for commit c5f4024a793f1209b1693aed9a46be9374ba4741. CC: Chad Versace <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i915: fix wrong rendering of gl_PointSize on PineviewYuanhan Liu2012-03-051-0/+4
| | | | | | | | | | | | | | | | | | The current code would ignore the point size specified by gl_PointSize builtin variable in vertex shader on Pineview. This patch servers as fixing that. This patch fixes the following issues on Pineview: webglc: https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/sdk/tests/conformance/rendering/point-size.html piglit: glsl-vs-point-size NOTE: This is a candidate for stable release branches. v2: pick Eric's nice tip for fixing this issue in hardware rendering. v3: the last arg of EMIT_ATTR specify the size in _byte_. (Eric) Signed-off-by: Yuanhan Liu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i915: Fix i830 polygon stipple from PBOs.Kurt Roeckx2012-03-021-1/+7
| | | | | | | | | | | This is a direct port of the i915 patch in a856da63247a4b403f6350914f732e14d1530ed1. Fixes glean's pbo test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41372 Reviewed-by: Eric Anholt <[email protected]> NOTE: This is a candidate for release branches.
* i915: Compute maximum number of verts using the actual batchbuffer size.Kurt Roeckx2012-03-021-3/+3
| | | | | | | | | | | We were looking at the size of batch.map for how big the batchbuffer was, but on 865 we just use a single-page batchbuffer due to hardware limits. v2: Removed check for sizeof map < bo->size, since that's always false. [change by anholt] NOTE: This is a candidate for release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41495
* i830: Compute initial number of vertices from remaining batch spaceChris Wilson2012-03-021-5/+11
| | | | | | | | | | | | | In order to prevent an overflow of the batch buffer when emitting triangles, we need to limit the initial primitive to fit within the current batch. To do we need to measure the remaining space and thence compute the maximum number of vertices that fit into that space. Reported-by: Kurt Roeckx <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41495 Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Eric Anholt <[email protected]> NOTE: This is a candidate for release branches.
* dri/i915: Fix off-by-one in i830 clip region size.Alban Browaeys2012-03-021-2/+2
| | | | | | | | | | | | | | | | | | The hardware, like i915, uses an inclusive bounds on min and max for the drawing rectangle, but we were providing a number for exclusive. The number of bits used by the hardware only covers this value going up to the maximum size, so when we programmed 2048 as the maximum inclusive X, it saw a maximum X of 0 and clipped all rendering. This caused rendering failures in gnome-shell. Fixes piglit fbo-maxsize. v2: dropped changes to the blitter, which does use an exclusive x2, y2. [change by anholt] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45558 Reviewed-by: Eric Anholt <[email protected]> NOTE: This is a candidate for release branches.
* intel: Don't enable GL_ARB_draw_instanced pre-gen4.Eric Anholt2012-03-021-1/+1
| | | | swtnl doesn't handle this extension.
* i915: Fix piglit fbo-nodepth-test on i830.Eric Anholt2012-03-022-3/+8
| | | | | | | | | This is a direct port of fc4fba52cf7e9616c70dd76b4d6bdba6582e157b from i915, and fixes GPU hangs when running piglit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41372 Reviewed-by: Eric Anholt <[email protected]> NOTE: This is a candidate for release branches.
* i965: Disable PrimitiveID upload.Kenneth Graunke2012-02-291-1/+1
| | | | | | | | | | | We currently don't support gl_PrimitiveID, and I believe asking the hardware to generate it results in vertex cache invalidations. This could result in slowdowns for applications that use gl_InstanceID, which would be counter-productive. Just turn it off for now. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Enable the GL_ARB_draw_instanced extension.Kenneth Graunke2012-02-292-2/+3
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix swizzles for system values such as gl_InstanceID.Kenneth Graunke2012-02-291-0/+4
| | | | | | | | | | | | | | | visit(ir_variable *) sets dst_reg::writemask to the appropriate channel for system values. Unfortunately, visit(ir_dereference_variable *) then calls swizzle_for_size, which for a float, sets the swizzle to .x. This works for gl_VertexID, since we store it in the .x component (see brw_draw_upload.c:732 - VID), but fails for gl_InstanceID (IID) since we store it in the .y channel. To fix this, avoid calling swizzle_for_size on ir_var_system_values. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix Gen6+ dynamic state upper bound on older kernels.Kenneth Graunke2012-02-291-2/+1
| | | | | | | | | | | | | | | | | | | | | Kernels prior to 271d81b84171d84723357ae6d172ec16b0d8139c (March 2011) don't support relocations outside of the target buffer object. Rather than guarding this with a I915_PARAM_HAS_RELAXED_DELTA check, just smash the bound to 0xfffff001 like we do on Ironlake. This effectively gives us no upper bound check, just like we did prior to commit 271d81b84171d84723357ae6d172ec16b0d8139c. Daniel Vetter would also like to mention that this relies on the guard page at the end of the GTT. NOTE: This is a candidate for release branches. Fixes a regression since 271d81b84171d84723357ae6d172ec16b0d8139c. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46766 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Daniel Vetter <[email protected]>
* mesa: Push the shine table into the tnl module.Mathias Fröhlich2012-02-292-4/+4
| | | | | | | | | | All users of the shine table outside of the tnl module are gone. Move the implementation into the tnl module and prefix the public functions with _tnl. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]>
* i965: Avoid blocking on the GPU for setting the HiZ op vertex data.Eric Anholt2012-02-284-60/+9
| | | | | | | | | | | | We need to allocate new space every time to avoid blocking on the last HiZ op completing. There are two easy ways to do this: brw_state_batch() and intel_upload_data(). brw_state_batch() is simpler and avoids another buffer allocation. Improves Unigine Tropics performance 0.376416% +/- 0.148722% (n=7). Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* xlib: silence unused var warningBrian Paul2012-02-271-0/+1
|
* mesa/gdi: include swrast.h to fix compilationBrian Paul2012-02-241-0/+1
|
* xlib: remove STENCIL_BITSBrian Paul2012-02-241-3/+3
|
* mesa: remove last of MAX_WIDTH, MAX_HEIGHTBrian Paul2012-02-241-1/+0
| | | | | Define new MAX_VIEWPORT_WIDTH/HEIGHT and MAX_RENDERBUFFER_SIZE values instead.
* intel: remove MAX_WIDTH usage in intelInitContext()Brian Paul2012-02-241-4/+2
|
* osmesa: use SWRAST_MAX_WIDTH/HEIGHTBrian Paul2012-02-241-4/+4
|
* dri/swrast: use SWRAST_MAX_WIDTH/HEIGHTBrian Paul2012-02-241-1/+1
|
* xlib: use SWRAST_MAX_WIDTH/HEIGHTBrian Paul2012-02-241-3/+3
|
* i915: Initialize swrast_texture_image structure fields.Paul Berry2012-02-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Commit 980f6f1 (mesa: move gl_texture_image::Width/Height/DepthScale fields to swrast) moved the initialization of the Width, Height, and DepthScale fields to _swrast_alloc_texture_image_buffer(). However, i915 doesn't call this function because it performs its own buffer allocation. As a result, the Width, Height, and DepthScale fields weren't getting initialized properly, and some operations requiring swrast would fail. This patch ensures that Width, Height, and DepthScale are properly initialized by separating the code that sets them into a new function, _swrast_init_texture_image(), which is called by intel_alloc_texture_image_buffer() as well as _swrast_alloc_texture_image_buffer(). It also moves the initialization of _IsPowerOfTwo into this function. Fixes piglit test fbo/fbo-cubemap on i915. Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=41216 This is a candidate for the 8.0 branch. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: bump DRI_IMAGE extension version to 3Jesse Barnes2012-02-221-1/+1
| | | | | | | To indicate support for the format query. Reviewed-by: Kristian Høgsberg <[email protected]> Signed-off-by: Jesse Barnes <[email protected]>
* gbm: track buffer format through DRI driversJesse Barnes2012-02-224-0/+8
| | | | | | | | | | | | GBM needs the buffer format in order to communicate with DRM and clients for things like scanout. So track the DRI format requested in the various back ends and use it to return the DRI format back to GBM when requested. GBM will then map this into the GBM surface type (which is in turn based on the DRM fb format list). Signed-off-by: Jesse Barnes <[email protected]>
* i965/gen6: Fix near-NULL deref in setting up GS binding table for non-XFB.Eric Anholt2012-02-211-5/+8
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Kenneth Graunke <[email protected]>
* i965: Correct the size of the state batch space allocated for binding tables.Eric Anholt2012-02-212-2/+2
| | | | | | | | | In the gen6 GS case, we were under-counting and so other state would get smashed. In the VS case, we were over-counting, so everything was fine. Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Kenneth Graunke <[email protected]>
* i965: Fix a bad comment in gen6 sol setup.Eric Anholt2012-02-211-3/+1
| | | | | | | | | This was copy and paste from the VS where I had similar code. We're only looking at things derived from BRW_NEW_VERTEX_PROGRAM in this block. Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Kenneth Graunke <[email protected]>
* i965/gen6: Fix the size of the GS surface binding table.Eric Anholt2012-02-211-1/+1
| | | | | | | I obviously didn't test on gen6 before pushing. Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Kenneth Graunke <[email protected]>
* i965: Only set Last Render Target Select on the last FB write.Kenneth Graunke2012-02-211-1/+1
| | | | | | | | | | | | | Fixes GPU hangs in OilRush, Trine, and Amnesia: The Dark Descent, which all use MRT (multiple render targets). NOTE: This is a candidate for release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38720 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40059 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45216 Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* intel: Silence valgrind warning for getparam ioctl argument.Eric Anholt2012-02-211-0/+1
| | | | It was concerned that the 4 pad bytes on LP64 were uninitialized.
* i965: Rename the original binding table to mention that it's the WM now.Eric Anholt2012-02-217-32/+30
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Split the gen6 GS binding table to a separate table.Eric Anholt2012-02-215-10/+75
| | | | | | | | Improves VS state change microbenchmark performance by 7.08729% +/- 1.22289% (n=10) on gen7, because we don't upload the 64 dwords of unused binding table any more. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Split the VS binding table to a separate table.Eric Anholt2012-02-219-17/+94
| | | | | | | | This is a step toward making the samplers/binding tables reflect sampler uniform mappings instead of embedding those in the programs. No significant performance difference on the microbenchmark (n=10). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6+: Avoid recomputing whether we use noperspective.Eric Anholt2012-02-213-36/+10
| | | | | | | Improves VS state change microbenchmark performance 2.38246% +/- 1.15046% (n=20). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen7: Skip checking if we need a GS program for now.Eric Anholt2012-02-211-1/+0
| | | | | | | We always say no. Improves VS state change microbenchmark performance 7.68747% +/- 1.40826% (n=10). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Compute required barycentric interp modes once at FS compile time.Eric Anholt2012-02-214-20/+17
| | | | | | | Improves VS state change microbenchmark performance 1.78817% +/- 0.556878% (n=25). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move VUE map computation to once at VS compile time.Eric Anholt2012-02-2112-49/+42
| | | | | | | | | | With this and the previous patch, 640x480 nexuiz is running 0.169118% +/- 0.0863696% faster (n=121). On a VS state change microbenchmark, performance is increased 8.28645% +/- 0.460478% (n=52). v2: Fix CACHE_NEW_VS comment. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the userclip flag for the VUE map come from VS prog data.Eric Anholt2012-02-2110-39/+29
| | | | | | | | This reduces recomputation of state based on non-clipping-related transform changes, and is a step toward removing VUE map recomputation. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the dummy fragment shader work in SIMD16 mode.Kenneth Graunke2012-02-181-5/+7
| | | | | | | | | | If you're resorting to the dummy shader, you've probably already turned off SIMD16 mode. But if you didn't, it would die in a fire. We could either fail to compile in SIMD16 mode...or just fix it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix GPU hangs in the dummy fragment shader.Kenneth Graunke2012-02-181-0/+2
| | | | | | | | The dummy FB write failed to specify EOT and a message length, causing the GPU to hang. Now we can enjoy "everyone's favorite color" again. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Fix rendering from textures after RenderTexture().Eric Anholt2012-02-175-61/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a serious trap for drivers: RenderTexture() does not indicate that the texture is currently bound to the draw buffer, despite FinishRenderTexture() signaling that the texture is just now being unbound from the draw buffer. We were acting as if RenderTexture() *was* the start of rendering and that we could make texturing incoherent with the current contents of the renderbuffer. This caused intel oglconform sRGB Mipmap.1D_textures to fail, because we got a call to TexImage() and thus RenderTexture() on a texture bound to a framebuffer that wasn't the draw buffer, so we skipped validating the new image into the texture object used for rendering. We can't (easily) make RenderTexture() indicate the start of drawing, because both our driver and gallium are using it as the moment to set up the renderbuffer wrapper used for things like MapRenderbuffer(). Instead, postpone the setup of the workaround render target miptree until update_renderbuffer time, so that we no longer need to skip validation of miptrees used as render targets. As a bonus, this should make GL_NV_texture_barrier possible. (This also fixes a regression in the gen4 small-mipmap rendering since 3b38b33c1648b07e75dc4d8340758171e109c598, which switched set_draw_offset from image->mt to irb->mt but didn't move the irb->mt replacement up before set_draw_offset). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44961 NOTE: This is a candidate for the 8.0 branch.
* intel: Improve the fallback debug for framebuffer status checks.Eric Anholt2012-02-171-2/+17
|
* i965: Emit Ivybridge VS workaround flushes.Kenneth Graunke2012-02-154-2/+29
| | | | | | | | | | | | | I recently discovered this text in the BSpec. It seems wise to comply, though I haven't observed it to fix anything yet. Fixes a regression in glean/fbo since 28cfa1fa213fe. NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45221 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Take # of components into account in try_rewrite_rhs_to_dst.Kenneth Graunke2012-02-151-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Commit dc7f449d1ac53a66e6efb56ccf2a5953418a26ca introduced a new method for avoiding MOVs: try to rewrite the destination of the instruction that produced the RHS so it writes into the LHS. Unfortunately, this is not safe for swizzled texturing operations, as they return a set of four contiguous registers. Consider the following: (assign (x) (var_ref vec_ctor_x) (swiz x (tex vec4 (var_ref m_sampY) (var_ref m_cordY) 0 1 ()))) In this case, the source and destination registers are equal, since reg_offset is 0 for both. Yet, this is only a partial move: the texture operation generates four registers, and the LHS only covers one. Fixes color distortion in XBMC when using GLSL shaders. NOTE: This is a candidate for the 8.0 branch (with the previous commit). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44333 Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>