summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* glsl: add support for shader stencil exportDave Airlie2010-10-132-0/+2
| | | | | This adds proper support for the GL_ARB_shader_stencil_export extension to the GLSL compiler. Thanks to Ian for pointing out where I need to add things.
* st/mesa: use shader stencil export to accelerate shader drawpixels.Dave Airlie2010-10-134-57/+158
| | | | | | If the pipe driver has shader stencil export we can accelerate DrawPixels using it. It tries to pick an S8 texture and works its way to X24S8 and S8X24 if that isn't supported.
* st/mesa: add option to choose a texture format that we won't render to.Dave Airlie2010-10-133-8/+22
| | | | | | | We need a texture to put the drawpixels stuff into, an S8 texture is less memory/bandwidth than the 32-bit X24S8, but we might not be able to render directly to an S8, so this lets us specify we won't be rendering to this texture.
* mesa: improve texstore for 8/24 formats and add texstore for S8.Dave Airlie2010-10-131-119/+144
| | | | | | | | | | | | | this improves mesa texstore for 8/24 so it can create S24X8/X24S8 variants by keeping the depth bits static. it also adds a texstore for S8 so we can write out an S8 texture to use in the sampler for accel draw pixels to save memory bw. The logic seems sound here, I've worked it out a few times on paper, though it would be good to have some review. Signed-off-by: Dave Airlie <[email protected]>
* mesa: add support for FRAG_RESULT_STENCIL.Dave Airlie2010-10-131-2/+3
| | | | | | this is needed to add support for stencil shader export. Signed-off-by: Dave Airlie <[email protected]>
* i965: Don't rebase the index buffer to min 0 if any arrays are in VBOs.Eric Anholt2010-10-124-11/+15
| | | | | | | | | There was a check to only do the rebase if we didn't have everything in VBOs, but nexuiz apparently hands us a mix of VBOs and arrays, resulting in blocking on the GPU to do a rebase. Improves nexuiz 800x600, high-settings performance on my Ironlake 41% (+/- 1.3%), from 14.0fps to 19.7fps.
* intel: Allow CopyTexSubImage to InternalFormat 3/4 textures, like RGB/RGBA.Eric Anholt2010-10-121-0/+2
| | | | | | The format selection of the CopyTexSubImage is pretty bogus still, but this at least avoids software fallbacks in nexuiz, bringing performance from 7.5fps to 12.8fps on my machine.
* i965: Fix missing "break;" in i2b/f2b, and missing AND of CMP result.Eric Anholt2010-10-121-2/+3
| | | | Fixes glsl-fs-i2b.
* glsl: Fix incorrect assertionIan Romanick2010-10-121-1/+1
| | | | | | | This assertion was added in commit f1c1ee11, but it did not notice that the array is accessed with 'size-1' instead of 'size'. As a result, the assertion was off by one. This caused failures in at least glsl-orangebook-ch06-bump.
* mesa: Validate assembly shaders when GLSL shaders are usedIan Romanick2010-10-121-12/+40
| | | | | | | | | | If an GLSL shader is used that does not provide all stages and assembly shaders are provided for the missing stages, validate the assembly shaders. Fixes bugzilla #30787 and piglit tests glsl-invalid-asm0[12]. NOTE: this is a candidate for the 7.9 branch.
* ir_to_mesa: assorted clean-ups, const qualifiers, new commentsBrian Paul2010-10-121-14/+45
|
* nouveau: Get larger push buffers.Francisco Jerez2010-10-122-2/+2
| | | | | Useful to amortize the command submission/reloc overhead (e.g. etracer goes from 72 to 109 FPS on nv4b).
* dri/nouveau: Initialize tile_flags when allocating a render target.Francisco Jerez2010-10-122-6/+14
|
* i965: Always use the new FS backend on gen6.Eric Anholt2010-10-111-2/+7
| | | | | | | | | | It's now much more correct for gen6 than the old backend, with just 2 regressions I've found (one of which is common with pre-gen6 and will be fixed by an array splitting IR pass). This does leave the old Mesa IR backend getting used still when we don't have GLSL IR, but the plan is to get GLSL IR input to the driver for the ARB programs and fixed function by the next release.
* i965: Fix gen6 pixel_[xy] setup to avoid mixing int and float src operands.Eric Anholt2010-10-111-6/+15
| | | | | | | | | | | | | Pre-gen6, you could mix int and float just fine. Now, you get goofy results. Fixes: glsl-arb-fragment-coord-conventions glsl-fs-fragcoord glsl-fs-if-greater glsl-fs-if-greater-equal glsl-fs-if-less glsl-fs-if-less-equal
* i965: Don't compute-to-MRF in gen6 VS math.Eric Anholt2010-10-111-7/+15
| | | | | There was code to do this for pre-gen6 already, this just enables it for gen6 as well.
* i965: Expand uniform args to gen6 math to full registers to get hstride == 1.Eric Anholt2010-10-111-0/+25
| | | | | | | | | | This is a hw requirement in math args. This also is inefficient, as we're calculating the same result 8 times, but then we've been doing that on pre-gen6 as well. If we're doing math on uniforms, though, we'd probably be better served by having some sort of mechanism for precalculating those results into another uniform value to use. Fixes 7 piglit math tests.
* i965: Don't compute-to-MRF in gen6 math instructions.Eric Anholt2010-10-111-0/+16
|
* i965: Add a couple of checks for gen6 math instruction limits.Eric Anholt2010-10-111-0/+26
|
* i965: Don't consider gen6 math instructions to write to MRFs.Eric Anholt2010-10-111-17/+38
| | | | | This was leftover from the pre-gen6 cleanups. One tests regresses where compute-to-MRF now occurs.
* intel_extensions: Add ability to set GLSL version via environmentChad Versace2010-10-111-1/+18
| | | | | | | | | Add ability to set the GLSL version used by the GLcontext by setting the environment variable INTEL_GLSL_VERSION. For example, env INTEL_GLSL_VERSION=130 prog args If the environment variable is missing, the GLSL versions defaults to 120. Reviewed-by: Ian Romanick <[email protected]>
* r200: revalidate after radeon_update_renderbuffersDaniel Vetter2010-10-113-3/+10
| | | | | | | | | | | | | | | | | By calling radeon_draw_buffers (which sets the necessary flags in radeon->NewGLState) and revalidating if NewGLState is non-zero in r200TclPrimitive. This fixes an assert in libdrm (the color-/ depthbuffer was changed but not yet validated) and and stops the kernel cs checker from complaining about them (when they're too small). Thanks to Mario Kleiner for the hint to call radeon_draw_buffer (instead of my half-broken hack). v2: Also fix the swtcl r200 path. Cc: Mario Kleiner <[email protected]> Signed-off-by: Daniel Vetter <[email protected]>
* i965: Compute to MRF in the new FS backend.Eric Anholt2010-10-112-0/+124
| | | | | | This didn't produce a statistically significant performance difference in my demo (n=4) or nexuiz (n=3), but it still seems like a good idea and is recommended by the HW team.
* i965: Give the FB write and texture opcodes the info on base MRF, like math.Eric Anholt2010-10-112-38/+48
|
* i965: Give the math opcodes information on base mrf/mrf len.Eric Anholt2010-10-112-12/+57
| | | | This is progress towards enabling a compute-to-MRF pass.
* i965: Move FS backend structures to a header.Eric Anholt2010-10-115-363/+407
| | | | It's time to start splitting some of this up.
* i965: Reduce register interference checks for changed FS_OPCODE_DISCARD.Eric Anholt2010-10-111-5/+2
| | | | | | While I don't know of any performance changes from this (once extra reg available out of 128), it makes the generated asm a lot cleaner looking.
* i965: Split FS_OPCODE_DISCARD into two steps.Eric Anholt2010-10-111-9/+23
| | | | | | Having the single opcode write then read the reg meant that single instruction opcodes had to consider their source regs to interfere with their dest regs.
* dri/nv10: Fake fast Z clears for pre-nv17 cards.Francisco Jerez2010-10-104-20/+127
|
* dri/nouveau: Minor cleanup.Francisco Jerez2010-10-104-23/+22
|
* i965: Initialize member variables.Vinson Lee2010-10-081-0/+2
| | | | | | | | | Fixes these GCC warnings. brw_wm_fp.c: In function 'search_or_add_const4f': brw_wm_fp.c:92: warning: 'reg.Index2' is used uninitialized in this function brw_wm_fp.c:84: note: 'reg.Index2' was declared here brw_wm_fp.c:92: warning: 'reg.RelAddr2' is used uninitialized in this function brw_wm_fp.c:84: note: 'reg.RelAddr2' was declared here
* i965: Silence unused variable warning on non-debug builds.Vinson Lee2010-10-081-0/+1
| | | | | | Fixes this GCC warning. brw_vs.c: In function 'do_vs_prog': brw_vs.c:46: warning: unused variable 'ctx'
* i965: Silence unused variable warning on non-debug builds.Vinson Lee2010-10-081-0/+1
| | | | | | Fixes this GCC warning. brw_eu_emit.c: In function 'brw_math2': brw_eu_emit.c:1189: warning: unused variable 'intel'
* i915: Silence unused variable warning in non-debug builds.Vinson Lee2010-10-081-0/+1
| | | | | | Fixes this GCC warning. i915_vtbl.c: In function 'i915_assert_not_dirty': i915_vtbl.c:670: warning: unused variable 'dirty'
* i915: Silence unused variable warning in non-debug builds.Vinson Lee2010-10-081-0/+1
| | | | | | Fixes this GCC warning. i830_vtbl.c: In function 'i830_assert_not_dirty': i830_vtbl.c:704: warning: unused variable 'i830'
* intel: Enable GL_ARB_explicit_attrib_locationIan Romanick2010-10-081-0/+1
|
* main: Enable GL_ARB_explicit_attrib_location for swrastIan Romanick2010-10-081-1/+2
|
* i965: Add register coalescing to the new FS backend.Eric Anholt2010-10-081-0/+80
| | | | | | | Improves performance of my GLSL demo 14.3% (+/- 4%, n=4) by eliminating the moves used in ir_assignment and ir_swizzle handling. Still 16.5% to go to catch up to the Mesa IR backend, presumably because instructions are almost perfectly mis-scheduled now.
* i965: Enable attribute swizzling (repositioning) in the gen6 SF.Eric Anholt2010-10-081-1/+2
| | | | | | | | | We were trying to remap a fully-filled array down to only handing the WM the components it uses. This is called attribute swizzling, and if you don't enable it you just get 1:1 mappings of inputs to outputs. This almost fixes glsl-routing, except for the highest gl_TexCoord[] indices.
* i965: Fix new FS gen6 interpolation for sparsely-populated arrays.Eric Anholt2010-10-081-1/+1
| | | | We'd overwrite the same element twice.
* i965: Fix gen6 WM push constants updates.Eric Anholt2010-10-081-1/+2
| | | | | | We would compute a new buffer, but never point the hardware at the new buffer. This partially fixes glsl-routing, as now it get the updated uniform for which attribute to draw.
* i965: Handle swizzles in the addition of YUV texture constants.Eric Anholt2010-10-081-2/+5
| | | | | If someone happened to land a set in a different swizzle order, we would have assertion failed.
* i965: Drop the check for YUV constants in the param list.Eric Anholt2010-10-081-13/+0
| | | | _mesa_add_unnamed_constant() already does that.
* i965: Drop the check for duplicate _mesa_add_state_reference.Eric Anholt2010-10-081-6/+0
| | | | _mesa_add_state_reference does that check for us anyway.
* mesa: Simplify a bit of _mesa_add_state_reference using memcmp.Eric Anholt2010-10-081-12/+3
|
* i965: Normalize cubemap coordinates like is done in the Mesa IR path.Eric Anholt2010-10-074-0/+114
| | | | Fixes glsl-fs-texturecube-2-*
* i965: Disable emitting if () statements on gen6 until we really fix them.Eric Anholt2010-10-072-0/+7
|
* gles2: Add GL_EXT_texture_format_BGRA8888 supportKristian Høgsberg2010-10-074-1/+16
|
* i965: Fix gen6 pointsize handling to match pre-gen6.Eric Anholt2010-10-061-1/+2
| | | | | Fixes point-line-no-cull. Bug #30532
* i965: Don't assume that WPOS is always provided on gen6 in the new FS.Eric Anholt2010-10-061-2/+1
| | | | | | | | We sensibly only provide it if the FS asks for it. We could actually skip WPOS unless the FS needed WPOS.zw, but that's something for later. Fixes: glsl-texture2d and probably many others.