aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_wm.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: Move program compile to emit() time.Eric Anholt2011-10-291-2/+3
| | | | | | | Only 4 other prepare() functions are left, which don't rely on this. Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965/gen6+: Add support for noperspective interpolation.Paul Berry2011-10-271-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This required the following changes: - WM setup now makes the appropriate set of barycentric coordinates (perspective vs. noperspective) available to the fragment shader, based on whether the shader requires perspective interpolation, noperspective interpolation, both, or neither. - The fragment shader backend now uses the appropriate set of barycentric coordiantes when interpolating, based on the interpolation mode returned by ir_variable::determine_interpolation_mode(). - SF setup now uses gl_fragment_program::InterpQualifier to determine which attributes are to be flat shaded (as opposed to the old logic, which only flat shaded colors). - CLIP setup now ensures that the clipper outputs non-perspective barycentric coordinates when they are needed by the fragment shader. Fixes the remaining piglit tests of interpolation qualifiers that were failing: - interpolation-flat-*-smooth-none - interpolation-flat-other-flat-none - interpolation-noperspective-* - interpolation-smooth-gl_*Color-flat-* Reviewed-by: Eric Anholt <[email protected]>
* i965/gen6+: Parameterize barycentric interpolation modes.Paul Berry2011-10-271-10/+32
| | | | | | | | | | | | | | | | | This patch modifies the fragment shader back-end so that instead of using a single delta_x/delta_y register pair to store barycentric coordinates, it uses an array of such register pairs, one for each possible intepolation mode. When setting up the WM, we intstruct it to only provide the barycentric coordinates that are actually needed by the fragment shader--that is computed by brw_compute_barycentric_interp_modes(). Currently this function returns just BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC, because this is the only interpolation mode we support. However, that will change in a later patch. Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename (vs|wm)_max_threads to max_(vs|wm)_threads for consistency.Kenneth Graunke2011-10-251-1/+1
| | | | | | | | | The inconsistency between vs_max_threads and max_vs_entries was rather annoying. I could never seem to remember which one was reversed, which made it harder to find quickly. "Max __ Threads" seems more natural. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Convert from GLboolean to 'bool' from stdbool.h.Kenneth Graunke2011-10-181-2/+2
| | | | | | | | | | | | | | | | | I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chad Versace <[email protected]> Acked-by: Paul Berry <[email protected]>
* mesa: Use gl_shader_program::_LinkedShaders instead of FragmentProgramIan Romanick2011-10-071-1/+1
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: silence unused var warnings in non-debug buildsBrian Paul2011-10-071-0/+1
| | | | Reviewed-by: Chad Versace <[email protected]>
* i965: Fix Android build by removing relative includesChad Versace2011-08-301-1/+1
| | | | | | | | | | Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Fix typo in 2b224d66a01f3ce867fb05558b25749705bbfe7aEric Anholt2011-08-231-1/+1
| | | | | | | | | Unfortunately, since a previous efficiency improvement, we no longer have any open-source testcases producing register spilling, so this code was untested in the fragment shader path. That should change when we get proper temporary array support in the fragment shader. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40194
* i965: Set up allocation of a VS scratch space if required.Eric Anholt2011-08-161-22/+3
|
* i965/fs: Don't allocate the old backend's compile structs for our compile.Eric Anholt2011-08-051-4/+7
| | | | This saves some 35MB when the program only uses GLSL shaders.
* i965/fs: Add support for TXD with shadow comparisons.Kenneth Graunke2011-06-181-0/+2
| | | | | | | | | | | | | | Our hardware doesn't have a sample_d_c message, so we have to do a regular sample_d and emit instructions to manually perform the comparison. This requires a state dependent recompile whenever the sampler's compare mode or function change. This adds the per-sampler comparison functions to brw_wm_prog_key, but only sets them when the sampler's compare mode is GL_COMPARE_R_TO_TEXTURE (i.e. only for shadow sampling). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use state streaming on programs, and state base address on gen5+.Eric Anholt2011-06-181-13/+8
| | | | | | | | | | There will be a little bit of thrashing of the program cache BO as the cache warms up, but once the application is in steady state, this reduces relocations on gen5 and later. On my T420 laptop, cairogl firefox-talos-gfx performance improves 2.6% +/- 1.3% (n=6). No statistically significant performance difference on nexuiz (n=5).
* i965/fs: Do a FS compile up front at link time to produce link errors.Eric Anholt2011-05-271-7/+19
| | | | | | At glLinkShaders time, a fail() call in FS compile in 8-wide (the one that's required to succeed, though we may relax that at some point for pre-Ironlake performance) will now report out as a link error.
* i965/fs: Move the computation of register block count from unit to compile.Eric Anholt2011-05-271-1/+1
| | | | | | | No net code size change, but unit update is down 0.8% code size pre-gen6. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove dead shadowtex_mask entry in the WM key.Eric Anholt2011-05-261-3/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove linear_color for GL_PERSPECTIVE_CORRECTION_HINT.Eric Anholt2011-05-261-4/+0
| | | | | | | | | | From the GL 2.1 spec: "Required perspective-correct interpolation for all fragment attributes except depth in sections 3.4.1 and 3.5.1, effectively making GL PERSPECTIVE CORRECT HINT a no-op." Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for correct GL_CLAMP behavior by clamping coordinates.Eric Anholt2011-05-181-0/+10
| | | | | | | | This removes the stupid strict-conformance fallback code I broke when adding ARB_sampler_objects. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36572 Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* i965: Get a ralloc context into brw_compile.Kenneth Graunke2011-05-171-7/+8
| | | | | | | | | | | | This would be so much easier if we were using C++; we could simply use constructors and destructors. Instead, we have to update all the callers. While we're at it, ralloc various brw_wm_compile fields rather than explicitly calloc/free'ing them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Remove dead entrypoints to state cache, rename the one that's left.Eric Anholt2011-04-291-8/+5
| | | | | | | | | | As we expanded the usage of the state cache, it grew extra functionality. However, with the recent state streaming rework, we're back to the state cache being used only for shader kernels, which is the piece of GPU state that's actually expensive to compute again from scratch, since it involves compiling. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add initial support for 16-wide dispatch on gen6.Eric Anholt2011-04-261-9/+4
| | | | | | | | | At this point it doesn't do uniforms, which have to be laid out the same between 8 and 16. Other than that, it supports everything but flow control, which was the thing that forced us to choose 8-wide for general GLSL support. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Add support for ARB_sampler_objects.Eric Anholt2011-04-231-4/+6
| | | | | | | | | | | | This extension support consists of replacing "gl_texture_obj->Sampler." with "_mesa_get_samplerobj(ctx, unit)->". One instance of referencing the texture's base sampler remains in the initial miptree allocation, where I'm not sure we have a clear association with any texture unit. Tested with piglit ARB_sampler_objects/sampler-objects. Reviewed-by: Brian Paul <[email protected]>
* intel: Add support for ARB_color_buffer_float.Eric Anholt2011-04-201-0/+4
| | | | Reviewed-by: Brian Paul <[email protected]>
* i965/fs: Add gen6 register spilling support.Eric Anholt2011-04-171-0/+15
| | | | | | | | | | | Most of this is code movement to get the scratch space allocated in a shared location. Other than that, the only real changes are that the old oword block messages now operate on oword-aligned areas (with new messages for unaligned access, which we don't do), and that the caching control is in the SFID part of the descriptor instead of message control. Fixes glsl-fs-convolution-1.
* mesa: move sampler state into new gl_sampler_object typeBrian Paul2011-04-101-4/+4
| | | | | | gl_texture_object contains an instance of this type for the regular texture object sampling state. glGenSamplers() generates new instances of gl_sampler_object which can override that state with glBindSampler().
* i965: Fix alpha testing when there is no color buffer in the FBO.Eric Anholt2011-03-151-0/+1
| | | | | We were alpha testing against an unwritten value, resulting in garbage. (part of) Bug #35073.
* i965: Fix tex_swizzle when depth mode is GL_REDChad Versace2011-03-141-1/+2
| | | | | | | Change swizzle from (x000) to (x001). Signed-off-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove dead assignmentChad Versace2011-03-141-2/+0
| | | | | | | | The assignment on line 368, `tex_swizzles[i] = SWIZZLE_NOOP`, is rendered dead by the reassignment on line 392. Signed-off-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop the dead tracking of color_regions[].Eric Anholt2011-02-041-1/+2
| | | | We pull the draw regions right out of the renderbuffers these days.
* intel: Set the swizzling for depth textures using the GL_RED depth mode.Eric Anholt2010-12-091-0/+4
| | | | Fixes depth-tex-modes-rg.
* i965: Nuke brw_wm_glsl.c.Eric Anholt2010-12-061-14/+7
| | | | | | | | | | It was only used for gen6 fragment programs (not GLSL shaders) at this point, and it was clearly unsuited to the task -- missing opcodes, corrupted texturing, and assertion failures hit various applications of all sorts. It was easier to patch up the non-glsl for remaining gen6 changes than to make brw_wm_glsl.c complete. Bug #30530
* i965: Move payload reg setup to compile, not lookup time.Eric Anholt2010-12-061-53/+61
| | | | | | | | Payload reg setup on gen6 depends more on the dispatch width as well as the uses_depth, computes_depth, and other flags. That's something we want to decide at compile time, not at cache lookup. As a bonus, the fragment shader program cache lookup should be cheaper now that there's less to compute for the hash key.
* i965: Fix gl_FragCoord inversion when drawing to an FBO.Eric Anholt2010-11-141-0/+1
| | | | | | This showed up as cairo-gl gradients being inverted on everyone but Intel, where I'd apparently tweaked the transformation to work around the bug. Fixes piglit fbo-fragcoord.
* intel: Annotate debug printout checks with unlikely().Eric Anholt2010-11-031-1/+1
| | | | | | | This provides the optimizer with hints about code hotness, which we're quite certain about for debug printouts (or, rather, while we developers often hit the checks for debug printouts, we don't care about performance while doing so).
* i965: Correct scratch space allocation.Eric Anholt2010-10-211-8/+13
| | | | | | | | | | One, it was allocating increments of 1kb, but per thread scratch space is a power of two. Two, the new FS wasn't getting total_scratch set at all, so everyone thought they had 1kb and writes beyond 1kb would go stomping on a neighbor thread. With this plus the previous register spilling for the new FS, glsl-fs-convolution-1 passes.
* i965: Tell the shader compiler when we expect depth writes for gen6.Eric Anholt2010-10-191-0/+6
| | | | | | | | This fixes hangs in some Z-writes-in-shaders tests, though other pieces don't come out correctly. Bug #30392: hang in fbo-fblit-d24s8. (still fails with bad color drawn to some targets)
* Drop GLcontext typedef and use struct gl_context insteadKristian Høgsberg2010-10-131-1/+1
|
* i965: Fix glean/texSwizzle regression in previous commit.Eric Anholt2010-10-031-18/+18
| | | | Easy enough patch, who needs a full test run. Oh, that's right. Me.
* i965: Set up swizzling of shadow compare results for GL_DEPTH_TEXTURE_MODE.Eric Anholt2010-10-021-1/+32
| | | | | | | | | | The brw_wm_surface_state.c handling of GL_DEPTH_TEXTURE_MODE doesn't apply to shadow compares, which always return an intensity value. The texture swizzles can do the job for us. Fixes: glsl1-shadow2D(): 1 glsl1-shadow2D(): 3
* i965: Add support for attribute interpolation on Sandybridge.Eric Anholt2010-09-281-6/+45
| | | | | Things are simpler these days thanks to barycentric interpolation parameters being handed in in the payload.
* i965: Track the windowizer's dispatch for kill pixel, promoted, and OQEric Anholt2010-09-211-1/+3
| | | | | | | | Looks like the problem was we weren't passing the depth to the render target as expected, so the chip would wedge. Fixes GPU hang in occlusion-query-discard. Bug #30097
* i965: DP2 produces a scalar result like DP3, DP4, etc.Eric Anholt2010-09-011-0/+1
| | | | Fixes glsl-fs-dot-vec2-2.
* i965: Start building direct GLSL2 IR to 965 assembly codegen.Eric Anholt2010-08-261-11/+13
| | | | | | | | | | | Our channel-expressions and vector-splitting changes now happen into a private copy of the IR that we maintain for ourselves. Uniform assignment still happens by the core, so we continue using Mesa IR generation not just for swrast fallbacks but also for uniform values (since there's no storage for their contents other than shader_program->FragmentProgram->Parameters->ParameterValues). And most importantly, at the moment no actual codegen is hooked up other than emitting our favorite color to the framebuffer.
* intel: Remove include of texmem.h, since we haven't used it in ages.Eric Anholt2010-08-131-1/+1
|
* intel: Change dri_bo_* to drm_intel_bo* to consistently use new API.Eric Anholt2010-06-081-2/+2
| | | | | The slightly less mechanical change of converting the emit_reloc calls will follow.
* Replace _mesa_malloc, _mesa_calloc and _mesa_free with plain libc versionsKristian Høgsberg2010-02-191-4/+4
|
* i965: Fix fp fragment.position handling and enable HW part of ARB_fcc.Eric Anholt2010-01-261-5/+1
| | | | | | | | | As with swrast, this fixes the default pixel center behavior which was broken, and implements the previous behavior for integer. Fixes piglit fp-arb-fragment-coord-conventions-none. The extension won't be exposed until we get the GLSL part implemented. The DRI1 origin_x/y parts are dropped since they're no longer relevant.
* Merge branch 'mesa_7_7_branch'Brian Paul2010-01-251-1/+0
|\ | | | | | | | | | | | | | | | | | | | | Conflicts: src/mesa/drivers/dri/intel/intel_screen.c src/mesa/drivers/dri/intel/intel_swapbuffers.c src/mesa/drivers/dri/r300/r300_emit.c src/mesa/drivers/dri/r300/r300_ioctl.c src/mesa/drivers/dri/r300/r300_tex.c src/mesa/drivers/dri/r300/r300_texstate.c
| * i965: Remove unnecessary headers.Vinson Lee2010-01-221-1/+0
| |
* | i965: Allow for variable-sized auxdata in the state cache.Eric Anholt2010-01-191-6/+7
|/ | | | | | Everything has been constant-sized until now, but constant buffer handling changes will make us want some additional variable sized array.