aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Split the GLSL IR -> FS LIR visitor to brw_fs_visitor.cpp.Eric Anholt2011-05-274-1679/+1736
| | | | | | | | | We now have: brw_fs.cpp handles calling out to everything and optimization. brw_fs_visitor.cpp handles translating to our LIR. brw_fs_emit.cpp handles emitting from our LIR to native code. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Split the BRW native code emit to brw_fs_emit.cppEric Anholt2011-05-273-839/+876
| | | | | | | This is all separate from the visitor and the optimization passes which feed into it. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move a couple of GLSL IR -> BRW helper functions to brw_shader.cpp.Eric Anholt2011-05-273-49/+76
| | | | | | These will be used by the VS backend as well. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move non-FS-specific shader support to brw_shader.cpp.Eric Anholt2011-05-273-100/+129
| | | | | | | These only existed in brw_fs.cpp because it was the only .cpp file in the area when I wrote them. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Avoid generating MOVs for assignments of expressions.Eric Anholt2011-05-272-12/+75
| | | | | | No statistically significant difference measured in 3dbenchmark egypt/pro. It does reduce fragment shader instructions across shader-db by 0.3%.
* i965/fs: Move the computation of register block count from unit to compile.Eric Anholt2011-05-274-7/+18
| | | | | | | No net code size change, but unit update is down 0.8% code size pre-gen6. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Track fixed GRF regs separate from allocated GRF file in scheduling.Eric Anholt2011-05-272-1/+22
| | | | | | | | | | | | | | There's an assumption here that fixed GRFs will never intersect with the allocated GRFs. That's true today, though it might change some day if we decide to register-allocate the regs containing push constants once they're dead. This fixes a regression in 0f7325b89038937bd428f7c89ed9859189a0ab0b in Lightsmark from the texture instructions now containing g0 references instead of having that be implied. Performance is improved 15.2% +/- 3.6% (n=3). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34968
* i965/fs: Add a helper function for add_dep(before, after, before->latency).Eric Anholt2011-05-271-31/+19
| | | | | | This lets us avoid a bunch of before==NULL checks in the callers. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Pack the lookup and line_aa bits into the first dword of the key.Eric Anholt2011-05-261-2/+2
| | | | | | | They were occupying whole 32-bit words, despite being only 10 or so bits. Reduces code size slightly (80/3300 bytes). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove dead shadowtex_mask entry in the WM key.Eric Anholt2011-05-262-4/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove linear_color for GL_PERSPECTIVE_CORRECTION_HINT.Eric Anholt2011-05-267-30/+10
| | | | | | | | | | From the GL 2.1 spec: "Required perspective-correct interpolation for all fragment attributes except depth in sections 3.4.1 and 3.5.1, effectively making GL PERSPECTIVE CORRECT HINT a no-op." Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Drop doubly irrelevant code in intelReadBuffers.Eric Anholt2011-05-261-12/+0
| | | | | | | | First, FBO read/draw == NULL validation happens in mesa core not intelReadBuffers -> intel_draw_buffers. Second, that condition is no longer tested for in our driver since ARB_ES2_compatibility was added. Reviewed-by: Brian Paul <[email protected]>
* i965: Warnings cleanup.Eric Anholt2011-05-252-4/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix assertion failures in unused brw_reg setup by deleting it.Eric Anholt2011-05-251-1/+0
| | | | | | | I was using undefined values to create an unused value. Go me. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37366 Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Change FBO validation criteria to accomodate hiz and seprate stencilChad Versace2011-05-251-15/+27
| | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Fix intel_draw_buffer() to accomodate hiz and separate stencilChad Versace2011-05-251-5/+11
| | | | | | | | The logic of intel_draw_buffers() expected that stencil buffers were always combined depth/stencil. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Add hiz_region to intel_mipmap_treeChad Versace2011-05-253-0/+36
| | | | | | | | | | When a texture is attached to multiple FBO's, a separate renderbuffer wrapper is created for each attachment. This necessitates storing the hiz region for these renderbuffers in the texture itself instead of the renderbuffer wrapper. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Refactor the wrapping of textures with renderbuffersChad Versace2011-05-251-7/+8
| | | | | | | | | | | | | Before this commit, the renderbuffer's region was updated in intel_renderbuffer_texture(). This commit moves the update into intel_update_wrapper(), which is a more logical location for updates. This is in preparation for the next commit, which allocates and updates the texture's hiz region in intel_update_wrapper(). Having the two region updates located in the same function makes good form. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Add hiz_region to intel_renderbufferChad Versace2011-05-252-0/+46
| | | | | | | | | | | | | | | | | | | | | | A hiz surface must be supplied to the hardware when rendering to a depth buffer with hiz. There are three potential places to store that surface: 1. Allocate a larger intel_region for the depthbuffer, and let the region's tail be the hiz surface. 2. Allocate a separate intel_region for hiz, and store it as brw_context state. 3. Allocate a separate intel_region for hiz, and store it in intel_renderbuffer. We choose method 3. Method 1 has not been chosen due to future complications it might cause when requesting a DRI drawable's depth buffer attachment from X. Method 2 has not been chosen because storing the hiz region apart from the depth region makes lazy hiz/depth resolves difficult to implement. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Add is_hiz_depth_format() to intel_contex.vtblChad Versace2011-05-253-0/+24
| | | | | | | | | Given a format, is_hiz_depth_format() indicates if HiZ can be enabled on a depthbuffer of that format. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Allocate region for separate stencil bufferChad Versace2011-05-251-3/+30
| | | | | | | | | ... in intel_alloc_renderbuffer_storage(). The stencil buffer has quirky pitch requirements, so its region allocation is a special case. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Change supported texture formats for separate stencilChad Versace2011-05-252-1/+7
| | | | | | | | | | | When hardware supports separate stencil, enable support for separate depth/stencil texture formats in the table intel_context.ctx.TextureFormatsSupported. If the hardware must use separate stencil, then disable support for combined depth/stencil formats. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Add flags to intel_context for hiz and separate stencilChad Versace2011-05-252-0/+58
| | | | | | | | | | | | | | | | | | | Add the following flags: intel_context.has_separate_stencil intel_context.must_use_separate_stencil intel_context.has_hiz The flags are currently set to false, and will be enabled for a given chipset once the feature is completely implemented. Since it may be some time before these features are completed, their values can be overridden with environment variables INTEL_HIZ and INTEL_SEPARATE_STENCIL. Valid values for these environment variables are "0" and "1". Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965/gen7: Fix miptree layout for cube surfaces.Kenneth Graunke2011-05-221-1/+1
| | | | | | | | | | | | | | | | | Volume 1a section 8.20.4.7.3 gives new equations which multiply by 12 instead of 11. Fixes 8 piglit tests: - fbo-cubemap - texCube - glsl-fs-texturecube - glsl-fs-texturecube-2 - glsl-fs-texturecube-2-bias - glsl-fs-texturecube-bias - arb_seamless_cubemap - cubemap Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Remove comments about pre-965 hardware.Kenneth Graunke2011-05-221-3/+0
| | | | | | They're irrelevant for this driver. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Fix sampling on Ivybridge after headerless change.Kenneth Graunke2011-05-201-2/+13
| | | | | | Fixes a regression since 90e922267a89fa9bef254bb257405531ceff7356. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Remove "TXD" from justification of sampler message headers.Kenneth Graunke2011-05-201-1/+1
| | | | | | | The coordinate offsets set in the m1 header are for textureOffset; they have nothing to do with textureGrad (TXD). Signed-off-by: Kenneth Graunke <[email protected]>
* i965/gen7: Add support for rendering to depthbuffer mipmap levels > 0.Kenneth Graunke2011-05-202-31/+18
| | | | | | | | | | | The same as 3e43adef95ee24dd218279d2de56939b90edcb4c but for Gen7. This doesn't quite fix GL_ARB_depth_texture/fbo-clear-formats; there's still a 1 pixel wide black line on the right edge of the smaller squares. The results were entirely wrong before, and are at least close now. Signed-off-by: Kenneth Graunke <[email protected]>
* r300: move declaration before codeBrian Paul2011-05-191-1/+1
|
* i965: Add support for rendering to depthbuffer mipmap levels > 0.Eric Anholt2011-05-184-32/+58
| | | | | | Fixes GL_ARB_depth_texture/fbo-clear-formats GL_EXT_packed_depth_stencil/fbo-clear-formats
* i965: Stop caching the combined depth/stencil region in brw_context.c.Eric Anholt2011-05-187-55/+53
| | | | | | This was going to get in the way of separate depth/stencil (which wants to know about both, and whether they are the same rb), and also wasn't a sufficient flag for the fix in the following commit.
* i965/gen6: Add support for point min/max size from ARB_point_parameters.Eric Anholt2011-05-181-2/+7
| | | | Fixes glean pointAtten.
* i965/fs: Don't emit a header on gen5+ sample messages unless required.Eric Anholt2011-05-181-7/+19
| | | | | | Improves glbenchmark egypt performance 0.6% +/- 0.4% (n=6). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix GPU hang on texture2d-bias on pre-Ironlake.Eric Anholt2011-05-181-4/+7
| | | | | | | In the 16-wide rework, I missed that we were setting some things to be SIMD16 mode (corresponding to their setup in emit_texture_gen4()). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for correct GL_CLAMP behavior by clamping coordinates.Eric Anholt2011-05-189-69/+90
| | | | | | | | This removes the stupid strict-conformance fallback code I broke when adding ARB_sampler_objects. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36572 Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* i965/fs: Drop the viewport index/rtai clearing in gen6 fb writes.Eric Anholt2011-05-181-6/+0
| | | | | | | These fields are documented to be in the payload, and though the FB write docs say they *aren't* in the payload, for all other fields the payload and header is structured so that no overwriting is required except for non-default options.
* i965/fs: Add support for "if" statements in 16-wide mode on gen6+.Eric Anholt2011-05-182-3/+7
| | | | | | | | | | | It turns out there's nothing in the hardware preventing this. It appears that it ought to work on pre-gen6 as well, but just produces GPU hangs. Improves glbenchmark Egypt framerate 4.4% +/- 0.3% (n=3), and Pro by 2.6% +/- 0.6% (n=3). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix discard and alpha test in 16-wide.Eric Anholt2011-05-181-6/+8
| | | | | | | | | | | | | As of gen6, alt-mode (which we use) MOVs of floats are not raw -- they'll modify infs/nans. This broke discard and alpha test in 16-wide, where apparently the upper 8 bits of the pixel enables being set were causing the whole value to get trashed upon being moved. Treating the values as UD instead of float makes sure they get preserved. While I'm here, replace the two 8-wide moves of the halves of the header with a single compressed move. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36648 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6: Fix blending state when no color buffer is bound.Eric Anholt2011-05-181-2/+12
| | | | | | | This is part of fixing fbo-alphatest-nocolor -- a regression in 35e8fe5c99b285f348cb8a1bba2931f120f7c0a1 after the initial regression, that had us using a garbage BLEND_STATE[0] (in particular, the alpha test enable) if no color buffer was bound.
* i965/fs: Cut an instruction and a temporary from gen6 discard statements.Eric Anholt2011-05-182-40/+30
| | | | | | | | I thought I was thwarted initially when I couldn't do conditional mod on a MOV, and couldn't use two immediate constants in one instruction. But g0 != g0 is also a way to produce a failing comparison. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix compiler warnings about dead code from 963431829055f63ec94dEric Anholt2011-05-181-19/+0
|
* i965: Rename IS_GT1 and IS_GT2 to IS_SNB_GT1 and IS_SNB_GT2.Kenneth Graunke2011-05-182-4/+4
| | | | | | This should help distinguish Sandybridge GT1/GT2 from Ivybridge GT1/GT2. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Instead of fallback on missing region, just bind a null renderbuffer.Eric Anholt2011-05-172-12/+4
| | | | | | | | | | | | | | The change for GPU hanging in 13bab58f04c1ec6d0d52760eab490a0997d9abe2 fell back even when rb == NULL, which is wrong for GLES2 and caused segfaulting in GLES2 conformance. For the GPU hang case (where the broken 2D driver failed to allocate a BO for the window system renderbuffer), it also would assertion fail/segfault immediately after the fallback setup when the renderbuffer map failed. Fixes GLES2 conformance packed_depth_stencil. Signed-off-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Updated fixed-point sizes in Ivybridge SAMPLER_STATE.Kenneth Graunke2011-05-171-3/+3
| | | | | | | | | Texture LOD Bias is now S4.8 instead of S4.6; Min LOD, and Max LOD are now U4.8 instead of U4.6. Fixes piglit test tex-miplevel-selection. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Ivybridge uses the Gen4 SAMPLER_BORDER_COLOR_STATE.Kenneth Graunke2011-05-171-1/+4
| | | | | | Volume 5c 1.13.7 lists it as [PreDevILK] and [DevIVB+]. Signed-off-by: Kenneth Graunke <[email protected]>
* intel: Recognize new Ivybridge PCI IDs.Kenneth Graunke2011-05-172-2/+22
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Disable register spilling on Ivybridge for now.Kenneth Graunke2011-05-171-0/+2
| | | | | | | | The data port messages for this are rather different. For now, fail to compile rather than hanging the GPU. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix RNDZ and RNDE on Sandybridge and Ivybridge.Kenneth Graunke2011-05-171-3/+8
| | | | | | | | | | | | | | On gen4/5, the RNDZ and RNDE instructions return floor(x), but set special "round increment bits" in the flag register; a predicated ADD (+1) fixes the result. The documentation still lists '.r' as existing, and says that the predicated add is necessary, but it apparently lies. According to the simulator, BRW_CONDITIONAL_R (7) is not a valid conditional modifier and the RNDZ and RNDE instructions simply produce the correct value. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix data port reads on Ivybridge.Kenneth Graunke2011-05-171-2/+12
| | | | | | | These also need to use gen7_dp. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Avoid register coalescing away MATH workarounds on Ivybridge.Kenneth Graunke2011-05-171-1/+1
| | | | | | | | The MATH instruction cannot handle source modifiers, even on Gen7. So, apply this workaround for Sandybridge on Ivybridge as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>