aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
...
* i965: take the target into account for Gen7 MSAA modesChris Forbes2013-03-021-3/+19
| | | | | | | | | | | | | | | | | | | | | Gen7 has an erratum affecting the ld_mcs message, making it unsafe to use when the surface doesn't have an associated MCS. From the Ivy Bridge PRM, Vol4 Part1 p77 ("MCS Enable"): "If this field is disabled and the sampling engine <ld_mcs> message is issued on this surface, the MCS surface may be accessed. Software must ensure that the surface is defined to avoid GTT errors." To allow the shader to treat all surfaces uniformly, force UMS if the surface is to be used as a multisample texture, even if CMS would have been possible. V3: - Quoted erratum text Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Support multisampling in surface_state for texturesChris Forbes2013-03-022-5/+6
| | | | | | | | | | | | | | | | | | | The surface_state setup for renderbuffers already worked; only the texturing side needed work. BLORP does something similar, but does its own surface_state setup. On Gen6, we just need to set the correct sample count. On Gen7: - set the correct sample count - set the correct layout mode - set GEN7_SURFACE_ARYSPC_LOD0 if it's set in the miptree. V2: - Clarify commit message - Rebased onto Paul's physical/logical dims cleanup - Added Gen7 support Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: add support for multisample texturesChris Forbes2013-03-026-7/+55
| | | | | | | | | | | V2: - Fix for state moving from texobj to image - Rebased onto Paul's logical/physical cleanup - Fixed missing quantization of sample count - Fold in IMS renderbuffer wrapper fixes from later in the series - Use correct physical slice offset for UMS/CMS surfaces on Gen7 Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: expose sample positionsChris Forbes2013-03-023-43/+82
| | | | | | | | | | | | Moves the definition of the sample positions out of gen6_emit_3dstate_multisample, and unpacks them in gen6_get_sample_position. V2: Be consistent about `sample position` rather than `location`. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Acked-by: Ian Romanick <[email protected]>
* i965: add support for sample mask on Gen6+Chris Forbes2013-03-024-9/+16
| | | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: implement GetMultisamplefvChris Forbes2013-03-021-0/+3
| | | | | | | | | | | | | | | Actual sample locations deferred to a driverfunc since only the driver really knows where they will be. V2: - pass the draw buffer to the driverfunc; don't fallback to pixel center if driverfunc is missing. - rename GetSampleLocation to GetSamplePosition - invert y sample position for winsys FBOs, at Paul's suggestion Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: expose new max sample countsChris Forbes2013-03-021-2/+10
| | | | | | | | | | V2: For now, only expose a depth sample count of 1, since there are possible unresolved interactions with HiZ. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* intel: Use the new "ctx" local variable I just added some more.Eric Anholt2013-03-011-2/+2
| | | | Reviewed-and-tested-by: Ian Romanick <[email protected]>
* i965: Make sRGB-capable framebuffers by default.Eric Anholt2013-03-012-3/+63
| | | | | | | | | | | | | The GLX extension lets you expose visuals that explicitly guarantee you that the GL_FRAMEBUFFER_SRGB_CAPABLE flag will be set, but we can set the flag even while the visual doesn't provide the guarantee. This appears to be consistent with other implementations, as we've seen several apps now that don't require an srgb visual and assume sRGB will work without checking the GL_FRAMEBUFFER_SRGB_CAPABLE flag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55783 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60633 Reviewed-and-tested-by: Ian Romanick <[email protected]>
* intel: Fix software copying of miptree faces for weird formats.Eric Anholt2013-03-013-61/+77
| | | | | | | | | | | Now that we have W-tiled S8, we can't just region_map and poke at bits -- there has to be some swizzling. Rely on intel_miptree_map to get that job done. This should also get the highest performance path we know of for the mapping (interesting if I get around to finishing movntdqa some day). v2: Fix stale name of the bit in a comment. Reviewed-by: Chad Versace <[email protected]>
* intel: Add a flag for miptree mapping to disable transcoding.Eric Anholt2013-03-012-4/+17
| | | | | | | | I want to reuse intel_miptree_map() to replace some region mapping that's broken for separate stencil, but doing so would result in new demands on ETC transcode that we actually don't want to happen. Reviewed-by: Chad Versace <[email protected]>
* i965: Add WARN_ONCE for depthstencil workarounds we shouldn't be hitting.Eric Anholt2013-03-012-0/+6
| | | | Reviewed-by: Chad Versace <[email protected]>
* intel: Enable __DRI_API_OPENGL_CORE api with dri2 contextsJordan Justen2013-02-281-0/+2
| | | | | | | | | | | | | Without this set, dri_util.c:dri2CreateContextAttribs will reject requests to create a context with __DRI_API_OPENGL_CORE. This prevents a 3.2 core profile context from being created even when MESA_GL_OVERRIDE_VERSION=3.2 is used. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: update max versions based on MESA_GL_VERSION_OVERRIDEJordan Justen2013-02-281-0/+10
| | | | | | | | | If the override is version is >= 3.1, then update the max_gl_core_version. Otherwise, update max_gl_compat_version. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Put immediate operand as src2Matt Turner2013-02-281-1/+1
| | | | | | | | Immediate operands can only be src2 in 2-source instructions. Fixes piglit failures since 0a1d145e (oops!). Spotted-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Remove intel_mipmap_tree::wraps_etcChad Versace2013-02-282-21/+3
| | | | | | | | | | | | | | The field was equivalent to (etc_format != MESA_FORMAT_NONE), and therefore duplicate information. This patch removes field and replaces all references to it with `etc_format != MESA_FORMAT_NONE`. No Piglit ETC test regresses on Intel Sandybridge. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965/vs: Assert that ir_triop_lrp was lowered.Matt Turner2013-02-281-0/+4
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fp: Use the LRP instruction for OPCODE_LRP.Matt Turner2013-02-281-8/+4
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use the LRP instruction for ir_triop_lrp when possible.Kenneth Graunke2013-02-287-5/+75
| | | | | | | | | | | | | | | | | | | v2 [mattst88]: - Add BRW_OPCODE_LRP to list of CSE-able expressions. - Fix op_var[] array size. - Rename arguments to emit_lrp to (x, y, a) to clear confusion. - Add LRP function to brw_fs.cpp/.h. - Corrected comment about LRP instruction arguments in emit_lrp. v3 [mattst88]: - Duplicate MAD code for LRP instead of using a function pointer. - Check for != GRF instead of == IMM in emit_lrp. - Lower LRP on gen < 6. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> 1
* i965: Add support for emitting the LRP instruction.Kenneth Graunke2013-02-284-0/+4
| | | | | | | | Like MAD, this is another three-source instruction. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* glsl: Convert mix() to use a new ir_triop_lrp opcode.Kenneth Graunke2013-02-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Many GPUs have an instruction to do linear interpolation which is more efficient than simply performing the algebra necessary (two multiplies, an add, and a subtract). Pattern matching or peepholing this is more desirable, but can be tricky. By using an opcode, we can at least make shaders which use the mix() built-in get the more efficient behavior. Currently, all consumers lower ir_triop_lrp. Subsequent patches will actually generate different code. v2 [mattst88]: - Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a subsequent patch and ir_triop_lrp translated directly. v3 [mattst88]: - Move changes from the next patch to opt_algebraic.cpp to accept 3-src operations. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/vs/gen7: Allow MATH instructions to have MRF as a destinationMatt Turner2013-02-281-1/+1
| | | | | | | | | total instructions in shared programs: 346873 -> 346847 (-0.01%) instructions in affected programs: 364 -> 338 (-7.14%) (All affected shaders are from Lightsmark) Reviewed-by: Eric Anholt <[email protected]>
* i965/fs/gen7: Allow MATH instructions to have MRF as a destinationMatt Turner2013-02-281-1/+1
| | | | | | | total instructions in shared programs: 1376297 -> 1375626 (-0.05%) instructions in affected programs: 35977 -> 35306 (-1.87%) Reviewed-by: Eric Anholt <[email protected]>
* i965/gen7: Relax restrictions on fake MRFsMatt Turner2013-02-281-2/+4
| | | | | | | | | | | | | Gen6 has write-only MRF registers, and for ease of implementation we paritition off 16 general purposes registers to act as MRFs on Gen7. Knowing that our Gen7 MRFs are actually GRFs, we can do things we can't do with real MRFs: - read from them; - return values directly to them from a send instruction; and - compute directly to them with math instructions. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Remove duplicate scan_inst->mlen checkMatt Turner2013-02-281-5/+0
| | | | | | Is already checked 20 lines below. Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix the W value of deprecated pointcoords on pre-gen6.Eric Anholt2013-02-251-1/+18
| | | | | | | | | | | | | | When you didn't have a texcoord array bound (or a non-1 current w attrib), we were telling the fragment shader that it could just use "1" instead of doing expensive pre-gen6 math to invert it. If you drew the point with a non-1 W value, then you'd get the right size (since all the vertex computations worked), but we'd mis-interpolate the coordinate across the face. Fixes the mesa pointsprite demo on GM45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30232 Reviewed-and-tested-by: Ian Romanick <[email protected]> Note: This is a candidate for the stable branches.
* i965: Enable OpenGL ES 3.0 on Sandy BridgeIan Romanick2013-02-221-1/+1
| | | | | | | | | | Regardless of what we put in the screen structure, all of the extensions that compute_version_es2 checks are present and 3.0 will be exposed anyway. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Ian Romanick <[email protected]>
* meta: Allocate texture before initializing texture coordinatesAnuj Phogat2013-02-221-9/+8
| | | | | | | | | | | | tex->Sright and tex->Ttop are initialized during texture allocation. This fixes depth buffer blitting failures in khronos conformance tests when run on desktop GL 3.0. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=59495 Note: This is a candidate for stable branches. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/fs: Fix broken math on values loaded from uniform buffers on gen6.Eric Anholt2013-02-221-0/+1
| | | | | | | | | | | | In a debug build this led to assertion failures, but on a non-debug build the hardware would just reference the whole vec8 instead of the same channel 8 times. Fixes the new piglit glsl-1.40/uniform-buffer/fs-exp2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57121 Note: This is a candidate for the stable branches Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Avoid segfault in gen6_upload_stateCarl Worth2013-02-211-1/+1
| | | | | | | | | | | | | | | This fixes a bug introduced in commit 258453716f001eab1288d99765213 and triggered whenever "rb" is NULL. Fixes at least one cause bug #59445: [SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault https://bugs.freedesktop.org/show_bug.cgi?id=59445 (Though segfaults are still possible in that test case, but they have been present since before commit 258453716f which is what's being fixed here.) Reviewed-by: Eric Anholt <[email protected]>
* i965: Consign COORD_REPLACE VS hacks to Pre-Gen6.Paul Berry2013-02-203-11/+34
| | | | | | | | | | | | | | | | | | | | | | | | | Pre-Gen6, the SF thread requires exact matching between VS output slots (aka VUE slots) and FS input slots, even when the corresponding VS output slot is unused due to being overwritten by point coordinate replacement (glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)). As a result, we have a special hack in the VS to ensure when any texture coordinate is subject to point coordinate replacement, it is always allocated space in the VUE, even if it isn't written to by the VS. This hack isn't needed from Gen6 onwards, since SF (Gen7: SBE) swizzling has the ability to insert the point coordinate into gl_TexCoord[] without needing a corresponding unused VUE slot. Note that no modification of SF setup code is required for this patch--get_attr_override() already does the right thing. However, we make a slight comment change to clarify why this works. In addition to eliminating unnecessary VS recompiles and saving precious URB space on Gen6+, this will save us the trouble of having to adjust this hack when we implement geometry shaders. Reviewed-by: Kenneth Graunke <[email protected]>
* gles2: a stub implementation for GL_EXT_discard_framebufferTapani Pälli2013-02-201-0/+1
| | | | | | | | | | | This patch implements a stub for GL_EXT_discard_framebuffer with required checks listed by the extension specification. This extension is required by GLBenchmark 2.5 when compiled with OpenGL ES 2.0 as the rendering backend. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-and-tested-by: Chad Versace <[email protected]>
* i965/fs: Enable CSE on uniform pull constant loads.Eric Anholt2013-02-191-0/+3
| | | | | | | | Improves on a major performance regression for the dolphin wii emulator from its move to using UBOs. Performance in the UBO codepath (as replayed through apitrace) is up 21.1% +/- 2.3% (n=26/29). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Only do CSE when the dst types match.Eric Anholt2013-02-191-1/+2
| | | | | | | | | | We could potentially do some CSE even when the dst types aren't the same on gen6 where there is no implicit dst type conversion iirc, or in the case of uniform pull constant loads where the dst type doesn't impact what's stored. But it's not worth worrying about. Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for the 9.1 branch.
* i965/fs: Delay setup of uniform loads until after pre-regalloc scheduling.Eric Anholt2013-02-193-27/+66
| | | | | | | | This should fix the register allocation explosion on the GLES 3.0 test on gen6. It also gives us an instruction that will fit our CSE handling. Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for the 9.1 branch.
* i965/fs: Fix copy propagation with smearing.Eric Anholt2013-02-191-1/+2
| | | | | | | | | | We were correctly relaying the smear from MOV's src, but if the MOV didn't do a smear, we don't want to smash the smear value from the instruction being propagated into. Prevents a regression in the upcoming UBO change. Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for the 9.1 branch.
* i965/fs: Add a bit more instruction dumping useful for upcoming work.Eric Anholt2013-02-191-1/+30
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove unused userclip flags.Paul Berry2013-02-193-5/+0
| | | | | | | | | | brw_vs_prog_data::userclip hasn't been used since commit f0cecd4 (i965: Move VUE map computation to once at VS compile time). brw_gs_prog_key::userclip_active hasn't been used since commit 9f3d321 (i965: Make the userclip flag for the VUE map come from VS prog data). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix leak in blorp CopyTexSubImage2DChristopher James Halse Rogers2013-02-161-2/+2
| | | | | | | | | | | | | | | _mesa_delete_renderbuffer does not call the driver-specific renderbuffer delete function, so the blorp code was leaking the Intel-specific bits, including some GEM objects. Call the renderbuffer's ->Delete() method instead, which does the right thing. Fixes Unity rapidly sending the machine into the arms of the OOM-killer Note: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Do a general SEND dependency workaround for the original 965.Eric Anholt2013-02-153-42/+229
| | | | | | | | | | | | | We'd been ad-hoc inserting instructions in some SEND messages with no knowledge of when it was required (so extra instructions), but not all SENDs (so not often enough). This should do much better than that, though it's still flow-control-ignorant. v2: Use BRW_MAX_MRF instead of magic numbers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58960 Reviewed-by: Kenneth Graunke <[email protected]> NOTE: Candidate for the stable branches.
* i965/gen7: Set up all samplers even if samplers are sparsely used.Eric Anholt2013-02-141-1/+1
| | | | | | | | | | | | | | | | | In GLSL, sampler indices are allocated contiguously from 0. But in the case of ARB_fragment_program (and possibly fixed function), an app that uses texture 0 and 2 will use sampler indices 0 and 2, so we were only allocating space for samplers 0 and 1 and setting up sampler 0. We would read garbage for sampler 2, resulting in flickering textures and an angry simulator. Fixes bad rendering in 0 A.D. and ETQW. This was fixed for pre-gen7 by 28f4be9eb91b12a2c6b1db6660cca71a98c486ec Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680 Reviewed-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for stable branches.
* intel: Allow blit readpixels even when the pack alignment is set.Eric Anholt2013-02-131-9/+4
| | | | | | | | | | The default alignment is 4, so this fast path was rarely hit. Rather than introduce logic to handle alignment, just use the Mesa core function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46632 Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove writemask support from brw_SAMPLE().Eric Anholt2013-02-135-109/+18
| | | | | | | | | | The code was rather broken for non-XYZW on 8-wide, but all of our callers were using XYZW anyway. For my experiments with using writemask on texturing, I've been using manual header setup in the compiler backends, since we want to actually know what registers are written for optimization and register allocation. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use a helper function for checking for flow control instructions.Eric Anholt2013-02-133-23/+22
| | | | | | | In 2 of our checks, we were missing BREAK and CONTINUE. NOTE: Candidate for the stable branches. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Re-enable the -RHW workaround for original gen4 chips.Eric Anholt2013-02-131-12/+8
| | | | | | | | Fixes broken clipping in supertuxkart and presumably many other applications. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51471 NOTE: Candidate for the stable branches. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen4: Work around missing sRGB RGB DXT1 support.Eric Anholt2013-02-133-4/+20
| | | | | | | | | | | The hardware just doesn't support it. I suspect this was a regression from the move to fixed MESA_FORMATs for compressed textures and that previously we were storing uncompressed for this or something. Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor swizzled" on my GM965. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use derived state for Haswell's 3DSTATE_VF packet.Kenneth Graunke2013-02-121-2/+2
| | | | | | | | | | | Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX. Fixes gles3conform's primitive_restart_mode test. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radeon: Remove dead STANDALONE_MMIO definesMatt Turner2013-02-112-3/+0
| | | | | | | | | | | | These were, at some point in the past, used to request that Xorg's compiler.h export a static inline xf86ReadMmio32 instead of a function pointer. compiler.h only has this option for DEC Alpha. But Xorg's compiler.h isn't being included by either of these two files and the radeon driver still works on Alpha, so the definitions are dead and not needed. Reviewed-by: Michel Dänzer <[email protected]>
* i965: Add missing dirty bits to INTEL_DEBUG=state arrays.Kenneth Graunke2013-02-111-0/+7
| | | | | | | | These are more recent additions, and no one remembered to update the INTEL_DEBUG=state code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Reorganize brw_bits to match the order in brw_context.h.Kenneth Graunke2013-02-111-5/+5
| | | | | | | | | | This reorders the "brw_bits" array in brw_state_upload.c to match the order of the #defines in brw_context.h. Otherwise, it's really hard to see if any are missing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>