summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
* i965: Don't fill buffer with zeroes.Kenneth Graunke2013-03-061-6/+0
| | | | | | | | This was only necessary because our bounds checking was off by one, and thus we read an extra pair of values. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix off-by-one in query object result gathering.Kenneth Graunke2013-03-061-2/+2
| | | | | | | | | | | If we've written N pairs of values to the buffer, then last_index = N, but the values are 0 .. N-1. Thus, we need to use <, not <=. This worked anyway because we fill the buffer with zeroes, so we just added an extra (0 - 0) to our results. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Improve the matching (more formats!) for TexImage from PBOs.Eric Anholt2013-03-051-25/+3
| | | | | | | Mesa core is the place for encoding what format/type matches a mesa format, so rely on that. Reviewed-by: Chad Versace <[email protected]>
* intel: Improve the test for readpixels blit path format checking.Eric Anholt2013-03-053-37/+6
| | | | | | | | We were allowing things like copying RG1616 to a user's ARGB8888 format, while we were denying anything that wasn't ARGB8888 or RGB565. Reviewed-by: Chad Versace <[email protected]>
* intel: Fold intel_region_copy() into its one caller.Eric Anholt2013-03-053-53/+16
| | | | | | | | | This is similar code to intel_miptree_copy_slice, but the knobs are all set differently. v2: fix whitespace Reviewed-by: Chad Versace <[email protected]>
* intel: Transition intel_region_map() to being a miptree operation.Eric Anholt2013-03-055-91/+55
| | | | | | | | | | I'm trying to move us away from the region structure, and all the callers are currently dereferencing a miptree to get the region. In this change, the map_refcount is dropped. However, the bo->virtual is itself map refcounted, so that's already dealt with. Reviewed-by: Chad Versace <[email protected]>
* intel: Remove num_mapped_regions tracking.Eric Anholt2013-03-052-14/+0
| | | | | | | | The point of tracking the value was removed in February 2012 (65b096aeddd9b45ca038f44cc9adfff86c8c48b2), and this should have been removed at the same time. Reviewed-by: Chad Versace <[email protected]>
* intel: Remove the struct intel_region reuse hash table.Eric Anholt2013-03-054-39/+2
| | | | | | | | | | | | I don't see any reason for it -- it was introduced with the DRI2 invalidate work by krh in 2010 with no explanation. I suspect it was something about wanting the same drm_intel_bo struct underneath multiple openings of the BO within one process, but that's covered by libdrm at this point. As far as the struct region goes, it is not threadsafe, so multiple contexts sharing a region could have mixed up the map_count and assertion failed or worse. Reviewed-by: Chad Versace <[email protected]>
* intel: Add missing perf debug for a stall on mapping a BO.Eric Anholt2013-03-051-0/+2
| | | | | | | | I was testing the ARB_debug_output code and wrote an obvious sample that should have hit this, and got confused that my ARB_debug_output was broken. Reviewed-by: Jordan Justen <[email protected]>
* i965: Make perf_debug() output to GL_ARB_debug_output in a debug context.Eric Anholt2013-03-0516-48/+83
| | | | | | | | I tried to ensure that performance in the non-debug case doesn't change (we still just check one condition up front), and I think the impact is small enough in the debug context case to warrant including all of it. Reviewed-by: Jordan Justen <[email protected]>
* intel: Finish renaming fallback_debug() to perf_debug().Eric Anholt2013-03-055-18/+13
| | | | | | | They're about to change to handle GL_ARB_debug_output, so just make one function. Reviewed-by: Jordan Justen <[email protected]>
* intel: Hook up the WARN_ONCE macro to GL_ARB_debug_output.Eric Anholt2013-03-054-0/+8
| | | | | | | | | | This doesn't provide detailed error type information, but it's important to get these relatively severe but rare error messages out to the developer through whatever mechanism they are using. v2: Rebase on new WARN_ONCE additions. Reviewed-by: Jordan Justen <[email protected]> (v1)
* dri/nouveau: NV17_3D class is not available for NV1a chipsetMarcin Slusarz2013-03-051-1/+1
| | | | | | | | Should fix https://bugs.freedesktop.org/show_bug.cgi?id=60510 Note: this is a candidate for the stable branches Acked-by: Francisco Jerez <[email protected]>
* i965: Fix Crystal Well PCI IDs.Kenneth Graunke2013-03-031-9/+9
| | | | | | | | | The second digit was off by one, which meant we accidentally treated GTn as GT(n-1). This also meant no support for GT1 at all. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Pull query BO reallocation out into a helper function.Kenneth Graunke2013-03-011-23/+33
| | | | | | | | | We'll want to reuse this for non-occlusion queries in the future. Plus, it's a single logical task, so having it as a helper function clarifies the code somewhat. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Replace the global brw->query.bo variable with query->bo.Kenneth Graunke2013-03-012-16/+7
| | | | | | | | | | Again, eliminating a global variable in favor of a per-query object variable will help in a future where we have more queries in hardware. Personally, I find this clearer: there's just the query object's BO, rather than two variables that usually shadow each other. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Turn if (query->bo) into an assertion.Kenneth Graunke2013-03-011-5/+5
| | | | | | | The code a few lines above calls brw_emit_query_begin() if !query->bo, and that creates query->bo. So it should always be non-NULL. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Unify query object BO reallocation code.Kenneth Graunke2013-03-011-11/+10
| | | | | | | | | | | | | If we haven't allocated a BO yet, we need to do that. Or, if there isn't enough room to write another pair of values, we need to gather up the existing results and start a new one. This is simple enough. However, the old code was awkwardly split into two blocks, with a write_depth_count() placed in the middle. The new depth count isn't relevant to gathering the old BO's data, so that can go after the reallocation is done. With the two blocks adjacent, we can merge them. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Use query->last_index instead of the global brw->query.index.Kenneth Graunke2013-03-012-7/+6
| | | | | | | | | | | | Since we already have an index in the brw_query_object, there's no need to also keep a global variable that shadows it. Plus, if we ever add support for more types of queries that still need the per-batch before/after treatment we do for occlusion queries, we won't be able to use a single global variable. In contrast, per-query object variables will work fine. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Remove brw_query_object::first_index field as it's always 0.Kenneth Graunke2013-03-012-6/+3
| | | | | | | | | | | brw->query.index is initialized to 0 just a few lines before it's copied to first_index. Presumably the idea here was to reuse the query BO for subsequent queries of the same type, but since that doesn't happen, there's no need to have the extra code complexity. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Add a pile of comments to brw_queryobj.c.Kenneth Graunke2013-03-011-16/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | This code was really difficult to follow, for a number of reasons: - Queries were handled in four different ways (TIMESTAMP writes a single value, TIME_ELAPSED writes a single pair of values, occlusion queries write pairs of values for the start and end of each batch, and other queries are done entirely in software. It turns out that there are very good reasons each query is handled the way it is, but insufficient comments explaining the rationale. - It wasn't immediately obvious which functions were driver hooks and which were helper functions. For example, brw_query_begin() is a driver hook that implements glBeginQuery() for all query types, but the similarly named brw_emit_query_begin() is a helper function that's only relevant for occlusion queries. Extra explanatory comments should save me and others from constantly having to ask how this code works and why various query types are handled differently. v2: Incorporate Eric's feedback: change "as soon as possible" to "the results will be present when mapped." Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Write TIMESTAMP query values into the first buffer element.Kenneth Graunke2013-03-011-4/+3
| | | | | | | | | | For timestamp queries, we just write a single value to a BO. The natural place to write that is element 0, so we should do that. Previously, we wrote it into element 1 (the second slot) leaving element 0 filled with garbage. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Implement the new QueryCounter() hook.Kenneth Graunke2013-03-011-6/+21
| | | | | | This moves the GL_TIMESTAMP handling out of EndQuery. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: enable ARB_texture_multisample on Gen6+Chris Forbes2013-03-021-0/+1
| | | | | | | | V2: Works on Ivy Bridge now too, so this can be 6+. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: add support for ir_txf_ms on Gen6+Chris Forbes2013-03-023-13/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | On Gen6, lower this to `ld` with lod=0 and an extra sample_index parameter. On Gen7, use `ld2dms`. We don't support CMS yet for multisample textures, so we just hardcode MCS=0. This is ignored for IMS and UMS surfaces. Note: If we do end up emitting specialized shaders based on the MSAA layout, we can emit a slightly shorter message here in the UMS case. Note: According to the PRM, `ld2dms` takes one more parameter, lod. However, it's always zero, and including it would make the message too long for SIMD16, so we just omit it. V2: Reworked completely, added support for Gen7. V3: - Introduce sample_index parameter rather than reusing lod - Removed spurious whitespace change - Clarify commit message V4: - Fix comment style - Emit SHADER_OPCODE_TXF_MS on Gen6. This was benignly wrong since it lowers to `ld` anyway on this gen, but still wrong. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: add support for ir_txf_ms on Gen6+Chris Forbes2013-03-021-4/+21
| | | | | | | | | | | | | | | | | | On Gen6, lower this to `ld` with lod=0 and an extra sample_index parameter. On Gen7, use `ld2dms`. This takes an additional MCS parameter to support compressed multisample surfaces, but we're not enabling them for multisample textures for now, so it's always ignored and can be safely omitted. V2: Reworked completely, added support for Gen7. V3: - Use new sample_index, sample_index_type rather than reusing lod - Clarify commit message. V4: - Fix comment style Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: add a new virtual opcode: SHADER_OPCODE_TXF_MSChris Forbes2013-03-025-0/+18
| | | | | | | | | | | | | This is very similar to the TXF opcode, but lowers to `ld2dms` rather than `ld` on Gen7. V4: - add SHADER_OPCODE_TXF_MS to is_tex() functions, so regalloc thinks it actually writes the correct number of registers. Otherwise in nontrivial shaders some of the registers tend to get clobbered, producing bad results. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: take the target into account for Gen7 MSAA modesChris Forbes2013-03-021-3/+19
| | | | | | | | | | | | | | | | | | | | | Gen7 has an erratum affecting the ld_mcs message, making it unsafe to use when the surface doesn't have an associated MCS. From the Ivy Bridge PRM, Vol4 Part1 p77 ("MCS Enable"): "If this field is disabled and the sampling engine <ld_mcs> message is issued on this surface, the MCS surface may be accessed. Software must ensure that the surface is defined to avoid GTT errors." To allow the shader to treat all surfaces uniformly, force UMS if the surface is to be used as a multisample texture, even if CMS would have been possible. V3: - Quoted erratum text Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Support multisampling in surface_state for texturesChris Forbes2013-03-022-5/+6
| | | | | | | | | | | | | | | | | | | The surface_state setup for renderbuffers already worked; only the texturing side needed work. BLORP does something similar, but does its own surface_state setup. On Gen6, we just need to set the correct sample count. On Gen7: - set the correct sample count - set the correct layout mode - set GEN7_SURFACE_ARYSPC_LOD0 if it's set in the miptree. V2: - Clarify commit message - Rebased onto Paul's physical/logical dims cleanup - Added Gen7 support Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: add support for multisample texturesChris Forbes2013-03-026-7/+55
| | | | | | | | | | | V2: - Fix for state moving from texobj to image - Rebased onto Paul's logical/physical cleanup - Fixed missing quantization of sample count - Fold in IMS renderbuffer wrapper fixes from later in the series - Use correct physical slice offset for UMS/CMS surfaces on Gen7 Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: expose sample positionsChris Forbes2013-03-023-43/+82
| | | | | | | | | | | | Moves the definition of the sample positions out of gen6_emit_3dstate_multisample, and unpacks them in gen6_get_sample_position. V2: Be consistent about `sample position` rather than `location`. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Acked-by: Ian Romanick <[email protected]>
* i965: add support for sample mask on Gen6+Chris Forbes2013-03-024-9/+16
| | | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: expose new max sample countsChris Forbes2013-03-021-2/+10
| | | | | | | | | | V2: For now, only expose a depth sample count of 1, since there are possible unresolved interactions with HiZ. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* intel: Use the new "ctx" local variable I just added some more.Eric Anholt2013-03-011-2/+2
| | | | Reviewed-and-tested-by: Ian Romanick <[email protected]>
* i965: Make sRGB-capable framebuffers by default.Eric Anholt2013-03-012-3/+63
| | | | | | | | | | | | | The GLX extension lets you expose visuals that explicitly guarantee you that the GL_FRAMEBUFFER_SRGB_CAPABLE flag will be set, but we can set the flag even while the visual doesn't provide the guarantee. This appears to be consistent with other implementations, as we've seen several apps now that don't require an srgb visual and assume sRGB will work without checking the GL_FRAMEBUFFER_SRGB_CAPABLE flag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55783 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60633 Reviewed-and-tested-by: Ian Romanick <[email protected]>
* intel: Fix software copying of miptree faces for weird formats.Eric Anholt2013-03-013-61/+77
| | | | | | | | | | | Now that we have W-tiled S8, we can't just region_map and poke at bits -- there has to be some swizzling. Rely on intel_miptree_map to get that job done. This should also get the highest performance path we know of for the mapping (interesting if I get around to finishing movntdqa some day). v2: Fix stale name of the bit in a comment. Reviewed-by: Chad Versace <[email protected]>
* intel: Add a flag for miptree mapping to disable transcoding.Eric Anholt2013-03-012-4/+17
| | | | | | | | I want to reuse intel_miptree_map() to replace some region mapping that's broken for separate stencil, but doing so would result in new demands on ETC transcode that we actually don't want to happen. Reviewed-by: Chad Versace <[email protected]>
* i965: Add WARN_ONCE for depthstencil workarounds we shouldn't be hitting.Eric Anholt2013-03-012-0/+6
| | | | Reviewed-by: Chad Versace <[email protected]>
* intel: Enable __DRI_API_OPENGL_CORE api with dri2 contextsJordan Justen2013-02-281-0/+2
| | | | | | | | | | | | | Without this set, dri_util.c:dri2CreateContextAttribs will reject requests to create a context with __DRI_API_OPENGL_CORE. This prevents a 3.2 core profile context from being created even when MESA_GL_OVERRIDE_VERSION=3.2 is used. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: update max versions based on MESA_GL_VERSION_OVERRIDEJordan Justen2013-02-281-0/+10
| | | | | | | | | If the override is version is >= 3.1, then update the max_gl_core_version. Otherwise, update max_gl_compat_version. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Put immediate operand as src2Matt Turner2013-02-281-1/+1
| | | | | | | | Immediate operands can only be src2 in 2-source instructions. Fixes piglit failures since 0a1d145e (oops!). Spotted-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Remove intel_mipmap_tree::wraps_etcChad Versace2013-02-282-21/+3
| | | | | | | | | | | | | | The field was equivalent to (etc_format != MESA_FORMAT_NONE), and therefore duplicate information. This patch removes field and replaces all references to it with `etc_format != MESA_FORMAT_NONE`. No Piglit ETC test regresses on Intel Sandybridge. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965/vs: Assert that ir_triop_lrp was lowered.Matt Turner2013-02-281-0/+4
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fp: Use the LRP instruction for OPCODE_LRP.Matt Turner2013-02-281-8/+4
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Use the LRP instruction for ir_triop_lrp when possible.Kenneth Graunke2013-02-287-5/+75
| | | | | | | | | | | | | | | | | | | v2 [mattst88]: - Add BRW_OPCODE_LRP to list of CSE-able expressions. - Fix op_var[] array size. - Rename arguments to emit_lrp to (x, y, a) to clear confusion. - Add LRP function to brw_fs.cpp/.h. - Corrected comment about LRP instruction arguments in emit_lrp. v3 [mattst88]: - Duplicate MAD code for LRP instead of using a function pointer. - Check for != GRF instead of == IMM in emit_lrp. - Lower LRP on gen < 6. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> 1
* i965: Add support for emitting the LRP instruction.Kenneth Graunke2013-02-284-0/+4
| | | | | | | | Like MAD, this is another three-source instruction. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* glsl: Convert mix() to use a new ir_triop_lrp opcode.Kenneth Graunke2013-02-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Many GPUs have an instruction to do linear interpolation which is more efficient than simply performing the algebra necessary (two multiplies, an add, and a subtract). Pattern matching or peepholing this is more desirable, but can be tricky. By using an opcode, we can at least make shaders which use the mix() built-in get the more efficient behavior. Currently, all consumers lower ir_triop_lrp. Subsequent patches will actually generate different code. v2 [mattst88]: - Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a subsequent patch and ir_triop_lrp translated directly. v3 [mattst88]: - Move changes from the next patch to opt_algebraic.cpp to accept 3-src operations. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/vs/gen7: Allow MATH instructions to have MRF as a destinationMatt Turner2013-02-281-1/+1
| | | | | | | | | total instructions in shared programs: 346873 -> 346847 (-0.01%) instructions in affected programs: 364 -> 338 (-7.14%) (All affected shaders are from Lightsmark) Reviewed-by: Eric Anholt <[email protected]>
* i965/fs/gen7: Allow MATH instructions to have MRF as a destinationMatt Turner2013-02-281-1/+1
| | | | | | | total instructions in shared programs: 1376297 -> 1375626 (-0.05%) instructions in affected programs: 35977 -> 35306 (-1.87%) Reviewed-by: Eric Anholt <[email protected]>
* i965/gen7: Relax restrictions on fake MRFsMatt Turner2013-02-281-2/+4
| | | | | | | | | | | | | Gen6 has write-only MRF registers, and for ease of implementation we paritition off 16 general purposes registers to act as MRFs on Gen7. Knowing that our Gen7 MRFs are actually GRFs, we can do things we can't do with real MRFs: - read from them; - return values directly to them from a send instruction; and - compute directly to them with math instructions. Reviewed-by: Eric Anholt <[email protected]>