summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* swrast: Build the driver into the shared mesa_dri_drivers.so.Eric Anholt2013-10-246-42/+48
| | | | | | | | | | | | v2: drop dridir now that it's unused. v3: Fix linking after rebase when building just swrast from classic but a drm-using gallium driver. v4: Consistently put spaces around += in the updated Makefile.am block. v5: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). Reviewed-by: Matt Turner <[email protected]> (v3) Reviewed-by: Emil Velikov <[email protected]>
* radeon: Build the driver into the shared mesa_dri_drivers.so.Eric Anholt2013-10-2412-44/+141
| | | | | | | | | | | | | This required some reordering of headers to ensure that the symbol name redefines happened before any prototypes. v2: drop dridir now that it's unused. v3: Consistently put spaces around += in the updated Makefile.am blocks. v4: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). Reviewed-by: Matt Turner <[email protected]> (v2) Reviewed-by: Emil Velikov <[email protected]>
* i915: Build the driver into the shared mesa_dri_drivers.so.Eric Anholt2013-10-247-21/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i915 has symbols for formerly-shared code that conflict with i965, so we define them away using gen-symbol-redefs.py. Options considered: - This option. Downsides: The symbols in profiling and debugging don't match the source. The symbol list may change in the future and we won't notice without manually running the tool again. - Use objcopy --localize-hidden to automatically demote our symbols to locals. This didn't work on i965 due to c++ weak symbols (which can't be localized), but could work on i915. We could do it on i915 only, but it does produce libtool warnings at link time due to libtool not knowing if the resulting .o file is safe to link (stupid libtool). Plus you end up with different symbols of the same name, which is confusing for debugging too. On the other hand, no future symbol conflicts long term. - Write our own libelf tool that handles c++ weak symbols like we want and apply it to all drivers. All the downsides of above, but applies uniformly across drivers. - Edit the files to just rename all the i915 or i965 symbols that conflict. There are on the order of 100 that have a prefix we used to share, so it would take a bit of typing. Fewest downsides, but still can have conflicts long term. Ultimately, this is the least invasive change at the moment, and we can see if the "more symbol conflicts appear later" thing is a real concern or not. Note that the ability to compile a version of i915 without INTEL_DEBUG env support is dropped. It's too useful. v2: drop dridir now that it's unused. v3: Consistently put spaces around += in the updated Makefile.am block. v4: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). Reviewed-by: Matt Turner <[email protected]> (v2) Reviewed-by: Emil Velikov <[email protected]>
* dri: Add a tool for generating #defines to namespace driver global symbols.Eric Anholt2013-10-241-0/+68
| | | | | Acked-by: Matt Turner <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nouveau: Build the driver into the shared mesa_dri_drivers.so.Eric Anholt2013-10-245-21/+24
| | | | | | | | | | | v2: drop dridir now that it's unused. v3: Consistently put spaces around += in the updated Makefile.am block. v4: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). v5: Fix missed public symbol in nouveau. (caught by Emil) Reviewed-by: Matt Turner <[email protected]> (v2) Reviewed-by: Emil Velikov <[email protected]>
* i965: Build the driver into a shared mesa_dri_drivers.so .Eric Anholt2013-10-249-33/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we've split things such that mesa core is in libdricore, exposing the whole Mesa core interface in the global namespace, and the i965_dri.so code all links against that. Along with polluting application namespace terribly, it requires extra PLT indirections and prevents LTO. Instead, we can build all of the driver contents into the same .so with just a few symbols exposed to be referenced from the actual driver .so file, allowing LTO and reducing our exposed symbol count massively. FPS improvement on GLB2.7 with INTEL_NO_HW=1: 2.61061% +/- 1.16957% (n=50) (without LTO, just the PLT reductions from this commit) Note that the X Server requires commit 7ecfab47eb221dbb996ea6c033348b8eceaeb893 to successfully load this driver! v2: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). v3: Drop AM_CPPFLAGS addition (Emil pointed out I'd missed some cflags that would be necessary, though only if we actually relied on them). v4: Fix install with DESTDIR set. Reviewed-by: Matt Turner <[email protected]> (v1) Reviewed-by: Emil Velikov <[email protected]> (v2)
* dri: Implement a DRI vtable extension to replace the global driDriverAPI.Eric Anholt2013-10-242-0/+30
| | | | | | | | | | | | | | | | | As we move to megadrivers, we are unable to build multiple drivers with the same public global symbol per driver (Think an X Server with an intel and a nouveau driver, and the X Server implementing indirect for both -- we have to actually talk to the right driver). By slipping the driDriverAPI vtable into the driver's extension list, we can replace the usage of the global symbol with usage of the loader-dlsym()ed driver information. v2: Pull in the hunk to avoid crashing on null driver_extensions. Thanks, Emil! Reviewed-by: Matt Turner <[email protected]> (v1) Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Pass in the dlsym()ed driver extension to screen creation.Eric Anholt2013-10-248-35/+119
| | | | | | | | | | | This will allow a megadrivers build to reference the actual driver being loaded from the shared dri_util screen creation code. v2: Fix indentation, fallback case in EGL (review by Emil). Reviewed-by: Matt Turner <[email protected]> (v1) Reviewed-by: Chad Versace <[email protected]> (v1) Reviewed-by: Emil Velikov <[email protected]>
* gbm: Add support for the new __driDriverGetExtensions interface.Eric Anholt2013-10-241-2/+15
| | | | | | | v2: Fix uninitialized variable use in the old-ABI case. Reviewed-by: Chad Versace <[email protected]> (v1) Reviewed-by: Emil Velikov <[email protected]>
* egl: Add an optional function call for getting the DRI driver interface.Eric Anholt2013-10-241-2/+18
| | | | | | | | v2: Fix asprintf error checking. Reviewed-by: Matt Turner <[email protected]> (v1) Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* glx: Add an optional function call for getting the DRI driver interface.Eric Anholt2013-10-246-8/+35
| | | | | | | | | | | | | | The previous interface relied on a static struct, which meant that the driver didn't get a chance to edit the struct before the struct got used. For megadrivers, I want struct specific to the driver being loaded. v2: Fix the prototype in the docs (caught by Marek). Since the driver name was in the function, we didn't need to also pass it in. v3: Fix asprintf error checking (caught by Matt's gcc). Reviewed-by: Matt Turner <[email protected]> (v1) Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Move driver config options to dri driver extensions.Eric Anholt2013-10-247-18/+40
| | | | | | | | | This way they aren't all sitting in the global namespace (with the same name per driver). Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Allow config options to be passed to the loader through extensions.Eric Anholt2013-10-242-9/+28
| | | | | | | | | | | | | | | Turns out already we have this nice mechanism for providing optional things from the driver to the loader, and I was going to have to rename the public global symbol to avoid conflicts when doing megadrivers. While the former __driConfigOptions is technically loader interface, this is the only loader that made use of that symbol. Continue paying attention to it if we can't find the new option, to retain compatibility with old drivers. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* glx: Move the driver extension-loading to a helper function.Eric Anholt2013-10-243-4/+18
| | | | | | | | | I'm planning on doing driver extension parsing from 3 places, and making the extension loading step a bit longer. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* clover: Query maximum kernel block size from the device instead of the ↵Francisco Jerez2013-10-244-10/+18
| | | | | | | | kernel object. Based on a similar fix from Aaron Watry. It seems unlikely that we will ever need a kernel-specific setting for this, and the Gallium API doesn't support it. Remove kernel::max_block_size() altogether.
* glsl: silence unused 'var' variable warningBrian Paul2013-10-241-2/+2
| | | | Reviewed-by: Paul Berry <[email protected]>
* svga: remove user-space vertex/index buffer codeBrian Paul2013-10-246-259/+13
| | | | | | | | The gallium vbuf module, which we've been using for some time now, takes care of uploading user-space vertex/index data into real buffers. The upload code in the svga driver was unused. Reviewed-by: José Fonseca <[email protected]>
* i965: Print more debuginfo in intel_texsubimage_memcpy()Chad Versace2013-10-241-2/+8
| | | | | | | | Print info about packing, format, type, and tiling. This will help debug future issues with this fastpath. Reviewed-by: Frank Henigman <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Fix glTexImage when packing alignment != cppChad Versace2013-10-241-2/+11
| | | | | | | | | | | | | | | | | | | | | | | Fixes texture corruption of Weston clients on cairo-glesv2 backend. Commit 49ed599 introduced the bug. Corruption occured when glTexSubImage called intel_texsubimage_tiled_memcpy() with: x,y=10,9 w,h=7,7 format=GL_ALPHA(0x1906) type=GL_UNSIGNED_BYTE(0x1401) gl_format=MESA_FORMAT_A8(0x18) packing.alignemnt=4 The function miscalculated the source image's stride as w*cpp=7 without taking into account the packing alignment. The actual stride was 8. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70435 Reported-by: U. Artie Eoff <[email protected]> Tested-by: Kristian Høgsberg <[email protected]> Reviewed-by:Frank Henigman <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* freedreno: fix compile errorRob Clark2013-10-231-1/+1
| | | | | | Small typo introduced in a3ed98f. Signed-off-by: Rob Clark <[email protected]>
* i965/fs: Only unroll high-accuracy dFdy() from SIMD16 to SIMD8 on gen4 and IVB.Paul Berry2013-10-231-10/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 800610f (i965/fs: Improve accuracy of dFdy() to match dFdx()) I unrolled the high-accuracy dFdy() computation from a single SIMD16 instruction to two SIMD8 instructions because of text I found in the i965 (gen4) PRM saying that instruction compression could not be used in align16 mode. I couldn't find similar text in later hardware docs, and I observed problems trying to use instruction compression on align16 mode on Ivy Bridge, so I assumed that the restriction still applied and the associated documentation had simply been lost. After consultation with the hardware engineers, it turns out this is not the case. In point of fact, the restriction was dropped in gen5, re-introduced in Ivy Bridge, and dropped again in Haswell. The reason I didn't notice this is that in the Ivy Bridge documentation, the restriction was in a different section, and described using different language. Now that we know that the restriction only applies to Gen4 and Ivy Bridge, we can limit the unrolling to those platforms. Tested on gen5, gen6, and gen7 (both Ivy Bridge and Haswell). Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl/gs: Prevent illegal input/output primitive types.Paul Berry2013-10-231-3/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the GLSL 1.50 spec, section 4.3.8.1 (Input Layout Qualifiers): The layout qualifier identifiers for geometry shader inputs are layout-qualifier-id points lines lines_adjacency triangles triangles_adjacency And from section 4.3.8.2 (Output Layout Qualifiers) The layout qualifier identifiers for geometry shader outputs are layout-qualifier-id points line_strip triangle_strip max_vertices = integer-constant We were erroneously allowing line_strip and triangle_strip to be used as input qualifiers, and we were allowing lines, lines_adjacency, triangles, and triangles_adjacency to be used as output qualifiers. Fixes piglit tests "glsl-1.50-gs-{input,output}-layout-qualifiers *". Reviewed-by: Ian Romanick <[email protected]>
* i965: Add perf debug hint when the app makes us do index buffer scanning.Eric Anholt2013-10-231-1/+4
| | | | Reviewed-by: Jordan Justen <[email protected]>
* i965: Try to avoid stalls on the GPU when doing glBufferSubData().Eric Anholt2013-10-239-36/+150
| | | | | | | | | | | | On DOTA2, framerate on dota2-de1.dem in windowed mode on my laptop improves by 7.69854% +/- 0.909163% (n=3). In a microbenchmark hitting this code path (wall time of piglit vbo-subdata-many), runtime decreases from 0.8 to 0.05 seconds. v2: Use out of range start/end instead of separate bool for the active flag (suggestion by Jordan), fix double-upload in the stalling path. Reviewed-by: Jordan Justen <[email protected]>
* i965: Be sure to reset brw->vb.buffers[] when trying to redo vertex setup.Eric Anholt2013-10-231-0/+2
| | | | | | | The brw_prepare_vertices that sets up buffers[] depends on these parameters, so don't let brw_prepare_vertices() skip it. Reviewed-by: Jordan Justen <[email protected]>
* i965: Add support for GL_ARB_texture_buffer_range.Eric Anholt2013-10-235-9/+34
| | | | | | | | | | | Supporting this extension turns out to simplify our code a bit over not supporting this extension, once the glBufferSubData() synchronization code lands. v2: Use 16 byte alignment like we do for uniform buffers, due to unaligned access penalties. Reviewed-by: Jordan Justen <[email protected]> (v1)
* i965: Add a note about the late-allocation in intel_bufferobj_buffer().Eric Anholt2013-10-231-0/+4
| | | | | | | | This was mostly for the i915 system-memory VBO code, which we don't have any more, but since that existed we've ended up producing dependencies on it being there. Reviewed-by: Jordan Justen <[email protected]>
* i965: Drop intel_bufferobj_source().Eric Anholt2013-10-234-30/+8
| | | | | | | Since src_offset was always 0, it wasn't doing anything for us beyond intel_bufferobj_buffer(). Reviewed-by: Jordan Justen <[email protected]>
* i965: Fix texture buffer rendering after a whole buffer replacement.Eric Anholt2013-10-231-0/+2
| | | | | | | | | | | If glBufferData(), glBufferSubData(0, obj->Size), or similar happens, we get a new drm_intel_bo for the buffer object, and thus need to re-upload texture buffer state so we point at the new data. Fixes the new piglit GL_ARB_texture_buffer_object/data-sync Cc: "9.2" <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* clover: fix build after a3ed98f7aa85636579a5696bf036ec13e5c9104aDavid Heidelberger2013-10-231-3/+4
|
* nv50: clamp PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS to PIPE_MAX_SAMPLERSBrian Paul2013-10-231-1/+1
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70212 Tested-by: Aaron Watry <[email protected]>
* radeonsi: remove unused si_set_cs_sampler_view()Brian Paul2013-10-231-4/+0
| | | | | | | Fixes build breakage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70804 Tested-by: Vinson Lee <[email protected]>
* gallium: new, unified pipe_context::set_sampler_views() functionBrian Paul2013-10-2344-492/+277
| | | | | | | | | | | | The new function replaces four old functions: set_fragment/vertex/ geometry/compute_sampler_views(). Note: at this time, it's expected that the 'start' parameter will always be zero. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Emil Velikov <[email protected]>
* svga: remove unneeded include of u_double_list.hBrian Paul2013-10-231-2/+0
|
* i965: Expose write_reg() as brw_store_register_mem64().Kenneth Graunke2013-10-232-9/+11
| | | | | | | | | | | | Writing a 64-bit register value to memory is sufficiently complicated that it makes sense to reuse this function rather than duplicating it. Exposing it outside of gen6_queryobj.c means it needs a more descriptive function name. It could probably be moved to brw_util.c or somewhere else, but this works too. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Move flushing out of write_reg and into the callers.Kenneth Graunke2013-10-231-4/+8
| | | | | | | | | | The current callers just want to write a single register, so combining the register read with a pipeline flush made sense. However, in the future we'll want to do multiple register reads back to back, and we'll only want to flush once. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Simplify the interface to link_invalidate_variable_locationsIan Romanick2013-10-223-44/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | The unit tests added in the previous commits prove some things about the state of some internal data structures. The most important of these is that all built-in input and output variables have explicit_location set. This means that link_invalidate_variable_locations doesn't need to know the range of non-generic shader inputs or outputs. It can simply reset location state depending on whether explicit_location is set. There are two additional assumptions that were already implicit in the code that comments now document. - ir_variable::is_unmatched_generic_inout is only used by the linker when connecting outputs from one shader stage to inputs of another shader stage. - Any varying that has explicit_location set must be a built-in. This will be true until GL_ARB_separate_shader_objects is supported. As a result, the input_base and output_base parameters to link_invalidate_variable_locations are no longer necessary, and the code for resetting locations and setting is_unmatched_generic_inout can be simplified. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl/tests: Unit test vertex shader in / out with ↵Ian Romanick2013-10-222-0/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | link_invalidate_variable_locations Validates: - ir_variable::explicit_location should not be modified. - If ir_variable::explicit_location is not set, ir_variable::location, ir_variable::location_frac, and ir_variable::is_unmatched_generic_inout must be reset to 0. - If ir_variable::explicit_location is set, ir_variable::location should not be modified. ir_variable::location_frac, and ir_variable::is_unmatched_generic_inout must be reset to 0. Previous unit tests have shown that all non-generic inputs / outputs have explicit_location set. v2: Split the link_invalidate_variable_locations interface change out to a separate patch. Remove the vertex_in_builtin_without_explicit and vertex_out_builtin_without_explicit tests. There was a lot of good discussion about this on the mailing list to which I refer the interested reader. Both changes suggested by Paul. http://lists.freedesktop.org/archives/mesa-dev/2013-October/046652.html Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl: Modify interface to link_invalidate_variable_locationsIan Romanick2013-10-222-7/+7
| | | | | | | | | | | This will make it easier to unit test this function in successive patches. Also, correct the prototype in linker.h. It was... wrong. v2: Split the interface change from adding the unit tests. Suggested by Paul. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl/tests: Verify geometry shader built-ins generated by ↵Ian Romanick2013-10-221-0/+98
| | | | | | | | | | | | | | | | | | | | | | | | | _mesa_glsl_initialize_variables Checks that the variables generated meet certain criteria. - Geometry shader inputs have an explicit location. - Geometry shader outputs have an explicit location. - Fragment shader-only varying locations are not used. - Geometry shader uniforms and system values don't have an explicit location. - Geometry shader constants don't have an explicit location and are read-only. - No other kinds of geometry variables exist. It does not verify that an specific variables exist. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl/tests: Verify fragment shader built-ins generated by ↵Ian Romanick2013-10-221-0/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | _mesa_glsl_initialize_variables Checks that the variables generated meet certain criteria. - Fragment shader inputs have an explicit location. - Fragment shader outputs have an explicit location. - Vertex / geometry shader-only varying locations are not used. - Fragment shader uniforms and system values don't have an explicit location. - Fragment shader constants don't have an explicit location and are read-only. - No other kinds of fragment variables exist. It does not verify that an specific variables exist. v2: Use _mesa_varying_slot_in_fs in fragment_builtin.inputs_have_explicit_location. Suggested by Paul. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl/tests: Verify vertex shader built-ins generated by ↵Ian Romanick2013-10-222-0/+225
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | _mesa_glsl_initialize_variables Checks that the variables generated meet certain criteria. - Vertex shader inputs have an explicit location. - Vertex shader outputs have an explicit location. - Fragment shader-only varying locations are not used. - Vertex shader uniforms and system values don't have an explicit location. - Vertex shader constants don't have an explicit location and are read-only. - No other kinds of vertex variables exist. It does not verify that an specific variables exist. v2: Fix memory management mistakes in common_builtin::string_starts_with_prefix. Clean up error message reporting in common_builtin::no_invalid_variable_modes. Both suggested by Paul. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* glsl: When constructing a variable with an interface type, set interface_typeIan Romanick2013-10-226-4/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ever since the addition of interface blocks with instance names, we have had an implicit invariant: var->type->is_interface() == (var->type == var->interface_type) The odd use of == here is intentional because !var->type->is_interface() implies var->type != var->interface_type. Further, if var->type->is_array() is true, we have a related implicit invariant: var->type->fields.array->is_interface() == (var->type->fields.array == var->interface_type) However, the ir_variable constructor doesn't maintain either invariant. That seems kind of silly... and I tripped over it while writing some other code. This patch makes the constructor do the right thing, and it introduces some tests to verify that behavior. v2: Add general-ir-test to .gitignore. Update the description of the ir_variable invariant for arrays in the commit message. Both suggested by Paul. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* mesa/tests: Add simple, dumb test for _mesa_program_state_stringIan Romanick2013-10-222-1/+48
| | | | | | | | | | | After some discussions about the correct way to update _mesa_program_state_string, I decided to make a unit test for the function. It turns out that the function didn't work quite the way I thought. The unit test proves that the code was already correct. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: Anuj Phogat <[email protected]>
* wayland: Don't leak wl_drm global when unbinding displayAnder Conselvan de Oliveira2013-10-221-2/+5
|
* mesa: fixes for MSVC 2013Scott Graham2013-10-222-1/+4
| | | | | Cc: "9.2" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/mesa: minor whitespace, comment changes in st_draw.cBrian Paul2013-10-221-8/+11
|
* st/dri: minor formatting clean-ups in dri_context.cBrian Paul2013-10-221-4/+6
|
* mesa: fix a couple issues with U_FIXED, I_FIXED macrosBrian Paul2013-10-221-3/+3
| | | | | | | | | | Silence a bunch of MSVC type conversion warnings. Changed return type of S_FIXED to int32_t (signed). The result is the same. It just seems more intuitive that a signed conversion function should return a signed value. Reviewed-by: Jose Fonseca <[email protected]>
* mesa: remove GL_MESA_program_debug bits from gl.hBrian Paul2013-10-221-21/+0
| | | | | | The code for this was removed from Mesa some time ago. Reviewed-by: Ian Romanick <[email protected]>