aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* targets/dri-ilo: make the driver installableChia-I Wu2014-03-161-4/+3
| | | | | | | | | | | | | | install-gallium-links.mk fails to create the compat link for ilo_dri.so because it looks for dri_LTLIBRARIES instead of noinst_LTLIBRARIES. Fix this by switching to dri_LTLIBRARIES (and make the driver installable). Since pci_id_driver_map.h and the DDX both tell libGL.so to look for "i965", ilo_dri.so will never be loaded even enabled and installed. The change should not create any more confusion. Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: mark GL_RGB9_E5 as not color-renderableMarek Olšák2014-03-151-4/+0
| | | | | | | | | | | | The GL 4.4 spec says it's not color-renderable and not accepted by RenderBufferStorage. The EXT_texture_shared_exponent spec says it's not color-renderable but it's accepted by RenderBufferStorageEXT. This seems to be a bug in the extension spec. Let's do what GL 4.4 says. Cc: [email protected] Reviewed-by: Ian Romanick <[email protected]>
* radeonsi/compute: Fix memory leakAaron Watry2014-03-151-0/+6
| | | | | | Free shader buffer object for all kernels when deleting compute state. Signed-off-by: Aaron Watry <[email protected]>
* st/mesa: remove _NEW_POLYGON dependency from vertex shaderMarek Olšák2014-03-153-11/+12
| | | | | | We can just check the polygon mode when updating the edge flag state. Also, we can just flag ST_NEW_VERTEX_PROGRAM directly, which makes ST_NEW_EDGEFLAGS_DATA useless.
* st/mesa: implement zero-stride edge flag by culling primitivesMarek Olšák2014-03-153-1/+17
| | | | This was unimplemented.
* st/mesa: fix per-vertex edge flags and GLSL support (v2)Marek Olšák2014-03-152-7/+6
| | | | | | | | This fixes piglit/gl-2.0-edgeflag. v2: use StrideB to recognize per-vertex edge flags Cc: [email protected]
* i965/fs: Invalidate live intervals when demoting uniforms to pull params.Kenneth Graunke2014-03-141-0/+2
| | | | | | | | | | | | | | Normally, nothing uses live intervals at this point, so this isn't necessary. However, dump_instructions() calculates them and uses them to show register pressure. So, calling dump_instructions() in this area of the code would segfault due to the arrays being the wrong size. This is not a candidate for stable branches because it only serves to fix internal debugging code that you manually have to invoke by altering the source code or using gdb. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Print "+reladdr" on variably-indexed uniform arrays.Kenneth Graunke2014-03-141-2/+5
| | | | | | | | | | | | Previously, dump_instruction() would print output such as: { 2} 3: mov vgrf1:F, u0:F { 3} 4: mov vgrf7:F, u0:F { 4} 5: mov vgrf8:F, u0:F which looked like either a scalar access or perhaps a constant-indexed access of element 0, when it was really a variable index. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Fix register types in dump_instructions(), again.Kenneth Graunke2014-03-144-4/+3
| | | | | | | | | | | | In commit e57d77280efcbfd6579a88f071426653287ef833, I fixed this for destinations in the Vec4 backend, and sources in the scalar backend. But not both types in both backends. To prevent this mess from continuing, make the reg_encoding table static, so only the disassembler can use it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Fix register comparisons in saturate propagation.Kenneth Graunke2014-03-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | opt_saturate_propagation_local compares scan_inst->dst.reg/reg_offset with inst->src[0].reg/reg_offset, and ensures that scan_inst->dst.file is GRF. But nothing ensured that inst->src[0].file was GRF. In the following program, this resulted in u1:F matching vgrf1:UW, and a saturate being incorrectly propagated from instruction 8 to instruction 1. { 1} 0: add vgrf0:UW, hw_reg1+8:UW, hw_reg0:V { 1} 1: add vgrf1:UW, hw_reg1+10:UW, hw_reg0:V { 1} 2: linterp vgrf6:F, hw_reg2:F, hw_reg3:F, hw_reg0:F { 2} 3: linterp vgrf27:F, hw_reg2:F, hw_reg3:F, hw_reg0+16:F { 4} 4: mov vgrf10+0.0:F, vgrf6:F { 3} 5: mov vgrf10+1.0:F, vgrf27:F { 6} 6: tex vgrf8+0.0:F, vgrf10+0.0:F { 5} 7: mov vgrf32:F, u1:F { 5} 8: mov.sat vgrf12:F, u1:F From shader-db: total instructions in shared programs: 1841932 -> 1841957 (0.00%) instructions in affected programs: 5823 -> 5848 (0.43%) I inspected two of the 25 hurt shaders, and concluded that they were both hitting this bug, and not legitimately optimized. This fixes bugs in Left 4 Dead 2 and Team Fortress 2, possibly among others. The optimization pass didn't exist in 10.0, so this is only a candidate for 10.1. Cc: "10.1" <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* glsl: Improve debug output and variable names for opt_dead_code_local.Eric Anholt2014-03-141-13/+13
| | | | | | | I know this code has confused others, and it confused me 3 years later, too. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add support for GL_ARB_buffer_storage.Eric Anholt2014-03-143-3/+9
| | | | | | | | | | | | It turns out we can allow COHERENT storage/mappings all the time, regardless of LLC vs non-LLC. It just means never using temporary mappings to avoid GPU stalls, and on non-LLC we have to use the GTT intead of CPU mappings. If we were to use CPU maps on non-LLC (which might be useful if apps end up using buffer_storage on PBO reads, to avoid WC read slowness), those would be PERSISTENT but not COHERENT, but doing that would require us driving the clflushes from userspace somehow. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Always use CPU mappings for BOs on LLC platforms.Eric Anholt2014-03-141-1/+1
| | | | | | | | It looks like there's no big difference for write-only workloads, but using a CPU map means that if they happen to read without having set the MAP_READ_BIT, they get 100x the performance for those reads. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop the system-memory temporary allocations for flush explicit.Eric Anholt2014-03-142-52/+58
| | | | | | | While in expected usage patterns nobody will ever hit this path, doubling our bandwidth used seems like a waste, and it cost us extra code too. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Switch mapping modes for non-explicit-flush blit-temporary maps.Eric Anholt2014-03-141-3/+3
| | | | | | | | | On LLC, it should always be better to use a cached mapping than the GTT. On non-LLC, it seems pretty silly to try to optimize read performance for the INVALIDATE_RANGE_BIT case. This will make the buffer_storage logic easier. Reviewed-by: Kenneth Graunke <[email protected]>
* gallivm: optimize repeat linear npot code in the aos int pathJeff Muizelaar2014-03-141-12/+62
| | | | | | Similar to the other cases, shift some weight/coord calculations to int space. This should be slightly faster (on x86 sse it should actually safe one instruction, and generally int instructions are cheaper).
* gallivm: use correct rounding for nearest wrap mode (in the aos int path)Roland Scheidegger2014-03-141-29/+9
| | | | | | | | | | | | | The previous code used coords which were calculated as (int) (f_coord * tex_size * 256) >> 8. This is not only unnecessarily complex but can give the wrong texel due to rounding for negative coords (as an example, after denormalization coords from -1.0 to 0.0 should give -1, but this will give -1 for numbers from -1.0-1/256 - 0.0-1/256. Instead, juse use ifloor, dropping the shift stuff. Unfortunately, this will most likely be slower - with arch rounding available it shouldn't be too bad (trades a int shift for a round but also saves an int mul (which is shared by all coords) but otherwise it's a mess.
* gallivm: use correct rounding for linear wrap mode (in the aos int path)Jeff Muizelaar2014-03-141-6/+8
| | | | | | | | | | | | | | | | | | | The previous method for converting coords to ints was sligthly inaccurate (effectively losing 1bit from the 8bit lerp weight). This is probably especially noticeable when trying to draw a pixel-aligned texture. As an example, for a 100x100 texture after dernormalization the texture coords in this case would turn up as 0.5, 1.5, 2.5, 3.5, 4.5, ... After the mul by 256, conversion to int and 128 subtraction, they end up as 0, 256, 512, 768, 1024, ... which gets us the correct coords/weights of 0/0, 1/0, 2/0, 3/0, 4/0, ... But even LSB errors (which are unavoidable) in the input coords may cause these coords/weights to be wrong, e.g. for a coord of 3.49999 we'd get a coord/weight of 2/255 instead. Fix this by using round-to-nearest int instead of FPToSi (trunc). Should be equally fast on x86 sse though other archs probably suffer a little.
* glapi: restore _glthread_GetID() functionBrian Paul2014-03-142-0/+15
| | | | | | | This partially reverts patch 02cb04c68f. This fixes an unresolved symbol error when using older builds of libGL. Tested-by: Chia-I Wu <[email protected]>
* radeonsi: flush the dma ring in si_flush_from_stNiels Ole Salscheider2014-03-141-0/+7
| | | | | Signed-off-by: Niels Ole Salscheider <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeon: Move DMA ring creation to common codeNiels Ole Salscheider2014-03-144-31/+32
| | | | | Signed-off-by: Niels Ole Salscheider <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* mesa: return v.value_int64 when the requested type is TYPE_INT64Emil Velikov2014-03-141-3/+3
| | | | | | | | Fixes "Operands don't affect result" defect reported by Coverity. Cc: "9.2 10.0 10.1" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nvc0: minor cleanups in stream output handlingEmil Velikov2014-03-141-4/+5
| | | | | | | | | | Constify the offsets parameter to silence gcc warning 'assignment from incompatible pointer type' due to function prototype miss-match. Use a boolean changed as a shorthand for target != current_target. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: honor fread return value in the nouveau_compilerEmil Velikov2014-03-141-2/+2
| | | | | | | | | | There is little point of continuing if fread returns zero, as it indicates that either the file is empty or cannot be read from. Bail out if fread returns zero after closing the file. Cc: Ilia Mirkin <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: typecast the prime_fd handle when calling nouveau_bo_set_primeEmil Velikov2014-03-141-1/+1
| | | | | | | | Core drm defines that the handle is of type int, while all drivers treat it as uint internally. Typecast the value to silence gcc warning messages and be consistent amongst all drivers. Signed-off-by: Emil Velikov <[email protected]>
* nv50: add missing brackets when handling the samplers arrayEmil Velikov2014-03-141-1/+2
| | | | | | | | | | | | | | | | | | Commit 3805a864b1d(nv50: assert before trying to out-of-bounds access samplers) introduced a series of asserts as a precausion of a previous illegal memory access. Although it failed to encapsulate loop within nv50_sampler_state_delete effectively failing to clear the sampler state, apart from exadurating the illegal memory access issue. Fixes gcc warning "array subscript is above array bounds" and "Nesting level does not match indentation" and "Out-of-bounds read" defects reported by Coverity. Cc: "10.1" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: Fix build warning of unused variableAnuj Phogat2014-03-141-2/+0
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Tested-by: Kenneth Graunke <[email protected]>
* dri3: Add GLX_EXT_buffer_age supportAdel Gadllah2014-03-137-3/+55
| | | | | | | | | | | v2: Indent according to Mesa style, reuse sbc instead of making a new swap_count field, and actually get a usable back before returning the age of the back (fixing updated piglit tests). Changes by anholt. Signed-off-by: Adel Gadllah <[email protected]> Reviewed-by: Robert Bragg <[email protected]> (v1) Reviewed-by: Adel Gadllah <[email protected]> (v2) Reviewed-by: Eric Anholt <[email protected]>
* dri3: Prefer the last chosen back when finding a new one.Eric Anholt2014-03-131-10/+7
| | | | | | | | | | | | | | With the buffer_age code, I need to be able to potentially call this more than once per frame, and it would be bad if a new special event showing up meant I chose a different back mid-frame. Now, once we've chosen a back for the frame, another find_back will choose it again since we know that it won't have ->busy set until swap. Note that this makes find_back return a buffer id instead of a backbuffer index. That's kind of a silly distinction anyway, since it's an identity mapping between the two (it's the front buffer that is at an offset). Reviewed-By: Adel Gadllah <[email protected]>
* Add the EGL_MESA_configless_context extensionNeil Roberts2014-03-1211-37/+218
| | | | | | | | | | | | | | | | | | | | This extension provides a way for an application to render to multiple surfaces with different buffer formats without having to use multiple contexts. An EGLContext can be created without an EGLConfig by passing EGL_NO_CONFIG_MESA. In that case there are no restrictions on the surfaces that can be used with the context apart from that they must be using the same EGLDisplay. _mesa_initialze_context can now take a NULL gl_config which will mark the context as ‘configless’. It will memset the visual to zero in that case. Previously the i965 and i915 drivers were explicitly creating a zeroed visual whenever 0 is passed for the EGLConfig. Mesa needs to be aware that the context is configless because it affects the initial value to use for glDrawBuffer. The first time the context is bound it will set the initial value for configless contexts depending on whether the framebuffer used is double-buffered. Reviewed-by: Kristian Høgsberg <[email protected]>
* eglCreateContext: Remove the check for whether config == 0Neil Roberts2014-03-121-5/+2
| | | | | | | | | | | | | | | | In eglCreateContext there is a check for whether the config parameter is zero and in this case it will avoid reporting an error if the EGL_KHR_surfacless_context extension is supported. However there is nothing in that extension which says you can create a context without a config and Mesa breaks if you try this so it is probably better to leave it reporting an error. The original check was added in b90a3e7d8b1bc based on the API-specific extensions EGL_KHR_surfaceless_opengl/gles1/gles2. This was later changed to refer to EGL_KHR_surfacless_context in b50703aea5. Perhaps the original extensions specified a configless context but the new one does not. Reviewed-by: Kristian Høgsberg <[email protected]>
* Fix the initial value of glDrawBuffers for GLESNeil Roberts2014-03-121-1/+3
| | | | | | | | | | | | | | | | | | Under GLES 3 it is not valid to pass GL_FRONT to glDrawBuffers. Instead, GL_BACK has a magic interpretation which means it will render to the front buffer on single-buffered contexts and the back buffer on double-buffered. We were incorrectly setting the initial value to GL_FRONT for single-buffered contexts. This probably doesn't really matter at the moment except that presumably it would be exposed in the API via glGetIntegerv. When we switch to configless contexts this is more important because in that case we always want to rely on the magic interpretation of GL_BACK in order to automatically switch between the front and back buffer when a new surface with a different number of buffers is bound. We also do this for GLES 1 and 2 because the internal value doesn't matter in that case and it is convenient to use the same code to have the magic interpretation of GL_BACK. Reviewed-by: Kristian Høgsberg <[email protected]>
* Use the magic behaviour of GL_BACK in GLES 1 and 2 as well as 3Neil Roberts2014-03-121-1/+6
| | | | | | | | | | | | | | | | | | In GLES 3 it is not possible to select rendering to the front buffer and instead selecting GL_BACK has the magic interpretation that it is either the front buffer on single-buffered configs or the back buffer on double-buffered. GLES 1 and 2 have no way of selecting the draw buffer at all. In that case we were initialising the draw buffer to either GL_FRONT or GL_BACK depending on the context's config and then leaving it at that. When we switch to having configless contexts we ideally want Mesa to automatically switch between the front and back buffer whenever a double- or single-buffered surface is bound. To make this happen we can just allow the magic behaviour from GLES 3 in GLES 1 and 2 as well. It shouldn't matter what the internal value of the draw buffer is in GLES 1 and 2 because there is no way to query it from the external API. Reviewed-by: Kristian Høgsberg <[email protected]>
* glsl: Fix typoIan Romanick2014-03-121-2/+2
| | | | | | | Remove extra "any" and re-word-wrap the comment. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Rewrite unrolled link_invalidate_variable_locations calls as a loopIan Romanick2014-03-121-11/+4
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* docs: Import 10.0.4 release notes, add news item.Carl Worth2014-03-123-0/+198
|
* mesa: Release gl_debug_state when destroying context.Mike Stroyan2014-03-121-1/+4
| | | | | | | | Commit 6e8d04a caused a leak by allocating ctx->Debug but never freeing it. Release the memory in _mesa_free_errors_data when destroying a context. Use FREE to match CALLOC_STRUCT from _mesa_get_debug_state. Reviewed-by: Brian Paul <[email protected]>
* r600g: compute memory pool size is given in dwNiels Ole Salscheider2014-03-111-2/+2
| | | | | | | Multiply the dw value by 4 in order to map the complete buffer. Reviewed-by: Tom Stellard <[email protected]> Signed-off-by: Niels Ole Salscheider <[email protected]>
* meta: Always restore the framebuffers and current renderbuffer.Eric Anholt2014-03-113-21/+17
| | | | | | | | The few paths that were playing with framebuffers and renderbuffer were saving and restoring them. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Drop intel_check_front_buffer_rendering().Eric Anholt2014-03-116-27/+0
| | | | | | | | | This was being applied in a subset of the places that intel_prepare_render() was called, to set the same flag that intel_prepare_render() was setting. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Drop broken front_buffer_reading/drawing optimization.Eric Anholt2014-03-115-42/+44
| | | | | | | | | | | | The flag wasn't getting updated correctly when the ctx->DrawBuffer or ctx->ReadBuffer changed. It usually ended up working out because most apps only have one window system framebuffer, or if they have more than one and they have any front read/drawing, they will have called glReadBuffer()/glDrawBuffer() on it when they get started on the new buffer. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* intel: When checking for updating front buffer reading, use the right fb.Eric Anholt2014-03-112-2/+2
| | | | | | | | | | | | | It's the ctx->ReadBuffer that gets read from, not the ctx->DrawBuffer. So, if you happened to have a ctx->ReadBuffer that was the winsys buffer, and it had previously been intel_prepare_render()ed but not invalidated since then, and you called glReadBuffer() to switch to front buffer instead of back buffer reading on the winsys fbo while your drawbuffer was a user FBO, you'd never get the front buffer's miptree fetched, and segfault. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* r600g,radeonsi: attempt to fix racy multi-context apps calling BufferDataMarek Olšák2014-03-113-14/+18
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75061 v2: minimize the window where cs_buf != new_buf
* r600g,radeonsi: fix broken buffer downloadMarek Olšák2014-03-111-1/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: use a fallback in dma_copy instead of failingMarek Olšák2014-03-116-97/+99
| | | | | | v2: - allow byte-aligned DMA buffer copies on Evergreen - fix piglit/texsubimage regression - use the fallback for 3D copies (depth > 1) as well
* radeonsi: small cleanup in get_paramMarek Olšák2014-03-111-4/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: set correct alignment for texture buffers and constant buffersMarek Olšák2014-03-111-3/+2
| | | | | | | I think these are all equivalent to vertex buffer fetches which should be dword-aligned. Scalar loads are also dword-aligned. Reviewed-by: Michel Dänzer <[email protected]>
* r600g, radeonsi: fix primitives-generated query with disabled streamoutMarek Olšák2014-03-1111-49/+87
| | | | | | | | | | | | | | | | | Buffers are disabled by VGT_STRMOUT_BUFFER_CONFIG, but the query only works if VGT_STRMOUT_CONFIG.STREAMOUT_0_EN is enabled. This moves VGT_STRMOUT_CONFIG to its own state. The register is set to 1 if either streamout or the primitives-generated query is enabled. However, the primitives-emitted query is also incremented, so it's disabled by setting VGT_STRMOUT_BUFFER_SIZE to 0 when there is no buffer bound. This fixes piglit: ARB_transform_feedback2/counting with pause EXT_transform_feedback/primgen-query transform-feedback-disabled Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: don't add streamout.num_dw_for_end twiceMarek Olšák2014-03-111-2/+4
| | | | | | | | It's already added in need_cs_space. Also don't calculate anything if there are no buffers. Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: fix MAX_TEXTURE_3D_LEVELS and MAX_TEXTURE_ARRAY_LAYERS limitsMarek Olšák2014-03-112-6/+11
| | | | | | | | | CB_COLORi_VIEW.SLICE_MAX can be at most 2047. This fixes the maxlayers piglit test. Cc: [email protected] Reviewed-by: Michel Dänzer <[email protected]>