summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* mesa: Delete the ctx->Array._RestartIndex derived state.Kenneth Graunke2013-05-294-12/+3
| | | | | | | | | | | | It's incorrect and isn't used any longer. v2: Actually flush vertices/flag _NEW_TRANSFORM on RestartIndex change. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Ignore fixed-index primitive restart in ArrayElement().Kenneth Graunke2013-05-291-1/+1
| | | | | | | | | | | | GL_PRIMITIVE_RESTART_FIXED_INDEX is only supposed to apply to glDrawElements*. This code is for legacy drawing paths and display lists, so it shouldn't apply. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* st/mesa: Go back to using ctx->Array.RestartIndex, not _RestartIndex.Kenneth Graunke2013-05-291-1/+1
| | | | | | | | | | | | | The derived _RestartIndex field is an attempt to support both GL_PRIMITIVE_RESTART and GL_PRIMITIVE_RESTART_FIXED_INDEX (part of ES 3.0). Gallium drivers don't appear to support ES 3.0 yet, so they don't need to use it. Plus, it's broken and going to go away soon. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Fix can_cut_index_handle_restart_index() for byte/short types.Kenneth Graunke2013-05-291-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Pre-Haswell hardware doesn't support an arbitrary restart index, and instead compares the index buffer value against 0xFF for byte-size buffers, 0xFFFF for short-size buffers, or 0xFFFFFFFF for unsigned integer buffers. OpenGL allows the restart index to be an arbitrary unsigned integer. When comparing against byte/short types, the index buffer value should be promoted to a full 32-bit integer before doing the comparison. The restart index is /not/ supposed to be masked to byte/short size. This means that with certain restart indexes, the comparison should always fail. For example, a restart index of 0xF000FFFF should never match any byte/short index buffer values due to the extra high bits. We must not enable hardware primitive restart in such a case. For now, fall back to software primitive restart as it's the simplest fix. In the future, we could detect restart indexes that will never match and skip both hardware and software primitive restart. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Use the correct restart index for fixed index mode on Haswell.Kenneth Graunke2013-05-291-2/+4
| | | | | | | | | | | | | | | | | | | | | | The code that updates the ctx->Array._RestartIndex derived state mashed it to 0xFFFFFFFF when GL_PRIMITIVE_RESTART_FIXED_INDEX was enabled regardless of the index buffer type. It's supposed to be 0xFF for byte, 0xFFFF for short, or 0xFFFFFFFF for integer types. The new _mesa_primitive_restart_index() helper gets this right. The hardware appears to compare against the full 32-bit value some of the time, causing primitive restart not to occur when it should. The fact that it works some of the time is rather frightening. Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart conformance test when run in combination with other tests. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* vbo: Use the new primitive restart index helper function.Kenneth Graunke2013-05-292-2/+3
| | | | | | | | | | | This gets the correct restart index for unsigned byte/short types when using GL_PRIMITIVE_RESTART_FIXED_INDEX. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add a helper function for determining the restart index.Kenneth Graunke2013-05-292-0/+26
| | | | | | | | | | | | | | | | | | The derived state approach currently used (_RestartIndex) doesn't work: in the GL_PRIMITIVE_RESTART_FIXED_INDEX case, the restart index depends on the index buffer's data type, and that isn't known until draw time. The existing code also fails to obey the GL 4.3 rules which say that FIXED_INDEX takes precedence over normal primitive restart. This helper function correctly determines the restart index, and will replace the derived state. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* vbo: Ignore PRIMITIVE_RESTART_FIXED_INDEX for glDrawArrays().Kenneth Graunke2013-05-291-5/+5
| | | | | | | | | | | | | | | | | | | | | | | The derived _PrimitiveRestart enable flag combines the PrimitiveRestart and PrimitiveRestartFixedIndex enable flags. However, DrawArrays is not supposed to do FixedIndex restart: From the OpenGL 4.3 Core specification, section 10.3.5 (page 302): "If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not performed for array elements transferred by any drawing command not taking a type parameter, including all of the *Draw* commands other than *DrawElements*." The OpenGL ES 3.0 specification agrees by omission: "When DrawElements, DrawElementsInstanced, or DrawRangeElements transfers a set of generic attribute array elements to the GL..." Notably, DrawArrays is not included in the list of draw calls that take PRIMITIVE_RESTART_FIXED_INDEX into consideration. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Fix implied_mrf_writes() for integer division pre-gen6.Eric Anholt2013-05-291-0/+2
| | | | | | | | | | | Previously it would assertion fail in debug builds (though the correct value was returned in a non-debug build). Marking it as a candidate for stable even though it has no current consumers in the stable branches, in case one shows up in a later backport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64727 NOTE: This is a candidate for stable branches. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix test for smearing enabled on an instruction.Eric Anholt2013-05-291-1/+1
| | | | | | | | | | | | | | | | We were expanding the live range too far, breaking register_coalesce_2() and compute_to_mrf() on 16-wide shaders. Turning it back on improves GLB2.7 performance by 0.239355% +/- 0.0850649% (n=398). shader-db stats are: total instructions in shared programs: 1627211 -> 1609262 (-1.10%) instructions in affected programs: 450351 -> 432402 (-3.99%) While 33 new 16-wide shaders are gained, 70 are lost. Despite that, tropics (the app that lost the most 16-wide) shows a .41% +/- .16% (n=7/8, first-run outlier removed) performance improvement on my HSW. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Fix segfault in instruction scheduling with LINTERP using last GRF.Eric Anholt2013-05-291-2/+8
| | | | | | | | | The scheduler didn't know about uniform-type accesses, and if a uniform access was last in a 16-wide, we'd walk off the end of the array. This never happened, because we'd never coalesce out all the GRFs, due to a bug to be fixed in the next commit. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Fix test for optimistic coloring being necessary.Eric Anholt2013-05-291-1/+1
| | | | | | | | | | i965 and radeon use ra_set_node_reg() to force payload registers to specific registers while exposing those registers to the allocator still. We were treating those register nodes as unsuccessfully allocated in the ra_simplify() step, leading to walking the registers again to do optimistic coloring even if there was nothing left ot do. Acked-by: Kenneth Graunke <[email protected]>
* gallium: fix build on uclibc systemAnthony G. Basile2013-05-291-4/+2
| | | | | | | | | | | | execinfo.h and debug_symbol_name_glibc() are pure GNU-isms and do not build on uclibc systems. A previous patch addressed this issue, but there was an error. This patch corrects that error. See https://bugs.freedesktop.org/show_bug.cgi?id=51782 https://bugs.gentoo.org/show_bug.cgi?id=469768 Signed-off-by: Anthony G. Basile <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* intel: Enable blit glCopyTexSubImage/glBlitFramebuffer with sRGB.Eric Anholt2013-05-281-5/+11
| | | | | | | | | | | | Since the introduction of default-to-SARGB8 window system framebuffers, non-blorp hardware lost blit acceleration for these two paths between the window system and ARGB8888 textures. Since we shouldn't be doing any conversion anyway, just compatibility-check the linear variants of the formats. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61954 Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Tobias Jakobi <[email protected]>
* llvmpipe: get rid of tiled/linear layout remainsRoland Scheidegger2013-05-296-226/+47
| | | | | | | Eliminate the rest of the no longer needed layout logic. (It is possible some code could be simplified a bit further still.) Reviewed-by: Jose Fonseca <[email protected]>
* intel: Remove dead intel_drawbuf_region().Eric Anholt2013-05-282-16/+0
| | | | | | | | Since the glBitmap() MRT change, it's unused. There was basically no way to responsibly use this function since MRT was introduced. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Fix format handling of blit glBitmap()Eric Anholt2013-05-281-3/+12
| | | | | | | | | | | Any 32-bit format got ARGB8888 handling (including, say, GL_RG1616), and anything else got 16-bit (including, say, GL_R8), which could potentially hang the GPU by writing out of bounds. NOTE: This is a candidate for the stable branches. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Fix MRT handling of glBitmap().Eric Anholt2013-05-281-9/+14
| | | | | | | | | We'd only hit color buffer 0 even if multiple draw buffers were bound. NOTE: This is a candidate for the stable branches. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Rebuild PBO blit glTexImage() on top of miptrees.Eric Anholt2013-05-281-30/+32
| | | | | | | | | | | | | | This will ensure that we have resolves if we ever extend this to glTexSubImage(), and fixes missing image start offset handling. The texture buffer alloc ended up getting moved up, because we want to look at the format of the image's actual mt to see if we'll end up blitting the right thing, in the case of packed depth/stencil uploads. This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Rebuild PBO blit glReadPixels() on top of miptrees.Eric Anholt2013-05-281-25/+23
| | | | | | | | | The previous code was missing depth resolves, that had only been prevented due to no blitting of Y tiling. The pair of flip args in the new blit function means that we can just drop the pack->Invert fallback. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Rework intel_miptree_create_for_region() to wrap a BO.Eric Anholt2013-05-283-24/+67
| | | | | | | | | | | | | I needed to do this for the PBO blit cases to use intel_miptree_blit(). But this also actually partially fixes a bug in EGLImage handling: We can't share regions across contexts, because regions have a refcount that isn't protected by a mutex, and different contexts can be simulataneously accessed from multiple threads. Now we just need to get regions out of __DRIImage. There was also a missing use of image->offset in the EGLImage renderbuffer storage code. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Make a temporary miptree for the blit path of miptree mapping.Eric Anholt2013-05-282-74/+29
| | | | | | | | | | | In a bit of debug code, we no longer have the inter-slice x/y to print. But I think the level/slice is more useful in this case for looking at what's getting mapped, especially given that INTEL_DEBUG=blit will tell you the other value. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Make a temporary miptree when doing blit uploads for glTexSubImage().Eric Anholt2013-05-281-44/+28
| | | | | | | | | | | While this is a bit more CPU work, it also is less code to handle this path, and fixes problems with 32k-pitch textures and missing resolves. v2: Add error checking in new code. Reviewed-and-tested-by: Ian Romanick <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]> (v1) Acked-by: Paul Berry <[email protected]>
* intel: Extend the force_y_tiling flag to allow forcing no tiling.Eric Anholt2013-05-285-13/+26
| | | | | | | | | | | | For a blit-uploaded temporary, it's faster on current hardware to memcpy the data into a linear CPU mapping than to go through the GTT. v2: Turn the not-fully-supported mask into 3 supported enum values. Reviewed-and-tested-by: Ian Romanick <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Paul Berry <[email protected]> (v2) Reviewed-by: Chad Versace <[email protected]> (v2)
* intel: Add an assert for glCopyTexSubImage() being called on MSAA buffers.Eric Anholt2013-05-281-0/+6
| | | | | | | | | This is just in case someone else trips over this due to our weird reuse of this code in glBlitFramebuffer(). Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Allow glCopyTexSubImage() on depth textures.Eric Anholt2013-05-281-5/+0
| | | | | | | | If the hw is pre-gen5 and can't blit depth, it'll cleanly error out. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Prefer blorp glBlitFramebuffer() to the glCopyTexSubImage-based blit.Eric Anholt2013-05-281-8/+9
| | | | | | | | | | | | I think we've measured no performance difference from this in the past, except that the blorp code can do things like multisample resolves. Prevents piglit regression in the next commit when a testcase started trying to do a multisampled resolve through the old glCopyTexSubImage() path. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* i965: Consistently do depth resolves before blitting.Eric Anholt2013-05-282-6/+6
| | | | | | | | | | | | | We were protected for a long time by the fact that depth was Y tiled and you couldn't blit Y. Now that we can blit Y, we were failing to resolve depth in glCopyPixels(). Note in the comment about swrast, that the swrast map path does resolves appropriately already. Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Make a wrapper for intelEmitCopyBlit using miptrees.Eric Anholt2013-05-285-111/+127
| | | | | | | | | | | | | | I had previously asserted that it was hard to write a useful, simpler blit function, but I think this might be it. This has the side effect of extending the 32k pitch check to a few more places that were missing it. v2: Update comment for being moved inside intel_miptree_blit(). Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Rename intel_renderbuffer_tile_offsets.Eric Anholt2013-05-283-6/+6
| | | | | | | | This makes it more consistent with intel_miptree_get_tile_offsets(). Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Reduce intel_renderbuffer_tile_offsets to a thin wrapper.Eric Anholt2013-05-282-28/+7
| | | | | | Reviewed-and-tested-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* intel: Make intel_miptree_get_tile_offsets return a page offset.Eric Anholt2013-05-284-10/+26
| | | | | | | | | Right now, the callers in i965 don't expect a nonzero page offset to actually occur (since that's being handled elsewhere), but it seems like a trap to leave it this way. Reviewed-and-tested-by: Ian Romanick <[email protected]> Acked-by: Paul Berry <[email protected]>
* glsl: Fix MSVC build.José Fonseca2013-05-281-3/+3
| | | | | | | | | | | | It appears that `sizeof(Class::member)` is either non-standard or merely unsupported in MSVC. So use `sizeof(instance->member)` instead, which is guaranteed to work everywhere. Also promote the assert to a static assert. Trivial.
* mesa: fix GLSL program objects with more than 16 samplers combinedMarek Olšák2013-05-289-113/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | The problem is the sampler units are allocated from the same pool for all shader stages, so if a vertex shader uses 12 samplers (0..11), the fragment shader samplers start at index 12, leaving only 4 sampler units for the fragment shader. The main cause is probably the fact that samplers (texture unit -> sampler unit mapping, etc.) are tracked globally for an entire program object. This commit adapts the GLSL linker and core Mesa such that the sampler units are assigned to sampler uniforms for each shader stage separately (if a sampler uniform is used in all shader stages, it may occupy a different sampler unit in each, and vice versa, an i-th sampler unit may refer to a different sampler uniform in each shader stage), and the sampler-specific variables are moved from gl_shader_program to gl_shader. This doesn't require any driver changes, and it fixes piglit/max-samplers for gallium and classic swrast. It also works with any number of shader stages. v2: - converted tabs to spaces - added an assertion to _mesa_get_sampler_uniform_value Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* swrast: increase array size of TextureSampleMarek Olšák2013-05-282-4/+4
| | | | | | | | to match the size of ctx->Texture.Unit, and it will also fix piglit/max-samplers with the following commit. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: declare UniformBufferBindings as an array with a static sizeMarek Olšák2013-05-284-14/+9
| | | | | | | | Some Gallium drivers were crashing, because the array was not large enough. v2: clamp the per-shader maximum in st/mesa, then sum them all up NOTE: This is a candidate for the stable branches.
* radeonsi: Enable GLSL 1.30Michel Dänzer2013-05-281-1/+1
|
* radeonsi: Handle TGSI TXQ opcodeMichel Dänzer2013-05-282-3/+33
|
* radeonsi: Add support for TGSI TXF opcodeMichel Dänzer2013-05-282-14/+51
|
* radeonsi: Use tgsi_util_get_texture_coord_dim()Michel Dänzer2013-05-281-25/+7
|
* radeonsi: Handle TGSI_SEMANTIC_CLIPDISTMichel Dänzer2013-05-281-4/+17
|
* radeonsi: Make border colour state handling safe for integer texturesMichel Dänzer2013-05-282-20/+27
|
* radeonsi: Fix hardware state for dual source blendingMichel Dänzer2013-05-284-6/+17
| | | | | Set up CB_SHADER_MASK register according to pixel shader exports, and enable some minimal state for colour buffer 1 in case dual source blending is used.
* r600g/sb: handle more cases for folding in gvn passVadim Girlin2013-05-282-28/+118
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* st/vdpau: destroy handle table only when it's emptyChristian König2013-05-271-1/+1
| | | | Signed-off-by: Christian König <[email protected]>
* st/vdpau: remove vlCreateHTAB from surface functionsChristian König2013-05-271-9/+0
| | | | Signed-off-by: Christian König <[email protected]>
* st/vdpau: invalidate the handles on destructionChristian König2013-05-273-0/+4
| | | | | | Fixes a problem with xbmc when switching channels. Signed-off-by: Christian König <[email protected]>
* r600g/sb: improve folding for SETccVadim Girlin2013-05-271-8/+98
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: optimize CNDcc instructionsVadim Girlin2013-05-273-1/+113
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: improve optimization of conditional instructionsVadim Girlin2013-05-276-21/+96
| | | | Signed-off-by: Vadim Girlin <[email protected]>