summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* swrast: Always use MapTextureImage for mapping textures for swrast.Eric Anholt2013-04-3011-300/+64
| | | | | | | | | | | | | | | Now that everything goes through ImageSlices[], we can rely on the driver's existing texture mapping function. A big block of code goes away on Radeon that looks like it was to deal with the validate that happened at SpanRenderStart, which no longer occurs since we don't need validation for the MapTextureImage hook. v2: Rewrite comment about ImageSlices, fix duplicated swImages, touch up unmap loop. Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Brian Paul <[email protected]>
* nouveau: Replace swrast_texture_image->Map usage with ->Buffer.Eric Anholt2013-04-301-3/+1
| | | | | | | | This code is trying to deal with providing a map in the case that AllocTexImageBuffer was called, which is hooked up to the swrast variant. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nouveau: Just use MapTextureImage instead of duplicating the logic.Eric Anholt2013-04-301-81/+18
| | | | | | | | MapTextureImage has the exact same logic, except it can also handle swrast-allocated buffers. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swrast: Make a teximage's stored RowStride be in terms of bytes per row.Eric Anholt2013-04-306-9/+30
| | | | | | | | | | | | | For hardware drivers with pitch alignment requirements, a non-power-of-two-sized texture format won't end up being an integer number of pixels per row. Also, avoids having to change our units between MapTextureImage's rowStride and swrast's RowStride. This doesn't fully convert the compressed texel fetch path, but does make sure we don't drop any bits (not that we'd expect to). Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swrast: Replace use of teximage Map in 1D/2D paths with ImageSlices[0].Eric Anholt2013-04-303-8/+8
| | | | | | | This gets us ready for the Map field to die. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swrast: Replace ImageOffsets with an ImageSlices pointer.Eric Anholt2013-04-3013-185/+123
| | | | | | | | | | | | | This is a step toward allowing drivers to use their normal mapping paths, instead of requiring that all slice mappings come from an aligned offset from the first slice's map. This incidentally fixes missing slice handling in FXT1 swrast. v2: Use slice height helper function. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swrast: Reuse _swrast_free_texture_image_buffer from drivers.Eric Anholt2013-04-302-15/+2
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swrast: Move ImageOffsets allocation to shared code.Eric Anholt2013-04-304-44/+17
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swrast: Clean up and explain the mapping process.Eric Anholt2013-04-302-10/+17
| | | | | | | v2: Move slice height calculation to a helper function (recommeded by Brian). Reviewed-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Brian Paul <[email protected]>
* swrast: Factor out texture slice counting.Eric Anholt2013-04-301-4/+12
| | | | | | | This function going to get used a lot more in upcoming patches. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeon: Remove some dead teximage mapping code.Eric Anholt2013-04-302-52/+0
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeon: Add missing swrast field initialization.Eric Anholt2013-04-301-0/+3
| | | | | | | | This is the equivalent of intel's 80513ec8b4c812b9c6249cc5824337a5f04ab34c. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r600g/llvm: Fix opencl buildVincent Lejeune2013-04-301-1/+1
|
* Gallium: Use mmap on Haiku for executable memory vs mallocAlexander von Gluck IV2013-04-291-1/+1
| | | | * Haiku now has DEP enabled by default.
* Mapi: Use mmap on Haiku for executable memory vs mallocAlexander von Gluck IV2013-04-291-1/+1
| | | | * Haiku now has DEP enabled by default.
* Mesa: Use mmap on Haiku for executable memory vs mallocAlexander von Gluck IV2013-04-291-1/+1
| | | | * Haiku now has DEP enabled by default.
* r600g/llvm: get use_kill from compiler shaderVincent Lejeune2013-04-304-2/+9
|
* i965/fs: Print out the estimated cycle count in INTEL_DEBUG=wmEric Anholt2013-04-291-0/+5
| | | | | | | This could be used by shader-db for hopefully more accurate regression testing. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Allow LRPs with uniform registers.Eric Anholt2013-04-293-1/+11
| | | | | | | | | Improves GLB2.7 performance on my HSW by 0.671455% +/- 0.225037% (n=62). v2: Make is_valid_3src() a method of the fs_reg. (recommended by Ken) Reviewed-by: Matt Turner <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* intel: Be more conservative in disabling tiling to save memory.Eric Anholt2013-04-291-3/+5
| | | | | | | | Improves GLB2.7 trex performance 1.01985% +/- 0.721366% on my IVB (n=10) and by 3.38771% +/- 0.584241% (n=15) on my HSW, due to a 32x32 ARGB8888 cubemap going from untiled to tiled. Reviewed-by: Daniel Vetter <[email protected]>
* i965: Disable Z16 on contexts that don't require it.Eric Anholt2013-04-291-1/+14
| | | | | | | | | | | | | | It appears that Z16 on Intel hardware is in fact slower than Z24, so people are getting surprisingly hurt when trying to use Z16 as a performance-versus-precision tradeoff, or when they're targeting GLES2 and that's all you get. GL 3.0+ have Z16 on the list of required exact format sizes, but GLES doesn't, so choose the better-performing layout in that case. Improves GLB 2.7 trex performance at 1920x1080 by 10.7% +/- 1.1% (n=3) on my IVB system. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Report FBO incompleteness causes through GL_ARB_debug_output.Eric Anholt2013-04-291-22/+34
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Fold the one last function intel_tex_format.c into the caller.Eric Anholt2013-04-297-27/+10
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Fix error checking for GS UBO getters.Eric Anholt2013-04-291-2/+7
| | | | | These are supposed to be present if both things are available, but we were enabling them if either one was.
* mesa: Add a clarifying comment about EXTRA_ error checking.Eric Anholt2013-04-291-1/+7
|
* mesa: Add an extra clarifying set of braces to getter checking.Eric Anholt2013-04-291-1/+2
| | | | | | For this multi-page single statement, my thought the end was to that the next block was mis-indented, rather than that the dropped indentation actually indicated the end of the loop.
* mesa: Fix error checking for getters consisting of only API versions.Eric Anholt2013-04-291-32/+24
| | | | | | | | | In almost all of our cases, getters that are turned on for only some API variants will have an extension listed as one of the things that can enable it, and thus api_check gets set. For extra_gl30_es3 (used for NUM_EXTENSIONS, MAJOR_VERSION, MINOR_VERSION) on a GL 2.1 context, though, we would check twice, not find either one, but never actually throw the error.
* mesa: Clarify the names of error checking variables for glGet.Eric Anholt2013-04-291-22/+21
| | | | | There's no reason to actually count these things, so the integer ++ behavior was just confusing.
* i915: Add support for GL_EXT_texture_sRGB and GL_EXT_texture_sRGB_decode.Eric Anholt2013-04-293-2/+15
| | | | This brings the driver up to GL 2.1.
* i915: Always enable GL 2.0 support.Eric Anholt2013-04-292-30/+5
| | | | There's no point in shipping a non-GL2 driver today.
* i915: Correctly set the OQ counter bits.Eric Anholt2013-04-292-0/+2
| | | | | | | | | | | | While we may provide the extension, we need to tell applications that they can't actually use it: An implementation can either set QUERY_COUNTER_BITS_ARB to the value 0, or to some number greater than or equal to n. If an implementation returns 0 for QUERY_COUNTER_BITS_ARB, then the occlusion queries will always return that zero samples passed the occlusion test, and so an application should not use occlusion queries on that implementation.
* i965: Move is_math/is_tex/is_control_flow() to backend_instruction.Kenneth Graunke2013-04-296-76/+49
| | | | | | | | | | | | | | | These are entirely based on the opcode, which is available in backend_instruction. It makes sense to only implement them in one place. This changes the VS implementation of is_tex() slightly, which now accepts FS_OPCODE_TXB and SHADER_OPCODE_LOD. However, since those aren't generated in the VS anyway, it should be fine. This also makes is_control_flow() available in the VS. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* draw/so: fix overflow calculationZack Rusin2013-04-271-8/+18
| | | | | | | | | | only report overflow for missing targets if they're actually being used. if the targets are missing but are not being used by any slot in the stream output declaration we should correctly just ignore them. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* llvmpipe: Fix queries when screen->num_threads == 0.José Fonseca2013-04-291-2/+3
| | | | | | | | | | That is, when llvmpipe is run in single-threaded mode. Trivial. Tested with LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry
* Revert "st/mesa: add a simple path to BufferData if it only discards buffer ↵José Fonseca2013-04-291-14/+0
| | | | | | | | contents" This reverts commit 5649f886f76023532538b8792605a3578cec1ed1. It causes segfaults when size is zero.
* r600g: force full cache for hyperzJerome Glisse2013-04-292-0/+2
| | | | | | | | | | | | | | | | | Seems that in some case allowing half cache usage confuse the gpu and trigger lockup. Force full cache use. Should fix : https://bugs.freedesktop.org/show_bug.cgi?id=59592 https://bugs.freedesktop.org/show_bug.cgi?id=60848 https://bugs.freedesktop.org/show_bug.cgi?id=60969 https://bugs.freedesktop.org/show_bug.cgi?id=61747 https://bugs.freedesktop.org/show_bug.cgi?id=62466 https://bugs.freedesktop.org/show_bug.cgi?id=62669 https://bugs.freedesktop.org/show_bug.cgi?id=62721 https://bugs.freedesktop.org/show_bug.cgi?id=63124 Signed-off-by: Jerome Glisse <[email protected]>
* freedreno: fix rebase screw-upRob Clark2013-04-291-1/+1
| | | | | | Add back 2nd arg to emit_vertexbufs() which got lost in rebase. Signed-off-by: Rob Clark <[email protected]>
* i965/fs: Don't try to use bogus interpolation modes pre-Gen6.Chris Forbes2013-04-301-9/+17
| | | | | | | | | | | | | | | | | | | | | Interpolation modes other than perspective-barycentric-pixel-center (and their associated coefficients in the WM payload) only exist in Gen6 and later. Unfortunately, if a varying was declared as `centroid`, we would blindly read the nonexistant values, and so produce all manner of bad behavior -- texture swimming, snow, etc. Fixes rendering in Counter-Strike Source and Team Fortress 2 on Ironlake. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Tested-by: Jordan Justen <[email protected]>
* i965/vs: Fix order of source arguments to LRP.Matt Turner2013-04-281-1/+4
| | | | | | | The order or arguments matches DirectX, and is backwards from GLSL's mix() built-in. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63983
* llvmpipe: stop crashing when one of the so targets is nullZack Rusin2013-04-271-2/+5
| | | | | | | Fixes a crash when one of the so targets is null. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/so: indicate overflow when buffer is missingZack Rusin2013-04-271-0/+4
| | | | | | | | | | We were crashing if one of the buffers wasn't set, we should just treat it as an overflow. It's useful when using so statistics because it allows one to figure out how much data would be generated by so without actually writing any of it. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallivm: fix indirect addressing of temps in soa modeZack Rusin2013-04-271-0/+11
| | | | | | | | | | we weren't adding the soa offsets when constructing the indices for the gather functions. That meant that we were always returning the data in the first vertex/primitive/pixel in the SoA structure and not correctly fetching from all structures. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* tgsi/ureg: Add a function to return the number of outputsZack Rusin2013-04-262-0/+15
| | | | | | | | We already hold the variable, just weren't providing access to it. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* draw/so: Fix overflow calculationsZack Rusin2013-04-261-3/+8
| | | | | | | | | | We weren't taking the buffer offset, destination offset or the stride into consideration so we were frequently writing into an overflown buffer. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw/llvm: fix viewport transformationsZack Rusin2013-04-261-4/+5
| | | | | | | | | | | | | This was a very serious bug. We were always doing the viewport transformations on the first output of the vertex shader. That means that every application that was storing position in anything but OUT[0] was outputing untransformed vertices and had broken output for whatever it was storing at OUT[0]. Correctly take into consideration where the vertex position is actually stored. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: increase the number of available stream output declsZack Rusin2013-04-261-1/+2
| | | | | | | | | | There can be more stream output decls than shader outputs because individual components from them can be split and distributed among different so buffers. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: implement so_overflow queryZack Rusin2013-04-263-0/+15
| | | | | | Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: fix the compressed TexSubImage size checking codeBrian Paul2013-04-261-3/+9
| | | | | | | | | | | | | Before, we'd incorrectly generate an error if we we tried to replace a non-4x4 block near the edge of a NPOT compressed texture. For example, if the dest image was 15 texels wide and xoffset=12 and width=3 we'd incorrectly generate GL_INVALID_OPERATION. Verified with new tests added to piglit s3tc-errors test. Note: This is a candidate for the stable branches. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: replace LP_MAX_THREADS with screen->num_threads in query codeBrian Paul2013-04-261-2/+4
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: bump LP_MAX_THREADS to 16Brian Paul2013-04-261-1/+1
| | | | | | | On the mesa-users list, Burlen Loring reported a speed-up with 16 cores and his test/app. Reviewed-by: Roland Scheidegger <[email protected]>