summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* texgetimage: consolidate 1D array handling code.Dave Airlie2015-12-031-15/+11
| | | | | | | | | | | | | | This should fix the getteximage-depth test that currently asserts. I was hitting problem with virgl as well in this area. This moves the 1D array handling code to a single place. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ben Skeggs <[email protected]> Cc: "10.6 11.0 11.1" <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit 237bcdbab529237a120e225c63f567934a955523)
* nv50/ir: fix (un)spilling of 3-wide resultsIlia Mirkin2015-12-031-4/+42
| | | | | | | | | | There is no 96-bit load/store operations, so we have to split it up into a 32-bit parts, with a split/merge around it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90348 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]> (cherry picked from commit 4deb118d06e96731f3481daa72c201d7258bfbbb)
* nv50,nvc0: properly handle buffer storage invalidation on dsa bufferIlia Mirkin2015-12-032-15/+17
| | | | | | | | | | In case that the buffer has no bind at all, assume it can be a regular buffer. This can happen on buffers created through the ARB_dsa interfaces. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]> (cherry picked from commit ad5f6b03e793b9390e3b9f3eca68bd43f9d809eb)
* nouveau: use the buffer usage to determine placement when no bindingIlia Mirkin2015-12-031-2/+6
| | | | | | | | | | | | With ARB_direct_state_access, buffers can be created without any binding hints at all. We still need to allocate these buffers to VRAM or GART, as we don't have logic down the line to place them into GPU-mappable space. Ideally we'd be able to shift these things around based on usage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92438 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]> (cherry picked from commit 079f713754a9e5d7802b655d54320bd37f24fbfa)
* glsl: Fix off-by-one error in array size check assertionIan Romanick2015-12-031-2/+1
| | | | | | | | | | | Apparently, this has been a bug since 2010 (c30f6e5d). Also use ARRAY_SIZE instead of open coding it. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected] (cherry picked from commit 2f554761536bbfd0d8ec22e807c18bd6df0f22b8)
* nir: fix typo in idiv lowering, causing large-udiv-udiv failuresIlia Mirkin2015-12-031-1/+1
| | | | | | | | | | | | | | In nv50, and in the python script that Rob circulated, we do: bld.mkCmp(OP_SET, CC_GE, TYPE_U32, (s = bld.getSSA()), TYPE_U32, m, b); Do the same in the nir div lowering pass. This fixes the large-udiv-udiv piglit tests on freedreno. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Signed-off-by: Rob Clark <[email protected]> (cherry picked from commit b40e144a665142957a7ae027238e61fd01a27ebc)
* llvmpipe: disable VSX in ppc due to LLVM PPC bugOded Gabbay2015-12-031-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch disables the use of VSX instructions, as they cause some piglit tests to fail For more details, see: https://llvm.org/bugs/show_bug.cgi?id=25503#c7 With this patch, ppc64le reaches parity with x86-64 as far as piglit test suite is concerned. v2: - Added check that we have at least LLVM 3.4 - Added the LLVM bug URL as a comment in the code v3: - Only disable VSX if Altivec is supported, because if Altivec support is missing, then VSX support doesn't exist anyway. - Change original patch description. Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.0" <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (cherry picked from commit 4581f8428e0e1d2f6787d0765823c7883bd2cfcd)
* nvc0/ir: actually emit AFETCH on keplerIlia Mirkin2015-12-031-0/+3
| | | | | | | | | Looks like this was forgotten in the commit which added the AFETCH logic. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit 8e68113c1a78c48f26e820f4beb2dda9e4b99f32)
* meta/generate_mipmap: Don't leak the framebuffer objectIan Romanick2015-12-031-0/+5
| | | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit 72e232374eda780a5dcd374b55d203d0e2a6d02b)
* get-pick-list.sh: Require explicit "11.0" for nominating stable patchesEmil Velikov2015-12-031-1/+1
| | | | | | | A nomination unadorned with a specific version is now interpreted as being aimed at the 11,0 branch, which was recently opened. Signed-off-by: Emil Velikov <[email protected]>
* meta/TexSubImage: Don't pollute the buffer object namespaceIan Romanick2015-11-241-18/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | tl;dr: For many types of GL object, we can *NEVER* use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 58aa56d40bfc6ba54ad9172bf219d18eeb615a80)
* meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTexIan Romanick2015-11-244-38/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | tl;dr: For many types of GL object, we can *NEVER* use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 76cfe2bc4436186fd585be96c4f402c1b1c79bdf)
* meta: Use internal functions for buffer object and VAO access in ↵Ian Romanick2015-11-241-12/+20
| | | | | | | | _mesa_meta_DrawTex Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit a222d4cbc3de49857aebbdf727e53c273abcc6c1)
* meta: Track VBO using gl_buffer_object instead of GL API object handle in ↵Ian Romanick2015-11-242-6/+19
| | | | | | | | _mesa_meta_DrawTex Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit b8a7369fb7e5773892d01fcb1140fe6569ee27eb)
* meta: Partially convert _mesa_meta_DrawTex to DSAIan Romanick2015-11-241-6/+6
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit d5225ee5d92f00958c54b425fe829c811149e889)
* meta: Don't pollute the buffer object namespace in ↵Ian Romanick2015-11-241-14/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | _mesa_meta_setup_vertex_objects tl;dr: For many types of GL object, we can *NEVER* use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 37d11b13ce1db2ad867ff5433cb31bcd1d93c7bf)
* meta: Use internal functions for buffer object and VAO accessIan Romanick2015-11-241-33/+43
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit b1b73a42c8f245d5bf6bbed04071b2c6519b2fb4)
* meta: Use DSA functions for VBOs in _mesa_meta_setup_vertex_objectsIan Romanick2015-11-241-19/+24
| | | | | | | | | | | The fixed-function attribute paths don't get the DSA treatment because there are no DSA entry-points for fixed-function attributes. These could have been added, but this is a temporary patch intended to make later patches easier to review. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 52921f8e089abbc6871060fc50f90b62aaf1e11b)
* meta: Track VBO using gl_buffer_object instead of GL API object handleIan Romanick2015-11-245-48/+72
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 1035e00a815f0babddac0c6c43d01fd34f7e8a94)
* meta: Don't leave the VBO bound after _mesa_meta_setup_vertex_objectsIan Romanick2015-11-245-21/+38
| | | | | | | | | | Meta currently does this, but future changes will make this impossible. Explicitly do it as a step in the patch series now to catch any possible kinks. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 3b5a7d450dea9bfadf1d72338f4418c87340805b)
* i965: Use _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objectsIan Romanick2015-11-241-3/+3
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit ed0bd6573b6ce83471e73d009dbab5220f9e80c0)
* meta: Use _mesa_NamedBufferData and _mesa_NamedBufferSubData for users of ↵Ian Romanick2015-11-243-10/+8
| | | | | | | | _mesa_meta_setup_vertex_objects Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 7f2f3000716d994d94c53f4a0c8a211fb00a46a4)
* meta: Use DSA functions for PBO in create_texture_for_pboIan Romanick2015-11-241-19/+11
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 89a61afdd7346d6e36caccc4d6f2a2607dc4a1f6)
* i965: Don't pollute the buffer object namespace in brw_meta_fast_clearIan Romanick2015-11-241-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | tl;dr: For many types of GL object, we can *NEVER* use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Abdiel Janulgue <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 4e6b9c11fc545cc570ea0042af93e61bfb525d34)
* i965: Use internal functions for buffer object accessIan Romanick2015-11-241-6/+18
| | | | | | | | | | | | | | | | Instead of going through the GL API implementation functions, use the lower-level functions. This means that we have to keep track of a pointer to the gl_buffer_object and the gl_vertex_array_object. This has two advantages. First, it avoids a bunch of CPU overhead in looking up objects and validing API parameters. Second, and much more importantly, it will allow us to stop calling _mesa_GenBuffers / _mesa_CreateBuffers and pollute the buffer namespace (next patch). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Abdiel Janulgue <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit e62799bd4e7b7525995e465a4bdcf6df0b0a69a0)
* i965: Use DSA functions for VBOs in brw_meta_fast_clearIan Romanick2015-11-241-6/+7
| | | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Abdiel Janulgue <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 1c5423d3a074d50138e5ad7945024f9cf4d063ec)
* i965: Pass brw_context instead of gl_context to brw_draw_rectlistIan Romanick2015-11-241-4/+5
| | | | | | | | | | | Future patches will use the brw_context instead. Keeping this non-functional change separate should make the function changes easier to review. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Abdiel Janulgue <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit dcadd855f14b0d3dcce04a16afdfed2d7159d4a8)
* mesa: Refactor enable_vertex_array_attrib to make ↵Ian Romanick2015-11-242-9/+22
| | | | | | | | | | | | | | | _mesa_enable_vertex_array_attrib Pulls the parts of enable_vertex_array_attrib that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). _mesa_enable_vertex_array_attrib can also be used to enable fixed-function arrays. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 4a644f1caadc6b3e26b5f0ac60ac855152e38e59)
* mesa: Refactor update_array_format to make _mesa_update_array_format_publicIan Romanick2015-11-242-19/+57
| | | | | | | | | | Pulls the parts of update_array_format that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit a336fcd36a4743c1ea6f8549cb31b48277359717)
* mesa: Make bind_vertex_buffer avilable outside varray.cIan Romanick2015-11-242-14/+22
| | | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Abdiel Janulgue <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 8fae494df2813bfa38f1aabd6c9b3c1ba3a5e4ef)
* meta: Compute correct buffer size with SkipRows/SkipPixelsChris Wilson2015-11-241-15/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the user is specifying a subregion of a buffer using SKIP_ROWS and SKIP_PIXELS, we must compute the buffer size carefully as the end of the last row may be much shorter than stride*image_height*depth. The current code tries to memcpy from beyond the end of the user data, for example causing: ==28136== Invalid read of size 8 ==28136== at 0x4C2D94E: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915) ==28136== by 0xB4ADFE3: brw_bo_write (brw_batch.c:1856) ==28136== by 0xB5B3531: brw_buffer_data (intel_buffer_objects.c:208) ==28136== by 0xB0F6275: _mesa_buffer_data (bufferobj.c:1600) ==28136== by 0xB0F6346: _mesa_BufferData (bufferobj.c:1631) ==28136== by 0xB37A1EE: create_texture_for_pbo (meta_tex_subimage.c:103) ==28136== by 0xB37A467: _mesa_meta_pbo_TexSubImage (meta_tex_subimage.c:176) ==28136== by 0xB5C8D61: intelTexSubImage (intel_tex_subimage.c:195) ==28136== by 0xB254AB4: _mesa_texture_sub_image (teximage.c:3654) ==28136== by 0xB254C9F: texsubimage (teximage.c:3712) ==28136== by 0xB2550E9: _mesa_TexSubImage2D (teximage.c:3853) ==28136== by 0x401CA0: UploadTexSubImage2D (teximage.c:171) ==28136== Address 0xd8bfbe0 is 0 bytes after a block of size 1,024 alloc'd ==28136== at 0x4C28C20: malloc (vg_replace_malloc.c:296) ==28136== by 0x402014: PerfDraw (teximage.c:270) ==28136== by 0x402648: Draw (glmain.c:182) ==28136== by 0x8385E63: ??? (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x83896C8: fgEnumWindows (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x838641C: glutMainLoopEvent (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x8386C1C: glutMainLoop (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x4019C1: main (glmain.c:262) ==28136== ==28136== Invalid read of size 8 ==28136== at 0x4C2D940: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915) ==28136== by 0xB4ADFE3: brw_bo_write (brw_batch.c:1856) ==28136== by 0xB5B3531: brw_buffer_data (intel_buffer_objects.c:208) ==28136== by 0xB0F6275: _mesa_buffer_data (bufferobj.c:1600) ==28136== by 0xB0F6346: _mesa_BufferData (bufferobj.c:1631) ==28136== by 0xB37A1EE: create_texture_for_pbo (meta_tex_subimage.c:103) ==28136== by 0xB37A467: _mesa_meta_pbo_TexSubImage (meta_tex_subimage.c:176) ==28136== by 0xB5C8D61: intelTexSubImage (intel_tex_subimage.c:195) ==28136== by 0xB254AB4: _mesa_texture_sub_image (teximage.c:3654) ==28136== by 0xB254C9F: texsubimage (teximage.c:3712) ==28136== by 0xB2550E9: _mesa_TexSubImage2D (teximage.c:3853) ==28136== by 0x401CA0: UploadTexSubImage2D (teximage.c:171) ==28136== Address 0xd8bfbe8 is 8 bytes after a block of size 1,024 alloc'd ==28136== at 0x4C28C20: malloc (vg_replace_malloc.c:296) ==28136== by 0x402014: PerfDraw (teximage.c:270) ==28136== by 0x402648: Draw (glmain.c:182) ==28136== by 0x8385E63: ??? (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x83896C8: fgEnumWindows (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x838641C: glutMainLoopEvent (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x8386C1C: glutMainLoop (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x4019C1: main (glmain.c:262) ==28136== Fixes regression from commit 7f396189f073d626c5f7a2c232dac92b65f5a23f Author: Jason Ekstrand <[email protected]> Date: Mon Jan 5 18:17:04 2015 -0800 meta: Add a BlitFramebuffers-based implementation of TexSubImage v2: However, the teximage we create does need to be width x full_height x 1 Signed-off-by: Chris Wilson <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: Neil Roberts <[email protected]> Reviewed-by Neil Roberts <[email protected]> (cherry picked from commit f30cf3258e495a583e011e07d5b4a19031c5518f)
* docs: add sha256 checksums for 11.0.6Emil Velikov2015-11-211-1/+2
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add release notes for 11.0.6mesa-11.0.6Emil Velikov2015-11-211-0/+144
| | | | Signed-off-by: Emil Velikov <[email protected]>
* Update version to 11.0.6Emil Velikov2015-11-211-1/+1
| | | | Signed-off-by: Emil Velikov <[email protected]>
* automake: use static llvm for make distcheckEmil Velikov2015-11-211-0/+1
| | | | | | | | | With llvm 3.7 semi-dropping the autoconf build, we rely on their cmake build. With the latter of which annoyingly using another (busted?) SONAME. Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit c45b4257c26b93043508e55c6a1aeb3a8b14eee9)
* llvmpipe: use simple coeffs calc for 128bit vectorsOded Gabbay2015-11-181-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are currently two methods in llvmpipe code to calculate coeffs to be used as inputs for the fragment shader. The two methods use slightly different ways to do the floating point calculations and thus produce slightly different results. The decision which method to use is determined by the size of the vector that is used by the platform. For vectors with size of more than 128bit, a single-step method is used, in which coeffs_init_simple() + attribs_update_simple() are called. For vectors with size of 128bit or less, a two-step method is used, in which coeffs_init() + attribs_update() are called. This causes some piglit tests (clip-distance-bulk-copy, interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with 128bit vectors (such as ppc64le or x86-64 without AVX). This patch makes platforms with 128bit vectors use the single-step method (aka "simple" method) instead of the two-step method. This would make the resulting coeffs identical between more platforms, make sure the piglit tests passes, and make debugging and maintainability a bit easier as the generated LLVM IR will be the same for more platforms. The performance impact is negligible for x86-64 without AVX, and basically non-existent for ppc64le, as it can be seen from the following benchmarking results: - glxspheres, on ppc64le: - original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec - Additional 0.8% performance boost - glxspheres, on x86-64 without AVX: - original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec - Additional 0.74% performance boost - glmark2, on ppc64le: - original code: score of 58 - with my change: score of 57 - glmark2, on x86-64 without AVX: - original code: score of 175 - with the patch: score of 167 - Impact of of -4.5% on performance - OpenArena, on ppc64le: - original code: 3398 frames 1719.0 seconds 2.0 fps 255.0/505.9/2773.0/0.0 ms - with the patch: 3398 frames 1690.4 seconds 2.0 fps 241.0/497.5/2563.0/0.2 ms - 29 seconds faster with the patch, which is about 2% - OpenArena, on x86-64 without AVX: - original code: 3398 frames 239.6 seconds 14.2 fps 38.0/70.5/719.0/14.6 ms - with the patch: 3398 frames 244.4 seconds 13.9 fps 38.0/71.9/697.0/14.3 ms - 0.3 fps slower with the patch (about 2%) Additional details can be found at: http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (cherry picked from commit 39b4dfe6ab1003863778a25c091c080e098833ec)
* vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB.Eric Anholt2015-11-185-0/+26
| | | | | | | | | | | | | | It looks like nir_lower_idiv is going to use it soon, so add support. With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with GL 3.0 forced on). Cc: "11.0" <[email protected]> (cherry picked from commit a4bf28178f064082d3b818d2cd48abf9075cc459) [Emil Velikov: Resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/gallium/drivers/vc4/vc4_qpu_emit.c
* r200: fix bgrx8/xrgb8 blitsRoland Scheidegger2015-11-181-0/+4
| | | | | | | | | | | | | | | | | | Since 779cabfc7d022de8b7b9bc7fdac0caffa8646c51 the same txformat table entries are used for "normal" texturing as well as for blits. However, I forgot to put in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing path can't hit them because the radeon tex format chooser will never chose them, but we get that format from the dri buffers (at least I assume we got it from there). This is untested but essentially addressing the same bug as for radeon. (I don't think that the second entry per le/be table is actually necessary, but shouldn't hurt...) Tested-by: Ian Romanick <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit a2611ffe4b5f1852c59301f086b988233a1c62f3)
* radeon: fix bgrx8/xrgb8 blitsRoland Scheidegger2015-11-181-0/+2
| | | | | | | | | | | | | | | | | Since d21320f6258b2e1780a15c1ca718963d8a15ca18 the same txformat table entries are used for "normal" texturing as well as for blits. However, I forgot to put in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing path can't hit them because the radeon tex format chooser will never chose them, but we get that format from the dri buffers (at least I assume we got it from there). This caused lots of piglit regressions (and probably lots of trouble outside piglit too). This fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=92900. Tested-by: Ian Romanick <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: "11.0" <[email protected]> (cherry picked from commit 983614dbede7b94cba1bad9f3e8627fc5e14bb91)
* meta/generate_mipmap: Only modify the draw framebuffer binding in ↵Ian Romanick2015-11-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | fallback_required Previously GL_FRAMEBUFFER was used. However, if GL_EXT_framebuffer_blit is supported (note: it is supported by every Mesa driver), this is *sometimes* an alias for GL_DRAW_FRAMEBUFFER (getters) and *sometimes* an alias for *both* GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER (setters). As a result, the code saved one binding but modified both. If the bindings were different, the GL_READ_FRAMEBUFFER would be incorrect on exit. Fixes the piglit fbo-generatemipmap-versus-READ_FRAMEBUFFER test. Ideally this function would use DSA functions and not modify the binding at all. However, that would be a much more intrusive change because _mesa_meta_bind_fbo_image would also need to be modified. _mesa_meta_bind_fbo_image has a lot of callers. Much of this code is about to get a major rework due to bug #92363, so I don't think it matters too much. In fact, I discovered this bug while working on the other bug. Le bon temps! Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: "10.6 11.0" <[email protected]> (cherry picked from commit c40a88b6c5a698e5297957e28cccf2ce23820caa)
* radeonsi: enable optimal raster config setting for fiji (v2)Alex Deucher2015-11-181-3/+9
| | | | | | | | | | | | Requires proper kernel tiling configuration so check the tiling config registers. v2: send the right version of the patch Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] (cherry picked from commit 00f554abba8c0f3b65af94365c15109c3b858486)
* nouveau: don't expose HEVC decoding supportIlia Mirkin2015-11-181-0/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] (cherry picked from commit f94e1d97381ec787c2abbbcd5265252596217e33)
* glsl: Allow implicit int -> uint conversions for the % operator.Kenneth Graunke2015-11-181-9/+28
| | | | | | | | | | | | | | | | | | GLSL 4.00 and GL_ARB_gpu_shader5 introduced a new int -> uint implicit conversion rule and updated the rules for modulus to use them. (In earlier languages, none of the implicit conversion rules did anything relevant, so there was no point in applying them.) This allows expressions such as: int foo; uint bar; uint mod = foo % bar; Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit 511de1a80cedc0add386dad79cce56dd68d2f611)
* meta/generate_mipmap: Don't leak the sampler objectIan Romanick2015-11-181-0/+2
| | | | | | | Signed-off-by: Ian Romanick <[email protected]> Cc: "10.6 11.0" <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> (cherry picked from commit 758f12fd98dea9a9682becf2d496bd38ef3959e5)
* radeonsi: initialize SX_PS_DOWNCONVERT to 0 on StoneyMarek Olšák2015-11-181-0/+3
| | | | | | | | | | | | | otherwise the SX or CB blocks can go bananas Reviewed-by: Nicolai Hähnle <[email protected]> Cc: [email protected] (cherry picked from commit 40912dd91e96376517fb41bb4dc228b45fd1a01c) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/gallium/drivers/radeonsi/si_state.c
* nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_storeJason Ekstrand2015-11-181-1/+4
| | | | | | | | | | | | | | | | | | | | | | Previously, we walked through a given deref_node's copies and, after lowering the copy away, removed it from both the source and destination copy sets. This commit changes this to only remove it from the other node's copy set (not the one we're lowering). At the end of the loop, we just throw away the copy set for the node we're lowering since that node no longer has any copies. This has two advantages: 1) It's more efficient because we're doing potentially half as many set search operations. 2) It now properly handles copies from a node to itself. Perviously, it would delete the copy from the set when processing the destinatioon and then assert-fail when we couldn't find it for the source. Cc: "11.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92588 Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Connor Abbott <[email protected]> (cherry picked from commit 226ba889a0f820b9f4b1132e379620d2688c96e7)
* i965/skl/gt4: Fix URB programming restriction.Ben Widawsky2015-11-181-0/+9
| | | | | | | | | | | | | | | The comment in the code details the restriction. Thanks to Ken for having a very helpful conversation with me, and spotting the blurb in the link I sent him :P. There are still stability problems for me on GT4, but this definitely helps with some of the failures. v2: Comment fixes Cc: [email protected] Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> (cherry picked from commit 55314c5be4cbf933ab7fbd20f6aa49207e04c946)
* r600: initialised PGM_RESOURCES_2 for ES/GSDave Airlie2015-11-182-0/+6
| | | | | | | | | | | | | This fixes the corruption on rendering that we are seeing in certain geometry shaders. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91780 Reviewed-by: Alex Deucher <[email protected]> Tested / Reviewed-by: Glenn Kennard <[email protected]> Cc: "10.6" "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit df8af7d75155845d12d5a14a3a5ca644f07cb3b1)
* mesa/copyimage: allow width/height to not be multiples of blockIlia Mirkin2015-11-181-3/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For compressed textures, the image size is not necessarily a multiple of the block size (e.g. the last mip levels). Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core Profile spec says: An INVALID_VALUE error is generated if the dimensions of either subregion exceeds the boundaries of the corresponding image object, or if the image format is compressed and the dimensions of the subregion fail to meet the alignment constraints of the format. and Section 8.7 (Compressed Texture Images) says: An INVALID_OPERATION error is generated if any of the following conditions occurs: * width is not a multiple of four, and width + xoffset is not equal to the value of TEXTURE_WIDTH. * height is not a multiple of four, and height + yoffset is not equal to the value of TEXTURE_HEIGHT. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860 Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: [email protected] (cherry picked from commit 912babba7bf1abd3caa49f6372d581ae1afe7e84) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/mesa/main/copyimage.c
* vc4: Return NULL when we can't make our shadow for a sampler view.Eric Anholt2015-11-181-0/+4
| | | | | | | | I'm not sure what the caller does is appropriate (just have a NULL sampler at this slot), but it fixes the immediate crash. Cc: "11.0" <[email protected]> (cherry picked from commit 5980389bbf98b8186ba6a06392d92b82fa9efad3)