mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	llvmpipe: scratch some special handling of vp_index/layer	Roland Scheidegger	2016-01-07	4	-38/+7
\| \| \| \| \| \| \| \| \|	It was actually slightly buggy (missing initialization / setup not dependent on new vs albeit I didn't see issues), but the case of non-existing attributes is now handled by draw emit code so don't need that anymore. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	gallium/drivers: Remove unnecessary semicolons	Edward O'Callaghan	2016-01-06	2	-2/+2
\| \| \| \| \| \|	Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	llvmpipe: Optimize lp_rast_triangle_32_3_16 for POWER8	Oded Gabbay	2016-01-06	1	-1/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch converts the SSE-optimized lp_rast_triangle_32_3_16() to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ openarena 16.35 16.7 2.14% xonotic 4.707 4.97 5.57% glmark2 didn't show a significant (more than 1%) difference. v2: Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	llvmpipe: Optimize BUILD_MASK(_LINEAR) for POWER8	Oded Gabbay	2016-01-06	1	-40/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch converts the SSE-optimized build_mask_32() and build_mask_linear_32() to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ glmark2 (score) 139.8 142.7 2.07% openarena and xonotic didn't show a significant (more than 1%) difference. v2: Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	llvmpipe: Optimize do_triangle_ccw for POWER8	Oded Gabbay	2016-01-06	1	-0/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch converts the SSE optimization done in do_triangle_ccw to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ glmark2 (score) 136.6 139.8 2.34% openarena 16.14 16.35 1.30% xonotic 4.655 4.707 1.11% v2: - Convert loads to use aligned loads - Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H support	Ilia Mirkin	2016-01-03	1	-0/+1
\| \| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	gallium: add PIPE_CAP_DRAW_PARAMETERS	Ilia Mirkin	2015-12-30	1	-0/+1
\| \| \| \| \| \| \| \|	This allows the state tracker to know that the various draw parameters are available in vertex shaders. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	llvmpipe: fix layer/vp input into fs when not written by prior stages	Roland Scheidegger	2015-12-12	8	-53/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ARB_fragment_layer_viewport requires that if a fs reads layer or viewport index but it wasn't output by gs (or vs with other extensions), then it reads 0. This never worked for llvmpipe, and is surprisingly non-trivial to fix. The problem is the mechanism to handle non-existing outputs in draw is rather crude, it will simply redirect them to whatever is at output 0, thus later stages will just get garbage. So, rather than trying to fix this up (which looks non-trivial), fix this up in llvmpipe setup by detecting this case there and output a fixed zero directly. While here, also optimize the hw vertex layout a bit - previously if the gs outputted layer (or vp) and the fs read those inputs, we'd add them twice to the vertex layout, which is unnecessary. And do some minor cleanup, slots don't require that many bits, there was some bogus (but harmless) float/int mixup for psize slot too, make the slots all unsigned (we always put pos at pos zero thus everything else has to be positive if it exists), and make sure they are properly initialized (layer and vp index slot were not which looked fishy as they might not have got set back to zero when changing from a gs which outputs them to one which does not). This fixes the failures in piglit's arb_fragment_layer_viewport group (3 each for layer and vp). Reviewed-by: Jose Fonseca <[email protected]>
*	gallium/drivers: Sanitize NULL checks into canonical form	Edward O'Callaghan	2015-12-06	7	-7/+7
\| \| \| \| \| \| \| \| \| \|	Use NULL tests of the form `if (ptr)' or `if (!ptr)'. They do not depend on the definition of the symbol NULL. Further, they provide the opportunity for the accidental assignment, are clear and succinct. Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	gallium/drivers: Trivial code-style cleanup	Edward O'Callaghan	2015-12-06	5	-7/+7
\| \| \| \| \|	Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	llvmpipe: Make use of ARRAY_SIZE macro	Edward O'Callaghan	2015-12-06	2	-4/+4
\| \| \| \| \|	Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	llvmpipe: use provoking vertex for layer/viewport	Roland Scheidegger	2015-12-04	2	-17/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	d3d10 actually requires using provoking (first) vertex. GL is happy with any vertex (as long as we say it's undefined in the corresponding queries). Up to now we actually used vertex 0 for viewport index, and vertex 1 for layer (for tris), which really didn't make sense (probably a typo). Also,$ since we reorder vertices of clockwise triangle, that actually meant we used a different vertex depending if the traingle was cw or ccw (still ok by gl). However, it should be consistent with what draw (clip) does, and using provoking vertex seems like the sensible choice (draw clip will be fixed next as it is totally broken there). While here, also use the correct viewport always even when not needed in setup (we pass it down to jit fragment shader it might be needed there for getting correct near/far depth values). No piglit changes. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	softpipe/llvmpipe: don't advertize support for ASTC	Roland Scheidegger	2015-11-24	1	-1/+2
\| \| \| \| \| \| \| \| \|	33339775565154040e0c4ea2e196217dccc08cdf added support for ASTC textures to gallium. They don't have any helpers hooked up for software decoding, however, so cannot support them in drivers relying on util code for decoding. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	llvmpipe: don't test for unsupported formats in lp_test_format	Roland Scheidegger	2015-11-24	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Removing the fake format helpers (1c7d0a6aa4f5cb38af7e281e1e5437cd1a20f781) caused this to fail. These formats were never supported, but previously they would have asserted in the generated jit functions (which, due to lack of test cases for these formats, were never called) whereas we now assert when trying to build the jit function. So, skip them completely. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=93092 Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	gallium: add PIPE_CAP_CLEAR_TEXTURE and clear_texture prototype	Ilia Mirkin	2015-11-11	1	-0/+1
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	llvmpipe: disable front updates for now	Dave Airlie	2015-11-08	1	-1/+1
\| \| \| \| \| \| \| \|	As pointed out by Emil, this sometimes hangs, appears to be due to threading need to rethink how this stuff works for llvmpipe. Signed-off-by: Dave Airlie <[email protected]>
*	llvmpipe: disable texture cache	Roland Scheidegger	2015-11-05	1	-1/+1
\| \| \| \|	There are some weird problems with 8-wide vectors.
*	llvmpipe: add cache for compressed textures	Roland Scheidegger	2015-11-04	7	-10/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	compressed textures are very slow because decoding is rather complex (and because there's no jit code code to decode them too for non-technical reasons). Thus, add some texture cache which holds a couple of decoded blocks. Right now this handles only s3tc format albeit it could be extended to work with other formats rather trivially as long as the result of decode fits into 32bit per texel (ideally, rgtc actually would decode to more than 8 bits per channel, but even then making it work for it shouldn't be too difficult). This can improve performance noticeably but don't expect wonders (uncompressed is unsurprisingly still faster). It's also possible it might be slower in some cases (using nearest filtering for example or if there's otherwise not many cache hits, the cache is only direct mapped which isn't great). Also, actual decode of a block relies on util code, thus even though always full blocks are decoded it is done texel by texel - this could obviously benefit greatly from simd-optimized code decoding full blocks at once... Note the cache is per (raster) thread, and currently only used for fragment shaders. Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: use simple coeffs calc for 128bit vectors	Oded Gabbay	2015-11-04	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are currently two methods in llvmpipe code to calculate coeffs to be used as inputs for the fragment shader. The two methods use slightly different ways to do the floating point calculations and thus produce slightly different results. The decision which method to use is determined by the size of the vector that is used by the platform. For vectors with size of more than 128bit, a single-step method is used, in which coeffs_init_simple() + attribs_update_simple() are called. For vectors with size of 128bit or less, a two-step method is used, in which coeffs_init() + attribs_update() are called. This causes some piglit tests (clip-distance-bulk-copy, interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with 128bit vectors (such as ppc64le or x86-64 without AVX). This patch makes platforms with 128bit vectors use the single-step method (aka "simple" method) instead of the two-step method. This would make the resulting coeffs identical between more platforms, make sure the piglit tests passes, and make debugging and maintainability a bit easier as the generated LLVM IR will be the same for more platforms. The performance impact is negligible for x86-64 without AVX, and basically non-existent for ppc64le, as it can be seen from the following benchmarking results: - glxspheres, on ppc64le: - original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec - Additional 0.8% performance boost - glxspheres, on x86-64 without AVX: - original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec - Additional 0.74% performance boost - glmark2, on ppc64le: - original code: score of 58 - with my change: score of 57 - glmark2, on x86-64 without AVX: - original code: score of 175 - with the patch: score of 167 - Impact of of -4.5% on performance - OpenArena, on ppc64le: - original code: 3398 frames 1719.0 seconds 2.0 fps 255.0/505.9/2773.0/0.0 ms - with the patch: 3398 frames 1690.4 seconds 2.0 fps 241.0/497.5/2563.0/0.2 ms - 29 seconds faster with the patch, which is about 2% - OpenArena, on x86-64 without AVX: - original code: 3398 frames 239.6 seconds 14.2 fps 38.0/70.5/719.0/14.6 ms - with the patch: 3398 frames 244.4 seconds 13.9 fps 38.0/71.9/697.0/14.3 ms - 0.3 fps slower with the patch (about 2%) Additional details can be found at: http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	gallium/swrast: fix front buffer blitting. (v2)	Dave Airlie	2015-10-31	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So I've known this was broken before, cogl has a workaround for it from what I know, but with the gallium based swrast drivers BlitFramebuffer from back to front or vice-versa was pretty broken. The legacy swrast driver tracks when a front buffer is used and does the get/put images when it is mapped/unmapped, so this patch attempts to add the same functionality to the gallium drivers. It creates a new context interface to denote when a front buffer is being created, and passes a private pointer to it, this pointer is then used to decide on map/unmap if the contents should be updated from the real frontbuffer using get/put image. This is primarily to make gtk's gl code work, the only thing I've tested so far is the glarea test from https://github.com/ebassi/glarea-example.git v2: bump extension version, check extension version before calling get image. (Ian) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91930 Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	gallium: add PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS	Marek Olšák	2015-10-28	1	-0/+1
\| \| \| \| \| \|	For ARB_copy_image. Reviewed-by: Brian Paul <[email protected]>
*	llvmpipe: fix using non-zero layer in non-array view from array resource	Roland Scheidegger	2015-10-24	2	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	Just need to use resource target not view target when calculating first-layer based mip offsets. (This is a gl specific problem since d3d10 does not distinguish between non-array and array resources neither at the resource nor view level, only at the shader level.) Fixes new piglit arb_texture_view sampling-2d-array-as-2d-layer test. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	gallium: add PIPE_CAP_SHAREABLE_SHADERS	Marek Olšák	2015-10-20	1	-0/+1
\| \| \| \| \| \|	I'll let drivers figure out how to do it. Reviewed-by: Ilia Mirkin <[email protected]>
*	gallium: add per-sample interpolation control into rasterizer statOAe	Marek Olšák	2015-10-03	1	-0/+1
\| \| \| \| \| \| \| \|	Required by ARB_sample_shading for drivers that don't want a shader variant in st/mesa. Reviewed-by: Ilia Mirkin <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
*	gallium: add PIPE_CAP_TGSI_TXQS to let st know if TXQS is supported	Ilia Mirkin	2015-09-13	1	-0/+1
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]>
*	gallium: add flags parameter to pipe_screen::context_create	Marek Olšák	2015-08-26	2	-2/+4
\| \| \| \| \| \| \| \|	This allows creating compute-only and debug contexts. Reviewed-by: Brian Paul <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
*	gallium: add an interface for EXT_depth_bounds_test	Marek Olšák	2015-08-14	1	-0/+1
\| \| \| \| \|	Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	gallium: add support for GLES texture float extensions (v3)	Marek Olšák	2015-08-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74329 v2: add a CAP for half floats drivers should not expose the CAPs if they don't support the formats v3: update relnotes Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	gallium: replace INLINE with inline	Ilia Mirkin	2015-07-21	17	-58/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generated by running: git grep -l INLINE src/gallium/ \| xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ \| xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]>
*	gallium: add PIPE_CAP_MAX_SHADER_PATCH_VARYINGS	Marek Olšák	2015-07-16	1	-0/+1
\| \| \| \|	Reviewed-by: Ilia Mirkin <[email protected]>
*	gallium: remove redundant pipe_context::fence_signalled	Marek Olšák	2015-07-05	1	-13/+0
\| \| \| \| \| \|	fence_finish(timeout=0) does the same thing Reviewed-by: Brian Paul <[email protected]>
*	gallium: handle fence_finish timeout in various drivers	Marek Olšák	2015-07-05	1	-0/+3
\| \| \| \| \| \|	I copied what fence_signalled does. Reviewed-by: Brian Paul <[email protected]>
*	llvmpipe: Truncate the binned constants to max const buffer size.	Jose Fonseca	2015-06-19	1	-1/+4
\| \| \| \| \| \| \| \| \|	Tested with Ilia Mirkin's gzdoom.trace and "arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test without my earlier fix to fail linkage when UBO exceeds GL_MAX_UNIFORM_BLOCK_SIZE. Reviewed-by: Roland Scheidegger <[email protected]>
*	llvmpipe: simplify lp_resource_copy()	Brian Paul	2015-06-10	1	-59/+2
\| \| \| \| \| \| \| \| \| \|	Just implement it in terms of util_resource_copy_region(). Both the original code and util_resource_copy_region() boil down to mapping, calling util_copy_box() and unmapping. No piglit regressions. This will also help to implement GL_ARB_copy_image. Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: Implement stencil export	Roland Scheidegger	2015-06-04	4	-15/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Pretty trivial, fixes the issue that we're expected to be able to blit stencil surfaces (as the blit just relies on util blitter code which needs stencil export to do it). 2 piglits skip->pass, 11 fail->pass v2: prettify, keep different stencil ref value handling out of depth/stencil test itself. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	llvmpipe: (trivial) add parantheses in (!x == y) expression	Roland Scheidegger	2015-05-25	1	-1/+1
\| \| \| \| \| \| \|	Apparently some compilers think we probably wanted to do !(x == y) instead and issue a warning, so just shut it up... No functional change, obviously. Cc: <[email protected]>
*	gallium/drivers: Add extern "C" wrappers to public entry	Alexander von Gluck IV	2015-05-15	1	-0/+8
\| \| \| \|	Reviewed-by: Brian Paul <[email protected]>
*	llvmpipe: enable ARB_texture_view	Roland Scheidegger	2015-05-13	3	-9/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the functionality was pretty much there, just not tested. Trivially fix up the missing pieces (take target info from view not resource), and add some missing bits for cubes. Also add some minimal debug validation to detect uninitialized target values in the view... 49 new piglits, 47 pass, 2 fail (both related to fake multisampling, not texture_view itself). No other piglit changes. v2: move sampler view validation to sampler view creation, update docs. Reviewed-by: Brian Paul <[email protected]>
*	gallium: add PIPE_CAP_DEVICE_RESET_STATUS_QUERY	Marek Olšák	2015-05-12	1	-0/+1
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	gallium: make pipe_context::begin_query return a boolean	Samuel Pitoiset	2015-05-06	1	-1/+2
\| \| \| \| \| \| \| \| \|	GL_AMD_performance_monitor must return an error when a monitoring session cannot be started. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Martin Peres <[email protected]>
*	Fix a few typos	Zoë Blade	2015-04-27	2	-2/+2
\| \| \| \|	Reviewed-by: Francisco Jerez <[email protected]>
*	gallivm: don't use control flow when doing indirect constant buffer lookups	Roland Scheidegger	2015-04-09	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	llvm goes crazy when doing that, using way more memory and time, though there's probably more to it - this points to a very much similar issue as fixed in 8a9f5ecdb116d0449d63f7b94efbfa8b205d826f. In any case I've seen a quite plain looking vertex shader with just ~50 simple tgsi instructions (but with a dozen or so such indirect constant buffer lookups) go from a terribly high ~440ms compile time (consuming 25MB of memory in the process) down to a still awful ~230ms and 13MB with this fix (with llvm 3.3), so there's still obvious improvements possible (but I have no clue why it's so slow...). The resulting shader is most likely also faster (certainly seemed so though I don't have any hard numbers as it may have been influenced by compile times) since generally fetching constants outside the buffer range is most likely an app error (that is we expect all indices to be valid). It is possible this fixes some mysterious vertex shader slowdowns we've seen ever since we are conforming to newer apis at least partially (the main draw loop also has similar looking conditionals which we probably could do without - if not for the fetch at least for the additional elts condition.) v2: use static vars for the fake bufs, minor code cleanups Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: enable ARB_texture_gather	Roland Scheidegger	2015-03-31	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Just announce support for 4 components. While here also increase the max/min texel offsets (the limit is completely artificial, was chosen because that's what other hardware did, however there's other drivers using larger limits). Over a thousand little piglits skip->pass. v2: update docs/GL3.txt Reviewed-by: Jose Fonseca <[email protected]>
*	gallivm: simplify sampler interface	Roland Scheidegger	2015-03-31	1	-26/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has got a bit out of control with more and more parameters added. Worse, whenever something in there changes all callees have to be updated for that, even though they don't really do much with any parameter in there except pass it on to the actual sampling function. Hence simply put almost everything into a struct. Also instead of relying on some arguments being NULL, be explicit and set this in a key (which is just reused for function generation for simplicity). (The code still relies on them being NULL in the end for now.) Technically there is a minimal functional change here for shadow sampling: if shadow sampling is done is now determined explicitly by the texture function (either sample_c or the gl-style tex func inherit this from target) instead of the static texture state. These two should always match, however. Otherwise, it should generate all the same code. Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: simplify address calculation for 4x4 blocks	Roland Scheidegger	2015-03-28	4	-76/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	These functions looked quite complicated, even though what they actually did was trivial (ever since we dropped swizzled rendering). Also drop lookup of format block per bytes done for each block, and do it once per scene instead. This improves everybody's favorite "benchmark" by 3% or so, though lp_rast_shade_quads_all() which calls this shows up still quite high for a function which does little more than call the jit function. (This would most likely be much better handled by the jit function itself, the strides are passed through anyway already, though for being able to handle layers it would definitely add some complexity.) Reviewed-by: Jose Fonseca <[email protected]>
*	gallivm: pass jit_context pointer through to sampling	Roland Scheidegger	2015-03-27	3	-18/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The callbacks used for getting the dynamic texture/sampler state were using the jit_context from the generated jit function. This works just fine, however that way it's impossible to generate separate functions for texture sampling, as will be done in the next commit. Hence, pass this pointer through all interfaces so it can be passed to a separate function (technically, it would probably be possible to extract this pointer from the current function instead, but this feels hacky and would probably require some more hacks if we'd use real functions instead of inlining all shader functions at some point). There should be no difference in the generated code for now. Reviewed-by: Jose Fonseca <[email protected]>
*	gallium: implement get_device_vendor() for existing drivers	Giuseppe Bilotta	2015-03-23	1	-0/+1
\| \| \| \| \| \| \| \| \|	The only hackish ones are llvmpipe and softpipe, which currently return the same string as for get_vendor(), while ideally they should return the CPU vendor. Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
*	llvmpipe: use global llvm context for PIPE_SUBSYSTEM_EMBEDDED	Roland Scheidegger	2015-03-21	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's 2 reasons why we'd want to use the global context: 1) There still seems to be one memory "leak" left when using multiple llvm contexts (it is not a true leak as the memory disappears into some still addressable pool but nevertheless the memory consumption grows). See http://cgit.freedesktop.org/~jrfonseca/llvm-jitstress/ 2) These contexts get kinda big - even when disposing modules etc. after compiling a shader the LLVMContext can easily be over 100kB. So when there's lots of llvm contexts arounds it adds up. The downside is that at least right now this is absolutely not thread safe, so this only works safely in environments where multiple pipe contexts are not used concurrently. Reviewed-by: Jose Fonseca <[email protected]>
*	scons: Use -Werror MSVC compatibility flags per-directory.	Jose Fonseca	2015-03-04	1	-0/+2
\| \| \| \| \| \|	Matching what we already do with autotools builds. Reviewed-by: Brian Paul <[email protected]>
*	configure: Leverage gcc warn options to enable safe use of C99 features ↵	Jose Fonseca	2015-03-03	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	where possible. The main objective of this change is to enable Linux developers to use more of C99 throughout Mesa, with confidence that the portions that need to be built with MSVC -- and only those portions --, stay portable. This is achieved by using the appropriate -Werror= options only on the places they need to be used. Unfortunately we still need MSVC 2008 on a few portions of the code (namely llvmpipe and its dependencies). I hope to eventually eliminate this so that we can use C99 everywhere, but there are technical/logistic challenges (specifically, newer Windows SDKs no longer bundle MSVC, instead require a full installation of Visual Studio, and that has hindered adoption of newer MSVC versions on our build processes.) Thankfully we have more directy control over our OpenGL driver, which is why we're now able to migrate to MSVC 2013 for most of the tree. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>