mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	llvmpipe: add support for ssbo to the fragment shader jit.	Dave Airlie	2019-07-07	1	-1/+5
\| \| \| \| \| \|	This just adds the ssbo ptrs to the jit fragment shader api. Reviewed-by: Roland Scheidegger <[email protected]>
*	gallivm: add ssbo pointers to the soa build api.	Dave Airlie	2019-07-07	1	-1/+1
\| \| \| \| \| \|	Need to pass ssbo + ssbo size pointers just like constants. Reviewed-by: Roland Scheidegger <[email protected]>
*	llvmpipe: make remove_shader_variant static.	Dave Airlie	2019-06-21	1	-1/+1
\| \| \| \| \| \| \|	this isn't used outside this file. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	gallium: split depth_clip into depth_clip_near & depth_clip_far	Marek Olšák	2018-09-06	1	-1/+1
\| \| \| \|	for AMD_depth_clamp_separate.
*	llvmpipe: improve rasterization discard logic	Roland Scheidegger	2018-05-23	1	-22/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This unifies the explicit rasterization discard as well as the implicit rasterization disabled logic (which we need for another state tracker), which really should do the exact same thing. We'll now toss out the prims early on in setup with (implicit or explicit) discard, rather than do setup and binning with them, which was entirely pointless. (We should eventually get rid of implicit discard, which should also enable us to discard stuff already in draw, hence draw would be able to skip the pointless clip and fallback stages in this case.) We still need separate logic for only null ps - this is not the same as rasterization discard. But simplify the logic there and don't count primitives simply when there's an empty fs, regardless of depth/stencil tests, which seems perfectly acceptable by d3d10. While here, also fix statistics for primitives if face culling is enabled. No piglit changes. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: fix check for a no-op shader	Brian Paul	2018-05-18	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	The tgsi_info.num_tokens fix broke llvmpipe's detection of no-op shaders. Fix the code to check for num_instructions <= 1 instead. Fixes: 8fde9429c36b75 ("tgsi: fix incorrect tgsi_shader_info::num_tokens computation") Tested-by: Roland Scheidegger <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	util: Move util_is_power_of_two to bitscan.h and rename to ↵	Ian Romanick	2018-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
*	tgsi/scan: use wrap-around shift behavior explicitly for file_mask	Roland Scheidegger	2018-03-06	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The comment said it will only represent the lowest 32 regs. This was not entirely true in practice, since at least on x86 you'll get masked shifts (unless the compiler could recognize it already and toss it out). It turns out this actually works out alright (presumably noone uses it for temp regs) when increasing max sampler views, so make that behavior explicit. Albeit it feels a bit hacky (but in any case, explicit behavior there is better than undefined behavior). Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	gallivm/llvmpipe: add const qualifiers on sampler variables	Brian Paul	2018-02-01	1	-1/+1
\| \| \| \| \| \| \|	Once a lp_build_sampler_soa or lp_build_sampler_aos object is created, it should never be modified. Found by inspection. Reviewed-by: Roland Scheidegger <[email protected]>
*	util: move os_time.[ch] to src/util	Nicolai Hähnle	2017-11-09	1	-1/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	llvmpipe: handle shader sample mask output	Roland Scheidegger	2017-10-18	1	-2/+24
\| \| \| \| \| \| \| \| \|	This probably isn't all that useful for GL, but there are apis where sample_mask is a valid output even without msaa. Just discard the pixel if the sample_mask doesn't include the bit for sample 0. Reviewed-by: Brian Paul <[email protected]>
*	llvmpipe: silence 'variable may be used uninitialized' warnings	Brian Paul	2017-10-03	1	-1/+1
\| \| \| \|	Reviewed-by: Charmaine Lee <[email protected]>
*	llvmpipe, draw: improve shader cache debugging	Roland Scheidegger	2017-09-09	1	-9/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With GALLIVM_DEBUG=perf set, output the relevant stats for shader cache usage whenever we have to evict shader variants. Also add some output when shaders are deleted (but not with the perf setting to keep this one less noisy). While here, also don't delete that many shaders when we have to evict. For fs, there's potentially some cost if we have to evict due to the required flush, however certainly shader recompiles have a high cost too so I don't think evicting one quarter of the cache size makes sense (and, if we're evicting based on IR count, we probably typically evict only very few or just one shader too). For vs, I'm not sure it even makes sense to evict more than one shader at a time, but keep the logic the same for now. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	gallium: rename util_dump_* to util_str_* for enum-to-string conversion	Nicolai Hähnle	2017-08-02	1	-21/+21
\| \| \| \| \| \| \|	This is mostly mechanical search-and-replace, plus touching up the macros in u_dump_defines.c manually a bit. Reviewed-by: Marek Olšák <[email protected]>
*	gallium: s/uint/enum pipe_shader_type/ for set_constant_buffer()	Brian Paul	2017-03-08	1	-1/+1
\| \| \| \|	Reviewed-by: Edward O'Callaghan <[email protected]>
*	llvmpipe: do transpose/untwiddle after conversion for 8bit formats	Roland Scheidegger	2017-01-06	1	-7/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generally we should do tranpose after conversion, if the format has less than 32 bits per channel (if it has 32 bits, conversion is going to be a no-op anyway...). This is obviously because there's less vectors to deal with. Though the advantage for 16 bit formats isn't that big, and in fact with AVX there isn't really any (as the 32bit unpacks can be done with 256bit, but the smaller ones cannot, although that would change again with proper AVX2 support). Only makes sense for 2d and not 1d cases. And to keep things easy, only handle 1,2 and 4 channels (rgbx is just fine). For rgba unorm8 format the backend conversion sums up to these instruction totals (not counting the movs for SSE2 due to 2-op syntax - generally every 2 unpacks need an additional mov). SSE2 AVX transpose: 32 unpack 16 unpack untwiddle: 0 8 (128bit low/high permutes) convert: 16 mul + 16 cvt 8 mul + 8 cvt 32->8bit: 12 pack 8 (128bit extract) + 12 pack When doing transpose/untwiddle afterwards we get: convert: 16 mul + 16 cvt 8 mul + 8 cvt 32->8bit: 12 pack 8 (128bit extract) + 12 pack transpose/untwiddle 12 unpack 12 unpack So for SSE2, this drops 20 unpacks (total instruction count 76->56) whereas for AVX it replaces the 16 256bit unpacks with 8 128bit ones and drops the 8 lo/hi permutes (in total 60->48). (Albeit to be fair, the permutes could be dropped even when doing the transpose first, they are extremely pointless but we'd need to be able to tell lp_build_conv to reorder the vectors, for AVX2 we're going to need to be able to tell lp_build_conv about ordering in any case.) (With different ordering going into conversion, it would be possible to do 4 unpacks + 4 pshufbs instead of 12 unpacks, but that might not be better, and not all cpus can do it. Proper AVX2 support should eliminate the 8 128bit extracts, reduce these 12 packs to 6 and the 12 unpacks to 2 pshufb + 2 permq ideally (+ 2 final 128bit extracts).) Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: use alpha from already converted color if possible	Roland Scheidegger	2017-01-06	1	-11/+32
\| \| \| \| \| \| \| \| \| \| \| \| \|	For rgbx formats, there is no point in doing alpha conversion again (and with different tranpose even, so llvm can't eliminate it). Albeit it looks like there's some minimal changes needed in the blend code (found by code inspection, no test seemed to complain) if we do this - the blend factors are already sanitized if we have no destination alpha, however for src_alpha_saturate it looks like it still might make a difference (note that we forced has_alpha to true before for some formats and nothing complained, but this seems safer). Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: use scalar load instead of vectors for small vectors in fs backend	Roland Scheidegger	2017-01-06	1	-6/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	llvm has _huge_ problems trying to load things like <4 x i8> vectors and stitching such loads together to form 128bit vectors. My understanding of the problem is that the type legalizer tries to extend that to really a <4 x i32> vector and not a <16 x i8> vector with the 4 elements first then followed by padding, so the shuffles for then combining things together are more or less impossible - you can in fact see the pmovzxd llvm generates. Pre-4.0 llvm just gives up on it completely and does a 30+ pextrb/pinsrb sequence instead. It looks like current llvm has fixed this behavior (my guess would be due to better shuffle combination and load/shuffle folds), but we can avoid this by just loading as <1 x i32> values, combine that and only cast at the end. (I suspect it might also work if we'd pad the loaded vectors immediately before shuffling them together, instead of directly stitching 2 such vectors together pairwise before combining the pair. But this _might_ lose the ability to load the values directly into their right place in the vector with pinsrd.). But using 32bit values is probably easier for llvm as it will never give it funny ideas how the vector should look like. (This is possibly only a problem for 1x8bit formats, since 2x8bit will end up fetching 64bit hence only two vectors are stitched together, not 4, but we use the same strategy anyway.) Reviewed-by: Jose Fonseca <[email protected]>
*	gallivm: implement aos unpack (to unorm8) for small unorm formats	Roland Scheidegger	2017-01-05	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using bit replication. This path now resembles something which might make sense. (The logic was mostly copied from llvmpipe fs backend.) I am not convinced though it is actually faster than SoA sampling (actually I'm quite certain it's always a loss with AVX). With SoA it's just shift/mask/cvt/mul for getting the colors, whereas there's still roughly 3 shifts, 3 or/and per channel for AoS (i.e. for SoA it's exactly the same as it would be for a rgba8 format, whereas the extra effort for AoS is significant). The filtering might still be faster (albeit with FMA the instruction count gets down quite a bit there on the SoA float filtering path on new cpus). And those small unorm formats often don't have an alpha channel (which makes things worse relatively for AoS path). (This also fixes a trivial bug in the llvmpipe fs code this was derived from, albeit it was only relevant for 4-bit channels.) Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: (trivial) minimally simplify mask construction	Roland Scheidegger	2017-01-05	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	simd instruction sets usually have comparisons for equal, not unequal. So use a different comparison against the mask itself - which also means we don't need a all-zero as well as a all-one (for the pxor) reg. Also add code to avoid scalar expansion of i1 values which we definitely shouldn't do. There's problems with this though with llvm select interaction, so it's disabled (basically using llvm select instead of intrinsics may still produce atrocious code, even in cases where we figured it should not, albeit I think this could probably be fixed with some better selection of optimization passes, but I have zero idea there really). Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: Fix build after removal of deprecated attribute API v2	Aaron Watry	2016-11-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Applies on top of v3 of Tom's gallivm change. v2: - Tom Stellard: Use enums instread of strings. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Aaron Watry <[email protected]> CC: Tom Stellard <[email protected]> CC: Jan Vesely <[email protected]>
*	llvmpipe: fix issues with depth clamp	Roland Scheidegger	2016-08-20	1	-45/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We only did depth clamp when the value was written from the fs. This is very wrong both for d3d10 and GL, and only passed the corresponding piglit test due to pure luck (it no longer does with the enhanced test). Also, interpolation clamped values to 1.0 always, which can legitimately happen if depth clip is disabled, so fix that as well (untested). There is one unresolved issue left, d3d10 always does depth clamping, whereas GL does not (but does [0,1] clamp instead for fs depth outputs) - this information isn't in any gallium state object, leave it as-is for now (though it looks like llvmpipe misses the [0,1] clamp as well). This (with the previous patch) fixes piglit depth-clamp-range test. Reviewed-by: Jose Fonseca <[email protected]>
*	gallium: make constant_buffer const	Rob Clark	2016-06-20	1	-1/+1
\| \| \| \| \|	Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	llvmpipe: hack-fix bugs due to bogus bind flags	Roland Scheidegger	2016-06-14	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The gallium contract would be that bind flags must indicate all possible bindings a resource might get used, but fact is the mesa state tracker does not set bind flags correctly, and this is more or less unfixable due to GL. This caused a bug with piglit arb_uniform_buffer_object-rendering-dsa since 6e6fd911da8a1d9cd62fe0a8a4cc0fb7bdccfe02 - the commit is correct, but it caused us to miss updates to fs UBOs completely, since the corresponding buffer didn't have the appropriate bind flag set (thus we wouldn't check if it is indeed currently bound). See the discussion about this starting here: https://lists.freedesktop.org/archives/mesa-dev/2016-June/119829.html So, update the bind flags when we detect such usage. Note we update this value for now only in places which matter for us - that is creating sampler/surface view, or binding constant buffer. There's plenty more places (setting streamout buffers, vertex/index buffers, ...) where things can be set with the wrong bind flags, but the bind flags there never matter. While here also make sure we only set dirty constant bit when it's a fs constant buffer - totally doesn't matter if it's vs/gs. Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: s/Elements/ARRAY_SIZE/	Brian Paul	2016-04-27	1	-5/+5
\| \| \| \|	Reviewed-by: Jose Fonseca <[email protected]>
*	gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*	Marek Olšák	2016-04-22	1	-1/+1
\| \| \| \| \| \| \| \|	Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <[email protected]>
*	llvmpipe: (trivial) initialize src1_alpha var to NULL	Roland Scheidegger	2016-04-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The blend code would do a conditional assignment based on it, causing valgrind to complain. Since that variable was actually unused in this case, this doesn't fix anything but the warning. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955 Cc: "11.1 11.2" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards.	Jose Fonseca	2016-04-03	1	-2/+2
\| \| \| \| \| \| \| \| \|	Only provide a fallback for LLVM 3.3. One less dependency on LLVM C++ interface. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	llvmpipe: use ints not unsigned for slots	Roland Scheidegger	2016-01-07	1	-17/+18
\| \| \| \| \| \| \| \| \| \| \| \| \|	They can't actually be 0 (as position is there) but should avoid confusion. This was supposed to have been done by af7ba989fb5a39925a0a1261ed281fe7f48a16cf but I accidentally pushed an older version of the patch in the end... Also prettify slightly. And make some notes about the confusing and useless fs input "map". Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	gallium/drivers: Trivial code-style cleanup	Edward O'Callaghan	2015-12-06	1	-1/+1
\| \| \| \| \|	Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	llvmpipe: add cache for compressed textures	Roland Scheidegger	2015-11-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	compressed textures are very slow because decoding is rather complex (and because there's no jit code code to decode them too for non-technical reasons). Thus, add some texture cache which holds a couple of decoded blocks. Right now this handles only s3tc format albeit it could be extended to work with other formats rather trivially as long as the result of decode fits into 32bit per texel (ideally, rgtc actually would decode to more than 8 bits per channel, but even then making it work for it shouldn't be too difficult). This can improve performance noticeably but don't expect wonders (uncompressed is unsurprisingly still faster). It's also possible it might be slower in some cases (using nearest filtering for example or if there's otherwise not many cache hits, the cache is only direct mapped which isn't great). Also, actual decode of a block relies on util code, thus even though always full blocks are decoded it is done texel by texel - this could obviously benefit greatly from simd-optimized code decoding full blocks at once... Note the cache is per (raster) thread, and currently only used for fragment shaders. Reviewed-by: Jose Fonseca <[email protected]>
*	gallium: replace INLINE with inline	Ilia Mirkin	2015-07-21	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generated by running: git grep -l INLINE src/gallium/ \| xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ \| xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]>
*	llvmpipe: Implement stencil export	Roland Scheidegger	2015-06-04	1	-5/+20
\| \| \| \| \| \| \| \| \| \| \| \| \|	Pretty trivial, fixes the issue that we're expected to be able to blit stencil surfaces (as the blit just relies on util blitter code which needs stencil export to do it). 2 piglits skip->pass, 11 fail->pass v2: prettify, keep different stencil ref value handling out of depth/stencil test itself. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	gallivm: pass jit_context pointer through to sampling	Roland Scheidegger	2015-03-27	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The callbacks used for getting the dynamic texture/sampler state were using the jit_context from the generated jit function. This works just fine, however that way it's impossible to generate separate functions for texture sampling, as will be done in the next commit. Hence, pass this pointer through all interfaces so it can be passed to a separate function (technically, it would probably be possible to extract this pointer from the current function instead, but this feels hacky and would probably require some more hacks if we'd use real functions instead of inlining all shader functions at some point). There should be no difference in the generated code for now. Reviewed-by: Jose Fonseca <[email protected]>
*	gallium: Replace u_simple_list.h with util/simple_list.h	Eric Anholt	2015-01-28	1	-1/+1
\| \| \| \| \| \| \|	The code was exactly the same, except util/ has c++ guards and a struct simple_node declaration. Reviewed-by: Marek Olšák <[email protected]>
*	draw,gallivm,llvmpipe: Avoid implicit casts of 32-bit shifts to 64-bits.	José Fonseca	2014-11-26	1	-4/+4
\| \| \| \| \| \| \| \| \|	Addresses MSVC warnings "result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)", which can often be symptom of bugs, but in these cases were all benign. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	gallivm,llvmpipe,clover: Bump required LLVM version to 3.3.	José Fonseca	2014-10-23	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We'll need to update gallivm for the interface changes in LLVM 3.6, and the fewer the number of older LLVM versions we support the less hairy that will be. As consequence HAVE_AVX define can disappear. (Note HAVE_AVX meant whether LLVM version supports AVX or not. Runtime support for AVX is always checked and enforced independently.) Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5. Reviewed-by: Roland Scheidegger <[email protected]>
*	tgsi: change tgsi_shader_info::properties to a one-dimensional array	Marek Olšák	2014-10-04	1	-2/+2
\| \| \| \| \| \|	Reviewed-by: Roland Scheidegger <[email protected]> v2: fix svga too
*	tgsi: remove some not so useful variables from tgsi_shader_info	Marek Olšák	2014-10-04	1	-1/+3
\|
*	tgsi: simplify shader properties in tgsi_shader_info	Marek Olšák	2014-10-04	1	-8/+2
\| \| \| \|	Use an array of properties indexed by TGSI_PROPERTY_* definitions.
*	llvmpipe: Use two LLVMContexts per OpenGL context instead of a global one.	Mathias Fröhlich	2014-09-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This is one step to make llvmpipe thread safe as mandated by the OpenGL standard. Using the global LLVMContext is obviously a problem for that kind of use pattern. The patch introduces two LLVMContext instances that are private to an OpenGL context and used for all compiles. One is put into struct draw_llvm and the other one into struct llvmpipe_context. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Mathias Froehlich <[email protected]>
*	llvmpipe: do IR counting for shader cache management after optimization.	Roland Scheidegger	2014-05-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	2ea923cf571235dfe573c35c3f0d90f632bd86d8 had the side effect of IR counting now being done after IR optimization instead of before. Some quick analysis shows that there's roughly 1.5 times more IR instructions before optimization than after, hence the effective shader cache size got quite a bit smaller. Could counter this with an increase of the instruction limit but it probably makes more sense to count them after optimizations, so move that code. Reviewed-by: Brian Paul <[email protected]>
*	gallivm: give more verbose names to modules	Roland Scheidegger	2014-05-16	1	-4/+8
\| \| \| \| \| \| \| \| \|	When we had just one module "gallivm" was an appropriate name. But now we have modules containing all functions for a particular variant, so give it a corresponding name (this is really just for helping debugging). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: kill off llvmpipe_variant_count	Roland Scheidegger	2014-05-15	1	-2/+0
\| \| \| \| \| \| \|	Unused except it was increased for both fs and setup shader variants created. Probably some leftover from ages ago. Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: Delete unneeded LLVM stuff earlier.	José Fonseca	2014-05-14	1	-11/+2
\| \| \| \| \| \|	Same as Frank's change to draw module but for llvmpipe module. Reviewed-by: Roland Scheidegger <[email protected]>
*	llvmpipe: add support for b5g6r5_srgb	Roland Scheidegger	2014-03-21	1	-4/+35
\| \| \| \| \| \| \| \| \| \| \| \| \|	The conversion code for srgb was tuned for n x 4x8bit AoS -> 4 x nxfloat SoA (and vice versa), fix this to handle also 16bit 565-style srgb formats. Still not really all that generic, things like r10g10b10a2_srgb or r4g4b4a4_srgb wouldn't work (the latter trivial to fix, the former would not require more work to not crash but near certainly need some higher precision calculation) but not needed right now. The code is not fully optimized for this (could use more direct calculation instead of expanding to 8-bit range first) but should be good enough. Reviewed-by: Jose Fonseca <[email protected]>
*	llvmpipe: fix denorm handling for r11g11b10_float format when blending	Roland Scheidegger	2014-01-31	1	-2/+15
\| \| \| \| \| \| \| \|	The code re-enabling denorms for small float formats did not recognize this format due to format handling hacks (mainly, the lp_type doesn't have the floating bit set). Reviewed-by: Jose Fonseca <[email protected]>
*	s/Tungsten Graphics/VMware/	José Fonseca	2014-01-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 \| xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra$ph\\|hp$ics,\? [iI]nc\.\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics$,\? [iI]nc\.$\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/[email protected]/[email protected]/ s/[email protected]/[email protected]/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\[email protected]/[email protected]/g s/keithw\[email protected]/[email protected]/g s/[email protected]/[email protected]/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/[email protected]/[email protected]/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <[email protected]>
*	llvmpipe: handle NULL color buffer pointers	Brian Paul	2014-01-17	1	-68/+84
\| \| \| \| \| \| \| \| \|	Fixes regression from 9baa45f78b8ca7d66280e36009b6a685055d7cd6 v2: incorporate a few small changes suggested by Roland. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	llvmpipe: do constant buffer bounds checking in shaders	Zack Rusin	2014-01-16	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible to bind a smaller buffer as a constant buffer, than what the shader actually uses/requires. This could cause nasty crashes. This patch adds the architecture to pass the maximum allowable constant buffer index to the jit to let it make sure that the constant buffer indices are always within bounds. The behavior follows the d3d10 spec, which says the overflow should always return all zeros, and overflow is only defined as access beyond the size of the currently bound buffer. Accesses beyond the declared shader constant register size are not considered an overflow and expected to return garbage but consistent garbage (we follow the behavior which some wlk tests expect which is to return the actual values from the bound buffer). Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>