mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	gallium: s/unsigned/enum pipe_prim_type/	Brian Paul	2017-10-27	1	-1/+2
\| \| \| \| \| \|	In the vbuf_render::set_primitive() functions. Reviewed-by: Roland Scheidegger <[email protected]>
*	draw: don't cull tris with zero area	Roland Scheidegger	2017-10-27	2	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Culling tris with zero area seems like a great idea, but apparently with fill mode line (and point) we're supposed to draw them, at least some tests for some other state tracker complained otherwise. Such tris also always seem to be back facing (not sure if this can be inferred from anything, since in a mathematical sense it cannot really be determined), so make sure to account for this when filling in the face information. (For solid tris, this is of course unnecessary, drivers will throw the tris away later in any case.) Reviewed-by: Brian Paul <[email protected]>
*	llvmpipe, draw: improve shader cache debugging	Roland Scheidegger	2017-09-09	2	-22/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With GALLIVM_DEBUG=perf set, output the relevant stats for shader cache usage whenever we have to evict shader variants. Also add some output when shaders are deleted (but not with the perf setting to keep this one less noisy). While here, also don't delete that many shaders when we have to evict. For fs, there's potentially some cost if we have to evict due to the required flush, however certainly shader recompiles have a high cost too so I don't think evicting one quarter of the cache size makes sense (and, if we're evicting based on IR count, we probably typically evict only very few or just one shader too). For vs, I'm not sure it even makes sense to evict more than one shader at a time, but keep the logic the same for now. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	llvmpipe, draw: increase shader cache limits	Roland Scheidegger	2017-09-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We're not particularly concerned with memory usage, if the tradeoff is shader recompiles. And it's common for apps to have a lot of shaders nowadays (and, since our shaders include a LOT of context state of course we may create quite a bit more shaders even). So quadruple the amount of shaders draw will cache (from 128 to 512). For llvmpipe (fs shaders) quadruple the number of instructions, keep the number of variants the same for now (only with very simple, non-texturing shaders the variant limit could really be reached), and simplify the definition, it's probably easier to just have one different definition per branch... Reviewed-by: Jose Fonseca <[email protected]>
*	draw: whitespace, formatting fixes in draw_vs_exec.c	Brian Paul	2017-07-12	1	-47/+43
\| \| \| \|	Trivial.
*	draw: s/unsigned/enum tgsi_semantic/	Brian Paul	2017-07-12	2	-3/+3
\| \| \| \|	Reviewed-by: Charmaine Lee <[email protected]>
*	draw: handle more TGSI_SEMANTIC_COLOR indices	Roland Scheidegger	2017-07-08	3	-10/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	It could only handle indices 0/1, otherwise what happened was bad (accessing array out of bounds, no crash but kind of random). This is enough for the gl state tracker (primary/secondary color) but not enough for some other state trackers (d3d9 has no limits on the number of color interpolants). The complexity with color semantics are all due to the front/back mapping (2 outputs in the vs map to one input in the fs) so this isn't extended to indices > 1 - d3d9 has no use for back colors, therefore this isn't needed and still only 2 back colors can be handled correctly. Reviewed-by: Brian Paul <[email protected]>
*	draw: check for line_width != 1.0f in validate_pipeline()	Brian Paul	2017-06-15	1	-3/+4
\| \| \| \| \| \| \| \| \|	We shouldn't use the wide line stage if the line width is 1. This check isn't strictly needed because all drivers are (now) specifying a line wide threshold of at least 1.0 pixels, but let's play it safe. Reviewed-by: Charmaine Lee <[email protected]>
*	draw: whitespace and formatting fixes	Brian Paul	2017-06-15	2	-60/+58
\| \| \| \|	Trivial.
*	tree-wide: remove trailing backslash	Eric Engestrom	2017-06-07	3	-3/+3
\| \| \| \| \| \| \| \| \|	Simple search for a backslash followed by two newlines. If one of the newlines were to be removed, this would cause issues, so let's just remove these trailing backslashes. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	gallium: remove pipe_index_buffer and set_index_buffer	Marek Olšák	2017-05-10	2	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	pipe_draw_info::indexed is replaced with index_size. index_size == 0 means non-indexed. Instead of pipe_index_buffer::offset, pipe_draw_info::start is used. For indexed indirect draws, pipe_draw_info::start is added to the indirect start. This is the only case when "start" affects indirect draws. pipe_draw_info::index is a union. Use either index::resource or index::user depending on the value of pipe_draw_info::has_user_indices. v2: fixes for nine, svga
*	gallium: decrease the size of pipe_vertex_buffer - 24 -> 16 bytes	Marek Olšák	2017-05-10	3	-8/+13
\|
*	draw: whitespace fixes in draw_pipe_vbuf.c	Brian Paul	2017-04-26	1	-104/+89
\| \| \| \|	Remove trailing whitespace, fix formatting, etc. Trivial.
*	draw: remove unused wideline_stage()	Samuel Pitoiset	2017-04-13	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes the following Clang warning. draw/draw_pipe_wide_line.c:48:38: warning: unused function 'wideline_stage' [-Wunused-function] static inline struct wideline_stage wideline_stage( struct draw_stage stage ) ^ 1 warning generated. v2: - remove commented code (Roland Scheidegger) v3: - remove half_line_width in the struct Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	draw: remove unused overflow()	Samuel Pitoiset	2017-04-13	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes the following Clang warning. draw/draw_pipe_vbuf.c:102:1: warning: unused function 'overflow' [-Wunused-function] overflow( void map, void ptr, unsigned bytes, unsigned bufsz ) ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	draw: (trivial) remove a unnecessary lp_build_alloca()	Roland Scheidegger	2017-03-16	1	-2/+0
\| \| \| \|	pointed out by clang (stored value never read)
*	draw: s/unsigned/enum pipe_shader_type/	Brian Paul	2017-03-08	4	-14/+15
\| \| \| \| \| \|	and some s/uint/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <[email protected]>
*	gallium: s/unsigned/enum pipe_shader_type/ for pipe_screen::get_shader_param()	Brian Paul	2017-03-08	2	-4/+6
\| \| \| \|	Reviewed-by: Edward O'Callaghan <[email protected]>
*	configure.ac: Revert recent HAVE_LLVM changes.	Jose Fonseca	2017-01-18	5	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts changes 903eb09b5fb78d47d0f8a4bdf826a113ca2aff40..1a0aa468f354f0ee94dd383cd40ae915584624aa: Tobias Droste (5): configure.ac: Rename MESA_LLVM to FOUND_LLVM configure.ac: Only set LLVM_LIBS if LLVM is used configure.ac: Only define HAVE_LLVM if LLVM is used configure.ac: Set and use HAVE_GALLIUM_LLVM define configure.ac: Don't check LLVM version in gallium_require_llvm They break scons build, and I'm not convinced this is the right fix. In particular changing HAVE_LLVM in the C code is something I'd rather avoid no matter what. So it's better to discuss without the pressure of broken builds.
*	configure.ac: Set and use HAVE_GALLIUM_LLVM define	Tobias Droste	2017-01-18	5	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Gallium code used HAVE_LLVM to check if it needs to compile code for LLVM in header and source files. With the new logic HAVE_LLVM is always set. Use extra define to figure out if LLVM is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010 Signed-off-by: Tobias Droste <[email protected]>
*	gallium: remove TGSI_OPCODE_SUB	Marek Olšák	2017-01-05	2	-11/+11
\| \| \| \| \| \|	It's redundant with the source modifier. Reviewed-by: Nicolai Hähnle <[email protected]>
*	draw: use SoA fetch, not AoS one	Roland Scheidegger	2016-12-21	1	-48/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that there's some SoA fetch which never falls back, we should always get results which are better or at least not worse (something like rgba32f will stay the same). For cases which get way better, think something like R16_UNORM with 8-wide vectors: this was 8 sign-extend fetches, 8 cvt, 8 muls, followed by a couple of shuffles to stitch things together (if it is smart enough, 6 unpacks) and then a (8-wide) transpose (not sure if llvm could even optimize the shuffles + transpose, since the 16bit values were actually sign-extended to 128bit before being cast to a float vec, so that would be another 8 unpacks). Now that is just 8 fetches (directly inserted into vector, albeit there's one 128bit insert needed), 1 cvt, 1 mul. v2: ditch the old AoS code instead of just disabling it. Reviewed-by: Jose Fonseca <[email protected]>
*	gallivm: optimize gather a bit, by using supplied destination type	Roland Scheidegger	2016-12-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By using a dst_type in the the gather interface, gather has some more knowledge about how values should be fetched. E.g. if this is a 3x32bit fetch and dst_type is 4x32bit vector gather will no longer do a ZExt with a 96bit scalar value to 128bit, but just fetch the 96bit as 3x32bit vector (this is still going to be 2 loads of course, but the loads can be done directly to simd vector that way). Also, we can now do some try to use the right int/float type. This should make no difference really since there's typically no domain transition penalties for such simd loads, however it actually makes a difference since llvm will use different shuffle lowering afterwards so the caller can use this to trick llvm into using sane shuffle afterwards (and yes llvm is really stupid there - nothing against using the shuffle instruction from the correct domain, but not at the cost of doing 3 times more shuffles, the case which actually matters is refusal to use shufps for integer values). Also do some attempt to avoid things which look great on paper but llvm doesn't really handle (e.g. fetching 3-element 8 bit and 16 bit vectors which is simply disastrous - I suspect type legalizer is to blame trying to extend these vectors to 128bit types somehow, so fetching these with scalars like before which is suboptimal due to the ZExt). Remove the ability for truncation (no point, this is gather, not conversion) as it is complex enough already. While here also implement not just the float, but also the 64bit avx2 gathers (disabled though since based on the theoretical numbers the benefit just isn't there at all until Skylake at least). Reviewed-by: Jose Fonseca <[email protected]>
*	draw: drop some overflow computations	Roland Scheidegger	2016-11-21	1	-65/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It turns out that noone actually cares if the address computations overflow, be it the stride mul or the offset adds. Wrap around seems to be explicitly permitted even by some other API (which is a _very_ surprising result, as these overflow computations were added just for that and made some tests pass at that time - I suspect some later fixes fixed the actual root cause...). So the requirements in that other api were actually sane there all along after all... Still need to make sure the computed buffer size needed is valid, of course. This ditches the shiny new widening mul from these codepaths, ah well... And now that I really understand this, change the fishy min limiting indices to what it really should have done. Which is simply to prevent fetching more values than valid for the last loop iteration. (This makes the code path in the loop minimally more complex for the non-indexed case as we have to skip the optimization combining two adds. I think it should be safe to skip this actually there, but I don't care much about this especially since skipping that optimization actually makes the code easier to read elsewhere.) Reviewed-by: Jose Fonseca <[email protected]>
*	draw: simplify fetch some more	Roland Scheidegger	2016-11-21	1	-63/+55
\| \| \| \| \| \| \| \| \| \| \|	Don't keep the ofbit. This is just a minor simplification, just adjust the buffer size so that there will always be an overflow if buffers aren't valid to fetch from. Also, get rid of control flow from the instanced path too. Not worried about performance, but it's simpler and keeps the code more similar to ordinary fetch. Reviewed-by: Jose Fonseca <[email protected]>
*	draw: unify linear and elts draw jit functions	Roland Scheidegger	2016-11-21	3	-89/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code for elts and linear paths was nearly 100% identical by now - with the elts path simply having some additional gather for the elements in the main loop (with some additional small differences before the main loop). Hence nuke the separate functions and decide this at jit shader execution time (simply based on the presence of the elts pointer). Some analysis shows that the generated vs jit functions seem to be just very minimally more complex than the former elts functions, and almost none of the additional complexity is in the main loop (basically just the branch logic for the branch fetching the actual indices). Compared to linear, the codesize of the function is of course a bit larger, however the actual executed code in the main loop appears to be near 100% identical (the additional code looking up indices is skipped as expected). So, I would not expect a (meaningful) performance difference with the generated code, neither with elts nor linear, this does however roughly half the compilation time (the compiled shaders should also use only half the memory of course). Reviewed-by: Jose Fonseca <[email protected]>
*	draw: use same argument order for jit draw linear / elts functions	Roland Scheidegger	2016-11-21	3	-34/+30
\| \| \| \| \| \|	This is a bit simpler. Mostly to make it easier to unify the paths later... Reviewed-by: Jose Fonseca <[email protected]>
*	draw: drop unnecessary index overflow handling from vsplit code	Roland Scheidegger	2016-11-21	2	-56/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was kind of strange, since it replaced indices which were only overflowing due to bias with MAX_UINT. This would cause an overflow later in the shader, except if stride was 0, however the vertex id would be essentially random then (-1 + eltBias). No test cared about it, though. So, drop this and just use ordinary int arithmetic wraparound as usual. This is much simpler to understand and the results are "more correct" or at least more consistent (vertex id as well as actual fetch results just correspond to wrapped around arithmetic). There's only one catch, it is now possible to hit the cache initialization value also with ushort and ubyte elts path (this wouldn't be an issue if we'd simply handle the eltBias itself later in the shader). Hence, we need to make sure the cache logic doesn't think this element has already been emitted when it has not (I believe some seriously bad things could happen otherwise). So, borrow the logic which handled this from the uint case, but not before fixing it up... Reviewed-by: Jose Fonseca <[email protected]>
*	draw: simplify vsplit elts code a bit	Roland Scheidegger	2016-11-21	3	-40/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vsplit_get_base_idx explicitly returned idx 0 and set the ofbit in case of overflow. We'd then check the ofbit and use idx 0 instead of looking it up. This was necessary because DRAW_GET_IDX used to return DRAW_MAX_FETCH_IDX and not 0 in case of overflows. However, this is all unnecessary, we can just let DRAW_GET_IDX return 0 in case of overflow. In fact before bbd1e60198548a12be3405fc32dd39a87e8968ab the code already did that, not sure why this particular bit was changed (might have been one half of an attempt to get these indices to actual draw shader execution - in fact I think this would make things less awkward, it would require moving the eltBias handling to the shader as well). Note there's other callers of DRAW_GET_IDX - those code paths however explicitly do not handle index buffer overflows, therefore the overflow value doesn't matter for them. Also do some trivial simplification - for (unsigned) a + b, checking res < a is sufficient for overflow detection, we don't need to check for res < b too (similar for signed). And an index buffer overflow check looked bogus - eltMax is the number of elements in the index buffer, not the maximum element which can be fetched. (Drop the start check against the idx buffer though, this is already covered by end check and end < start). Reviewed-by: Jose Fonseca <[email protected]>
*	draw: finally optimize bool clip mask generation	Roland Scheidegger	2016-11-18	3	-23/+26
\| \| \| \| \| \| \| \| \| \| \|	lp_build_any_true_range is just what we need, though it will only produce optimal code with sse41 (ptest + set) - but even without it on 64bit x86 the code is still better (1 unpack, 2 movq + or + set), on 32bit x86 it's going to be roughly the same as before. While here also make it a "real" 8bit boolean - cuts one instruction but more importantly similar to ordinary booleans. Reviewed-by: Jose Fonseca <[email protected]>
*	draw: use vectorized calculations for fetch (v2)	Roland Scheidegger	2016-11-18	1	-131/+265
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to llvm not recognizing it's all the same fetch, since it would have been possible some of the fetches getting replaced with zeros in case vector size exceeds remaining fetch count - the values of such fetches don't matter at all though). Also, for elts gathering, use vectorized code as well. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). v3: skip the fake index buffer, not needed due to the jit code never seeing the real index buffer in the first place. Fix a bug with mask expansion (needs SExt, not ZExt). Also, be really really careful to keep the behavior the same, even in cases where it looks wrong, and add comments why the code is doing the seemingly wrong stuff... Fortunately it's not actually more complex in the end... Also change function order slightly just to make the diff more readable. No piglit change. Passes some internal testing with another api too... Reviewed-by: Jose Fonseca <[email protected]>
*	gallivm: Fix build after removal of deprecated attribute API v3	Tom Stellard	2016-11-09	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \|	v2: Fix adding parameter attributes with LLVM < 4.0. v3: Fix typo. Fix parameter index. Add a gallivm enum for function attributes. Reviewed-by: Nicolai Hähnle <[email protected]>
*	Revert "draw: use vectorized calculations for fetch"	Roland Scheidegger	2016-11-09	2	-282/+159
\| \| \| \| \| \| \| \|	Trivial. There's some regressions internally, related to overflow behavior. I'll have to look at it at another time, some interactions with vsplit/vcache are actually mind-blowing. This reverts commit 3fa10ffb496cc4e6d1003891cf0381bb5bec2a74.
*	draw: use vectorized calculations for fetch	Roland Scheidegger	2016-11-08	2	-159/+282
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. Because llvm is complete fail with the zero-extend widening mul, roll our own even... To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to apparently llvm not being able to deduce it's really all the same with a couple instanced elements). Also, for elts gathering, use vectorized code as well - provide a fake elt buffer if there's no valid one bound. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). No piglit change. Reviewed-by: Jose Fonseca <[email protected]>
*	draw: fix undefined input handling some more...	Roland Scheidegger	2016-11-04	1	-50/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previous fixes were incomplete - some code still iterated through the number of elements provided by velem layout instead of the number stored in the key (which is the same as the number defined by the vs). And also actually accessed the elements from the layout directly instead of those in the key. This mismatch could still cause crashes. (Besides, it is a very good idea to only use data stored in the key anyway.) v2: move null format check, remove now unnecessary function parameter, some minor prettify Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	draw: improve vertex fetch (v2)	Roland Scheidegger	2016-10-19	1	-86/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The per-element fetch has quite some calculations which are constant, these can be moved outside both the per-element as well as the main shader loop (llvm can figure out it's constant mostly on its own, however this can have a significant compile time cost). Similarly, it looks easier swapping the fetch loops (outer loop per attrib, inner loop filling up the per vertex elements - this way the aos->soa conversion also can be done per attrib and not just at the end though again this doesn't really make much of a difference in the generated code). (This would also make it possible to vectorize the calculations leading to the fetches.) There's also some minimal change simplifying the overflow math slightly. All in all, the generated code seems to look slightly simpler (depending on the actual vs), but more importantly I've seen a significant reduction in compile times for some vs (albeit with old (3.3) llvm version, and the time reduction is only really for the optimizations run on the IR). v2: adapt to other draw change. No changes with piglit. Reviewed-by: Jose Fonseca <[email protected]>
*	draw: improved handling of undefined inputs	Roland Scheidegger	2016-10-19	1	-21/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previous attempts to zero initialize all inputs were not really optimal (though no performance impact was measurable). In fact this is not really necessary, since we know the max number of inputs used. Instead, just generate fetch for up to max inputs used by the shader, directly replacing inputs for which there was no vertex element by zero. This also cleans up key generation, which previously would have stored some garbage for these elements. And also drop the assertion which indicates such bogus usage by a debug_printf (the whole point of initializing the undefined inputs was to make this case safe to handle). Reviewed-by: Jose Fonseca <[email protected]>
*	draw: initialize shader inputs	Roland Scheidegger	2016-10-12	1	-0/+7
\| \| \| \| \| \| \| \| \|	This should make the code more robust if a shader tries to use inputs which aren't defined by the vertex element layout (which usually shouldn't happen). No piglit change. Reviewed-by: Brian Paul <[email protected]>
*	gallium: Use enum pipe_shader_type in set_sampler_views()	Kai Wasserbäch	2016-08-29	4	-9/+11
\| \| \| \| \|	Signed-off-by: Kai Wasserbäch <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	gallium: Use enum pipe_shader_type in bind_sampler_states() (v2)	Kai Wasserbäch	2016-08-29	2	-6/+10
\| \| \| \| \| \| \| \| \| \| \|	v1 → v2: - Fixed indentation (noted by Brian Paul) - Removed second assert from nouveau's switch statements (suggested by Brian Paul) Signed-off-by: Kai Wasserbäch <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	draw: Avoid aliasing violations.	Matt Turner	2016-08-01	2	-3/+6
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	gallivm: Use llvm.fmuladd.*.	Jose Fonseca	2016-06-10	1	-10/+5
\| \| \| \|	Reviewed-by: Roland Scheidegger <[email protected]>
*	draw: stop using CULLDIST semantic.	Dave Airlie	2016-05-23	10	-48/+31
\| \| \| \| \| \| \| \| \| \| \|	The way the HW works doesn't really fit with having two semantics for this. The GLSL compiler emits 2 vec4s and two properties, this makes draw use those instead of CULLDIST semantics. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	draw: s/Elements/ARRAY_SIZE/	Brian Paul	2016-04-27	7	-24/+24
\| \| \| \|	Reviewed-by: Jose Fonseca <[email protected]>
*	tgsi: accept a starting PC value for exec machine.	Dave Airlie	2016-04-27	2	-2/+2
\| \| \| \| \| \| \| \|	This will be used later to restart barriered execution threads in compute, for now we just want to change the API. Acked-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	tgsi: move to using vector for system values.	Dave Airlie	2016-04-27	2	-5/+5
\| \| \| \| \| \| \| \| \| \|	For compute support some of the system values are .xyz types, so move to using a vector instead of a single channel. [airlied: squash swizzle fix from compute series]. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	tgsi: pass a shader type to the machine create and clean up.	Dave Airlie	2016-04-26	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	There was definitely bugs here mixing up the PIPE_ and TGSI_ defines, hopefully they didn't cause any problems, since mostly it was special cases for GEOMETRY. This clarifies at shader machine create what type of shader this machine will execute. This is needed also for compute shaders where we don't want to allocate inputs/outputs. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	gallium/tgsi: move tgsi_exec.h header out of draw_context.h	Dave Airlie	2016-04-26	2	-1/+1
\| \| \| \| \| \| \| \| \|	It gets annoying that changing the tgsi exec rebuilds the state tracker unnecessarily. Putting this include into draw_gs.h which uses it causes a lot less rebuilds. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	gallivm: convert size query to using a set of parameters.	Dave Airlie	2016-04-19	1	-18/+4
\| \| \| \| \| \| \| \| \| \|	This isn't currently that easy to expand, so fix it up before expanding it later to include dynamic samplers. [airlied: use some local variables (Roland)] Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	draw: add support for passing buffers to vs/gs shaders.	Dave Airlie	2016-04-12	5	-3/+32
\| \| \| \| \| \| \| \|	Like the image code, but for shader buffers this time. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>