| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
This reduces code duplication.
Suggested-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reduces code duplication. It also adds support for drivers where the
fragment position is a system value.
Suggested-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I removed this mistakenly in 2dbc20e45689e09766552517a74e2270e49817b5. I
actually thought it should not be necessary and a piglit run didn't show
any differences, but this shouldn't have been in there.
draw_prepare_shader_outputs() is in fact dependent on NEW_RASTERIZER.
The new polygon-mode-facing test indeed shows why this is necessary, there's
lots of invalid reads and writes with valgrind (also crashes without
valgrind), because the pre-pipeline vertex size doesn't match the
post-pipeline vertex size (note this won't help much with stages which don't
have the prepare hook which can grow the vertex size, in particular the wide
point stage, but this isn't used by llvmpipe). The test still won't pass, of
course, but it is only usage of uninitialized values now, which is much
less dangerous...
(Albeit I'm pretty sure for i915 it really is not needed anymore as it
doesn't care about the extra outputs and doesn't call
draw_prepare_shader_outputs().)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Discovered by accident, valgrind was complaining (could have possibly caused
us to create redundant geometry shader variants).
v2: convinced by Brian and Jose, just use memset for both gs and vs keys,
just as easy and less error prone.
|
|
|
|
|
|
|
|
|
| |
Otherwise, clipped lines would have undefined stippling reset bit if line
stippling is enabled.
(Untested, and I just assume copying over the bits from the original line
is actually the right thing to do.)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The unfilled stage was not filling in the prim header, and the line stage
then decided to reset the stipple counter or not based on the uninitialized
data. This causes some failures in conform linestipple test (albeit quite
randomly happening depending on environment).
So fill in the prim header in the unfilled stage - I am not entirely sure
if anybody really needs determinant after that stage, but there's at least
later stages (wide line for instance) which copy over the determinant as well.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
draw emit couldn't care less what the interpolation mode is...
This somehow looked like it would matter, all drivers more or less
dutifully filled that in correctly. But this is only used for emit,
if draw needs to know about interpolation mode (for clipping for instance)
it will get that information from the vs anyway.
softpipe actually used to depend on that interpolation parameter, as it
abused that structure quite a bit but no longer.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the code would just redirect requests for attributes which
don't exist to use output 0. Rework this to output all zeros instead which
seems more useful - in particular some extensions like
ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by
previous stages. That way, drivers don't have to special case this depending
if the vs/gs outputs some attribute or not.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fix silly issue with MSVC case fall-though support to need
a extra 'break;'
Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NaNs mean it should be clipped, otherwise the NaNs might get passed to the
next stages (if clipping didn't happen for another reason already), which
might cause all kind of problems.
The llvm path got this right already (possibly by luck), but this isn't used
when there's a gs active.
Found by code inspection, verified with some hacked piglit test and some more
hacked debug output.
(Note the clipper can still itself incorrectly generate NaN and INF position
values in its output prims (at least after w divide / viewport transform) even
if the inputs weren't NaNs, if the position data of the vertices is
"sufficiently bad".)
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Those stages only really work for OGL-style texturing (so number of samplers
and views mostly the same, certainly for the max values).
These get often set up all at once, thus there might be max number of both
even if all of them are just NULL. We must not set the max number of samplers
and views to the same value since that will lead to terrible things if a driver
supports more views than samplers (and the state tracker set up all the views).
(This will not make these stages magically work if a shader uses dx10-style
texturing, they might still replace an actually used sview in that case.)
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We just ignored them altogether. While this feature is rather old-fashioned
supporting it is actually rather trivial.
This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag
and all (7) of point-vertex-id).
v2: comment fixes, and make the use of the edgeflag in clipmask consistent
with when it's actually there (should be impossible to hit a case where the
difference would actually matter but still...)
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This just adds confusion, these parameters are used when fetching vertices
by translate, but certainly not when emitting hw vertices for drivers, they
make no sense there (setting them has no consequences otherwise since there
won't be any elements with instance_divisor set). So just set them to 0 (the
draw_pipe_vbuf code for emitting vertices when the draw pipeline is run
already does exactly that).
Also while here do some whitespace cleanup.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vertex header had both clip_pos and clip_vertex.
We only really need one (clip_pos) because the draw llvm shader would
overwrite the position output from the vs with the viewport transformed.
However, we don't really need the second one, which was only really used
for gl_ClipVertex - if the shader didn't have that the values were just
duplicated to both clip_pos and clip_vertex. So, just use this from the vs
output instead when we actually need it.
Also change clip debug to output both the data from clip_pos and the
clipVertex output (if available).
Makes some things more complex, some things less complex, but seems more
easy to understand what clipping actually does (and what values it uses
to do its magic).
Reviewed-by: Brian Paul <[email protected]
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Seems obvious now this should use the data from position and not clip_vertex
(albeit might not really make a difference).
Reviewed-by: Brian Paul <[email protected]
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
clip -> clip_vertex and pre_clip_pos -> clip_pos.
Looks more obvious to me what these values actually represent (so use
something resembling the vs output names).
Reviewed-by: Brian Paul <[email protected]
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is just for code cleanup, conceptually the have_clipdist really
isn't per-vertex state, so don't put it there (just dependent on the
shader). Even though there wasn't really any overhead associated with
this, we shouldn't store random shader information in the vertex header.
Reviewed-by: Brian Paul <[email protected]
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I'm pretty sure this should use position (i.e. pre_clip_pos) and not
the output from clipVertex. Albeit piglit doesn't care. It is what we
use in the clip test, and it is what every other driver does (as they
don't even have clipVertex output and lower the additional planes to
clip distances).
Reviewed-by: Brian Paul <[email protected]
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
Otherwise, if struct vertex_info is changed, you're in for some surprises...
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discovered this when working on other clip code, apparently didn't work
correctly - the combination of linear interpolated values and using
gl_ClipVertex produced wrong values (failing all such combinations
in piglits glsl-1.30 interpolation tests, named
interpolation-noperspective-XXX-vertex).
Use the pre-clip-pos values when determining the interpolation factor to
fix this.
Noone really understands this code well, but everybody agrees this looks
sane... This fixes all those failing tests (10 in total) both with
the llvm and non-llvm draw paths, with no piglit regressions.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use NULL tests of the form `if (ptr)' or `if (!ptr)'.
They do not depend on the definition of the symbol NULL.
Further, they provide the opportunity for the accidental
assignment, are clear and succinct.
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was just plain broken. It used always the value from v0 (for vp_index)
but would pass the value from the provoking vertex to later stages - but only
if there was a corresponding fs input, otherwise the layer/vp index would get
lost completely (as it would try to interpolate the (unsigned) values as
floats).
So, make it obey provoking vertex rules (drivers relying on draw will need to
do the same). And make sure that the default interpolation mode (when no
corresponding fs input is found) for them is constant.
Also, change the code a bit so constant inputs aren't interpolated then
copied over later.
Fixes the new piglit test gl-layer-render-clipped.
v2: more consistent whitespaces fixes for function defs, and more tab killing
(overall still not quite right however).
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
compressed textures are very slow because decoding is rather complex
(and because there's no jit code code to decode them too for non-technical
reasons).
Thus, add some texture cache which holds a couple of decoded blocks.
Right now this handles only s3tc format albeit it could be extended to work
with other formats rather trivially as long as the result of decode fits into
32bit per texel (ideally, rgtc actually would decode to more than 8 bits
per channel, but even then making it work for it shouldn't be too difficult).
This can improve performance noticeably but don't expect wonders (uncompressed
is unsurprisingly still faster). It's also possible it might be slower in
some cases (using nearest filtering for example or if there's otherwise not
many cache hits, the cache is only direct mapped which isn't great).
Also, actual decode of a block relies on util code, thus even though always
full blocks are decoded it is done texel by texel - this could obviously
benefit greatly from simd-optimized code decoding full blocks at once...
Note the cache is per (raster) thread, and currently only used for fragment
shaders.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the draw module splits long line loops, the sections are emitted
as line strips. But the primitive type wasn't set correctly so each
section was being drawn as a loop, introducing extra line segments.
To fix this, we pass a new DRAW_LINE_LOOP_AS_STRIP flag to the run()
function. The linear/elt_run() functions have to check for this flag
and set their primitive type accordingly.
No piglit regressions. Fixes piglit's lineloop with -count 4097 or
higher.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81174
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generated by running:
git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g'
git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g'
git checkout src/gallium/state_trackers/clover/Doxyfile
and manual edits to
src/gallium/include/pipe/p_compiler.h
src/gallium/README.portability
to remove mentions of the inline define.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This extends the draw code to add support for invocations.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This is just for softpipe, llvmpipe won't work without
some changes.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This probably got broken when the samplers were converted to be indexed
by shader type.
Seen when looking at bug 89819 though I'm not sure if that really was what
the bug was about...
Cc: "10.5 10.6" <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Was off-by-one. llvm says inserting an element with an index higher than the
number of elements yields undefined results. Previously such inserts were
ignored but as of llvm revision 235854 the vector gets replaced with undef,
causing failures.
This fixes piglit gl-3.2-layered-rendering-gl-layer, as mentioned in
https://llvm.org/bugs/show_bug.cgi?id=23424.
Reviewed-by: Brian Paul <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were resetting the prim id count for each run of the prim assembler,
hence this only worked when the draw calls were very small (the exact limit
depending on the vertex size), since larger draw calls get split up.
So, do the same as we do already if there's a gs, reset it to zero explicitly
for every new instance (this possibly could use the same variable but that
isn't doable without some heavy refactoring and I'm not sure it makes sense).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90130.
Reviewed-by: Jose Fonseca <[email protected]>
CC: <[email protected]>
|
|
|
|
|
|
|
|
| |
Neither the shader nor the key change when doing elts or linear variant, so
this was just annoying (probably mildly useful at some point when we printed
the IR per function too).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm goes crazy when doing that, using way more memory and time, though there's
probably more to it - this points to a very much similar issue as fixed in
8a9f5ecdb116d0449d63f7b94efbfa8b205d826f. In any case I've seen a quite
plain looking vertex shader with just ~50 simple tgsi instructions (but with a
dozen or so such indirect constant buffer lookups) go from a terribly high
~440ms compile time (consuming 25MB of memory in the process) down to a still
awful ~230ms and 13MB with this fix (with llvm 3.3), so there's still obvious
improvements possible (but I have no clue why it's so slow...).
The resulting shader is most likely also faster (certainly seemed so though
I don't have any hard numbers as it may have been influenced by compile times)
since generally fetching constants outside the buffer range is most likely an
app error (that is we expect all indices to be valid).
It is possible this fixes some mysterious vertex shader slowdowns we've seen
ever since we are conforming to newer apis at least partially (the main draw
loop also has similar looking conditionals which we probably could do without -
if not for the fetch at least for the additional elts condition.)
v2: use static vars for the fake bufs, minor code cleanups
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has got a bit out of control with more and more parameters added.
Worse, whenever something in there changes all callees have to be updated
for that, even though they don't really do much with any parameter in there
except pass it on to the actual sampling function.
Hence simply put almost everything into a struct. Also instead of relying
on some arguments being NULL, be explicit and set this in a key (which is
just reused for function generation for simplicity). (The code still relies
on them being NULL in the end for now.)
Technically there is a minimal functional change here for shadow sampling:
if shadow sampling is done is now determined explicitly by the texture
function (either sample_c or the gl-style tex func inherit this from target)
instead of the static texture state. These two should always match, however.
Otherwise, it should generate all the same code.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The callbacks used for getting the dynamic texture/sampler state were using
the jit_context from the generated jit function. This works just fine, however
that way it's impossible to generate separate functions for texture sampling,
as will be done in the next commit. Hence, pass this pointer through all
interfaces so it can be passed to a separate function (technically, it would
probably be possible to extract this pointer from the current function instead,
but this feels hacky and would probably require some more hacks if we'd use
real functions instead of inlining all shader functions at some point).
There should be no difference in the generated code for now.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89372
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
The code was exactly the same, except util/ has c++ guards and a struct
simple_node declaration.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
v2: move initialization of llvm_gs to declaration.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts db3dfcfe90a3d27e6020e0d3642f8ab0330e57be.
The commit was correct but we've got some precision problems later in
llvmpipe (or possibly in draw clip) due to the vertices coming in in
different order, causing some internal test failures. So revert for now.
(Will only affect drivers which actually support constant-interpolated
attributes and not just flatshading.)
|
|
|
|
|
|
| |
This fixes 4 vertexid related piglit tests with llvmpipe due to switching
behavior of vertexid to the one gl expects.
(Won't fix non-llvm draw path since we don't get the basevertex currently.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because all topologies are reduced to basic primitives (i.e. no strips, fans)
and the vertices involved are all copied, there's no need for any elaborate
decisions where to insert the prim id. The logic employed was correct for
first provoking vertex, but didn't account at all for the last provoking
vertex case. And since we now will get the right constant value even if the
primitive type is later changed (for unfilled etc.) this is no longer
required to pass certain tests (which were checking for prim_id == some
const interpolated value so passing because both were wrong in the end).
This is a bit overkill (3x4 values assigned in total even though it's really
one scalar per prim...) but the code is now much easier and I don't need to
add more cases for last provoking vertex.
This fixes piglit primitive-id-no-gs-strip test.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the first provoking vertex convention would only be used if
flatshading were enabled. No matter how I look at it that cannot be possibly
correct. Maybe the code getting used was somewhat simpler that way at a time
where there weren't constant interpolated attributes, only flatshading...
(Note that all other places including the decomposition macros already do
the same.)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This stage only worked for traditional old-school flatshading, it did ignore
constant interpolated values and only handled colors, the code probably
predates using of constant interpolated values in gallium. So fix this - the
clip stage apparently did this a long time ago already.
Unfortunately this also means the stage needs to be invoked when flatshading
isn't enabled but some other prim changing stages are - for instance with
fill mode line each of the 3 lines in a tri should get the same attribute
value from the leading vertex in the original tri if interpolation is constant,
which did not happen before
Due to that, the stage is now run in more cases, even unnecessary ones. Could
in theory skip it completely if there aren't any constant interpolated
attributes (and rast->flatshade isn't set), but not sure it's worth bothering,
as it looks kinda complicated getting this information in advance.
No piglit change (doesn't really cover this directly).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Just like we do for tris (det shouldn't matter at this point, however
can have flags for things like line stipple reset).
No piglit change, it would fail line stippling tests if the flatshade
stage were run, which will happen with the next commit.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Required by Nine. Tested with util_run_tests.
It's added to softpipe, llvmpipe, and r300g/swtcl.
Tested-by: David Heidelberg <[email protected]>
|