| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g'
Just happened to notice this in a patch that was sent and included one
of the tokens in question.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Looks like immed branch offset size increased again.. making what we
think is a small negative number look to hw like a huge positive number.
And things go badly when shader tries to jump to hyperspace.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On a3xx/a4xx, the SP_VS_VPC_DST_REG.OUTLOCn is offset by 8, so we used
to add this offset into fs->inputs[n].inloc. But a5xx drops this extra
offset-by-8. So instead make inloc zero based and add the offset when
we emit OUTLOCn values (for the gen's that need the offset).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Helps simplify things on a5xx, where pos/psize get added to the vs-out
map. And anyways, simplifies a3xx and a4xx.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Fixes fallout from acc23b04 ("ralloc: remove memset from ralloc_size").
We were still depending on zero'd allocations in a couple of places.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No change in behavior. ralloc_size is equivalent to rzalloc_size.
That will change though.
Calls not switched to rzalloc_size:
- ralloc_vasprintf
- glsl_type::name allocation (it's filled with snprintf)
- C++ classes where valgrind didn't show uninitialized values
I switched most of non-glsl stuff to rzalloc without checking whether
it's really needed.
Reviewed-by: Edward O'Callaghan <[email protected]>
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.
There are other advantages such as being able to reuse the
shader info populated by GLSL IR.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
One of NIR's invariants is that control flow lists always start and end
with blocks. There's no good reason why we should return a cf_node from
these functions since we know that it's always a block. Making it a block
lets us remove a bunch of code.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs,
or a clip vertex, or clip distances are in use, then we must use the
fallback discard-based clipping from the frag shader.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jason suggested adding an assert(function->impl) here. All callers
of this function actually want ->impl, so I decided just to change
the API.
We also change the nir_lower_io_to_temporaries API here. All but one
caller passed nir_shader_get_entrypoint(), and with the previous commit,
it now uses a nir_function_impl internally. Folding this change in
avoids the need to change it and change it back.
v2: Fix one call I missed in ir3_compiler (caught by Eric).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
This reduces the diff between GLSL-to-NIR and TGSI-to-NIR, and gives NIR
more optimization to work on.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This lets TTN-using drivers handle FRAG_RESULT_DEPTH the same between all
their source paths.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For various tex fetch instructions, coord's get fixed up in different
ways. But modifying the array returned from get_src() has side-effects
if the same SSA src is used again.. the later instruction will see the
previous fixups.
Fix this, and const'ify things to prevent this sort of mistake in the
future.
Noticed by Varad when adding support for txf_ms.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
To silence missing initializers warning
Signed-off-by: Francesco Ansanelli <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
This is also used in gmem code, which executes from the "bottom half"
(ie. from the flush_queue worker thread), so it cannot be in fd_context.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Likewise, rename the enum type to glsl_interp_mode.
Beyond the GLSL front-end, talking about "interpolation modes" seems
more natural than "interpolation qualifiers" - in the IR, we're removed
from how exactly the source language specifies how to interpolate an
input. Also, SPIR-V calls these "decorations" rather than "qualifiers".
Generated by:
$ find . -regextype egrep -regex '.*\.(c|cpp|h)' -type f -exec sed -i \
-e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \
-e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \;
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For .vert/.frag, now multiple can be specified on the cmdline for
purposes of linking, and the last one specified is the one that is
fed into the ir3 backend (and dumped along the way if --verbose is
specified)
Without this, varyings in frag shaders would appear as undefined.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
We can push the unwrap of pipe_resource down.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up misrepetitions ('if if', 'the the' etc) found throughout the
comments. This has been done manually, after grepping
case-insensitively for duplicate if, is, the, then, do, for, an,
plus a few other typos corrected in fly-by
v2:
* proper commit message and non-joke title;
* replace two 'as is' followed by 'is' to 'as-is'.
v3:
* 'a integer' => 'an integer' and similar (originally spotted by
Jason Ekstrand, I fixed a few other similar ones while at it)
Signed-off-by: Giuseppe Bilotta <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
Give algebraic-opt pass a chance to catch udiv by const power-of-two,
before running lower-idiv pass.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
CID 1362453
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Should also fix coverity warning: CID 1362454
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Right now libglsl.la depends on libnir.la so putting it in libnir.la
adds a dependency on libglsl.la that goes the wrong direction.
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
| |
Use glsl/libstandalone.la to add support for taking glsl src files (in
addition to .tgsi) as input. Then glsl->nir and feed the result into
the ir3 backend as normal.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variable-indexing tests always had a few random fails, which I
usually couldn't reproduce when running tests manually. Somehow
recently this got a lot worse. I ported a couple of the shaders to
GLES to see what blob does, and it also seems to be avoiding to cp
indirect srcs. So I guess indirect w/ instructions other than cat1
(mov) are not totally reliable. Let's just switch that off until
this is better understood.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Don't hard-code the gpu-id anymore.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Not sure how we didn't hit this already, but since we want fdiv
converted into mul + rcp, we should set this.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
In the glsl->tgsi path, this already gets translated to VAR8, which
matches up with rasterizer->sprite_coord_enable.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
When we got NIR directly from state tracker (vs using tgsi_to_nir) we
need to realize this and skip some TGSI specific hacks.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
For now under debug flag, since only suitable for debugging/testing.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
The i965 driver has its own pass for fusing mul+add combinations that's
much smarter than what nir_opt_algebraic can do so we don't want to get the
nir_opt_algebraic one just because we didn't set lower_ffma.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We index into these based on var->data.driver_location, which might have
gaps (ie. two inputs, one w/ drvloc 0 and other 2). This shows up in
(for example) 'bin/copyteximage 1D', but was only noticed recently due
to additional asserts.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Lower lrp when operating with double operands because float version of
lrp is also lowered.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Since this is potentially modifying the block structure of the shader,
it needs the _safe() version of the iterator.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are a total of four possible currently, rather than 2. So we need
to be prepared for the input array to grow by 16 components. We could
get away with less if we could pack sysval inputs.. and the way this is
handled currently isn't really the nicest thing. But it's a tactical
fix for an issue hit in:
GL31-CTS.gtf30.GL3Tests.transform_feedback.transform_feedback_vertex_id
Signed-off-by: Rob Clark <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This was always a bit overly complicated, and had some issues (like
ctx->prog.dirty not getting reset at the end of the batch). It also
required some special hacks to avoid resetting dirty state on binning
pass. So just move it all into ctx->dirty (leaving some free bits
for future shader stages), and make FD_DIRTY_PROG just be the union
of all FD_SHADER_DIRTY_*.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Now that the opc's encode the instruction category (making them unique)
we no longer need to check the category in addition to the opc.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The instruction encoding allows for more registers, but at least on
a3xx/a4xx they don't actually exist.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Helps reduce register pressure and instruction counts for immediates
that would otherwise require a mov into gpr.
total instructions in shared programs: 4455332 -> 4369297 (-1.93%)
total dwords in shared programs: 8807872 -> 8614432 (-2.20%)
total full registers used in shared programs: 263062 -> 250846 (-4.64%)
total half registers used in shader programs: 9845 -> 9845 (0.00%)
total const registers used in shared programs: 1029735 -> 1466993 (42.46%)
half full const instr dwords
helped 0 10415 0 17861 5912
hurt 0 1157 21458 947 33
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Needed in next commit.. just split out to reduce noise.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Brian Paul <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|