| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
s/agressive/aggressive/g
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader-db results on Broadwell:
total instructions in shared programs: 7152330 -> 7137006 (-0.21%)
instructions in affected programs: 1330548 -> 1315224 (-1.15%)
helped: 5797
HURT: 76
GAINED: 0
LOST: 8
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Previously, this case was being handled in match_expression prior to
calling match_value. However, there is really no good reason for this
given that match_value has all of the information it needs. Also, they
weren't being handled properly in the commutative case and putting it in
match_value gives us that for free.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit switches us from the current setup of using hash sets for
use/def sets to using linked lists. Doing so should save us quite a bit of
memory because we aren't carrying around 3 hash sets per register and 2 per
SSA value. It should also save us CPU time because adding/removing things
from use/def sets is 4 pointer manipulations instead of a hash lookup.
Running shader-db 50 times with USE_NIR=0, NIR, and NIR + use/def lists:
GLSL IR Only: 586.4 +/- 1.653833
NIR with hash sets: 675.4 +/- 2.502108
NIR + use/def lists: 641.2 +/- 1.557043
I also ran a memory usage experiment with Ken's patch to delete GLSL IR and
keep NIR. This patch cuts an aditional 42.9 MiB of ralloc'd memory over
and above what we gained by deleting the GLSL IR on the same dota trace.
On the code complexity side of things, some things are now much easier and
others are a bit harder. One of the operations we perform constantly in
optimization passes is to replace one source with another. Due to the fact
that an instruction can use the same SSA value multiple times, we had to
iterate through the sources of the instruction and determine if the use we
were replacing was the only one before removing it from the set of uses.
With this patch, uses are per-source not per-instruction so we can just
remove it safely. On the other hand, trying to iterate over all of the
instructions that use a given value is more difficult. Fortunately, the
two places we do that are the ffma peephole where it doesn't matter and GCM
where we already gracefully handle duplicates visits to an instruction.
Another aspect here is that using linked lists in this way can be tricky to
get right. With sets, things were quite forgiving and the worst that
happened if you didn't properly remove a use was that it would get caught
in the validator. With linked lists, it can lead to linked list corruption
which can be harder to track. However, we do just as much validation of
the linked lists as we did of the sets so the validator should still catch
these problems. While working on this series, the vast majority of the
bugs I had to fix were caught by assertions. I don't think the lists are
going to be that much worse than the sets.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
We were rolling our own rewrite_src variant in copy-propagation. Let's
stop doing that and use the ones in core NIR.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The out-of-SSA pass was one of the first passes written when getting SSA
up-and-going (for obvious reasons). As such, it came before a lot of the
nifty SSA-based helpers were introduced. This commit modernizes it so that
we're no longer doing nearly as much manual banging on use/def sets.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
| |
Nothing produces it, and nothing can consume it.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
All paths that produce GLSL IR for NIR lower ir_unop_log. All paths
that consume NIR will explode if they geta nir_op_flog.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Nothing produces it, and nothing can consume it.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
All paths that produce GLSL IR for NIR lower ir_unop_exp. All paths
that consume NIR will explode if they geta nir_op_fexp.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The spec is vague all over the place about this, but this seems
to be the intent, we can probably make this optional later if
someone makes hw that cares and writes a driver.
Basically we need to double count some of the d types but
only for totalling not for slot number assignment.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
instead of doing the attempts at dual slot handling here,
let the backend do it.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Just more boilerplate stuff.
v2:
bad fallthrough on versioning,
this is my ugly but self contained solution (Ian)
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
instructions in affected programs: 380 -> 376 (-1.05%)
helped: 2
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... and (a >= c) || (b >= c) as max(a, b) >= c.
Similar to commit 97e6c1b9.
total instructions in shared programs: 6182276 -> 6182180 (-0.00%)
instructions in affected programs: 6400 -> 6304 (-1.50%)
helped: 68
HURT: 4
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
| |
No changes, but does prevent some regressions in the next commit.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Helps the same set of programs as the previous commit.
instructions in affected programs: 4490 -> 4346 (-3.21%)
helped: 8
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Four shaders in Unreal 4's Sun Temple are helped, and gain SIMD16
because we avoid an integer multiplication.
instructions in affected programs: 2353 -> 2245 (-4.59%)
helped: 4
GAINED: 4
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactoring done on active attribute queries did not take in to
account special built-in inputs for the vertex stage. This commit
sets them referenced by vertex stage so that they get enumerated
properly.
Fixes Piglit test 'get-active-attrib-returns-all-inputs' failure.
Signed-off-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90243
Acked-by: Jose Fonseca <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Reviewed-By: Martin Peres <[email protected]>
|
|
|
|
|
|
|
|
| |
Silences gcc warning:
builtin_functions.cpp:204:23: warning: suggest parentheses around '&&'
within '||' [-Wparentheses]
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise `make distcheck' will fail.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 7e414b58640aee6e243d337e72cea290c354f632 broke the gl_FragData array
into separate gl_FragData[i] variables, so drivers can eliminate useless
writes to gl_FragData improving their performance.
The problem occurs when GLSL IR code is linked in the following case:
* The FS output variable base data type does not match gl_FragData one (float
vector)
* The FS output variable is replaced by gl_out_FragDataX because of commit
7e414b58640aee6 with X from 0 to GL_MAX_DRAW_BUFFERS.
Then the FS output variable base data type is lost in the resulting GLSL IR,
making that the driver does a wrong assignment to gl_out_FragData components
because of unmatching data types.
This patch reverts the fragdata array lowering when the output var base data type
doesn't match gl_out_FragData, i.e., when output variable base data type is
not a float or a float vector.
This patch fixes 250 dEQP tests (tested in an Intel Haswell machine)
dEQP-GLES3.functional.fragment_out.random.* (22 failed tests)
dEQP-GLES3.functional.fragment_out.array.uint.* (120 failed tests)
dEQP-GLES3.functional.fragment_out.array.int.* (108 failed tests)
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Add missing lexer support. Noticed by Tapani.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]> [v1]
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
Currently no 3.10 ES features (beyond 3.00 ES) are enabled. That will
come later.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
| |
v2: Change GL version from 400 to 420. Noticed by Tapani and Ilia.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I opted to comment out "last_field" because it was not obvious what the
meaning of the dangling bool would be. For the other parameters, the
meaning was more intuitive without the name.
link_uniform_blocks.cpp:70:65: warning: unused parameter 'name' [-Wunused-parameter]
virtual void enter_record(const glsl_type *type, const char *name,
^
link_uniform_blocks.cpp:77:65: warning: unused parameter 'name' [-Wunused-parameter]
virtual void leave_record(const glsl_type *type, const char *name,
^
link_uniform_blocks.cpp:93:62: warning: unused parameter 'record_type' [-Wunused-parameter]
bool row_major, const glsl_type *record_type,
^
link_uniform_blocks.cpp:94:34: warning: unused parameter 'last_field' [-Wunused-parameter]
bool last_field)
^
link_uniforms.cpp:547:65: warning: unused parameter 'name' [-Wunused-parameter]
virtual void enter_record(const glsl_type *type, const char *name,
^
link_uniforms.cpp:556:65: warning: unused parameter 'name' [-Wunused-parameter]
virtual void leave_record(const glsl_type *type, const char *name,
^
link_uniforms.cpp:567:34: warning: unused parameter 'last_field' [-Wunused-parameter]
bool last_field)
^
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
And rename _mesa_glsl_parse_state::early_fragment_tests to
fs_early_fragment_tests for consistency with other FS-specific flags in the
same struct.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Image memory qualifiers (coherent, volatile, restrict, readonly and writeonly)
follow slightly different rules from storage qualifiers, e.g. the uniqueness
rule doesn't apply. Make them a separate non-terminal.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Broke in commit f00c5f85b82efe9535b18dbf97c4591fb28aeae6 when
adding support for multidimensional arrays
Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timothy Arceri <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timothy Arceri <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
| |
|
|
|
|
|
| |
Reviewed-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The nir_lower_source_mods pass does a weak form of copy propagation to
clean up all of the mov-with-negate's that get generated. However, we
weren't properly checking that the sources were SSA and so we could end up
moving a register read which is not, in general, valid.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
The old code wasn't correctly handling the case where the new value of the
source contains an indirect.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|