| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
DEQP likes to do math on uniforms, and the "fmaxabs dst, uni, uni" to get
the absolute value would get lowered. The lowering doesn't bother to try
to restrict the lifetime of the lowered uniforms, so we'd end up register
allocation failng due to this on 5 of the tests (More tests still fail in
RA, which look like we'll need to reduce lowered uniform lifetimes to
fix).
No changes on shader-db, though fewer extra MOVs are generated on even
glxgears (MOVs pair well enough that it ends up being the same instruction
count).
|
|
|
|
|
| |
This does actually happen in the wild (particularly fabs of a uniform), so
we'd like to support it.
|
| |
|
|
|
|
|
|
|
|
| |
It looks like nir_lower_idiv is going to use it soon, so add support.
With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with
GL 3.0 forced on).
Cc: "11.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Since X has undefined contents in new pixmaps, it will allocate new
textures for an FBO and draw to them without an explicit clear. For
VC4, it's much faster to emit a clear than the load of the actual
undefined memory contents, so just do that instead.
|
|
|
|
|
|
|
| |
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
| |
I was afraid our callers weren't prepared for this, but it looks like
at least for resource creation, mesa/st throws an error appropriately.
Cc: "11.0" <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This greatly increases the pressure you can put on the driver before
create fails. Ultimately we need to let the kernel take control of
our cached BOs and just take them from us (and other clients)
directly, but this is a very easy patch for the moment.
Cc: "11.0" <[email protected]>
|
|
|
|
|
| |
It's surprising to see "0kb" printed for debug on short shaders, while
4kb alignment won't be suprising.
|
|
|
|
| |
60MB of cached BOs are a lot less scary than 600MB.
|
|
|
|
|
| |
Improves low-settings openarena performance by 31.9975% +/- 0.659931%
(n=7).
|
|
|
|
|
|
| |
For ARB_copy_image.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
total instructions in shared programs: 89251 -> 87862 (-1.56%)
instructions in affected programs: 52971 -> 51582 (-2.62%)
|
|
|
|
| |
Another step in reducing the special-casing of instructions.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This paves the way for copy propagating our unpacks. We end up with a
small change on shader-db:
total instructions in shared programs: 89390 -> 89251 (-0.16%)
instructions in affected programs: 19041 -> 18902 (-0.73%)
which appears to be because we no longer convert MOVs for an FMAX dst,
r4.unpack, r4.unpack (instead of the previous MOV dst, r4.unpack), and
this ends up with a slightly better schedule.
|
|
|
|
|
|
|
|
|
| |
At one point I thought packs and unpacks were in the same field of the
instruction. They aren't. These instructions therefore never cause a
pack.
total instructions in shared programs: 89472 -> 89390 (-0.09%)
instructions in affected programs: 15261 -> 15179 (-0.54%)
|
|
|
|
|
| |
I'm going to introduce some more types of MOV, which also want the elision
of raw MOVs.
|
|
|
|
| |
We can do 16a/16b from float as well. No difference on shader-db.
|
| |
|
|
|
|
| |
No problems being fixed, but needed for the new unpack changes.
|
|
|
|
| |
Not used yet, but will be.
|
|
|
|
|
| |
They're only f16-to-f32 on a float operation, otherwise they're
i16-to-i32.
|
|
|
|
|
| |
No known bugs, just something I noticed while updating optimization code
for other changes.
|
|
|
|
|
|
|
|
|
|
| |
One instruction instead of four, and it turns out you do this a lot for
the Over operator.
total uniforms in shared programs: 32168 -> 32087 (-0.25%)
uniforms in affected programs: 318 -> 237 (-25.47%)
total instructions in shared programs: 89830 -> 89472 (-0.40%)
instructions in affected programs: 6434 -> 6076 (-5.56%)
|
|
|
|
|
| |
I don't know what previous test was trying to do, but it dates back to the
first add of vc4_qpu_emit.c. No change to shader-db.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't do this all the time, because you want blending to be done in
linear space, and sRGB would lose too much precision being done in 4x8.
The win on instructions is pretty huge when you can, though.
total uniforms in shared programs: 32065 -> 32168 (0.32%)
uniforms in affected programs: 327 -> 430 (31.50%)
total instructions in shared programs: 92644 -> 89830 (-3.04%)
instructions in affected programs: 15580 -> 12766 (-18.06%)
Improves openarena performance at 1920x1080 from 10.7fps to 11.2fps.
|
| |
|
|
|
|
|
|
|
| |
This can happen when we're doing destination packing -- we don't know
what's in the rest of the register.
Signed-off-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
|
|
| |
I haven't proven that this happens (I've got other GPU hangs in the
way), but the closed driver also does this and it's documented as an
errata.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a little bit like the mprotect-based fencing I've experimented
with, but it's simple and low overhead. The downside is that only catches
writes, not reads.
It didn't catch any bad writes on a current piglit run, but may be useful
in the future.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids a serious r600g bug leading to a GPU hang.
The chances this bug will get fixed are pretty low now.
I deeply regret listening to others and not pushing this patch, leaving
other users with a GPU-crashing driver. Yes, it should be fixed
in the compiler and it's ugly, but users couldn't care less about that.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720
Cc: 11.0 10.6 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This exposes more information to NIR's optimization, and should be
particularly useful when we do range-based optimization.
total uniforms in shared programs: 32066 -> 32065 (-0.00%)
uniforms in affected programs: 21 -> 20 (-4.76%)
total instructions in shared programs: 93104 -> 92630 (-0.51%)
instructions in affected programs: 31901 -> 31427 (-1.49%)
|
|
|
|
|
| |
This is just enough to cover our unpack modes, which will be used by some
new NIR-based lowering in the next commit.
|
|
|
|
|
|
| |
I'll let drivers figure out how to do it.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Similar to 9ffc1049ca (freedreno/ir3: use nir two-sided-color lowering).
No piglit regression.
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
| |
We validate per draw call, and need to free the shader per draw call, too.
|
|
|
|
|
|
|
|
| |
Required by ARB_sample_shading for drivers that don't want a shader variant
in st/mesa.
Reviewed-by: Ilia Mirkin <[email protected]>
Acked-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instructions with difference in PM field can actually be paired up if
the one without PM doesn't do packing/unpacking and non-NOP
packing/unpacking operations from PM instruction aren't added to the
other without PM.
total instructions in shared programs: 48209 -> 47460 (-1.55%)
instructions in affected programs: 11688 -> 10939 (-6.41%)
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
(originally part of previous patch, split out to separate patch by Rob)
v2: squash in some fixes from Eric
v3: Another fix from Eric for point coords.
Signed-off-by: Rob Clark <[email protected]>
|