| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Since X has undefined contents in new pixmaps, it will allocate new
textures for an FBO and draw to them without an explicit clear. For
VC4, it's much faster to emit a clear than the load of the actual
undefined memory contents, so just do that instead.
|
|
|
|
|
|
|
| |
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
| |
I was afraid our callers weren't prepared for this, but it looks like
at least for resource creation, mesa/st throws an error appropriately.
Cc: "11.0" <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This greatly increases the pressure you can put on the driver before
create fails. Ultimately we need to let the kernel take control of
our cached BOs and just take them from us (and other clients)
directly, but this is a very easy patch for the moment.
Cc: "11.0" <[email protected]>
|
|
|
|
|
| |
It's surprising to see "0kb" printed for debug on short shaders, while
4kb alignment won't be suprising.
|
|
|
|
| |
60MB of cached BOs are a lot less scary than 600MB.
|
|
|
|
|
| |
Improves low-settings openarena performance by 31.9975% +/- 0.659931%
(n=7).
|
|
|
|
|
|
| |
For ARB_copy_image.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
total instructions in shared programs: 89251 -> 87862 (-1.56%)
instructions in affected programs: 52971 -> 51582 (-2.62%)
|
|
|
|
| |
Another step in reducing the special-casing of instructions.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This paves the way for copy propagating our unpacks. We end up with a
small change on shader-db:
total instructions in shared programs: 89390 -> 89251 (-0.16%)
instructions in affected programs: 19041 -> 18902 (-0.73%)
which appears to be because we no longer convert MOVs for an FMAX dst,
r4.unpack, r4.unpack (instead of the previous MOV dst, r4.unpack), and
this ends up with a slightly better schedule.
|
|
|
|
|
|
|
|
|
| |
At one point I thought packs and unpacks were in the same field of the
instruction. They aren't. These instructions therefore never cause a
pack.
total instructions in shared programs: 89472 -> 89390 (-0.09%)
instructions in affected programs: 15261 -> 15179 (-0.54%)
|
|
|
|
|
| |
I'm going to introduce some more types of MOV, which also want the elision
of raw MOVs.
|
|
|
|
| |
We can do 16a/16b from float as well. No difference on shader-db.
|
| |
|
|
|
|
| |
No problems being fixed, but needed for the new unpack changes.
|
|
|
|
| |
Not used yet, but will be.
|
|
|
|
|
| |
They're only f16-to-f32 on a float operation, otherwise they're
i16-to-i32.
|
|
|
|
|
| |
No known bugs, just something I noticed while updating optimization code
for other changes.
|
|
|
|
|
|
|
|
|
|
| |
One instruction instead of four, and it turns out you do this a lot for
the Over operator.
total uniforms in shared programs: 32168 -> 32087 (-0.25%)
uniforms in affected programs: 318 -> 237 (-25.47%)
total instructions in shared programs: 89830 -> 89472 (-0.40%)
instructions in affected programs: 6434 -> 6076 (-5.56%)
|
|
|
|
|
| |
I don't know what previous test was trying to do, but it dates back to the
first add of vc4_qpu_emit.c. No change to shader-db.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't do this all the time, because you want blending to be done in
linear space, and sRGB would lose too much precision being done in 4x8.
The win on instructions is pretty huge when you can, though.
total uniforms in shared programs: 32065 -> 32168 (0.32%)
uniforms in affected programs: 327 -> 430 (31.50%)
total instructions in shared programs: 92644 -> 89830 (-3.04%)
instructions in affected programs: 15580 -> 12766 (-18.06%)
Improves openarena performance at 1920x1080 from 10.7fps to 11.2fps.
|
| |
|
|
|
|
|
|
|
| |
This can happen when we're doing destination packing -- we don't know
what's in the rest of the register.
Signed-off-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
|
|
| |
I haven't proven that this happens (I've got other GPU hangs in the
way), but the closed driver also does this and it's documented as an
errata.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a little bit like the mprotect-based fencing I've experimented
with, but it's simple and low overhead. The downside is that only catches
writes, not reads.
It didn't catch any bad writes on a current piglit run, but may be useful
in the future.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids a serious r600g bug leading to a GPU hang.
The chances this bug will get fixed are pretty low now.
I deeply regret listening to others and not pushing this patch, leaving
other users with a GPU-crashing driver. Yes, it should be fixed
in the compiler and it's ugly, but users couldn't care less about that.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720
Cc: 11.0 10.6 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This exposes more information to NIR's optimization, and should be
particularly useful when we do range-based optimization.
total uniforms in shared programs: 32066 -> 32065 (-0.00%)
uniforms in affected programs: 21 -> 20 (-4.76%)
total instructions in shared programs: 93104 -> 92630 (-0.51%)
instructions in affected programs: 31901 -> 31427 (-1.49%)
|
|
|
|
|
| |
This is just enough to cover our unpack modes, which will be used by some
new NIR-based lowering in the next commit.
|
|
|
|
|
|
| |
I'll let drivers figure out how to do it.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Similar to 9ffc1049ca (freedreno/ir3: use nir two-sided-color lowering).
No piglit regression.
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
| |
We validate per draw call, and need to free the shader per draw call, too.
|
|
|
|
|
|
|
|
| |
Required by ARB_sample_shading for drivers that don't want a shader variant
in st/mesa.
Reviewed-by: Ilia Mirkin <[email protected]>
Acked-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instructions with difference in PM field can actually be paired up if
the one without PM doesn't do packing/unpacking and non-NOP
packing/unpacking operations from PM instruction aren't added to the
other without PM.
total instructions in shared programs: 48209 -> 47460 (-1.55%)
instructions in affected programs: 11688 -> 10939 (-6.41%)
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
(originally part of previous patch, split out to separate patch by Rob)
v2: squash in some fixes from Eric
v3: Another fix from Eric for point coords.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids exceeding the size of the .index bitfield since it got
truncated, and should make our NIR look more like the NIR that the rest of
the NIR developers are working on.
v2: split out vc4 updates, first patch uses varying_slot_to_tgsi_semantic()
helper, and second patch does the actual conversion.
v3: add frag_result_to_tgsi_semantic() helper and don't try to map
frag_results to semantic name/index as if they were varying_slot's
v4: use VERT_ATTRIB_ for VS inputs
v5: Fix vc4 build.
Signed-off-by: Rob Clark <[email protected]>
|
| |
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids generation of undefined packing in qir and qpu instructions,
fixing a lot of rendering errors.
Fixes 8b36d107fdd (vc4: Pack the unorm-packing bits into a src MUL
instruction when possible.)
Cc: [email protected]
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|