| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Originally there was CL code for handling various relocations back when I
had relocs for the TSDA/TA buffers. Now that the kernel handles those
entirely on its own, I can inline that code into the one place using it.
|
|
|
|
|
|
|
|
|
|
|
| |
Mesa's DEBUG and assert's NDEBUG are not tied to each other, so we need
to explicitly compile this code out.
Fixes: 3df78928786134874eafa "vc4: Drop reloc_count tracking for debug
asserts on non-debug builds."
Cc: Eric Anholt <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
For now this is a no-op on the output, but it makes it clear that we've
had weird things going on with things like
V3D21_CLIPPER_Z_SCALE_AND_OFFSET.
|
|
|
|
|
|
|
|
|
|
| |
This gets our vc4_emit.c size back down a bit:
before:
1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o
after:
968 0 0 968 3c8 src/gallium/drivers/vc4/.libs/vc4_emit.o
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Take the CL pointer in, which will be useful for enabling relocs.
However, our code expands a bit more:
before:
4449 0 0 4449 1161 src/gallium/drivers/vc4/.libs/vc4_draw.o
988 0 0 988 3dc src/gallium/drivers/vc4/.libs/vc4_emit.o
after:
4481 0 0 4481 1181 src/gallium/drivers/vc4/.libs/vc4_draw.o
1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This slightly inflates the size of the generated code, in exchange for
getting us some convenient tools.
before:
4389 0 0 4389 1125 src/gallium/drivers/vc4/.libs/vc4_draw.o
808 0 0 808 328 src/gallium/drivers/vc4/.libs/vc4_emit.o
after:
4449 0 0 4449 1161 src/gallium/drivers/vc4/.libs/vc4_draw.o
988 0 0 988 3dc src/gallium/drivers/vc4/.libs/vc4_emit.o
|
|
|
|
|
| |
This is a preparation step for having multiple jobs being queued up at the
same time.
|
|
|
|
| |
Cuts another 88 bytes of compiled code.
|
|
|
|
|
| |
Drops 680 bytes of code, from avoiding a bunch of extra updates to the
next pointer in the struct.
|
|
|
|
|
|
| |
I needed to rewrite this a bit for safety checking in the next commit.
Despite being a static inline of the same thing that was being done, we
lose 36 bytes of code for some reason.
|
|
|
|
|
|
| |
Now that RCL generation is in the kernel, we don't have any other
callers. Oddly, the compiler generates another 8 bytes of code for
this, but the simplification is worth it.
|
|
|
|
|
|
|
| |
Now that we don't resize the CL as we build (it's set up at the top by
vc4_start_draw()), we can store the pointers instead of offsets from
the base. Saves a bit of math in emitting relocs (about 60 bytes of
code).
|
|
|
|
| |
I want to notice discrepancies when I diff -u between Mesa and the kernel.
|
|
|
|
|
|
|
| |
As of 229bf4475ff0a5dbeb9bc95250f7a40a983c2e28 we started getting SIBGUS
from unaligned accesses on the hardware, for reasons I haven't figured
out. However, we should be avoiding unaligned accesses anyway, and our CL
setup certainly would have produced them.
|
|
|
|
|
| |
They should all be set to real values by the time they're read, and
ideally if you used valgrind you'd see uninitialized value uses.
|
|
|
|
| |
It doesn't matter, since it just got truncated to 16 inside, anyway.
|
|
|
|
|
|
|
|
| |
The optimizer obviously doesn't have the ability to rewrite these to skip
the size checks per call, so we have to do it manually.
Improves a norast benchmark on simulation by 0.779706% +/- 0.405838%
(n=6087).
|
|
|
|
|
| |
Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673%
(n=20).
|
|
|
|
| |
This caught the previous commit's bug in the kernel validator.
|
| |
|
|
|
|
|
| |
It's not a real VC4 hardware packet, but I've put in a comment to explain
it.
|
|
|
|
|
|
|
|
| |
This ensures that when I'm using the simulator, I get a closer match to
what behavior on real hardware will be. It lets me rapidly iterate on the
kernel validation code (which otherwise has a several-minute turnaround
time), and helps catch buffer overflow bugs in the userspace driver
faster.
|
|
This mostly just takes every draw call and turns it into a sequence of
commands that clear the FBO and draw a single shaded triangle to it,
regardless of the actual input vertices or shaders. I copied the initial
driver skeleton mostly from freedreno, and I've preserved Rob Clark's
copyright for those. I also based my initial hardcoded shaders and
command lists on Scott Mansell (phire)'s "hackdriver" project, though the
bit patterns of the shaders emitted end up being different.
v2: Rebase on gallium megadrivers changes.
v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change.
v4: Rely on simpenrose actually being installed when building for
simulation.
v5: Add more header duplicate-include guards.
v6: Apply Emil's review (protection against vc4 sim and ilo at the same
time, and dropping the dricommon drm bits) and fix a copyright header
(thanks, Roland)
|