| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Cuts another 88 bytes of compiled code.
|
|
|
|
|
| |
Drops 680 bytes of code, from avoiding a bunch of extra updates to the
next pointer in the struct.
|
|
|
|
|
|
| |
I needed to rewrite this a bit for safety checking in the next commit.
Despite being a static inline of the same thing that was being done, we
lose 36 bytes of code for some reason.
|
|
|
|
|
|
| |
Now that RCL generation is in the kernel, we don't have any other
callers. Oddly, the compiler generates another 8 bytes of code for
this, but the simplification is worth it.
|
|
|
|
|
|
|
| |
Now that we don't resize the CL as we build (it's set up at the top by
vc4_start_draw()), we can store the pointers instead of offsets from
the base. Saves a bit of math in emitting relocs (about 60 bytes of
code).
|
|
|
|
| |
I want to notice discrepancies when I diff -u between Mesa and the kernel.
|
|
|
|
|
|
|
| |
As of 229bf4475ff0a5dbeb9bc95250f7a40a983c2e28 we started getting SIBGUS
from unaligned accesses on the hardware, for reasons I haven't figured
out. However, we should be avoiding unaligned accesses anyway, and our CL
setup certainly would have produced them.
|
|
|
|
|
| |
They should all be set to real values by the time they're read, and
ideally if you used valgrind you'd see uninitialized value uses.
|
|
|
|
| |
It doesn't matter, since it just got truncated to 16 inside, anyway.
|
|
|
|
|
|
|
|
| |
The optimizer obviously doesn't have the ability to rewrite these to skip
the size checks per call, so we have to do it manually.
Improves a norast benchmark on simulation by 0.779706% +/- 0.405838%
(n=6087).
|
|
|
|
|
| |
Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673%
(n=20).
|
|
|
|
| |
This caught the previous commit's bug in the kernel validator.
|
| |
|
|
|
|
|
| |
It's not a real VC4 hardware packet, but I've put in a comment to explain
it.
|
|
|
|
|
|
|
|
| |
This ensures that when I'm using the simulator, I get a closer match to
what behavior on real hardware will be. It lets me rapidly iterate on the
kernel validation code (which otherwise has a several-minute turnaround
time), and helps catch buffer overflow bugs in the userspace driver
faster.
|
|
This mostly just takes every draw call and turns it into a sequence of
commands that clear the FBO and draw a single shaded triangle to it,
regardless of the actual input vertices or shaders. I copied the initial
driver skeleton mostly from freedreno, and I've preserved Rob Clark's
copyright for those. I also based my initial hardcoded shaders and
command lists on Scott Mansell (phire)'s "hackdriver" project, though the
bit patterns of the shaders emitted end up being different.
v2: Rebase on gallium megadrivers changes.
v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change.
v4: Rely on simpenrose actually being installed when building for
simulation.
v5: Add more header duplicate-include guards.
v6: Apply Emil's review (protection against vc4 sim and ilo at the same
time, and dropping the dricommon drm bits) and fix a copyright header
(thanks, Roland)
|