| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Fixes assertion failures in 14 piglit tests (half of which now pass).
|
|
|
|
|
|
|
|
| |
The other driver does this manually before calling into each tile, but we
can just let it get binned into the tiles (saving repeated kernel
validation on the packet).
Fixes simulator assertion failures on polygon-mode and non-auto texwrap.
|
| |
|
|
|
|
|
| |
You have to load at least 1, according to the simulator. Fixes 4 piglit
tests and even more ES2 conformance tests.
|
|
|
|
|
| |
Improves simulated norast performance on a little benchmark by 38.0965%
+/- 3.27534% (n=11).
|
|
|
|
|
| |
I was trying to skip state updates when !dirty, and suspiciously
everything was always dirty.
|
|
|
|
|
|
|
|
|
|
|
| |
Merging VS and CS into the same struct wasn't winning us anything except
for not allocating a separate BO (but if we want to pack programs into
BOs, we should pack not just those 2 programs together). What it was
getting us was a bunch of code duplication about hash table lookups and
propagating vc4_compile contents into a vc4_compiled_shader.
I was about to make the situation worse with indirect uniform buffer
access.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
max_index was coming from either the user telling us as part of
glDrawRangeElements, or from an incidental calculation as part of some
sort of primitive conversion fallback. Sometimes, it was just set to the
default "I don't know" ~0 value.
If it wasn't set to the actual max index, then the kernel would reject the
draw call for allowing out-of-bounds VBO reads. So, compute the max index
from the sizes of the VBOs, which isn't too expensive (unlike mapping and
reading the index buffer) and is reliable.
Fixes piglit vao-element-array-buffer.
|
|
|
|
|
| |
The flush only happens after both are written, so we can do them in either
order. This will let me compute max_index during the shader record setup.
|
|
|
|
| |
This is the support for both the global and per-vertex modes.
|
|
|
|
|
|
|
| |
While depth test state is passed through the fragment shader as sideband,
data, the stencil test state has to be set by the fragment shader itself.
Many tests are still failing, but this gets most of hiz/ passing.
|
|
|
|
|
|
| |
The rest of stencil handling isn't done yet, but it documents an extra
cl_u8(0) and helps make it obvious why we don't need to format clear_depth
the same way the depth/stencil buffer is formatted.
|
|
|
|
|
| |
After implementing depth stores, it looks like this is the way things
actually are, according to hiz-depth-read-fbo-d24-s0's probes.
|
|
|
|
|
| |
It was useful on i965, but it's even more useful for debugging tiled
renderers.
|
|
|
|
|
| |
There are a few tools I want to have always available, and fprintf() and
abort() are among them.
|
|
|
|
|
|
|
|
|
|
| |
We were triggering simulator assertion failures for not consuming these,
and presumably we want to actually make use of them some day (for things
like point/line antialiasing)
Note that this has the qreg index as 0, which is the same index as the
first GL varyings read. This doesn't matter currently, since that number
isn't used for anything except dumping.
|
|
|
|
|
|
|
| |
This prevents some simulator assertion failures, but it does mean (since
I've dropped the "* 16" padding) that on real hardware you need a kernel
that does overflow memory management (currently, "drm/vc4: Add support for
binner overflow memory allocation." in my kernel tree).
|
|
|
|
| |
These #defines are 0, but it should help make math above make more sense.
|
|
|
|
|
|
| |
It's not relevant to our command streams any more.
v2: Fix indentation and a typo in the comment.
|
|
|
|
|
|
|
|
|
| |
This doesn't load/store the Z contents across submits yet. It also
disables early Z, since it's going to require tracking of Z functions
across multiple state updates to track the early Z direction and whether
it can be used.
v2: Move the key setup to before the search for the key.
|
|
|
|
| |
Now we actally get multiple draw calls per submit.
|
|
|
|
|
|
| |
This is a step toward queueing more than one draw per frame.
Fixes piglit attribute0 test, since we get a working clear color now.
|
|
|
|
|
| |
We only want to set up render target config and clear colors once per
frame.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This required building a shader parser that would walk the program to find
where the texturing-related uniforms are in the uniforms stream.
Note that as of this commit, a new kernel is required for rendering on
actual VC4 hardware (currently that commit is named "drm/vc4: Introduce
shader validation and better command stream validation.", but is likely to
be squashed as part of an eventual merge of the kernel driver).
|
|
|
|
|
|
|
|
| |
This ensures that when I'm using the simulator, I get a closer match to
what behavior on real hardware will be. It lets me rapidly iterate on the
kernel validation code (which otherwise has a several-minute turnaround
time), and helps catch buffer overflow bugs in the userspace driver
faster.
|
|
|
|
|
|
|
|
| |
Only rgba8888 works, and only a single texture unit, and it's only under
simulation because I haven't built the kernel interface yet.
v2: Rebase on helpers.
v3: Fold in the don't-break-the-arm-build fix.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some tests start working (useprogram-flushverts, for example) due to
getitng the right vertices now. Some that used to pass start failing with
memory overflow during binning, which is weird (glsl-fs-texture2drect).
And a couple stop rendering correctly (glsl-fs-bug25902).
v2: Move the attribute format setup in the key from after search time to
before the search.
v3: Fix reading of attributes other than position (I forgot to respect
attr and stored everything in inputs 0-3, i.e. position).
|
|
|
|
|
| |
This avoids a simulator assertion failure with glamor. I need to actually
support resize, though.
|
| |
|
|
|
|
|
|
| |
It doesn't do all the interpolation yet, but more tests can run now.
v2: Rebase on helpers.
|
| |
|
|
|
|
|
|
|
| |
We will want to occasionally disable this again when we do clear support.
v2: Squash with the previous commit (I accidentally committed at two
stages of writing the change)
|
|
|
|
|
| |
This was a problem for the simulator since we don't free memory back to
it, and it would soon just run out.
|
|
|
|
|
| |
This is hardcoded to read it as RGBA32F so far, but starts to get more
tests working.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces an IR (QIR, for QPU IR) to do optimization on. It's a
scalar, SSA IR in general. It looks like optimization is pretty easy this
way, though I haven't figured out if it's going to be good for our weird
register allocation or not (or if I want to reduce to basically QPU
instructions first), and I've got some problems with it having some
multi-QPU-instruction opcodes (SEQ and CMP, for example) which I probably
want to break down.
Of course, this commit mostly doesn't work, since many other things are
still hardwired, like the VBO data.
v2: Rewrite to use a bunch of helpers (qir_OPCODE) for emitting QIR
instructions into temporary values, and make qir_inst4 take the 4 args
separately instead of an array (all later callers wanted individual
args).
|
|
|
|
|
|
|
|
| |
Note: This is the cutoff point where I switched from developing primarily
on the Pi to developing o the simulator. As a result, from this point on
the code is untested on the Pi (the kernel code I have currently wasn't
rendering anything at this commit, though the simulator renders
successfully, suggesting kernel bugs).
|
|
This mostly just takes every draw call and turns it into a sequence of
commands that clear the FBO and draw a single shaded triangle to it,
regardless of the actual input vertices or shaders. I copied the initial
driver skeleton mostly from freedreno, and I've preserved Rob Clark's
copyright for those. I also based my initial hardcoded shaders and
command lists on Scott Mansell (phire)'s "hackdriver" project, though the
bit patterns of the shaders emitted end up being different.
v2: Rebase on gallium megadrivers changes.
v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change.
v4: Rely on simpenrose actually being installed when building for
simulation.
v5: Add more header duplicate-include guards.
v6: Apply Emil's review (protection against vc4 sim and ilo at the same
time, and dropping the dricommon drm bits) and fix a copyright header
(thanks, Roland)
|