| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
This allows creating compute-only and debug contexts.
Reviewed-by: Brian Paul <[email protected]>
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
Drops 680 bytes of code, from avoiding a bunch of extra updates to the
next pointer in the struct.
|
|
|
|
|
|
| |
I needed to rewrite this a bit for safety checking in the next commit.
Despite being a static inline of the same thing that was being done, we
lose 36 bytes of code for some reason.
|
|
|
|
|
|
|
|
|
| |
Some, but not all, state trackers will explicitly unref (and set to
NULL) the previous *fence before calling pipe->flush(). So driver
should use fence_ref() which will unref the old fence if not NULL.
Signed-off-by: Rob Clark <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This avoids a security issue where userspace could have written the tile
state/tile alloc behind the GPU's back, and will apparently be necessary
for fixing stability bugs (tile state buffers are missing some top bits
for the tile alloc's address).
|
|
|
|
|
| |
There weren't that many variations of RCL generation, and this lets us
skip all the in-kernel validation for what we generated.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea I had when I wrote the original shadow code was that you'd see a
set_index_buffer to the IB, then a bunch of draws out of it. What's
actually happening in openarena is that set_index_buffer occurs at every
draw, so we end up making a new shadow BO every time, and converting more
of the BO than is actually used in the draw.
While I could maybe come up with a better caching scheme, for now just
do the simple thing that doesn't result in a new shadow IB allocation
per draw.
Improves performance of isosurf in drawelements mode by 58.7967% +/-
3.86152% (n=8).
|
|
|
|
|
|
| |
I want to be able to have multiple jobs being set up at the same time (for
example, a render job to do a little fixup blit in the course of doing a
render to the main FBO).
|
| |
|
|
|
|
|
|
| |
We're over-allocating our BCL in vc4_draw.c, so this never mattered.
However, new RCL-only blit support might end up here without having set up
any BCL contents.
|
|
|
|
| |
This wouldn't have mattered except in the worst case scenario RCL setup.
|
|
|
|
|
|
| |
New BO create and mmap ioctls are added. The submit ABI gains a flags
argument, and the pointers are fixed at 64-bit. Shaders are now fixed at
the start of their BOs.
|
|
|
|
|
| |
Execution will end at the cl->next, because that's what ct0ea/ct1ea get
programmed to.
|
|
|
|
|
|
| |
It turns out the simulator was not treating this bit the same as the RPi,
and I'd forgotten to remove it when turning on early Z. The result was
that you'd get big chunks of your rendering missing.
|
|
|
|
|
|
|
| |
Improves framerate of 5 seconds of es2gears by 1.57473% +/- 0.669409%
(n=67).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Can't reset the CL before looking at how much we had pupt in it.
|
|
|
|
|
| |
This gives a 2.7x improvement in x11perf -rect100, since we only end up
load/storing the x11perf window, not the whole screen.
|
|
|
|
|
|
| |
This will be more important in the next commit, when there's more state to
reset to nonzero values, and I want an early exit from the submit
function.
|
|
|
|
|
|
|
|
| |
The optimizer obviously doesn't have the ability to rewrite these to skip
the size checks per call, so we have to do it manually.
Improves a norast benchmark on simulation by 0.779706% +/- 0.405838%
(n=6087).
|
|
|
|
|
| |
Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673%
(n=20).
|
| |
|
| |
|
| |
|
|
|
|
|
| |
This is nice when you're tracking down which command list is hanging the
GPU.
|
|
|
|
|
| |
Our submits now return immediately and you have to manually wait for
things to complete if you want to (like a normal driver).
|
|
|
|
|
|
|
| |
Previously, the kernel would dispatch thread 0, wait, then dispatch thread
1. By insisting that the thread contents use semaphores in the right
place, the kernel can sleep for longer by dispatching both threads at
once.
|
|
|
|
| |
I'm going to want to make some other decisions here before flushing.
|
|
|
|
|
|
| |
GLES2 doesn't have GL_TEXTURE_BASE_LEVEL, so the hardware doesn't. Fixes
piglit levelclamp, tex-miplevel-selection, and texture-storage/2D mipmap
rendering.
|
|
|
|
| |
This caught the previous commit's bug in the kernel validator.
|
|
|
|
|
|
| |
We don't need to emit all of our current state at the end of each bin
list. We're going to be smashing it all at the start of the next tile's
bin list, anyway.
|
| |
|
|
|
|
|
|
| |
It's not documented that I can see, but the other driver does it (check
vg_hw_4.c), and one of the HW guys confirmed that you really do need to do
it.
|
|
|
|
|
| |
Otherwise, we'd replace the stencil in our packed depth/stencil with 0s.
Fixes about 50 piglit tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code was kind of mixed up what buffers were getting stored in the case
that a resolve bit was unset (which are set based on the GL state at draw
time) and the buffer wasn't actually bound. In particular, depth-only
rendering would store the color buffer contents, which happen to be
pointing at the depth buffer.
Thanks to clearing out the resolve bits for things we really can't
resolve, now I can drop the safety checks for buffer presence around the
actual stores.
Fixes 42 piglit tests.
|
|
|
|
|
|
| |
We have to expose them for GL 2.0, but we just always return a value of 0.
We should be advertising 0 query bits instead of 64, but gallium doesn't
have plumbing for that yet. At least this stops the segfaults.
|
|
|
|
|
|
|
|
| |
These are pretty catastrophic, "should never happen" failure paths (though
4 tests in piglit hit them currently, due to a single bug). An abort()
that you can gdb on easily is probably more useful than a clean exit,
particularly since a bug in piglit framework right now is causing early
exit(1)s to simply not be recorded in the results at all.
|
|
|
|
|
| |
I wanted to hang the ra_regs off it so I didn't have to free, but it
turned out it wasn't ralloced yet.
|
| |
|
| |
|
|
|
|
|
|
| |
The rest of stencil handling isn't done yet, but it documents an extra
cl_u8(0) and helps make it obvious why we don't need to format clear_depth
the same way the depth/stencil buffer is formatted.
|
|
|
|
|
|
| |
For now it still requires the color buffer to be present -- we're relying
on the store of color buffer contents to end the frame, and we have to do
something with color buffers in the rendering config packet.
|
| |
|
|
|
|
|
|
|
| |
Now that tiling is in place, we can expose the other formats. Depth is
still broken (need to make changes in the shader), but if you don't expose
it things crash all over. SNORM is dropped, but we could re-add it later
with some shader fixes to handle converting between [0,1] and [-1,1].
|
|
|
|
|
|
| |
This still treats everything as RGBA8888 for the most part, same as
before. This is a prerequisite for handling other texture formats, since
only RGBA8888 has a raster-layout mode.
|
|
|
|
|
| |
There are a few tools I want to have always available, and fprintf() and
abort() are among them.
|
|
|
|
|
|
|
|
| |
The hw_mask is the set of primitives you actually support, so this attempt
to provide the set of formats that's unsupported was wrong in two ways (it
was intended to be '~' not '!'). However, we only call this code when
prim isn't one of the actually supported hw_mask bits, so missing out on
the memcpy didn't matter anyway.
|
|
|
|
|
|
|
| |
At some point I'm going to want to move the information necessary for the
host buffer upload/download into the BO so that it's independent of the
current vc4->framebuffer, but for now this fixes pointless derefs on
non-simulator in vc4_context.c since the dump_fbo() removal
|