aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4/vc4_cl.h
Commit message (Collapse)AuthorAgeFilesLines
* broadcom/vc4: Simplify the relocation handling for index buffers.Eric Anholt2017-12-011-15/+0
| | | | | | Originally there was CL code for handling various relocations back when I had relocs for the TSDA/TA buffers. Now that the kernel handles those entirely on its own, I can inline that code into the one place using it.
* vc4: fix release buildEric Engestrom2017-10-271-6/+6
| | | | | | | | | | | Mesa's DEBUG and assert's NDEBUG are not tied to each other, so we need to explicitly compile this code out. Fixes: 3df78928786134874eafa "vc4: Drop reloc_count tracking for debug asserts on non-debug builds." Cc: Eric Anholt <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Start using XML unpack functions in CL dump.Eric Anholt2017-06-301-1/+0
| | | | | | For now this is a no-op on the output, but it makes it clear that we've had weird things going on with things like V3D21_CLIPPER_Z_SCALE_AND_OFFSET.
* vc4: Move rasterizer state packing to CSO creation time.Eric Anholt2017-06-301-0/+5
| | | | | | | | | | This gets our vc4_emit.c size back down a bit: before: 1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o after: 968 0 0 968 3c8 src/gallium/drivers/vc4/.libs/vc4_emit.o
* vc4: Convert the driver to emitting the shader record using pack macros.Eric Anholt2017-06-301-10/+37
|
* vc4: Simplify pack header usageEric Anholt2017-06-301-6/+9
| | | | | | | | | | | | | Take the CL pointer in, which will be useful for enabling relocs. However, our code expands a bit more: before: 4449 0 0 4449 1161 src/gallium/drivers/vc4/.libs/vc4_draw.o 988 0 0 988 3dc src/gallium/drivers/vc4/.libs/vc4_emit.o after: 4481 0 0 4481 1181 src/gallium/drivers/vc4/.libs/vc4_draw.o 1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o
* vc4: Start using the pack header.Eric Anholt2017-06-301-0/+63
| | | | | | | | | | | | | This slightly inflates the size of the generated code, in exchange for getting us some convenient tools. before: 4389 0 0 4389 1125 src/gallium/drivers/vc4/.libs/vc4_draw.o 808 0 0 808 328 src/gallium/drivers/vc4/.libs/vc4_emit.o after: 4449 0 0 4449 1161 src/gallium/drivers/vc4/.libs/vc4_draw.o 988 0 0 988 3dc src/gallium/drivers/vc4/.libs/vc4_emit.o
* vc4: Move the render job state into a separate structure.Eric Anholt2016-09-141-6/+7
| | | | | This is a preparation step for having multiple jobs being queued up at the same time.
* vc4: Drop reloc_count tracking for debug asserts on non-debug builds.Eric Anholt2015-07-141-0/+10
| | | | Cuts another 88 bytes of compiled code.
* vc4: Rework cl handling to be friendlier to the compiler.Eric Anholt2015-07-141-47/+66
| | | | | Drops 680 bytes of code, from avoiding a bunch of extra updates to the next pointer in the struct.
* vc4: Make a helper function for getting the current offset in the CL.Eric Anholt2015-07-141-5/+10
| | | | | | I needed to rewrite this a bit for safety checking in the next commit. Despite being a static inline of the same thing that was being done, we lose 36 bytes of code for some reason.
* vc4: Drop separate cl*_reloc_hindex().Eric Anholt2015-07-141-18/+6
| | | | | | Now that RCL generation is in the kernel, we don't have any other callers. Oddly, the compiler generates another 8 bytes of code for this, but the simplification is worth it.
* vc4: Store reloc pointers as pointers, not offsets.Eric Anholt2015-07-141-5/+5
| | | | | | | Now that we don't resize the CL as we build (it's set up at the top by vc4_start_draw()), we can store the pointers instead of offsets from the base. Saves a bit of math in emitting relocs (about 60 bytes of code).
* vc4: Move vc4_packet.h to the kernel/ directory, since it's also shared.Eric Anholt2015-06-161-1/+1
| | | | I want to notice discrepancies when I diff -u between Mesa and the kernel.
* vc4: Handle unaligned accesses in CL emits.Eric Anholt2014-12-251-1/+52
| | | | | | | As of 229bf4475ff0a5dbeb9bc95250f7a40a983c2e28 we started getting SIBGUS from unaligned accesses on the hardware, for reasons I haven't figured out. However, we should be avoiding unaligned accesses anyway, and our CL setup certainly would have produced them.
* vc4: Don't bother zero-initializing the shader reloc indices.Eric Anholt2014-12-251-2/+2
| | | | | They should all be set to real values by the time they're read, and ideally if you used valgrind you'd see uninitialized value uses.
* vc4: Fix the argument type for cl_u16().Eric Anholt2014-12-251-1/+1
| | | | It doesn't matter, since it just got truncated to 16 inside, anyway.
* vc4: Optimize CL emits by doing size checks up front.Eric Anholt2014-12-241-10/+7
| | | | | | | | The optimizer obviously doesn't have the ability to rewrite these to skip the size checks per call, so we have to do it manually. Improves a norast benchmark on simulation by 0.779706% +/- 0.405838% (n=6087).
* vc4: Avoid repeated hindex lookups in the loop over tiles.Eric Anholt2014-12-241-3/+9
| | | | | Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673% (n=20).
* vc4: Make some assertions about how many flushes/EOFs the simulator sees.Eric Anholt2014-10-171-1/+1
| | | | This caught the previous commit's bug in the kernel validator.
* vc4: Actually implement VC4_DEBUG=cl.Eric Anholt2014-09-181-0/+1
|
* vc4: Rename GEM_HANDLES to be in a namespace.Eric Anholt2014-08-111-1/+1
| | | | | It's not a real VC4 hardware packet, but I've put in a comment to explain it.
* vc4: Switch simulator to using kernel validatorEric Anholt2014-08-111-12/+10
| | | | | | | | This ensures that when I'm using the simulator, I get a closer match to what behavior on real hardware will be. It lets me rapidly iterate on the kernel validation code (which otherwise has a several-minute turnaround time), and helps catch buffer overflow bugs in the userspace driver faster.
* vc4: Initial skeleton driver import.Eric Anholt2014-08-081-0/+132
This mostly just takes every draw call and turns it into a sequence of commands that clear the FBO and draw a single shaded triangle to it, regardless of the actual input vertices or shaders. I copied the initial driver skeleton mostly from freedreno, and I've preserved Rob Clark's copyright for those. I also based my initial hardcoded shaders and command lists on Scott Mansell (phire)'s "hackdriver" project, though the bit patterns of the shaders emitted end up being different. v2: Rebase on gallium megadrivers changes. v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change. v4: Rely on simpenrose actually being installed when building for simulation. v5: Add more header duplicate-include guards. v6: Apply Emil's review (protection against vc4 sim and ilo at the same time, and dropping the dricommon drm bits) and fix a copyright header (thanks, Roland)