aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4/vc4_cl.h
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Drop reloc_count tracking for debug asserts on non-debug builds.Eric Anholt2015-07-141-0/+10
| | | | Cuts another 88 bytes of compiled code.
* vc4: Rework cl handling to be friendlier to the compiler.Eric Anholt2015-07-141-47/+66
| | | | | Drops 680 bytes of code, from avoiding a bunch of extra updates to the next pointer in the struct.
* vc4: Make a helper function for getting the current offset in the CL.Eric Anholt2015-07-141-5/+10
| | | | | | I needed to rewrite this a bit for safety checking in the next commit. Despite being a static inline of the same thing that was being done, we lose 36 bytes of code for some reason.
* vc4: Drop separate cl*_reloc_hindex().Eric Anholt2015-07-141-18/+6
| | | | | | Now that RCL generation is in the kernel, we don't have any other callers. Oddly, the compiler generates another 8 bytes of code for this, but the simplification is worth it.
* vc4: Store reloc pointers as pointers, not offsets.Eric Anholt2015-07-141-5/+5
| | | | | | | Now that we don't resize the CL as we build (it's set up at the top by vc4_start_draw()), we can store the pointers instead of offsets from the base. Saves a bit of math in emitting relocs (about 60 bytes of code).
* vc4: Move vc4_packet.h to the kernel/ directory, since it's also shared.Eric Anholt2015-06-161-1/+1
| | | | I want to notice discrepancies when I diff -u between Mesa and the kernel.
* vc4: Handle unaligned accesses in CL emits.Eric Anholt2014-12-251-1/+52
| | | | | | | As of 229bf4475ff0a5dbeb9bc95250f7a40a983c2e28 we started getting SIBGUS from unaligned accesses on the hardware, for reasons I haven't figured out. However, we should be avoiding unaligned accesses anyway, and our CL setup certainly would have produced them.
* vc4: Don't bother zero-initializing the shader reloc indices.Eric Anholt2014-12-251-2/+2
| | | | | They should all be set to real values by the time they're read, and ideally if you used valgrind you'd see uninitialized value uses.
* vc4: Fix the argument type for cl_u16().Eric Anholt2014-12-251-1/+1
| | | | It doesn't matter, since it just got truncated to 16 inside, anyway.
* vc4: Optimize CL emits by doing size checks up front.Eric Anholt2014-12-241-10/+7
| | | | | | | | The optimizer obviously doesn't have the ability to rewrite these to skip the size checks per call, so we have to do it manually. Improves a norast benchmark on simulation by 0.779706% +/- 0.405838% (n=6087).
* vc4: Avoid repeated hindex lookups in the loop over tiles.Eric Anholt2014-12-241-3/+9
| | | | | Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673% (n=20).
* vc4: Make some assertions about how many flushes/EOFs the simulator sees.Eric Anholt2014-10-171-1/+1
| | | | This caught the previous commit's bug in the kernel validator.
* vc4: Actually implement VC4_DEBUG=cl.Eric Anholt2014-09-181-0/+1
|
* vc4: Rename GEM_HANDLES to be in a namespace.Eric Anholt2014-08-111-1/+1
| | | | | It's not a real VC4 hardware packet, but I've put in a comment to explain it.
* vc4: Switch simulator to using kernel validatorEric Anholt2014-08-111-12/+10
| | | | | | | | This ensures that when I'm using the simulator, I get a closer match to what behavior on real hardware will be. It lets me rapidly iterate on the kernel validation code (which otherwise has a several-minute turnaround time), and helps catch buffer overflow bugs in the userspace driver faster.
* vc4: Initial skeleton driver import.Eric Anholt2014-08-081-0/+132
This mostly just takes every draw call and turns it into a sequence of commands that clear the FBO and draw a single shaded triangle to it, regardless of the actual input vertices or shaders. I copied the initial driver skeleton mostly from freedreno, and I've preserved Rob Clark's copyright for those. I also based my initial hardcoded shaders and command lists on Scott Mansell (phire)'s "hackdriver" project, though the bit patterns of the shaders emitted end up being different. v2: Rebase on gallium megadrivers changes. v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change. v4: Rely on simpenrose actually being installed when building for simulation. v5: Add more header duplicate-include guards. v6: Apply Emil's review (protection against vc4 sim and ilo at the same time, and dropping the dricommon drm bits) and fix a copyright header (thanks, Roland)