aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4/vc4_cl.c
Commit message (Collapse)AuthorAgeFilesLines
* broadcom/vc4: Use a single-entry cached last_hindex value.Eric Anholt2017-12-011-2/+12
| | | | | | | | | Since almost all BOs will be in one CL at a time, this cache will almost always hit except for the first usage of the BO in each CL. This didn't show up as statistically significant on the minetest trace (n=340), but if I lop off the throttled lobe of the bimodal distribution, it very clearly does (0.74731% +/- 0.162093%, n=269).
* vc4: Convert the driver to emitting the shader record using pack macros.Eric Anholt2017-06-301-2/+3
|
* vc4: Flush the job early if we're referencing too many BOs.Eric Anholt2017-01-051-0/+2
| | | | | | | | | | | If we get up toward 256MB (or whatever the CMA area size is), VC4_GEM_CREATE will start throwing errors. Even if we don't trigger that, when we flush the kernel's BO allocation for the CLs or bin memory may end up throwing an error, at which point our job won't get rendered at all. Just flush early (half of maximum CMA size) so that hopefully we never get to that point.
* ralloc: use rzalloc where it's necessaryMarek Olšák2016-10-311-1/+1
| | | | | | | | | | | | | | | | | No change in behavior. ralloc_size is equivalent to rzalloc_size. That will change though. Calls not switched to rzalloc_size: - ralloc_vasprintf - glsl_type::name allocation (it's filled with snprintf) - C++ classes where valgrind didn't show uninitialized values I switched most of non-glsl stuff to rzalloc without checking whether it's really needed. Reviewed-by: Edward O'Callaghan <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* vc4: Move the render job state into a separate structure.Eric Anholt2016-09-141-9/+9
| | | | | This is a preparation step for having multiple jobs being queued up at the same time.
* vc4: Rework cl handling to be friendlier to the compiler.Eric Anholt2015-07-141-2/+9
| | | | | Drops 680 bytes of code, from avoiding a bunch of extra updates to the next pointer in the struct.
* vc4: Make a helper function for getting the current offset in the CL.Eric Anholt2015-07-141-5/+4
| | | | | | I needed to rewrite this a bit for safety checking in the next commit. Despite being a static inline of the same thing that was being done, we lose 36 bytes of code for some reason.
* vc4: Optimize CL emits by doing size checks up front.Eric Anholt2014-12-241-4/+8
| | | | | | | | The optimizer obviously doesn't have the ability to rewrite these to skip the size checks per call, so we have to do it manually. Improves a norast benchmark on simulation by 0.779706% +/- 0.405838% (n=6087).
* vc4: Fix leaks of the CL contents.Eric Anholt2014-12-141-1/+4
|
* vc4: Keep a reference to BOs queued for rendering.Eric Anholt2014-08-111-4/+1
| | | | | | Otherwise, once we're not flushing at the end of every draw, we'll free things like gallium resources, and free the backing GEM object, before we've flushed the rendering using it to the kernel.
* vc4: Switch simulator to using kernel validatorEric Anholt2014-08-111-9/+5
| | | | | | | | This ensures that when I'm using the simulator, I get a closer match to what behavior on real hardware will be. It lets me rapidly iterate on the kernel validation code (which otherwise has a several-minute turnaround time), and helps catch buffer overflow bugs in the userspace driver faster.
* vc4: Initial skeleton driver import.Eric Anholt2014-08-081-0/+74
This mostly just takes every draw call and turns it into a sequence of commands that clear the FBO and draw a single shaded triangle to it, regardless of the actual input vertices or shaders. I copied the initial driver skeleton mostly from freedreno, and I've preserved Rob Clark's copyright for those. I also based my initial hardcoded shaders and command lists on Scott Mansell (phire)'s "hackdriver" project, though the bit patterns of the shaders emitted end up being different. v2: Rebase on gallium megadrivers changes. v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change. v4: Rely on simpenrose actually being installed when building for simulation. v5: Add more header duplicate-include guards. v6: Apply Emil's review (protection against vc4 sim and ilo at the same time, and dropping the dricommon drm bits) and fix a copyright header (thanks, Roland)