summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4/vc4_draw.c
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Translate 4-byte index buffers to 2 bytes.Eric Anholt2014-10-191-5/+9
| | | | Fixes assertion failures in 14 piglit tests (half of which now pass).
* vc4: Set the primitive list format at the start of rendering.Eric Anholt2014-10-171-0/+9
| | | | | | | | The other driver does this manually before calling into each tile, but we can just let it get binned into the tiles (saving repeated kernel validation on the packet). Fixes simulator assertion failures on polygon-mode and non-auto texwrap.
* vc4: Add some comments about state management.Eric Anholt2014-10-171-0/+5
|
* vc4: Add support for having 0 vertex elements used.Eric Anholt2014-10-141-6/+21
| | | | | You have to load at least 1, according to the simulator. Fixes 4 piglit tests and even more ES2 conformance tests.
* vc4: Don't look up the compiled shaders unless state has changed.Eric Anholt2014-10-101-0/+5
| | | | | Improves simulated norast performance on a little benchmark by 38.0965% +/- 3.27534% (n=11).
* vc4: Actually clear the context's dirty flags.Eric Anholt2014-10-101-0/+1
| | | | | I was trying to skip state updates when !dirty, and suspiciously everything was always dirty.
* vc4: Split the coordinate shader to its own vc4_compiled_shader.Eric Anholt2014-10-091-10/+6
| | | | | | | | | | | Merging VS and CS into the same struct wasn't winning us anything except for not allocating a separate BO (but if we want to pack programs into BOs, we should pack not just those 2 programs together). What it was getting us was a bunch of code duplication about hash table lookups and propagating vc4_compile contents into a vc4_compiled_shader. I was about to make the situation worse with indirect uniform buffer access.
* vc4: Compute max_index instead of trusting the rest of userspace.Eric Anholt2014-09-241-5/+13
| | | | | | | | | | | | | | max_index was coming from either the user telling us as part of glDrawRangeElements, or from an incidental calculation as part of some sort of primitive conversion fallback. Sometimes, it was just set to the default "I don't know" ~0 value. If it wasn't set to the actual max index, then the kernel would reject the draw call for allowing out-of-bounds VBO reads. So, compute the max index from the sizes of the VBOs, which isn't too expensive (unlike mapping and reading the index buffer) and is reliable. Fixes piglit vao-element-array-buffer.
* vc4: Move shader record setup before the draw call.Eric Anholt2014-09-241-38/+38
| | | | | The flush only happens after both are written, so we can do them in either order. This will let me compute max_index during the shader record setup.
* vc4: Add support for point size setting.Eric Anholt2014-09-241-1/+5
| | | | This is the support for both the global and per-vertex modes.
* vc4: Add support for stencil operations.Eric Anholt2014-09-181-0/+2
| | | | | | | While depth test state is passed through the fragment shader as sideband, data, the stencil test state has to be set by the fragment shader itself. Many tests are still failing, but this gets most of hiz/ passing.
* vc4: Fill out the stencil clear field.Eric Anholt2014-09-091-0/+3
| | | | | | The rest of stencil handling isn't done yet, but it documents an extra cl_u8(0) and helps make it obvious why we don't need to format clear_depth the same way the depth/stencil buffer is formatted.
* vc4: Flip around the depth/stencil fields.Eric Anholt2014-09-091-1/+5
| | | | | After implementing depth stores, it looks like this is the way things actually are, according to hiz-depth-read-fbo-d24-s0's probes.
* vc4: Add a debug flag for flushing after every draw.Eric Anholt2014-09-091-0/+3
| | | | | It was useful on i965, but it's even more useful for debugging tiled renderers.
* vc4: Include stdio/stdlib in headers so I don't have to include it per file.Eric Anholt2014-08-221-2/+0
| | | | | There are a few tools I want to have always available, and fprintf() and abort() are among them.
* vc4: Consume the implicit varyings for points and lines.Eric Anholt2014-08-151-1/+1
| | | | | | | | | | We were triggering simulator assertion failures for not consuming these, and presumably we want to actually make use of them some day (for things like point/line antialiasing) Note that this has the qreg index as 0, which is the same index as the first GL varyings read. This doesn't matter currently, since that number isn't used for anything except dumping.
* vc4: Clean up the tile alloc buffer size.Eric Anholt2014-08-111-1/+9
| | | | | | | This prevents some simulator assertion failures, but it does mean (since I've dropped the "* 16" padding) that on real hardware you need a kernel that does overflow memory management (currently, "drm/vc4: Add support for binner overflow memory allocation." in my kernel tree).
* vc4: Clarify some values implicitly chosen for binning config.Eric Anholt2014-08-111-1/+4
| | | | These #defines are 0, but it should help make math above make more sense.
* vc4: Drop VC4_PACKET_PRIMITIVE_LIST_FORMAT.Eric Anholt2014-08-111-3/+0
| | | | | | It's not relevant to our command streams any more. v2: Fix indentation and a typo in the comment.
* vc4: Add support for depth clears and tests within a tile.Eric Anholt2014-08-111-1/+4
| | | | | | | | | This doesn't load/store the Z contents across submits yet. It also disables early Z, since it's going to require tracking of Z functions across multiple state updates to track the early Z direction and whether it can be used. v2: Move the key setup to before the search for the key.
* vc4: Drop the flush at the end of the drawEric Anholt2014-08-111-2/+0
| | | | Now we actally get multiple draw calls per submit.
* vc4: Track clears veresus uncleared draws, and the clear color.Eric Anholt2014-08-111-13/+53
| | | | | | This is a step toward queueing more than one draw per frame. Fixes piglit attribute0 test, since we get a working clear color now.
* vc4: Move the rest of RCL setup to flush time.Eric Anholt2014-08-111-30/+0
| | | | | We only want to set up render target config and clear colors once per frame.
* vc4: Move render command list calls to vc4_flush()Eric Anholt2014-08-111-40/+0
|
* vc4: Move bin command list ending commands to vc4_flush()Eric Anholt2014-08-111-4/+0
|
* vc4: Rewrite the kernel ABI to support texture uniform relocation.Eric Anholt2014-08-111-12/+10
| | | | | | | | | | This required building a shader parser that would walk the program to find where the texturing-related uniforms are in the uniforms stream. Note that as of this commit, a new kernel is required for rendering on actual VC4 hardware (currently that commit is named "drm/vc4: Introduce shader validation and better command stream validation.", but is likely to be squashed as part of an eventual merge of the kernel driver).
* vc4: Switch simulator to using kernel validatorEric Anholt2014-08-111-5/+0
| | | | | | | | This ensures that when I'm using the simulator, I get a closer match to what behavior on real hardware will be. It lets me rapidly iterate on the kernel validation code (which otherwise has a several-minute turnaround time), and helps catch buffer overflow bugs in the userspace driver faster.
* vc4: Add support for texturing (under simulation)Eric Anholt2014-08-111-0/+3
| | | | | | | | Only rgba8888 works, and only a single texture unit, and it's only under simulation because I haven't built the kernel interface yet. v2: Rebase on helpers. v3: Fold in the don't-break-the-arm-build fix.
* vc4: Add support for swizzles of 32 bit float vertex attributes.Eric Anholt2014-08-081-5/+0
| | | | | | | | | | | | Some tests start working (useprogram-flushverts, for example) due to getitng the right vertices now. Some that used to pass start failing with memory overflow during binning, which is weird (glsl-fs-texture2drect). And a couple stop rendering correctly (glsl-fs-bug25902). v2: Move the attribute format setup in the key from after search time to before the search. v3: Fix reading of attributes other than position (I forgot to respect attr and stored everything in inputs 0-3, i.e. position).
* vc4: Crank up the tile allocation BO sizeEric Anholt2014-08-081-2/+2
| | | | | This avoids a simulator assertion failure with glamor. I need to actually support resize, though.
* vc4: Add support for multiple attributesEric Anholt2014-08-081-14/+20
|
* vc4: Add WIP support for varyings.Eric Anholt2014-08-081-1/+1
| | | | | | It doesn't do all the interpolation yet, but more tests can run now. v2: Rebase on helpers.
* vc4: Add shader variant caching to handle FS output swizzle.Eric Anholt2014-08-081-0/+2
|
* vc4: Load the tile buffer before incrementally drawing.Eric Anholt2014-08-081-8/+22
| | | | | | | We will want to occasionally disable this again when we do clear support. v2: Squash with the previous commit (I accidentally committed at two stages of writing the change)
* vc4: Don't reallocate the tile alloc/state bos every frame.Eric Anholt2014-08-081-10/+18
| | | | | This was a problem for the simulator since we don't free memory back to it, and it would soon just run out.
* vc4: Use the user's actual first vertex attribute.Eric Anholt2014-08-081-35/+55
| | | | | This is hardcoded to read it as RGBA32F so far, but starts to get more tests working.
* vc4: Switch to actually generating vertex and fragment shader code from TGSI.Eric Anholt2014-08-081-18/+15
| | | | | | | | | | | | | | | | | | This introduces an IR (QIR, for QPU IR) to do optimization on. It's a scalar, SSA IR in general. It looks like optimization is pretty easy this way, though I haven't figured out if it's going to be good for our weird register allocation or not (or if I want to reduce to basically QPU instructions first), and I've got some problems with it having some multi-QPU-instruction opcodes (SEQ and CMP, for example) which I probably want to break down. Of course, this commit mostly doesn't work, since many other things are still hardwired, like the VBO data. v2: Rewrite to use a bunch of helpers (qir_OPCODE) for emitting QIR instructions into temporary values, and make qir_inst4 take the 4 args separately instead of an array (all later callers wanted individual args).
* vc4: Start converting the driver to use vertex shaders.Eric Anholt2014-08-081-44/+48
| | | | | | | | Note: This is the cutoff point where I switched from developing primarily on the Pi to developing o the simulator. As a result, from this point on the code is untested on the Pi (the kernel code I have currently wasn't rendering anything at this commit, though the simulator renders successfully, suggesting kernel bugs).
* vc4: Initial skeleton driver import.Eric Anholt2014-08-081-0/+241
This mostly just takes every draw call and turns it into a sequence of commands that clear the FBO and draw a single shaded triangle to it, regardless of the actual input vertices or shaders. I copied the initial driver skeleton mostly from freedreno, and I've preserved Rob Clark's copyright for those. I also based my initial hardcoded shaders and command lists on Scott Mansell (phire)'s "hackdriver" project, though the bit patterns of the shaders emitted end up being different. v2: Rebase on gallium megadrivers changes. v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change. v4: Rely on simpenrose actually being installed when building for simulation. v5: Add more header duplicate-include guards. v6: Apply Emil's review (protection against vc4 sim and ilo at the same time, and dropping the dricommon drm bits) and fix a copyright header (thanks, Roland)