summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Switch to using a separate ioctl for making shaders.Eric Anholt2015-07-174-12/+78
| | | | | | | | | This gives the kernel a chance to validate and lock down the data, without having to deal with mmap zapping. With this, GLBenchmark stops on a texture relocations, because we'd recycled a shader BO as another shader and failed to revalidate, since we weren't clearing the cached validation state on mmap faults.
* vc4: Fix printing of shader-db debug when shader-db isn't turned on.Eric Anholt2015-07-171-4/+6
|
* vc4: Add debugging on texture relocation validation failures.Eric Anholt2015-07-171-7/+13
|
* vc4: Also consider uniform 0 in uniform lowering.Eric Anholt2015-07-171-3/+3
| | | | The hash table considers key 0 to be the empty key.
* vc4: Use the pure/const attributes on a bunch of our QPU functions.Eric Anholt2015-07-172-18/+18
| | | | | | On a release build, this makes the rest of vc4_qpu_validate.c go away (the compiler didn't know that our qpu helper function calls had no side effects).
* gallium: add PIPE_CAP_MAX_SHADER_PATCH_VARYINGSMarek Olšák2015-07-161-0/+1
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Cache the texture p1 for the sampler.Eric Anholt2015-07-143-49/+69
| | | | | Cuts another 12% of vc4_uniforms.o, in exchange for computing it at CSO creation time.
* vc4: Cache texture p0/p1 setup for the sampler view.Eric Anholt2015-07-143-28/+43
| | | | | In exchange for a bit of space and computation in CSO setup, we cut vc4_uniform.c (draw time) code size by 4.8%.
* vc4: Move uniforms handling to a separate file.Eric Anholt2015-07-143-314/+341
| | | | | The rest of vc4_program.c is about compiling, while this is about uniform emit at draw time.
* vc4: Fix some -Wdouble-promotion warnings.Eric Anholt2015-07-143-6/+6
| | | | | No code generation changes from this, but it'll be useful to have this next time I go checking -Wdouble-promotion.
* vc4: Fix compiler warnings on release builds.Eric Anholt2015-07-144-7/+14
|
* vc4: Add better debug for register allocation failure.Eric Anholt2015-07-141-1/+5
|
* vc4: Drop reloc_count tracking for debug asserts on non-debug builds.Eric Anholt2015-07-141-0/+10
| | | | Cuts another 88 bytes of compiled code.
* vc4: Rework cl handling to be friendlier to the compiler.Eric Anholt2015-07-146-152/+203
| | | | | Drops 680 bytes of code, from avoiding a bunch of extra updates to the next pointer in the struct.
* vc4: Make a helper function for getting the current offset in the CL.Eric Anholt2015-07-144-20/+21
| | | | | | I needed to rewrite this a bit for safety checking in the next commit. Despite being a static inline of the same thing that was being done, we lose 36 bytes of code for some reason.
* vc4: Drop separate cl*_reloc_hindex().Eric Anholt2015-07-141-18/+6
| | | | | | Now that RCL generation is in the kernel, we don't have any other callers. Oddly, the compiler generates another 8 bytes of code for this, but the simplification is worth it.
* vc4: Store reloc pointers as pointers, not offsets.Eric Anholt2015-07-141-5/+5
| | | | | | | Now that we don't resize the CL as we build (it's set up at the top by vc4_start_draw()), we can store the pointers instead of offsets from the base. Saves a bit of math in emitting relocs (about 60 bytes of code).
* vc4: Add perf debug for when we wait on BOs.Eric Anholt2015-07-144-43/+72
|
* vc4: unref old fenceRob Clark2015-07-101-0/+2
| | | | | | | | | Some, but not all, state trackers will explicitly unref (and set to NULL) the previous *fence before calling pipe->flush(). So driver should use fence_ref() which will unref the old fence if not NULL. Signed-off-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]>
* gallium: remove redundant pipe_context::fence_signalledMarek Olšák2015-07-051-11/+0
| | | | | | fence_finish(timeout=0) does the same thing Reviewed-by: Brian Paul <[email protected]>
* gallium/ttn: mark location specially in nir for color0-writes-allIlia Mirkin2015-07-031-0/+6
| | | | | | | | | | We need to distinguish a shader that has separate writes to each MRT from one which is supposed to write the data from MRT 0 to all the MRTs. In TGSI this is done with a property. NIR doesn't have that, so encode it as a funny location and decode on the other end. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/from_ssa: add a flag to not convert everything from SSAConnor Abbott2015-06-301-1/+1
| | | | | | | | | | | | | We already don't convert constants out of SSA, and in our backend we'd like to have only one way of saying something is still in SSA. The one tricky part about this is that we may now leave some undef instructions around if they aren't part of a phi-web, so we have to be more careful about deleting them. v2: rename and flip meaning of flag (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: Enable subdir-objects globally.Matt Turner2015-06-261-2/+0
| | | | Reviewed-by: Emil Velikov <[email protected]>
* vc4: Also dump VC4_PACKET_LOAD_TILE_BUFFER_GENERAL.Eric Anholt2015-06-231-2/+14
|
* vc4: Add dumping for VC4_PACKET_LOAD/STORE_FULL_RES_TILE_BUFFER.Eric Anholt2015-06-232-2/+38
|
* vc4: Don't try to CSE color reads.Eric Anholt2015-06-231-1/+2
| | | | | | It returns a new value for each sample in the TLB. We've already avoided trying to get the same index's color multiple times at the vc4_program.c level, so we're not losing anything by doing this.
* vc4: Make a helper for TLB color writes, too.Eric Anholt2015-06-232-1/+2
| | | | We've done so for all the other QIR instruction generation in this file.
* vc4: Pull the blending operation out to a separate function.Eric Anholt2015-06-231-38/+50
| | | | | It's fairly separate from the rest of the TLB operations at frag end time, and we'll need to run it multiple times to support MSAA blending.
* vc4: Clarify size calculation for Z/S writes.Eric Anholt2015-06-231-1/+1
| | | | | It's the same value for loads and stores, because they're basically the same packet.
* vc4: Add an "args" temporary for RCL setup.Eric Anholt2015-06-231-24/+24
|
* vc4: Reuse (and extend) the packet.h sizes for dumping.Eric Anholt2015-06-232-51/+58
|
* vc4: Fix printfs for blit fallbacks.Eric Anholt2015-06-231-3/+3
|
* vc4: Use a defined t value for 1D textures.Eric Anholt2015-06-201-1/+3
| | | | | This doesn't fix the broken 1D cases of texsubimage, but it does prevent segfaulting when dumping the QIR code generated in fbo-1d.
* vc4: Fix write-only texsubimage when we had to align.Eric Anholt2015-06-201-1/+5
| | | | | | | | We need to make sure that when we store the aligned box, we've got initialized contents in the border. We could potentially just load the border area, but for now let's get text rendering working in X (and fix the GL_TEXTURE_2D errors in piglit's texsubimage test and gl-2.1-pbo/test_tex_image)
* vc4: Move tile state/alloc allocation into the kernel.Eric Anholt2015-06-179-101/+72
| | | | | | | This avoids a security issue where userspace could have written the tile state/tile alloc behind the GPU's back, and will apparently be necessary for fixing stability bugs (tile state buffers are missing some top bits for the tile alloc's address).
* vc4: Move RCL generation into the kernel.Eric Anholt2015-06-1711-676/+725
| | | | | There weren't that many variations of RCL generation, and this lets us skip all the in-kernel validation for what we generated.
* vc4: Add dumping of VC4_PACKET_TILE_BINNING_MODE_CONFIG.Eric Anholt2015-06-171-1/+32
|
* vc4: Fix memory leak from simple_list conversion.Eric Anholt2015-06-171-3/+2
| | | | | I accidentally shadowed the outside declaration, so we always returned NULL even when we'd found something in the cache.
* vc4: Track the number of BOs allocated and their size.Eric Anholt2015-06-172-7/+100
| | | | This is useful for BO leak debugging.
* vc4: Make sure that direct texture clamps have a minimum value of 0.Eric Anholt2015-06-162-25/+66
| | | | | | | I was thinking of the MIN opcode in terms of unsigned math, but it's signed, so if you used a negative array index, you could read before the UBO. Fixes segfaults under simulation in piglit array indexing tests with mprotect-based guard pages.
* vc4: Swap around which src we spill to ra31/rb31.Eric Anholt2015-06-161-4/+4
| | | | | | | | I wanted to assert that src1 came from a non-unspilled register in shader validation, and this easily gets us that. And, as a bonus: total instructions in shared programs: 93347 -> 92723 (-0.67%) instructions in affected programs: 60524 -> 59900 (-1.03%)
* vc4: R4 is not a valid register for clamped direct texturing.Eric Anholt2015-06-161-1/+1
| | | | | Our array only goes to R3, and R4 is a special case that shouldn't be used.
* vc4: Factor out the live clamp register getter.Eric Anholt2015-06-161-8/+24
|
* vc4: Drop the unused "stride" field of surfaces.Eric Anholt2015-06-161-1/+0
| | | | We're always looking at the slice anyway, when we would have needed it.
* vc4: Handle refcounting the exec BO like we do in the kernel.Eric Anholt2015-06-164-10/+34
| | | | | This reduces the diff to the kernel, and will be useful when I make the kernel allocate more BOs as part of validation.
* vc4: Use VC4_SET/GET_FIELD for some RCL packets.Eric Anholt2015-06-164-77/+89
|
* vc4: Make symbolic values for packet sizes.Eric Anholt2015-06-164-57/+86
|
* vc4: Use symbolic values in texture ptype validation.Eric Anholt2015-06-161-10/+13
|
* vc4: Move vc4_packet.h to the kernel/ directory, since it's also shared.Eric Anholt2015-06-164-3/+3
| | | | I want to notice discrepancies when I diff -u between Mesa and the kernel.
* vc4: Add support for building on Android.Eric Anholt2015-06-151-0/+37
| | | | | | | | v2: Add a comment explaining why we link libmesa_glsl. Drop warning option from freedreno. Add vc4 to the documentation for BOARD_GPU_DRIVERS. Reviewed-by: Emil Velikov <[email protected]>