summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4/kernel
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Add simulator kernel validation for multithreaded fragment shaders.Jonas Pfeil2016-11-123-5/+76
| | | | | This is Jonas Pfeil's code from the kernel, brought back to Mesa by anholt.
* vc4: Add miptree/texture state support for ETC1 compressed textures.Eric Anholt2016-11-031-0/+7
| | | | | The format isn't flagged as enabled at runtime yet, because we need kernel validation support.
* vc4: Fix termination of the initial scan for branch targets.Eric Anholt2016-10-211-11/+8
| | | | | | | | | | | | | The loop is scanning until the original max_ip (size of the BO), but we want to not examine any code after the PROG_END's delay slots. There was a block trying to do that, except that we had some early continue statements if the signal wasn't a PROG_END or a BRANCH. The failure mode would be that a valid shader is rejected because some undefined memory after the PROG_END slots is parsed as a branch and the rest of its setup is illegal. I haven't seen this in the wild, but valgrind was complaining and the new userland simulator code started triggering it.
* Introduce .editorconfigEric Engestrom2016-08-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files to try and enforce the formatting of the code, to which Michel Dänzer suggested [1] we start by importing the existing .dir-locals.el settings. The first draft was discussed in the RFC [2]. These .editorconfig are a first step, one that has the advantage of requiring little to no intervention from the devs once the settings files are in place, but the settings are very limited. This does have the advantage of applying while the code is being written. This doesn't replace the need for more comprehensive formatting tools such as clang-format & clang-tidy, but those reformat the code after the fact. [0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Eric Anholt <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* vc4: Add kernel support for branching in shader validation.Eric Anholt2016-07-123-17/+280
| | | | | | | | | | | | | | | | | | | | | We're already checking that branch instructions are within the contents of the shader and the proper PROG_END sequence is present. The other thing we need in the presence of branching is to verify that the shader doesn't overflow past the end of the uniforms stream. To do that, we require that at the start of any basic block reading uniforms have the following instructions: load_imm temp, <offset within uniform stream> add unif_addr, temp, unif The instructions are generated by userspace, and the kernel verifies that the load_imm is of the expected offset, and that the add adds it to a uniform. We track which uniform in the stream that is, and at draw call time fix up the uniform stream to have the address of the start of the shader's uniforms for that draw call. Signed-off-by: Eric Anholt <[email protected]>
* vc4: Add a bitmap of branch targets in kernel validation.Eric Anholt2016-07-121-2/+112
| | | | | | This isn't used yet, it's just a first step toward loop validation. During the main parsing of instructions, we need to know when we hit a new basic block so that we can reset validated state.
* vc4: Track the current instruction into the validation_state.Eric Anholt2016-07-121-24/+30
| | | | | This reduces how much we need to pass around as arguments, which was becoming more of a problem with looping validation.
* Remove wrongly repeated words in commentsGiuseppe Bilotta2016-06-231-1/+1
| | | | | | | | | | | | | | | | | Clean up misrepetitions ('if if', 'the the' etc) found throughout the comments. This has been done manually, after grepping case-insensitively for duplicate if, is, the, then, do, for, an, plus a few other typos corrected in fly-by v2: * proper commit message and non-joke title; * replace two 'as is' followed by 'is' to 'as-is'. v3: * 'a integer' => 'an integer' and similar (originally spotted by Jason Ekstrand, I fixed a few other similar ones while at it) Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* vc4: Fix validation of full res tile offset if used for non-MSAA.Eric Anholt2016-04-223-2/+14
| | | | | | There's no reason we couldn't do non-MSAA full resolution tile buffer load/stores, but we would have claimed buffer overflow was being attempted. Nothing does this currently.
* vc4: Add kernel RCL support for MSAA rendering.Eric Anholt2015-12-083-38/+231
|
* vc4: Rename color_ms_write to color_write.Eric Anholt2015-12-081-15/+14
| | | | | I was thinking this was the only MSAA resolve thing, so it should be noted separately, but actually load/store general also do MSAA resolve.
* vc4: Fix compiler warning from size_t change.Eric Anholt2015-12-081-1/+1
| | | | I missed this when bringing over the kernel changes.
* vc4: Bring over cleanups from submitting to the kernel.Eric Anholt2015-12-053-86/+77
|
* vc4: Add debug dumping of MSAA surfaces.Eric Anholt2015-12-041-0/+2
|
* vc4: Add support for loading sample mask.Eric Anholt2015-12-041-0/+3
|
* vc4: Mark our shaders as single-threaded.Eric Anholt2015-07-301-0/+5
| | | | | I had my understanding of this bit flipped. We're using the full register space, so we need to say so.
* vc4: Avoid overflowing various static tables.Eric Anholt2015-07-301-1/+1
|
* vc4: Fix return values from recent validation changes.Eric Anholt2015-07-301-4/+4
|
* vc4: Simplify vc4_use_bo and make sure it's not a shader.Eric Anholt2015-07-283-36/+23
| | | | | | | Since the conversion to keeping validated shaders around for the BO's lifetime, we haven't been checking that rendering doesn't happen to shaders. Make vc4_use_bo check that always, and just don't use it for the VC4_MODE_SHADER case (so now modes are unused)
* vc4: Keep the validated shader around for the simulator execution.Eric Anholt2015-07-281-13/+6
| | | | This more closely matches the kernel behavior on shader validation now.
* vc4: Make the object be the return value from vc4_use_bo().Eric Anholt2015-07-283-23/+25
| | | | Drops 40 bytes of code from validation.
* vc4: Ensure that the bin CL is properly capped by increment/flush.Eric Anholt2015-07-283-26/+36
| | | | | | | | We don't want anything to appear after we've kicked off the render (and thus job flush), since that might then get written out to the tile allocation state. Signed-off-by: Eric Anholt <[email protected]>
* vc4: Drop NV shader reloc validation.Eric Anholt2015-07-282-120/+60
| | | | It wasn't validating enough, and we don't need the packet.
* vc4: Add debugging on texture relocation validation failures.Eric Anholt2015-07-171-7/+13
|
* vc4: Add dumping for VC4_PACKET_LOAD/STORE_FULL_RES_TILE_BUFFER.Eric Anholt2015-06-231-0/+10
|
* vc4: Clarify size calculation for Z/S writes.Eric Anholt2015-06-231-1/+1
| | | | | It's the same value for loads and stores, because they're basically the same packet.
* vc4: Add an "args" temporary for RCL setup.Eric Anholt2015-06-231-24/+24
|
* vc4: Reuse (and extend) the packet.h sizes for dumping.Eric Anholt2015-06-231-0/+7
|
* vc4: Move tile state/alloc allocation into the kernel.Eric Anholt2015-06-174-63/+66
| | | | | | | This avoids a security issue where userspace could have written the tile state/tile alloc behind the GPU's back, and will apparently be necessary for fixing stability bugs (tile state buffers are missing some top bits for the tile alloc's address).
* vc4: Move RCL generation into the kernel.Eric Anholt2015-06-174-306/+544
| | | | | There weren't that many variations of RCL generation, and this lets us skip all the in-kernel validation for what we generated.
* vc4: Make sure that direct texture clamps have a minimum value of 0.Eric Anholt2015-06-161-25/+63
| | | | | | | I was thinking of the MIN opcode in terms of unsigned math, but it's signed, so if you used a negative array index, you could read before the UBO. Fixes segfaults under simulation in piglit array indexing tests with mprotect-based guard pages.
* vc4: R4 is not a valid register for clamped direct texturing.Eric Anholt2015-06-161-1/+1
| | | | | Our array only goes to R3, and R4 is a special case that shouldn't be used.
* vc4: Factor out the live clamp register getter.Eric Anholt2015-06-161-8/+24
|
* vc4: Handle refcounting the exec BO like we do in the kernel.Eric Anholt2015-06-162-0/+8
| | | | | This reduces the diff to the kernel, and will be useful when I make the kernel allocate more BOs as part of validation.
* vc4: Use VC4_SET/GET_FIELD for some RCL packets.Eric Anholt2015-06-162-36/+40
|
* vc4: Make symbolic values for packet sizes.Eric Anholt2015-06-162-34/+69
|
* vc4: Use symbolic values in texture ptype validation.Eric Anholt2015-06-161-10/+13
|
* vc4: Move vc4_packet.h to the kernel/ directory, since it's also shared.Eric Anholt2015-06-161-0/+335
| | | | I want to notice discrepancies when I diff -u between Mesa and the kernel.
* vc4: Drop subdirectory in vc4 build.Eric Anholt2015-06-092-46/+0
| | | | | Just because we put the source in a subdir, doesn't mean we need helper libraries in the build. This will also simplify the Android build setup.
* vc4: Update to current kernel validation code.Eric Anholt2015-06-092-35/+36
| | | | | After profiling on real hardware, I found a few ways to cut down the kernel overhead.
* vc4: Allow submitting jobs with no bin CL in validation.Eric Anholt2015-04-133-3/+9
| | | | | | For blitting, we want to fire off an RCL-only job. This takes a bit of tweaking in our validation and the simulator support (and corresponding new code in the kernel).
* vc4: Fix off-by-one in branch target validation.Eric Anholt2015-04-131-1/+1
|
* vc4: Sync with kernel changes to relax BCL versus RCL validation.Eric Anholt2015-04-131-22/+3
| | | | There was no reason to tie the two packets' values together.
* vc4: Write the alignment of level width consistently in validation.Eric Anholt2015-03-241-2/+2
| | | | | | 16 / cpp happens to be the same as utile_w on the only raster format supported (4 bytes per pixel), but simulator/hw source code generally talks in terms of utiles.
* vc4: Update to current kernel sources.Eric Anholt2015-02-244-33/+39
| | | | | | New BO create and mmap ioctls are added. The submit ABI gains a flags argument, and the pointers are fixed at 64-bit. Shaders are now fixed at the start of their BOs.
* dir-locals.el: Don't set variables for non-programming modesNeil Roberts2015-02-021-1/+1
| | | | | | | | | | | | | | This limits the style changes to modes inherited from prog-mode. The main reason to do this is to avoid setting fill-column for people using Emacs to edit commit messages because 78 characters is too many to make it wrap properly in git log. Note that makefile-mode also inherits from prog-mode so the fill column should continue to apply there. v2: Apply to all the .dir-locals.el files, not just the one in the root directory. Acked-by: Michel Dänzer <[email protected]>
* vc4: Add support for turning constant uniforms into small immediates.Eric Anholt2014-12-171-3/+14
| | | | | | | | | | | | | | | | | | | | | | Small immediates have the downside of taking over the raddr B field, so you might have less chance to pack instructions together thanks to raddr B conflicts. However, it also reduces some register pressure since it lets you load 2 "uniform" values in one instruction (avoiding a previous load of the constant value to a register), and increases some pairing for the same reason. total uniforms in shared programs: 16231 -> 13374 (-17.60%) uniforms in affected programs: 10280 -> 7423 (-27.79%) total instructions in shared programs: 40795 -> 41168 (0.91%) instructions in affected programs: 25551 -> 25924 (1.46%) In a previous version of this patch I had a reduction in instruction count by forcing the other args alongside a SMALL_IMM to be in the A file or accumulators, but that increases register pressure and had a bug in handling FRAG_Z. In this patch is I just use raddr conflict resolution, which is more expensive. I think I'd rather tweak allocation to have some way to slightly prefer good choices for files in general, rather than risk failing to register allocate by forcing things into register classes.
* vc4: Fix decision for whether the MIN operation writes to the B regfile.Eric Anholt2014-12-081-3/+3
|
* vc4: Emit semaphore instructions for new kernel ABI.Eric Anholt2014-11-182-3/+76
| | | | | | | Previously, the kernel would dispatch thread 0, wait, then dispatch thread 1. By insisting that the thread contents use semaphores in the right place, the kernel can sleep for longer by dispatching both threads at once.
* vc4: Add support for ARL and indirect register access on TGSI_FILE_CONSTANT.Eric Anholt2014-10-283-26/+183
| | | | | Fixes 14 ARB_vp tests (which had no lowering done), and should improve performance of indirect uniform array access in GLSL.