| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
V3D 3.3 is a continuation of the 3D implementation in VC4 (v2.1 and v2.6).
V3D 3.3 introduces an MMU (no more CMA allocations) and support for
GLES3.1. This driver is not currently conformant, though that will be a
target as soon as possible.
V3D 3.x parts use a new texture tiling layout common across many Broadcom
graphics parts including and the HVS scanout engine. It also massively
changes the QPU instructions, introducing a common physical register file
(no more A/B split) and half-float instructions, while removing the 4x8
unorm instructions in favor of half-float for talking to fixed function
interfaces. Because so much has changed, vc5 is implemented in a separate
gallium driver, using only the XML code-generation support from vc4.
v2: Fix tile layout for 64bpp textures. Fix texture swizzling for 32-bit
returns. Fix up a bit of MRT setup. Sync the simulator to kernel
behavior a bit more. Improve uniform debugging code. Rebase on
QIR->VIR rename. Move texture state mostly to the CSOs. Improve
cache flushing on the simulator. Fix program deletion
use-after-frees.
Acked-by: Dave Airlie <[email protected]> (uabi plan)
Acked-by: Daniel Vetter <[email protected]> (uabi plan)
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If you don't pass this, the compiler refuses to compile the assembly for
pre-v7 CPUs. This also keeps us from building identical, non-NEON code on
aarch64 and x86.
Fixes: a373f77662c5 ("vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.")
v2: Fix Android build by just appending NEON_C_SOURCES when
ARCH_ARM_HAVE_NEON.
Tested-by: Rob Herring <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VC4_DEBUG_CL output goes from:
0x00000010 0x00000010: 0x06 VC4_PACKET_START_TILE_BINNING
0x00000011 0x00000011: 0x38 VC4_PACKET_PRIMITIVE_LIST_FORMAT
0x00000012 0x00000012: 0x12
0x00000013 0x00000013: 0x66 VC4_PACKET_CLIP_WINDOW
0x00000014 0x00000014: 0x00
0x00000015 0x00000015: 0x00
0x00000016 0x00000016: 0x00
0x00000017 0x00000017: 0x00
0x00000018 0x00000018: 0xfa
0x00000019 0x00000019: 0x00
0x0000001a 0x0000001a: 0xfa
0x0000001b 0x0000001b: 0x00
to:
0x00000010 0x00000010: 0x06 Start Tile Binning
0x00000011 0x00000011: 0x38 Primitive List Format
Data Type: 1 (16-bit index)
Primitive Type: 2 (Triangles List)
0x00000013 0x00000013: 0x66 Clip Window
Clip Window Height in pixels: 250
Clip Window Width in pixels: 250
Clip Window Bottom Pixel Coordinate: 0
Clip Window Left Pixel Coordinate: 0
v2: Squash in robher's fixes for Android
|
|
|
|
|
|
|
|
|
|
|
| |
Needing to get our uapi header from libdrm has only complicated things.
Follow intel's lead and drop our requirement for it.
Generated from the same commit mentioned in the README.
v2: Update Android.mk as well, move vc4_drm.h reference for distcheck.
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Android.mk was setting the flag across the entire driver, so we didn't
have non-NEON versions getting built. This was going to be a problem with
the next commit, when I start auto-detecting NEON support and use the
non-NEON version when appropriate.
Reviewed-by: Rob Herring <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had a lot of memcpy call overhead because gpu_stride wasn't being
inlined. But if you split out the stride==8 and stride==16 cases like
this code does while still using memcpy, you'd no longer have glibc's
NEON memcpy applied at which point we'd be doing 16 uncached reads
instead of 64/(NEON memcpy granularity), for about a 30% performance
hit. By hand writing the assembly, we can get a whole cacheline
loaded at a time.
Unfortunately, NEON intrinsics turned out to be unusable -- they
didn't have the vldm instruction available.
Note that, for now, the NEON code is only enabled when building for ARMv7
(Pi 2+). We may want to do runtime detection for the Raspbian case, in
the future.
Improves 1024x1024 GetTexImage by 208.256% +/- 7.07029% (n=10).
|
|
|
|
|
|
| |
This helps in debugging memory pressure. It would be nice if we could
tell valgrind about it all the way from allocation time to destroy, but we
need a pointer to hand to VALGRIND_MALLOCLIKE_BLOCK.
|
|
|
|
|
| |
The required version is set to .69 for the getparam ioctl that will be
used in the next commit.
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
In the pipe-loader reworks, it was missed in one of the new directories it
was used.
Cc: [email protected]
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Cc: 10.6 <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Silence the warnings about the future incompatibility with automake 2.0
Cc: Eric Anholt <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Just because we put the source in a subdir, doesn't mean we need helper
libraries in the build. This will also simplify the Android build setup.
|
|
|
|
| |
Now this whole setup matches the kernel's file layout much more closely.
|
|
|
|
|
|
|
|
|
| |
This will let me more reliably allocate a-file registers, which are going
to be even more in demand when I start using a-file unpacks.
Also fixes a bug where the reservation of payload registers (FRAG_Z/W) was
off by one but just caused failure to register allocate at all if the
off-by-one was fixed.
|
|
This mostly just takes every draw call and turns it into a sequence of
commands that clear the FBO and draw a single shaded triangle to it,
regardless of the actual input vertices or shaders. I copied the initial
driver skeleton mostly from freedreno, and I've preserved Rob Clark's
copyright for those. I also based my initial hardcoded shaders and
command lists on Scott Mansell (phire)'s "hackdriver" project, though the
bit patterns of the shaders emitted end up being different.
v2: Rebase on gallium megadrivers changes.
v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change.
v4: Rely on simpenrose actually being installed when building for
simulation.
v5: Add more header duplicate-include guards.
v6: Apply Emil's review (protection against vc4 sim and ilo at the same
time, and dropping the dricommon drm bits) and fix a copyright header
(thanks, Roland)
|