| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This adds a freedreno backend for the a6xx generation GPUs, which at
the time of this commit is about 98% GLES2 conformant. Much remains to
be done - both performance work and feature work towards more recent
GLES versions, but this is a good start.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
pull in a6xx registers
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Suggested-by: Emil Velikov <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of doing conservative guesses, we should report the max levels
based on the max sizes we get from GL on the host.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These macro-names are also used for softpipe, so let's avoid confusion
by avoiding them. Besides, they are just used in one place in virgl, so
let's just inline them into the place they are used instead.
While we're at it, fixup an error in the comment for the 3D version.
Mesa subtracts computes max-size by doing by 2^(n-1), which means this
should be 256 cubed, not 512 cubed. The other comments are correct.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This option allows us to remove additional s_waitcnt instructions
because s_barrier internally does s_waitcnt 0.
Though, apparently there is a problem with LDS accesses that
causes rendering issues with FFXV and DXVK. Disable this
optimization for now (RadeonSI still uses it).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107460
CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
The diff looks like it moves code that I didn't touch.
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Starting with a6xx, half and full precision registers conflict. Which
makes things a bit more efficient, ie. if some parts of the shader are
heavy on half-precision and others on full precision, you don't have to
allocate the worst case for both. But it means we need to setup some
additional conflicts.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Collapse is_temp() into it's only callsite, and pass compiler object as
struct rather than void. Just cleanups to reduce noise in next patch.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We originally did this because at the time we didn't know all the
bitfields to configure where various frag shader sysval's went. But
we do.
So switch to using sysvals for all the frag shader inputs.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
This way, unused sysval inputs, like frag_vcoord, get the correct regid
value to disable the input.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
create_input()/create_input_compmask() should take the ctx as arg,
rather than block, to enforce that all inputs are created in the first
block, so that RA sees them as live at the start of the shader.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Make it more clear that this is varying fetch related. Also fixup some
comments. Just cleanup for next patches.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Move it from the compile ctx to the compiler object, before adding
new things for a6xx.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Following patches will be doing further cleanup after calling
fd_context_destroy() so it is easier if we move the free() into
the per-gen backend code.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this patch brings a number of changes to ir2:
-ir2 now generates CF clauses as necessary during assembly. this simplifies
fd2_program/fd2_compiler and is necessary to implement optimization passes
-ir2 now has separate vector/scalar instructions. this will make it easier
to implementing scheduling of scalar+vector instructions together. dst_reg
is also now seperate from src registers instead of a single list
-ir2 now implements register allocation. this makes it possible to compile
shaders which have more than 64 TGSI registers
-ir2 now implements the following optimizations: removal of IN/OUT MOV
instructions generated by TGSI and removal of unused instructions when
some exports are disabled
-ir2 now allows full 8-bit index for constants
-ir2_alloc no longer allocates 4 times too many bytes
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that all the build scripts are compatible with both Python 2 and 3,
we can flip the switch and tell Meson to use the latter.
Since Meson already depends on Python 3 anyway, this means we don't need
two different Python stacks to build Mesa.
Signed-off-by: Mathieu Bridon <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
| |
This avoids a memcpy into a temporary in the upload path.
Improves x11perf -putimage100 performance by 12.1586% +/- 1.38155% (n=145)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we would load out the tile-aligned area, update the raster
copy, and store it back. This was a huge cost for XPutImage calls to the
screen under glamor.
Instead, implement a general load/store path that walks over the source
x/y writing into the corresponding pixel of the destination (using clever
math from
https://fgiesen.wordpress.com/2011/01/17/texture-tiling-and-swizzling/).
If things are aligned, we go through the previous utile-at-a-time loop.
Improves x11perf -putimage10 performance by 139.777% +/- 2.83464% (n=5)
Improves x11perf -putimage100 performance by 383.908% +/- 22.6297% (n=11)
Improves x11perf -getimage10 performance by 2.75731% +/- 0.585054% (n=145)
|
|
|
|
|
|
|
|
| |
For the partial load/store support I'm about to add, we want the memcpy to
be compiled out to a single load/store. This should also eliminate the
calls to vc4_utile_width/height().
Improves x11perf -putimage100 performance by 3.76344% +/- 1.16978% (n=15)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of the underlying texture's target. This fixes an issue
where the TGSI sampler type was not agreeing with the sampler view
target/type. In particular, this fixes a Mint 19 XFCE desktop
scaling issue because the TGSI code was using a RECT sampler but
the sampler view's underlying texture was PIPE_TEXTURE_2D.
We want to use the sampler view's type rather than the underlying
resource, as we do for the view's surface format.
No piglit regressions.
VMware issue 2156696.
Reviewed-by: Neha Bhende <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm not sure why we didn't support this in the past, but fillmode
is supported by all renderers nowadays.
Also fix the logic in svga_create_rasterizer_state() to avoid a few
swtnl case.
No piglit regressions
Reviewed-by: Neha Bhende <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes failed assertion running Piglit polygon-mode-face test.
Though, the test still does not pass.
Reviewed-by: Neha Bhende <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We won't have an FD if we're just having the server wait on a fence
created by eglCreateSyncKHR(). Our seqno fences will happen in order, so
server-side waits are no-ops in that case. Fixes
dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.buffers.gen_delete
Fixes: b0acc3a5628c ("broadcom/vc4: Native fence fd support")
|