| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
| |
function.
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
| |
smart references.
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
| |
element type.
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
| |
Define some additional convenience operators, clean up the
implementation slightly, and rename it to 'intrusive_ptr' for reasons
that will be obvious in the next commit.
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes a build break in state_tracker/st_program.c
Signed-off-by: Jordan Justen <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75278
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous code relied on cpu denorm support for converting small float
formats (such r11g11b10_float and r16_float) to floats, otherwise denorms
are flushed to zero. We worked around that in llvmpipe blend code by
reenabling denorms, but this did nothing for texture sampling. Now it would
be possible to reenable it there too but I'm not really a fan of messing
with fpu flags (and it seems we can't actually do it reliably with llvm in
any case looking at some bug reports). (Not to mention if you actually have
a lot of denorms in there, you can expect some order-of-magnitude slowdown
with x86 cpus.)
So instead use code which adjusts exponents etc. directly hence not relying
on cpu denorm support for the rescaling mul.
(We still need the fpu flag handling as we can't do float-to-smallfloat
without using cpu denorms at least for now - I actually wanted to keep
both the old and new code and using one or the other depending on from where
it's called but that didn't work out as the parameter would have to be passed
through too many layers than I'd like.)
Reviewed-by: Zack Rusin <[email protected]>
Reviewed-by: Si Chen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
| |
Since we are now consuming two ringbuffers at a time, we probably want a
pool larger than 4.. but we don't need each individual ringbuffer to be
so large, so offset the pool size increase by reducing rb size.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It seems the write-after-read hazard that applies to texture fetch
instructions, also applies to sfu instructions.
Also, cat5/cat6 instructions do not have a (ss) bit, so in these
cases we need to insert a dummy nop instruction with (ss) bit set.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes radeonsi emitting command streams to the kernel even when there
have been no draw calls before a flush, potentially powering up the GPU
needlessly.
Incidentally, this also cuts the runtime of piglit gpu.py in about half
on my Kaveri system, probably because an X11 client going away no longer
always results in a command stream being submitted to the kernel via
glamor.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65761
Cc: "10.1" [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Required for libdrm 2.4.37 and earlier. Both scons and automake
require version 2.4.38 now so that guard is not longer needed.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
libvdpau, libselinux and libexpat are not used.
Signed-off-by: Kusanagi Kouichi <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If built without llvm, the following error occurs with mplayer:
Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE
[vo/vdpau] Error when calling vdp_device_create_x11: 1
Cc: <[email protected]>
Signed-off-by: Kusanagi Kouichi <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
DRM_API_HANDLE_TYPE_SHARED is zero, so doesn't actually fix anything.
But we shouldn't rely on SHARED handle type being zero.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
This lets multiple gallium drivers use XA.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Build two versions of pipe-loader, with only the client version linking
in x11 client side dependencies. This will allow the XA state tracker
to use pipe-loader.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Seems texture sample instructions don't immediately consume there
src(s). In fact, some shaders from blob compiler seem to indiciate that
it does not even count the texture sample instructions when calculating
number of delay slots to fill for non-sample instructions. (Although so
far it seems inconclusive as to whether this is required.)
In particular, when a src register of a previous texture sample
instruction is clobbered, the (ss) bit is needed to synchronize with the
tex pipeline to ensure it has picked up the previous values before they
are overwritten.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Was supposed to be a '+', otherwise we end up with a negative offset and
choosing registers below the assigned range.
This seems to fix the scheduling mystery "solved" by adding in extra
delay slots.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 'kill' does not produce a result, the new compiler was happily
optimizing them out. We need to instead track 'kill's similar to
outputs. But since there is no non-predicated kill instruction,
(and for flattend if/else we do want them to be predicated), we need
to track the topmost branch condition on the stack and use that as src
arg to the kill. For a kill at the topmost level, we have to generate
an immediate 1.0 to feed into the cmps.f for setting the predicate
register.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to figuring out 32bit float render target, and adding regdump
test in fdre-a3xx, I can more easily play around with instructions to
figure out range of inputs/outputs/etc. And from this I can conclude
that cmps.f works more like expected and I can do something much more
simple in trans_cmp() (compared to before which was more closely
emulating the instruction sequence of the blob compiler).
And using sel.b32 (binary 0/1) often makes more sense than sel.f32
(+/- float) or sel.u32 (+/- uint) as it can use the output directly
from cmps.f without needing the 'add.s tmp0, tmp0, -1'.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Requested by Marek.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The linux winsys needs to know whether a surface is shared.
For guest-backed surfaces we need this information to avoid allocating a
mob out of the mob cache for shared surfaces, but instead allocate a shared
mob, that is never put in the mob cache, from the kernel.
Also previously, all surfaces were given the "shareable" attribute when
allocated from the kernel. This is too permissive for client-local surfaces.
Now that we have the needed info, only set the "shareable" attribute if the
client indicates that it needs to share the surface.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
| |
This is a squash commit of many commits by Thomas Hellstrom.
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some situations, it may be desirable to bypass the cache at buffer
creation but to insert the buffer in the cache at buffer destruction.
One such situation is where we already have a kernel representation of a
buffer that we want to use, but we also want to insert it in the cache when
it's freed up.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
|
| |
In some situations it's important to restrict the sizes of buffers that the
cached buffer manager is allowed to return
Signed-off-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
| |
And update some existing commands.
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
|
| |
This adds new interface functions for guest-backed surfaces and
adds a mobid parameter to the surface_relocation() function.
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The old svga3d_reg.h file is split into separate header files and we
add new items for guest-backed surfaces.
Plus some minor code fixes because of renamed symbols.
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
| |
Reviewed-by: Christian König <[email protected]>
|