| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately there's only one RT_ARRAY_MODE setting for all
attachments, so clears were previously truncated to the minimum number
of layers any attachment had. Instead set the RT_ARRAY_MODE to 512 (the
max number of layers) before doing the clear. This fixes
gl-3.2-layered-rendering-clear-color-mismatched-layer-count.
Also fix clears of individual layered rt/zeta, in case it ever happens.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Christoph Bumiller <[email protected]>
Cc: 10.1 <[email protected]>
|
|
|
|
|
|
|
|
| |
Use tex->bo_format instead of zs->format in ilo_blitter_rectlist_clear_zs()
because the latter may be combined depth/stencil format. hiz_can_clear_zs()
is no-op for GEN7+, but move the GEN check so that the assertions are tested.
Finally, call the fast depth clear function from ilo_clear().
|
|
|
|
|
| |
It is needed for 3DSTATE_CLEAR_PARAMS, and can also be used to track what
value the slice has been cleared to.
|
|
|
|
|
| |
Improve comments for the flags, and explicitly separate their uses in slice
flags and resolve flags.
|
|
|
|
|
| |
3DSTATE_STENCIL_BUFFER inherits some states from 3DSTATE_DEPTH_BUFFER. We
need to emit both even the surface is stencil only.
|
|
|
|
| |
Layer offsetting is possible when it is level 0, layer 0.
|
|
|
|
| |
It happens to work because PIPE_USAGE_STAGING is 0x100.
|
|
|
|
|
| |
Assume the bo has been written by another process, which will trigger a HiZ
resolve.
|
|
|
|
|
|
| |
We were turning non-memory spill slots into NULL.
Cc: 10.1 <[email protected]>
|
|
|
|
|
|
|
|
| |
Since we are now consuming two ringbuffers at a time, we probably want a
pool larger than 4.. but we don't need each individual ringbuffer to be
so large, so offset the pool size increase by reducing rb size.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It seems the write-after-read hazard that applies to texture fetch
instructions, also applies to sfu instructions.
Also, cat5/cat6 instructions do not have a (ss) bit, so in these
cases we need to insert a dummy nop instruction with (ss) bit set.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes radeonsi emitting command streams to the kernel even when there
have been no draw calls before a flush, potentially powering up the GPU
needlessly.
Incidentally, this also cuts the runtime of piglit gpu.py in about half
on my Kaveri system, probably because an X11 client going away no longer
always results in a command stream being submitted to the kernel via
glamor.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65761
Cc: "10.1" [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Seems texture sample instructions don't immediately consume there
src(s). In fact, some shaders from blob compiler seem to indiciate that
it does not even count the texture sample instructions when calculating
number of delay slots to fill for non-sample instructions. (Although so
far it seems inconclusive as to whether this is required.)
In particular, when a src register of a previous texture sample
instruction is clobbered, the (ss) bit is needed to synchronize with the
tex pipeline to ensure it has picked up the previous values before they
are overwritten.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Was supposed to be a '+', otherwise we end up with a negative offset and
choosing registers below the assigned range.
This seems to fix the scheduling mystery "solved" by adding in extra
delay slots.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 'kill' does not produce a result, the new compiler was happily
optimizing them out. We need to instead track 'kill's similar to
outputs. But since there is no non-predicated kill instruction,
(and for flattend if/else we do want them to be predicated), we need
to track the topmost branch condition on the stack and use that as src
arg to the kill. For a kill at the topmost level, we have to generate
an immediate 1.0 to feed into the cmps.f for setting the predicate
register.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to figuring out 32bit float render target, and adding regdump
test in fdre-a3xx, I can more easily play around with instructions to
figure out range of inputs/outputs/etc. And from this I can conclude
that cmps.f works more like expected and I can do something much more
simple in trans_cmp() (compared to before which was more closely
emulating the instruction sequence of the blob compiler).
And using sel.b32 (binary 0/1) often makes more sense than sel.f32
(+/- float) or sel.u32 (+/- uint) as it can use the output directly
from cmps.f without needing the 'add.s tmp0, tmp0, -1'.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The linux winsys needs to know whether a surface is shared.
For guest-backed surfaces we need this information to avoid allocating a
mob out of the mob cache for shared surfaces, but instead allocate a shared
mob, that is never put in the mob cache, from the kernel.
Also previously, all surfaces were given the "shareable" attribute when
allocated from the kernel. This is too permissive for client-local surfaces.
Now that we have the needed info, only set the "shareable" attribute if the
client indicates that it needs to share the surface.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
| |
And update some existing commands.
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
|
| |
This adds new interface functions for guest-backed surfaces and
adds a mobid parameter to the surface_relocation() function.
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The old svga3d_reg.h file is split into separate header files and we
add new items for guest-backed surfaces.
Plus some minor code fixes because of renamed symbols.
Reviewed-by: Thomas Hellstrom <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the flag to DBG_HYPERZ and reverse the logic
so setting the flag enabled the feature. This disables
hyperz on r600g and radeonsi by default. It can be
enabled by setting the env var. There are just too
many issues with certain apps so leave it disabled for
now until we sort out the issues with the problematic
apps.
Bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=58660
https://bugs.freedesktop.org/show_bug.cgi?id=64471
https://bugs.freedesktop.org/show_bug.cgi?id=66352
https://bugs.freedesktop.org/show_bug.cgi?id=68799
https://bugs.freedesktop.org/show_bug.cgi?id=72685
https://bugs.freedesktop.org/show_bug.cgi?id=73088
https://bugs.freedesktop.org/show_bug.cgi?id=74428
https://bugs.freedesktop.org/show_bug.cgi?id=74803
https://bugs.freedesktop.org/show_bug.cgi?id=74863
https://bugs.freedesktop.org/show_bug.cgi?id=74892
https://bugzilla.kernel.org/show_bug.cgi?id=70411
Signed-off-by: Alex Deucher <[email protected]>
Cc: "10.1" "10.0" <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2 (chk): revert feedback buffer hack
v3 (slava): fixed bitstream size calculation
v4 (chk): always create buffers in the right domain
v5 (chk): flush async
v6 (chk): rework fw interface add version check
v7 (leo): implement cropping support
v8 (chk): add hw checks
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Leo Liu <[email protected]>
Signed-off-by: Slava Grigorev <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Commit 246ca4b001 ("nv50: implement multiple viewports/scissors, enable
ARB_viewport_array") added dirty tracking to scissors/viewports. However
it neglected to mark them all as dirty on a context switch. This fixes
an apparent regression in webgl in chrome, but probably in any
application that switches contexts.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Unused and unmaintained for quite a while.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Maarten Lankhorst <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Christoph Bumiller <[email protected]>
|
|
|
|
| |
Signed-off-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes bug 73200 "vdpau-GL interop fails due to different screen
objects" in the same way radeon does.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
It should be possible to make this be 16 on nvc0.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Leo Liu <[email protected]>
|
|
|
|
|
|
| |
tested on rv635 and barts.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently some players are ill-prepared for us claiming that a decoder
exists only to have creating it fail, and express this poor preparation
with crashes (e.g. flash). Check that firmware is there to increase the
chances of there being a high correlation between reported capabilities
and ability to create a decoder.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: 10.0 10.1 <[email protected]>
Tested-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: adjust limits for radeonsi and llvmpipe
v3: add documentation
Cc: "10.1" <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for
nv40/nv30. This fixes compilation of the varying-packing tests.
Furthermore it appears that the last 2 inputs on nv4x don't seem to
work in those tests, so just report 8 everywhere for now.
Tested on NV42, NV44. NV4B appears to have additional problems.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: 9.1 9.2 10.0 10.1 <[email protected]>
|
|
|
|
| |
Cc: 10.1 <[email protected]>
|
|
|
|
| |
Cc: 10.1 <[email protected]>
|
|
|
|
| |
It's required for being able to use software methods now.
|
| |
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Christoph Bumiller <[email protected]>
|