| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
In case ctx->stream == NULL the fail label gets executed where
pctx gets dereferenced - too bad pctx is NULL in that case.
Caught by Coverity, reported to me by imirkin.
Cc: "17.0" <[email protected]>
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
(cherry picked from commit a0b16a08905d68da07668a42eeb464b4f30bf3e5)
|
|
|
|
|
|
|
|
|
|
|
| |
The same PS epilog workaround as for 8-bit integer formats is required,
since the CB doesn't do clamping.
Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 066a117be77fdc2b29c8eafabb4e2c2fa902a18e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also handle the GL_ARB_indirect_parameters case where the count itself
is in a buffer.
Use transfers rather than mapping the buffers directly. This anticipates
the possibility that the buffers are sparse (once ARB_sparse_buffer is
implemented), in which case they cannot be mapped directly.
Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type
on <= CIK.
v2:
- unmap the indirect buffer correctly
- handle the corner case where we have indirect draws, but all of them
have count 0.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
Acked-by: Edward O'Callaghan <[email protected]>
(cherry picked from commit 6a1d9684f4ec1e1eed49bc14749be7b7784277ec)
|
|
|
|
|
|
|
|
|
| |
It's OK for r300g (because r300g can't write to buffers via the GPU), but
not later hardware. This issue was spotted randomly.
Cc: [email protected]
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit c8ef5123980f9f538c79e626b0092660a2256ae6)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.
v2: Also fix the util_shorten_ubyte_elts_to_userptr call.
Tested with the new piglit.
Cc: [email protected]
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit a264fee6245856340fab9024e1a428626e966335)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/drivers/radeonsi/si_state_draw.c
|
|
|
|
|
|
|
|
|
|
|
| |
Empirically, this makes things work. Presumably this was originally
copied from the blob, which does make use of linked tsc mode.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99532
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: [email protected]
(cherry picked from commit 956556b3c30ce3d38d0af795f9383df3bc2cf8a2)
|
|
|
|
|
|
|
|
| |
Fixes GL45-CTS.compute_shader.conditional-dispatching
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
(cherry picked from commit 48f04862c1d74844db9534b32ef73e5a2bc0ae74)
|
|
|
|
|
|
|
|
| |
Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
(cherry picked from commit 7e75f0913ab545be14feb233d1ed74dc48116fb8)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel will reject our shader if we emit one here, and having 4, 8, or
12 as the top end of our UBO clamp rare is enough that it's not worth
making the kernel let us.
Fixes piglit fs-const-array-of-struct and
fs-const-array-of-struct-of-array since recent GLSL linking changes made
us get this as an indirect load of a uniform, instead of a tempoary.
Cc: "13.0 17.0" <[email protected]>
(cherry picked from commit b2309393039b2ec0cc00a8e6fd828c60c4ef1e11)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want cached GTT for all non-persistent read mappings.
Set level = 0 on purpose.
Use dma_copy, because resource_copy_region causes a failure in the PBO
read of piglit/getteximage-luminance.
If Rocket League used the READ flag, it should get cached GTT.
v2: mask out UNSYNCHRONIZED
Cc: 13.0 17.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit d86099df0af7c22c8acfd48b38ad446d9c8df6bd)
|
|
|
|
|
|
|
|
|
|
|
|
| |
We just increased the max UBO, so we should also increase the clamp that
we do for robustness. Similarly, as we're including the fileIndex in the
new indirect value, we should reset fileIndex to 0 so that it is not
added in a second time.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: [email protected]
(cherry picked from commit c95f821cb4286f8163bfdf341be2b0940011585a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Kepler and up unfortunately only support up to 8 constbufs. We work
around this by loading from constbufs as if they were storage buffers.
However we were not consistently applying limits to loads from these
buffers. Make sure to do the same thing we do for storage buffers.
Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: [email protected]
(cherry picked from commit 1acdd62847cf0da8a8e9c7915d698208d73a5be8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but
it's unclear what it refers to). We need to expose an extra binding
point for the "program parameters", which means this must be 15. Remove
the last vestige of the "use c14 for immediates" idea.
Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: [email protected]
(cherry picked from commit 59ca352fc573a37f9f70c1f6217e85dd3e31d38e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clipper::ClipScalar() is dead code and should be removed. It is causing
an error with gcc-7 because it references a now defunct member.
v2: includes bugzilla reference, same code change
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99633
CC: "13.0 17.0" <[email protected]>
Tested-by: Vinson Lee <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tim Rowley <[email protected]>
(cherry picked from commit bf29495dcdb290c8b15cacd2001603b8ae5d36c8)
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit bdd860e3076655519d45bd66936ef7be9b7dda63.
Requested by a game developer.
Cc: 17.0 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit dfe111368d11aaffae7f8738c858c335cdec1e9d)
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a vertex data corruption issue if some of the vertex streams
go through the MMU and some don't.
Signed-off-by: Lucas Stach <[email protected]>
Tested-by: Philipp Zabel <[email protected]>
Acked-by: Christian Gmeiner <[email protected]>
(cherry picked from commit e158b7497103f145a9236a70183e07c37a9e13f7)
Nominated-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes rendering of full-screen quads (and other screen-filling
geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op
on other hardware.
- It looks like SE_CLIP registers were not set at all.
I'm amazed that rendering worked without them. Emit them to
avoid issues on gc3000.
- Define constants
ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119)
ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111)
ETNA_SE_CLIP_MARGIN_RIGHT (0xffff)
ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff)
These demarcate the margin (fixp16) between the computed sizes and the
value sent to the chip. I have set these to the numbers used by the
Vivante driver for gc2000. I am not sure whether any old hardware was
relying on the old numbers, or whether those were just a guess. But if
so, these need to be moved to the _specs structure.
CC: <[email protected]>
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Acked-by: Christian Gmeiner <[email protected]>
(cherry picked from commit 56314f5bafdfeb514adf8401c52f216bd430bbb2)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shaders using sin/cos instructions were not working on GC3000.
The reason for this turns out to be that these chips implement sin/cos
in a different way (but using the same opcodes):
- Need their input scaled by 1/pi instead of 2/pi.
- Output an x and y component, which need to be multiplied to
get the result.
- tex_amode needs to be set to 1.
Add a new bit to the compiler specs and generate these instructions
as necessary.
CC: <[email protected]>
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Acked-by: Christian Gmeiner <[email protected]>
(cherry picked from commit fe3bb8cdb519a01e6315ce6f142827aece3d4a41)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Exposing rb swapped (or other swizzled) formats for rendering would
involve swizzing in the pixel shader. This is not the case at the
moment, so reject requests for creating such surfaces.
(GPUs that need an extra resolve step anyway due to multiple pixel
pipes, such as gc2000, might also do this swap in the resolve operation.
But this would be tricky to keep track of)
CC: <[email protected]>
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Acked-by: Christian Gmeiner <[email protected]>
(cherry picked from commit 658568941d5e232d690e1ffbcddbd6ea9685693a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use of unsigned loop control variable with '>= 0' would lead
to infinite loop.
Reported by clang:
etnaviv_compiler.c:1024:39: warning: comparison of unsigned expression
>= 0 is always true [-Wtautological-compare]
for (unsigned sp = c->frame_sp; sp >= 0; sp--)
~~ ^ ~
v2: Simply use the same datatype as c->frame_sp is using.
CC: <[email protected]>
Reported-by: Rhys Kidd <[email protected]>
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Rhys Kidd <[email protected]>
(cherry picked from commit 82fe240a9912d78bc2eec513c1139c918c5f189f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug uncovered by the 17-part patch series, specifically:
"gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter"
If dirty_tex_counter has been updated and set_shader_image invokes DCC
decompression, the DCC decompression itself checks the counter and updates
descriptors, which in turn invokes the same DCC decompression. The blitter
can't handle the recursion and the driver eventually crashes.
Cc: 17.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit a0740d59aa97a08d89998cb57138e8217a331af6)
|
|
|
|
|
|
|
|
|
|
| |
Commit 7b5878ee0491e7a93914389a8369cd6752b9757d increased number of
outputs to 64, but left output array intact. This caused stack overflow
when number of outputs is bigger then 32. Found by ASAN.
Cc: "12.0 13.0 17.0" <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit a41f2527ae8ae5432b99c88863fbdf2f0b5f04ad)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Analogous to previous commit.
Fixes: 4610e5ef28e "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: Nicolas Dechesne <[email protected]>
Reported-by: Nicolas Dechesne <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Tested-by: Nicolas Dechesne <[email protected]>
(cherry picked from commit a922c821255bfac22cf705244e5bd303a626bb55)
|
|
|
|
|
|
|
|
|
| |
This fixes R11G11B10_FLOAT, because it's in the category of "OTHER",
meaning that it doesn't have any channel description.
Cc: 17.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit eac7df43ca05abd9992b305e078e88fe7b7f8c91)
|
|
|
|
|
|
|
|
|
| |
Some CIK-VI docs say this is the default behavior on SI. That doesn't
answer whether it's also the default behavior on CIK-VI.
Cc: 17.0 13.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit 573bf0940a08e18a511e338de478f30fd95a1590)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some query results struct contents are declared as cache line aligned.
Use aligned malloc, and align the whole struct, to be safe.
Fixes crash when compiling with clang.
CC: <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
(cherry picked from commit 00847e4f14dd237dfcdb2c3d15be1325a08ccf5a)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CalculateProcessorTopology tries to figure out system topology by
parsing /proc/cpuinfo to determine the number of threads, cores, and
NUMA nodes. There are some architectures where the "physical id" begins
with 1 rather than 0, which was creating and empty "0" node and causing a
crash in CreateThreadPool.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102
Reviewed-By: George Kyriazis <[email protected]>
CC: <[email protected]>
(cherry picked from commit b829206b0739925501bcc68233437d6d03b79795)
|
|
|
|
|
|
|
| |
Tested-by: Glenn Kennard <[email protected]>
Tested-by: James Harvey <[email protected]>
Cc: 17.0 <[email protected]>
(cherry picked from commit e4f8f9a638c1ffb9b76840b088290f11f0f91813)
|
|
|
|
|
|
|
|
|
| |
We will use it for DDIV.
Tested-by: Glenn Kennard <[email protected]>
Tested-by: James Harvey <[email protected]>
Cc: 17.0 <[email protected]>
(cherry picked from commit 488560cfe6ee2206f7a7f894694ebc43b419be61)
|
|
|
|
|
|
|
|
|
|
|
| |
It seems clear that trying to multiply two pairs of doubles would result
in the temporary register getting overwritten by the second pair. So
make the code more explicit.
Tested-by: Glenn Kennard <[email protected]>
Tested-by: James Harvey <[email protected]>
Cc: 17.0 <[email protected]>
(cherry picked from commit 76b02d2fe1df5351f67f53d07b37952043f0a84c)
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 31daeb5bf14334bc0d39f28c9102cd15d834abfc)
|
|
|
|
|
|
|
|
|
|
|
|
| |
What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into
bary.f. We were incorrectly setting both this and gl_FragCoord.xy to
the same register resulting in all sorts of hilarity.
Fixes stk, vdrift, 0ad, probably a bunch others.
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 8d6af93e76bb9e592293b632b22b2b756cc0cae8)
|
|
|
|
|
|
|
|
|
| |
Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on
a5xx.
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 6cc93bedc15d09395ab6a92a0a129d06a8cd8ae8)
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 141a4f86d6b9c0c4dbde511b741576a103f8f7ff)
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 69fbb458cf59fbab5f6675ad256a266b04d54700)
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 16671e970444f154ffa60d2aaadee4d065eb6103)
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 4d9aa4f67d6316feea93901bf29b76a68c4333cd)
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 4c39458460075f6c1ea9e4607769513b96c6dd82)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes VM faults. Discovered by Samuel Pitoiset.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450
Cc: 17.0 13.0 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
(cherry picked from commit e490b7812cae778c61004971d86dc8299b6cd240)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At least on VI, texture gather doesn't work with a 24_8 data format, so
use 8_8_8_8 and a modified swizzle instead.
A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select
the X24S8 pipe format because we don't support stencil-only render targets
properly. With mip-mapping this can lead to a setup where the tiling is
incompatible with stencil texturing, and a flushed stencil texture is
used. For the flushed stencil, a literal X24S8 is used because there were
issues with an 8bpp DB->CB copy.
Longer term, it would be good if we could get away from these workarounds,
i.e. properly support an S8 format for stencil-only rendering and flushed
stencil. Since stencil texturing is somewhat rare, it's not a high
priority.
Fixes GL45-CTS.texture_cube_map_array.sampling.
Cc: 17.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Acked-by: Edward O'Callaghan <[email protected]>
(cherry picked from commit 3cd092c41508dde2e6259f09df1736911a828548)
|
|
|
|
|
|
|
|
|
|
|
| |
This commit makes si_update_poly_offset set poly_offset to NULL if
uses_poly_offset is false. This way poly_offset either points into the
currently queued rasterizer, or it is NULL.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451
Cc: "13.0 17.0" <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit d7d32b3bfe86bd89d94d59393907bce1cb9dab7c)
|
|
|
|
|
|
| |
v2: now it should be correct
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Only vertex buffers use a separate bool flag.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
the mutex lock is inside util_range_add.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
the next commit will use it in a clever way, because the CP DMA prefetch
doesn't need this.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If the shader selector is created with a different context than
the shader variant, we should use the calling context's target machine
for the shader variant.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419
Reviewed-by: Nicolai Hähnle <[email protected]>
|