| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're not particularly concerned with memory usage, if the tradeoff is
shader recompiles. And it's common for apps to have a lot of shaders
nowadays (and, since our shaders include a LOT of context state of course
we may create quite a bit more shaders even).
So quadruple the amount of shaders draw will cache (from 128 to 512).
For llvmpipe (fs shaders) quadruple the number of instructions, keep the
number of variants the same for now (only with very simple, non-texturing
shaders the variant limit could really be reached), and simplify the
definition, it's probably easier to just have one different definition
per branch...
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Fixes:7319ff87("radeon/uvd: add YUYV format support for target buffer")
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
SWR rasterizer must remain C++11 compliant.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Fixed properly in gcc-compatible fashion.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
For consistency and to support overloading.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Needed to compensate for change to fetch jit requiring
alignment.
Fixes regressions in piglit: vertex-buffer-offsets and about
another hundred of the vs-input*byte* tests.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the HS wave is empty, the hardware writes the LS VGPRs starting at
v0 instead of v2. Workaround by shifting them back into place when
necessary. For simplicity, this is always done in the LS prolog.
According to the hardware team, this will be fixed in future chips,
so take that into account already.
Note that this is not a bug fix, as the bug was already worked
around by commit 166823bfd26 ("radeonsi/gfx9: add a temporary workaround
for a tessellation driver bug"). This change merely replaces the
workaround by one that should be better.
v2: add workaround code to shader only when necessary
v3: clarify the prefer_mono comment
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The buffer bind flags can be promoted in svga_buffer_handle(), so
move the assertion after it. This has already been done for
vertex buffer in commit 6b4bf7e8be, but it misses the one for
index buffer.
Fixes assertion running WarThunder.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Minor performance improvement in avoiding binding the same shader resource
or the same vertex buffer for the same slot.
Tested with MTT glretrace.
v2: Per Brian's suggestion, add a helper function to do vertex buffer
comparision.
v3: Change the helper function to vertex_buffers_equal().
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
This increases performance, but it was tuned for Raven, not Vega.
We don't know yet how Vega will perform, hopefully not worse.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
3 flags for primitive binning, 2 flags for out-of-order rasterization
(but that will be done some other time)
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
The data is read when the render_cond_atom is emitted, so we must
delay emitting the atom until after the flush.
Fixes: 0fe0320dc074 ("radeonsi: use optimal packet order when doing a pipeline sync")
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
The result written by the shader workaround needs to be written back, or
the CP may read stale data.
Fixes: 78476cfe071a ("radeonsi: enable ARB_transform_feedback_overflow_query")
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Fixes: 420c438589c8 ("radeonsi: log draw and compute state into log context")
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
used
v2: remove some redundant checks
Acked-by: Roland Scheidegger <[email protected]> (v1)
Tested-by: Dieter Nützel <[email protected]> (v1)
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Most older drivers seem to just ignore the Dimension setting, so virtually
no changes should be needed.
Acked-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
It adds the capacity to decode MJPEG stream with DRI marker
Signed-off-by: Leo Liu <[email protected]>
|
|
|
|
|
|
| |
Fixes: 130d1f456b8 ("radeon/uvd: reconstruct MJPEG bitstream")
Signed-off-by: Leo Liu <[email protected]>
|
|
|
|
|
|
|
|
| |
It is kind of pointless for compute, and avoids issues with apps kicking
off more than 32 compute shaders at once.
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Accompanying patch "st/mesa: only try to create 1x msaa surfaces for
'fake' msaa" requires driver to report max_samples=1 to enable "fake"
msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions()
and either could enable "fake" msaa.
This patch raises the swr default msaa_max_count from 0 to 1, so that
swr_is_format_supported will report max_samples=1.
Real msaa can still be enabled by exporting SWR_MSAA_MAX_COUNT with a
pow2 value between 2 and 16.
This patch is necessary to prevent an OpenSWR regression resulting from
the st/mesa patch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Acked-by: Brian Paul <[email protected]>
Reviewed-By: George Kyriazis <[email protected]>
|
|
|
|
|
|
|
|
| |
For radv, in order to report VM faults when detected.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On using builtin functions we have to move the input to registers $0 and $1, if
one of the input value is an immediate, we fail to propagate the immediate:
...
mov u32 $r477 0x00000003 (0)
...
mov u32 $r0 %r473 (0)
mov u32 $r1 $r477 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...
With this patch the immediate is propagated, potentially causing the first MOV
to be superfluous, which we'd remove in that case:
...
mov u32 $r0 %r473 (0)
mov u32 $r1 0x00000003 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...
Shaderdb stats:
total instructions in shared programs : 4893460 -> 4893324 (-0.00%)
total gprs used in shared programs : 582972 -> 582881 (-0.02%)
total local used in shared programs : 17960 -> 17960 (0.00%)
local gpr inst bytes
helped 0 91 112 112
hurt 0 0 0 0
v2:
implement some changes proposed by imirkin, the manual deletion of the dead
mov is necessary after ea22ac23e0 ("nvc0/ir: unlink values pre- and post-call
to division function") as the potentially dead mov is unlinked properly,
causing later passes to not notice the mov op at all and thus not cleaning it
up. That makes up a big chunk of the regression the above commit caused.
Keep the deletion of the op where it is, deleting it later unnecessarily blows
up size of the change.
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cs_invocations are currently unsupported, but leaving the field uninitialized
is even worse.
fixes on nvc0:
* KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values
* KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).
Roland Scheidegger's commit e827d9175675aaa6cfc0b981e2a80685fb7b3a74
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000. This patch fixes
three of the four regressions observed by Ray:
- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2
One regression remains:
- draw-vertices-2101010
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <[email protected]>
Signed-off-by: Ben Crocker <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lp_build_fetch_rgba_soa fetches a texel from a texture.
Part of that process involves first gathering the element
together from memory into a packed format, and then breaking
out the individual color channels into separate, parallel
arrays.
The code fails to account for endianess when reading the packed
values.
This commit attempts to correct the problem by reversing the order
the packed values are read on big endian systems.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <[email protected]>
Signed-off-by: Ray Strode <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When the kernel supports it set the local flag and
stop adding those BOs to the BO list.
Can probably be optimized much more.
v2: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.
v2: It shouldn't crash anymore, but the kernel will reject the new flag.
v3 (christian): Rename the flag, avoid sending those buffers in the BO list.
v4 (christian): Remove setting the kernel flag for now
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Use MAX2() because sampleCount will be zero for non-MSAA surfaces.
No Piglit regressions.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Only useful when that debug option is enabled.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes a rendering issue with Hitman when bindless textures
are enabled.
Fixes: 2263610827 ("radeonsi: flush DB caches only when transitioning from DB to texturing")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If llvmpipe_set_scissor_states() is never called, we still need to be sure
that derived scissor/clip state is updated. As of commit 743ad599a97d09b1
that function might not be called.
Fixes regressed Piglit gl-1.0-scissor-offscreen -fbo -auto test.
Reviewed-by: Roland Scheidegger <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101709
Fixes: 743ad599a97 ("st/mesa: don't set 16 scissors and 16 viewports
if they're unused")
Cc: "17.2" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
This is still very simple, but it's better than before.
Loosely ported from Vulkan.
|