| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This introduces a new separate option because the output can
be quite verbose. If spirv-dis is not found in the path, this
debug option is useless.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
At the moment, debugging radv is not really easy because the
driver doesn't report enough information when it hangs. This
new file will be the main location for all debug tools.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Ported from RadeonSI (original patch by Marek).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For software drivers where we want "fake" msaa support for GL 3.x, we
treat 1 sample as being msaa.
For drivers with real msaa support, start format probing at 2x msaa.
For drivers with fake msaa support, start format probing at 1x msaa.
This also tweaks the MaxSamples code in st_init_extensions() so that
we use MaxSamples=1 for fake msaa. This allows the format proble loops
to run at least one iteration.
This fixes a llvmpipe/VTK regression from commit 6839d3369905eb02151.
And for drivers with fake msaa support, calls such as
glTexImage2DMultisample(samples=1) will now succeed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102125
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On using builtin functions we have to move the input to registers $0 and $1, if
one of the input value is an immediate, we fail to propagate the immediate:
...
mov u32 $r477 0x00000003 (0)
...
mov u32 $r0 %r473 (0)
mov u32 $r1 $r477 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...
With this patch the immediate is propagated, potentially causing the first MOV
to be superfluous, which we'd remove in that case:
...
mov u32 $r0 %r473 (0)
mov u32 $r1 0x00000003 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...
Shaderdb stats:
total instructions in shared programs : 4893460 -> 4893324 (-0.00%)
total gprs used in shared programs : 582972 -> 582881 (-0.02%)
total local used in shared programs : 17960 -> 17960 (0.00%)
local gpr inst bytes
helped 0 91 112 112
hurt 0 0 0 0
v2:
implement some changes proposed by imirkin, the manual deletion of the dead
mov is necessary after ea22ac23e0 ("nvc0/ir: unlink values pre- and post-call
to division function") as the potentially dead mov is unlinked properly,
causing later passes to not notice the mov op at all and thus not cleaning it
up. That makes up a big chunk of the regression the above commit caused.
Keep the deletion of the op where it is, deleting it later unnecessarily blows
up size of the change.
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cs_invocations are currently unsupported, but leaving the field uninitialized
is even worse.
fixes on nvc0:
* KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values
* KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).
Roland Scheidegger's commit e827d9175675aaa6cfc0b981e2a80685fb7b3a74
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000. This patch fixes
three of the four regressions observed by Ray:
- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2
One regression remains:
- draw-vertices-2101010
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <[email protected]>
Signed-off-by: Ben Crocker <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lp_build_fetch_rgba_soa fetches a texel from a texture.
Part of that process involves first gathering the element
together from memory into a packed format, and then breaking
out the individual color channels into separate, parallel
arrays.
The code fails to account for endianess when reading the packed
values.
This commit attempts to correct the problem by reversing the order
the packed values are read on big endian systems.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <[email protected]>
Signed-off-by: Ray Strode <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Fixes build error when it's not.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes some crashes in the dEQP-VK.memory.requirements.core.* tests.
I'm not sure whether or not passing out-of-bound formats into the query
is supposed to be allowed but there's no harm in protecting ourselves
from it.
Reviewed-by: Lionel Landwerlin <[email protected]>
Bugzilla: https://bugs.freedesktop.org/101956
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of saving primitive offset in the minmax cache key,
save the actual buffer offset which is used in the cache lookup.
Fixes rendering artifact seen with GoogleEarth when run with
VMware driver.
v2: Per Brian's comment, initialize offset to avoid compiler warning.
Cc: [email protected]
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
error: incompatible pointer to integer conversion initializing 'VkFence'
(aka 'unsigned long long') with an expression of type 'void *' [-Werror,-Wint-conversion]
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When the kernel supports it set the local flag and
stop adding those BOs to the BO list.
Can probably be optimized much more.
v2: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.
v2: It shouldn't crash anymore, but the kernel will reject the new flag.
v3 (christian): Rename the flag, avoid sending those buffers in the BO list.
v4 (christian): Remove setting the kernel flag for now
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
- Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
- Car Chase by 1.25607% +/- 0.291262% (n=5).
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we blit data into a buffer object, we may need to invalidate any
caches that might contain stale data, so the new data becomes visible.
For example, if the buffer object is bound as a vertex buffer, we need
to invalidate the vertex fetch cache.
While this flushing was missing, it usually happened implicitly for
non-obvious reasons: we're usually on the render ring, and calling
intel_emit_linear_blit() would require switching to the BLT ring,
causing an implicit flush. This likely provoked the kernel to do
PIPE_CONTROLs on our behalf. Although, Gen4-5 wouldn't have this
behavior. At any rate, we should do it ourselves.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Although we're phasing out brw_emit_mi_flush(), we still use it in some
places in order to "flush everything". In a number of those places, we
write data to a buffer that we may then bind as an image surface, SSBO,
or atomic buffer. Those usages require us to flush the data cache.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This exposes the new blorp_copy_buffer() functionality to i965.
It should be a drop-in replacement for intel_emit_linear_blit()
(other than the arguments being backwards, for consistency with BLORP).
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Gen4-6 can only handle surfaces up to 8192. Only Gen7+ can do 16384.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I want to be able to copy between buffer objects using BLORP in the i965
driver. Anvil already had code to do this, in a reasonably efficient
manner - first using large bpp copies, then smaller bpp copies.
This patch moves that logic into BLORP as blorp_buffer_copy(), so we
can use it in both drivers.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently if table_size is 0, it's falling through to:
unreachable("hash table should never be full");
But table_size can be 0 when RADV_DEBUG=nocache is set, or when the
table allocation fails (which is not considered an error).
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Use MAX2() because sampleCount will be zero for non-MSAA surfaces.
No Piglit regressions.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We need to take some take here as brw->is_broxton has been used to
check whether the device is a low power gen9 (aka Atom gen9 platform).
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 13c23b19d0b3b965d666498eb759e63fc4a625d9.
Mesa CI was brought down by this commit, with:
mesa/drivers/dri/i965/brw_sync.c:491: brw_dri_create_fence_fd:
Assertion `brw->screen->has_exec_fence' failed.
|
|
|
|
|
|
|
|
|
|
|
| |
For Gen8, add 2xMSAA. For Gen9, add 2xMSAA and 16xMSAA.
Special thanks to Eero Tamminen for reporting rasterizer
numbers being twice what it should be for 2xMSAA under
a benchmark.
V2: Make pointer name less ugly + add 2xMSAA for Gen8
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit f6d38785e8b28a6dd303884798b823e289817741.
Kevin's original patch accidentally didn't add 2x for Gen8; he sent
a v2 with a bunch of style fixes shortly after I pushed the original
patch, not knowing it was coming. Let's just revert this one, apply
v2, and move on.
|
|
|
|
|
|
|
| |
Fixes: 0ac78dc92582a59d4319 "util: move string_to_uint_map to glsl"
Cc: Emil Velikov <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
dri2_display_destroy may be called by dri2_initialize_wayland_drm() if
initialization fails. In this case, these objects may not be initialized.
Cc: [email protected]
Signed-off-by: Michael Olbrich <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add plumbing to allow creation of per display surface out fence.
Currently enabled only on android, since the system expects a valid
fd in ANativeWindow::{queue,cancel}Buffer. We pass a fd of -1 with
which native applications such as flatland fail. The patch enables
explicit sync on android and fixes one of the functional issue for
apps or buffer consumers which depend upon fence and its timestamp.
v2: a) Also implement the fence in cancelBuffer.
b) The last sync fence is stored in drawable object
rather than brw context.
c) format clear.
v3: a) Save the last fence fd in DRI Context object.
b) Return the last fence if the batch buffer is empty and
nothing to be flushed when _intel_batchbuffer_flush_fence
c) Add the new interface in vbtl to set the retrieve fence
v3.1 a) close fd in the new vbtl interface on none Android platform
v4: a) The last fence is saved in brw context.
b) The retrieve fd is for all the platform but not just Android
c) Add a uniform dri2 interface to initialize the surface.
v4.1: a) make some changes of variable name.
b) the patch is broken into two patches.
v4.2: a) Add a deinit interface for surface to clear the out fence
v5: a) Add enable_out_fence to init, platform sets it true or
false
b) Change get fd to update fd and check for fence
c) Commit description updated
v6: a) Heading and commit description updated
b) enable_out_fence is set only if fence is supported
c) Review comments on function names
d) Test with standalone patch, resolves the bug
v6.1: Check for old display fence reverted
v6.2: enable_out_fence initialized to false by default,
dri2_surf_update_fence_fd updated, deinit changed to fini
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101655
Signed-off-by: Zhongmin Wu <[email protected]>
Signed-off-by: Yogesh Marathe <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
|
|
|
|
|
|
|
| |
Only useful when that debug option is enabled.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes a rendering issue with Hitman when bindless textures
are enabled.
Fixes: 2263610827 ("radeonsi: flush DB caches only when transitioning from DB to texturing")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This structure contains two fields, binding and index, that store the
binding in the descriptor set and the index inside the binding.
These structures are defined as uint8_t, but the types in Vulkan
specification are uint32_t, so big values are clamp.
This fixes dEQP-VK.binding_model.shader_access.*.multiple_arbitrary_descriptors.*
v2: use UINT32_MAX for index when having no render targets (Tapani)
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If llvmpipe_set_scissor_states() is never called, we still need to be sure
that derived scissor/clip state is updated. As of commit 743ad599a97d09b1
that function might not be called.
Fixes regressed Piglit gl-1.0-scissor-offscreen -fbo -auto test.
Reviewed-by: Roland Scheidegger <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101709
Fixes: 743ad599a97 ("st/mesa: don't set 16 scissors and 16 viewports
if they're unused")
Cc: "17.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our initial size of 4kB is way too small to do anything useful, so we
end up growing it at least a few times. We may as well start it larger.
Some data points:
- Dinoshade (from Mesa Demos): hit 8kB.
- Chromium 60: hit 16kB after browsing a few things in Google Docs.
- GFXBench4 TRex/Manhattan 3.1: hit 128kB
- Unigine Valley 1.0: hit 512kB
It might make sense to start it even larger.
Acked-by: Matt Turner <[email protected]>
|