| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes GPUVM conflicts with non-4K page size.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92738
v2: Replace sanitization of VM base address alignment with comment why
that's not necessary.
v3: Use unsigned instead of long as the type for the size_align member.
(Marek)
Cc: [email protected]
Reviewed-by: Christian König <[email protected]> (v1)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This is more complicated, because tracking priority_usage needed changing
the relocs_bo type.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Use the priority flags and expand them.
This information will be used for debugging.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
This is always false on amdgpu (set by calloc).
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 567394112d904096abff1d994ab952f475dfb444.
It regressed performance. It looks like smaller IBs are better, because
the GPU goes idle quicker and there is less waiting for buffers and fences.
Cc: 11.0 <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is an internal project that Catalyst uses and now open source will do
too.
v2: squashed these commits in:
- winsys/amdgpu: fix warnings in addrlib
- winsys/amdgpu: set PIPE_CONFIG and NUM_BANKS in tiling_flags
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: - lots of changes according to Emil Velikov's comments
- implemented radeon_winsys::read_registers
v3: - a lot of new work, many of them adapt to libdrm interface changes
Squashed patches:
winsys/amdgpu: implement radeon_winsys context support
winsys/amdgpu: add reference counting for contexts
winsys/amdgpu: add userptr support
winsys/amdgpu: allocate IBs like normal buffers
winsys/amdgpu: add IBs to the buffer list, adapt to interface changes
winsys/amdgpu: don't use KMS handles as reloc hash keys
winsys/amdgpu: sync buffer accesses to different rings
winsys/amdgpu: use dependencies instead of waiting for last fence v2
gallium/radeon: unify buffer_wait and buffer_is_busy in the winsys interface (amdgpu part)
winsys/amdgpu: track fences per ring and be thread-safe
winsys/amdgpu: simplify waiting on a variable in amdgpu_fence_wait
gallium/radeon: allow the winsys to choose the IB size (amdgpu part)
winsys/amdgpu: switch to new amdgpu_cs_query_fence_status interface
winsys/amdgpu: handle fence and dependencies merge
winsys/amdgpu follow libdrm change to move user fence into UMD
winsys/amdgpu: use amdgpu_bo_va_op for va map/unmap v2
winsys/amdgpu: use the new tiling flags
winsys/amdgpu: switch to new GTT_USWC definition
winsys/amdgpu: expose amdgpu_cs_query_reset_state to drivers
winsys/amdgpu: fix valgrind warnings
winsys/amdgpu: don't use VRAM with APUs that don't have much of it
winsys/amdgpu: require LLVM 3.6.1 for VI because of bug fixes there
winsys/amdgpu: remove amdgpu_winsys::num_cpus
winsys/amdgpu: align BO size to page size
winsys/amdgpu: reduce BO cache timeout
winsys/amdgpu: remove useless flushing and waiting in amdgpu_bo_set_tiling
winsys/amdgpu: use amdgpu_device_handle as a unique device ID instead of fd
winsys/amdgpu: use safer access to amdgpu_fence_wait::signalled
winsys/amdgpu: allow maximum IB size of 4 MB
winsys/amdgpu: add ip_instance into amdgpu_fence
gallium/radeon: add RING_COMPUTE instead of RADEON_FLUSH_COMPUTE
winsys/amdgpu: set the ring type at CS initilization
winsys/amdgpu: query the GART page size from the kernel
winsys/amdgpu: correctly wait for shared buffers to become idle
winsys/amdgpu: set the amdgpu_cs_fence structure only once at fence creation
winsys/amdgpu: add a specific error message for cs_submit -> -ENOMEM
winsys/amdgpu: check num_active_ioctls before calling amdgpu_bo_wait_for_idle
winsys/amdgpu: clear user fence BO after allocating it
winsys/amdgpu: fix user fences
winsys/amdgpu: make amdgpu_winsys_create public
winsys/amdgpu: remove thread offloading
winsys/amdgpu: flatten the amdgpu_cs_context structure and simplify more
v4: require libdrm 2.4.63
|
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Same idea as in libdrm_amdgpu.
A command stream can only be created for a specific context and it's always
submitted to that context.
This will mainly be used by amdgpu and it's required by the GPU reset status
query too.
(radeon only has a basic version of the query and thus doesn't need this)
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
The timeout parameter covers both cases.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Luckily, there is a kernel query, so use the size from that.
It currently returns 256KB. It can be increased in the kernel.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Picked from the amdgpu branch.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier commit added an extra dup(fd) to fix a ZaphodHeads issue.
Although it did not consider the (very unlikely) case where we might end
up with the valid fd == 0.
Fixes: 28dda47ae4d(winsys/radeon: Use dup fd as key in drm-winsys hash
table to fix ZaphodHeads.)
Cc: 10.6 <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Mario Kleiner <[email protected]>
|
|
|
|
|
|
|
| |
This has been a no-op due to performance concerns. From now on, drivers
should decide when they don't want to unmap, not the winsys.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generated by running:
git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g'
git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g'
git checkout src/gallium/state_trackers/clover/Doxyfile
and manual edits to
src/gallium/include/pipe/p_compiler.h
src/gallium/README.portability
to remove mentions of the inline define.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Same problem and fix as for nouveau's ZaphodHeads trouble.
See patch ...
"nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads."
... for reference.
Cc: "10.3 10.4 10.5 10.6" <[email protected]>
Signed-off-by: Mario Kleiner <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
1000 ms is an extreme value for typical interactive loads. A large
cache has some disadvantages. Search for reusable BOs can take a long
time and memory might get exhausted.
Let's be rather conservative and use half of the old value,
500ms. This is beneficial to some loads on my test system and there
are no regressions.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the basic granularity for BO allocations. The alignment also
helps with BO reuse by the cached bufmgr.
This results in a huge 45% speedup in Metro 2033 Redux on my test
system. The game relies on buffer orphaning with very small buffers
(hundreds of bytes in size) and that did not work efficiently
before. This change may also affect other applications and games.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
But only when doing so is safe according to the
RADEON_INFO_VA_UNMAP_WORKING kernel query.
This avoids kernel GPU VM address range conflicts when the BO has other
references than the GEM handle being closed, e.g. when the BO is shared.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90537
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90873
Cc: "10.5 10.6" <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The linked list in gallium is pretty much the kernel list and we would like
to have a C-based linked list for all of mesa. Let's not duplicate and
just steal the gallium one.
Acked-by: Connor Abbott <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
| |
|
| |
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
... to manage the LIBDRM*_CFLAGS. The former is the recommended approach
by the Android build system developers while the latter has been
depreciated for quite some time.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
There is no other way to check for support.
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
This is not required, but being user-friendly doesn't hurt.
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
OpenGL requires this.
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This limits the style changes to modes inherited from prog-mode. The
main reason to do this is to avoid setting fill-column for people
using Emacs to edit commit messages because 78 characters is too many
to make it wrap properly in git log. Note that makefile-mode also
inherits from prog-mode so the fill column should continue to apply
there.
v2: Apply to all the .dir-locals.el files, not just the one in the
root directory.
Acked-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
The code was exactly the same, except util/ has c++ guards and a struct
simple_node declaration.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This should fix this performance regression:
https://bugs.freedesktop.org/show_bug.cgi?id=88227
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Android builds Mesa from git, so there don't need to be in the tarball.
|
|
|
|
|
|
|
|
| |
All uses of this require that the value be at least one, so it's
easier to report at least one than having to wrap all uses
in MAX2(max_compute_units, 1).
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Harvested GPUs have some of their render backends disabled, so
in order to prevent the hardware from trying to render things
with these disabled backends we need to correctly program
the PA_SC_RASTER_CONFIG register.
v2:
- Write RASTER_CONFIG for all SEs.
v3:
- Set GRBM_GFX_INDEX.INSTANCE_BROADCAST_WRITES bit.
- Set GRBM_GFX_INFEX.SH_BROADCAST_WRITES bit when done setting
PA_SC_RASTER_CONFIG.
- Get num_se and num_sh_per_se from kernel.
v4:
- Get correct value for num_se
- Remove loop for setting PA_SC_RASTER_CONFIG
- Only compute raster config when a backend has been disabled.
v5: Michel Dänzer
- Fix computation for chips with multiple SEs
https://bugs.freedesktop.org/show_bug.cgi?id=60879
CC: "10.4 10.3" <[email protected]>
|
|
|
|
|
|
| |
It's annoying with octave. Reported by Michael Burian.
Cc: 10.2 10.3 <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The trick is to generate a unique buffer usage value for each possible
combination of domains and flags, with only one bit set each for the
domains and flags. This ensures pb_check_usage() only returns TRUE when
the domains and flags the cached buffer was created for exactly match
the requested ones.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise the caching buffer manager may return a buffer which was created
with a different set of flags, which can cause trouble.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This lets the kernel know that such BOs can be pinned outside of the CPU
accessible part of VRAM.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Making sure large file support is enabled across the tree even on 32-bit
systems.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
This allows the kernel to prevent such BOs from ever being stored in the
CPU inaccessible part of VRAM.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This should make a machine which is running piglit more responsive at times.
e.g. streaming-texture-leak can easily eat 600 MB because of how fast it
creates new textures.
|