| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a direct port of Marek Olšáks patch
"radeonsi: increase performance for DRI PRIME
offloading if 2nd GPU is CIK or VI" to r600.
It uses SDMA for the detiling blit from renderoffload VRAM
to GTT, as SDMA is much faster for tiled->linear blits from
VRAM to GTT.
Testing on a dual Radeon HD-5770 setup reduced the time
for the render offload gpu to get its rendering into
system RAM from approximately 16 msecs for simple rendering
at 1920x1080 pixel 32 bpp to 5 msecs, a > 3x speedup!
This was measured using ftrace to trace the time the radeon kms
driver waited on the dmabuf fence of the renderoffload gpu to
complete.
All in all this brought the time for a flip down from 20 msecs
to 9 msecs, so the prime setup can display at full 60 fps instead
of barely 30 fps vsync'ed.
The current r600 implementation supports SDMA on Evergreen and
later, but not R600/R700 due to some bugs apparently present
in their SDMA implementation.
Signed-off-by: Mario Kleiner <[email protected]>
Cc: Marek Olšák <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
* Check gen <= 7, rather than gen == 7. (Ian)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2:
* Cleanups suggested by Ian, Matt and Topi
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For gen < 8, we can't sample from the stencil buffer, which is
required for the ARB_stencil_texturing extension. We'll make a copy of
the stencil data into a new texture that we can sample using the
R8_UINT surface type.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
* has_component (Ken); const bits_per_channel (Topi)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
thread_width_max in the GPGPU walker command limits us to a maximum of
64 threads.
This fixes a crash on Haswell in the OpenGLES 3.1 conformance test
suite which tests the advertised limits of the max invocation counts.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
This file was supposed to be added with the previous "svga: add guest
statistic gathering interface" patch but went MIA for some reason.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
If the kernel driver doesn't support it, it returns 0.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
This was missed while rewriting the PIPE_DUMP flags.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SDMA is much faster for tiled->linear blits from VRAM to GTT.
I have Bonaire in my second PCIe slot.
$ glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD TONGA ...
$ DRI_PRIME=1 glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ...
Without SDMA:
$ DRI_PRIME=1 glxgears
8796 frames in 5.0 seconds = 1759.074 FPS
8899 frames in 5.0 seconds = 1779.672 FPS
With SDMA:
$ DRI_PRIME=1 glxgears
12765 frames in 5.0 seconds = 2552.788 FPS
12888 frames in 5.0 seconds = 2577.495 FPS
The 1st GPU is irrelevant. The improvement should be much lower at 60 fps,
but definitely measurable.
SI will get this once we add SDMA blit support for it.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
It passes R600_DEBUG=testdma on Bonaire/radeon.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
there's no reason to separate these
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes conditional jump depending on uninitialized value
in si_state_draw.c:593
Cc: <[email protected]>
Signed-off-by: Miklós Máté <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This regression is caused because of commit 3190c7ee9727161d627f107c2e7f8ec3a11941c1
Regression caused by following OpenGL 4.4 spec rules relates to
GL_FRAMEBUFFER_SRGB in Mesa.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
Both are zero, but the later is the right token.
|
| |
|
|
|
|
|
|
|
| |
String for SVGA_STATS_COUNT_TEXREADBACK was swapped
with the string for SVGA_STATS_COUNT_SURFACEWRITEFLUSH.
Trivial fix.
|
|
|
|
|
|
| |
Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf, conform.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
This patch adds a cleanup function to clean up sampler state at
context destruction time.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes piglit spec/ext_transform_feedback/overflow-edge-cases segfaults
because the query's fence pointer was null.
Tested with Piglit, Sauerbraten, ETQW.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We don't want to flush the command buffer or sync on the fence when ending
a query (that kind of defeats the whole purpose of async queries). Do that
instead in get_query_result().
Tested with Piglit, arbocclude, Sauerbraten game, Nobel Clinician Viewer,
ETQW.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch avoid emitting redundant DXSetSamplers command.
Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Put all the clearing related functions in svga_init_clear_functions()
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
| |
define svga_init_clear_functions()
and svga_clear_texture as svga->pipe.clear_texture. This is part of
ARB_clear_texture extension
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
| |
To clear texture this function can be used. This is part of
ARB_clear_texture extension. Basically this extension allows you to
clear texture with given color values.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
Saving all blitter states will be done in begin_blit() so that
begin_blit() can be used before performing any blit operation.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
| |
For opt build, add VMX86_STATS to the list of cpp defines.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch, guest statistic gathering interface is added to
svga winsys interface that can be used to gather svga driver
statistic. The winsys module can then share the statistic info with
the VMX host via the mksstats interface.
The statistic enums used in the svga driver are defined in
svga_stats_count and svga_stats_time in svga_winsys.h
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If the shader has indirect access to non-indexable temporaries,
convert these non-indexable temporaries to indexable temporary array.
This works around a bug in the GLSL->TGSI translator.
Fixes glsl-1.20/execution/fs-const-array-of-struct-of-array.shader_test
on DX11Renderer.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
To silence unused var warning with MSVC, MinGW.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From about kernel 4.9, GTT mmaps are virtually unlimited. A new
parameter, I915_PARAM_MMAP_GTT_VERSION, is added to advertise the
feature so query it and use it to avoid limiting tiled allocations to
only fit within the mappable aperture.
A couple of caveats:
- fence support is still limited by stride to 262144 and the stride
needs to be a multiple of tile_width (as before, and same limitation as
the current 3D pipeline in hardware)
- the max_gtt_map_object_size forcing untiled may be hiding a few bugs
in handling of large objects, though none were spotted in piglits.
See kernel commit 4cc6907501ed ("drm/i915: Add I915_PARAM_MMAP_GTT_VERSION
to advertise unlimited mmaps").
v2: Include some commentary on mmap virtual space vs CPU addressable
space.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Kenneth Graunke <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
|