| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backing views/surfaces are used to handle the case when a resource is
bound both as a render target and as a sampler source (such as when
doing auto mipmap generation).
This patch fixes a bug where mapping a resource (to do a glReadPixels)
was reading the stale data in the original surface rather than the
backing surface which was rendered to.
We need to propagate the backing resource (which we rendered to) back
to the original resource before we read from it. The problem was the
svga_propagate_rendertargets() function was examining the wrong surface
views.
This fixes the "poc9" test described in VMware bug 1686661.
Also tested with Piglit, Cinebench, Lightsmark, etc.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Put new svga_propagate_rendertargets() function where all the other
surface propagation code lives.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This should be all that is required for cull distances to work
on radeonsi.
v1.1: whitespace cleanup, add docs fix clipdist_mask usage.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: document the new cap
v3: fix 80 char limit in screen.rst
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Small code clean up that removes magic numbers where a TGSI
opcode has been defined.
No functional change expected as each opcode is unsupported on
the respective hardware.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: James Harvey <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As reported by Clang, TGSI_OPCODE_DFMA (defined magic number 118) is
currently initialized twice for Cayman and Evergreen.
When Jan Vesely added double precision FMA opcode it did make sense
to locate it immediately after TGSI_OPCODE_DMAD, although this is
out of order.
This change cleans up the prior magic number definition and ensures
any later reordering of this struct will not create problems.
Prior change was:
commit 015e2e0fce3eea7884f8df275c2fadc35143a324
Author: Jan Vesely <[email protected]>
Date: Sat Jul 2 16:14:54 2016 -0400
r600g: Add double precision FMA ops
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782
Fixes: 54c4d525da7c7fc1e103d7a3e6db015abb132d5d ("r600g: Enable FMA on chips that support it")
Signed-off-by: Jan Vesely <[email protected]>
Tested-by: James Harvey <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Jan Vesely <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: James Harvey <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Noticed this error in a debug message whilst reviewing
https://bugs.freedesktop.org/show_bug.cgi?id=97477
This patch doesn't go towards fixing that bug, but at
least may clarify future debug output.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
I missed this while adding loop support because the discard test inside a
loop was crashing before, anyway. Fixes piglit glsl-fs-discard-04.
|
| |
|
|
|
|
|
|
|
| |
Only rasterize scissor edges if one or more scissor/viewport
rects are not hottile aligned.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
| |
Implement SCATTERPS as a dynamic loop based on mask set bits
instead of a static compile time loop.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
| |
Fixes upper left rule for scissors and viewport/scissor macrotile alignment.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
| |
Use dynamic memory allocation for per-thread data
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
| |
- use per-primitive viewports throughout the pipeline.
- track whether all available scissor rects are tile aligned.
Causes failures, so not taken into account when choosing rasterizer yet.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We were allocating global variables for the maximum LDS size
which made the compiler think we were using all of LDS, which
isn't the case.
Reviewed-By: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
| |
|
|
|
|
|
| |
Signed-off-by: Kai Wasserbäch <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kai Wasserbäch <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kai Wasserbäch <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v1 → v2:
- Fixed indentation (noted by Brian Paul)
- Removed second assert from nouveau's switch statements (suggested by
Brian Paul)
Signed-off-by: Kai Wasserbäch <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
This fixes: GL45-CTS.texture_barrier.*
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
| |
Move computation of zslice, layer inside the conditional where they're
used.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Rewrite the comment too.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Give more info about backing resources/surfaces.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
| |
We don't do this for other cast wrappers. And this will simplify some
code at call sites.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Use the tex variable instead of using svga_texture() again.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
tex was already declared at the function body scope.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Put near other assignments to the svga_transfer variable.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Use local vars instead of jumping through a pointer.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
To make it symmetric with the svga_texture() cast wrapper.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
To simplify the code a bit.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a direct port of Marek Olšáks patch
"radeonsi: increase performance for DRI PRIME
offloading if 2nd GPU is CIK or VI" to r600.
It uses SDMA for the detiling blit from renderoffload VRAM
to GTT, as SDMA is much faster for tiled->linear blits from
VRAM to GTT.
Testing on a dual Radeon HD-5770 setup reduced the time
for the render offload gpu to get its rendering into
system RAM from approximately 16 msecs for simple rendering
at 1920x1080 pixel 32 bpp to 5 msecs, a > 3x speedup!
This was measured using ftrace to trace the time the radeon kms
driver waited on the dmabuf fence of the renderoffload gpu to
complete.
All in all this brought the time for a flip down from 20 msecs
to 9 msecs, so the prime setup can display at full 60 fps instead
of barely 30 fps vsync'ed.
The current r600 implementation supports SDMA on Evergreen and
later, but not R600/R700 due to some bugs apparently present
in their SDMA implementation.
Signed-off-by: Mario Kleiner <[email protected]>
Cc: Marek Olšák <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This file was supposed to be added with the previous "svga: add guest
statistic gathering interface" patch but went MIA for some reason.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
If the kernel driver doesn't support it, it returns 0.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
This was missed while rewriting the PIPE_DUMP flags.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SDMA is much faster for tiled->linear blits from VRAM to GTT.
I have Bonaire in my second PCIe slot.
$ glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD TONGA ...
$ DRI_PRIME=1 glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ...
Without SDMA:
$ DRI_PRIME=1 glxgears
8796 frames in 5.0 seconds = 1759.074 FPS
8899 frames in 5.0 seconds = 1779.672 FPS
With SDMA:
$ DRI_PRIME=1 glxgears
12765 frames in 5.0 seconds = 2552.788 FPS
12888 frames in 5.0 seconds = 2577.495 FPS
The 1st GPU is irrelevant. The improvement should be much lower at 60 fps,
but definitely measurable.
SI will get this once we add SDMA blit support for it.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
It passes R600_DEBUG=testdma on Bonaire/radeon.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|