| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes long standing bug on NV10 and NV20 where using a non-1x RGB or A
post-scale with GL_DOT3_RGB or GL_DOT3_RGBA texture environment would
not work.
The old combiner math uses HALF_BIAS_NORMAL and HALF_BIAS_NEGATE. The
GL_NV_register_combiners defines these as
HALF_BIAS_NORMAL_NV max(0.0, e) - 0.5
HALF_BIAS_NEGATE_NV -max(0.0, e) + 0.5
In order to get the correct result from the dot-product, the
intermediate dot-product must be multiplied by 4. This is a literal
implementation of the GL_ARB_texture_env_dot3 spec. It also requires
using the register combiner post-scale. As a result, the post-scale
cannot be used for the post-scale set by the application.
The new combiner math uses EXPAND_NORMAL and EXPAND_NEGATE. The
GL_NV_register_combiners defines these as
EXPAND_NORMAL_NV 2.0 * max(0.0, e) - 1.0
EXPAND_NEGATE_NV -2.0 * max(0.0, e) + 1.0
Since this fully expands the value to [-1, 1] range, the intermediate
dot-product result is the desired value. This leaves the register
combiner post-scale available for application use.
NOTE: I have not actually tested this.
Signed-off-by: Ian Romanick <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Suggested-by: Ilia Mirkin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Note that GL_KHR_blend_equation_advanced and
GL_KHR_blend_equation_advanced_coherent are done.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Changed "Mesa mailing list" to "mesa-dev mailing list" to clarify
which list patches should be sent to
* Added an explicit link to
https://lists.freedesktop.org/mailman/listinfo/mesa-dev to show
where to subscribe to the list
* Added a link to https://git-scm.com/docs/git-send-email to help new
users of that command
v2: add signed-off-by
Signed-off-by: Nicholas Bishop <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
| |
Move computation of zslice, layer inside the conditional where they're
used.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Rewrite the comment too.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Give more info about backing resources/surfaces.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
| |
We don't do this for other cast wrappers. And this will simplify some
code at call sites.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Use the tex variable instead of using svga_texture() again.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
tex was already declared at the function body scope.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Put near other assignments to the svga_transfer variable.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
Use local vars instead of jumping through a pointer.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
To make it symmetric with the svga_texture() cast wrapper.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
| |
To simplify the code a bit.
Reviewed-by: Neha Bhende <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a direct port of Marek Olšáks patch
"radeonsi: increase performance for DRI PRIME
offloading if 2nd GPU is CIK or VI" to r600.
It uses SDMA for the detiling blit from renderoffload VRAM
to GTT, as SDMA is much faster for tiled->linear blits from
VRAM to GTT.
Testing on a dual Radeon HD-5770 setup reduced the time
for the render offload gpu to get its rendering into
system RAM from approximately 16 msecs for simple rendering
at 1920x1080 pixel 32 bpp to 5 msecs, a > 3x speedup!
This was measured using ftrace to trace the time the radeon kms
driver waited on the dmabuf fence of the renderoffload gpu to
complete.
All in all this brought the time for a flip down from 20 msecs
to 9 msecs, so the prime setup can display at full 60 fps instead
of barely 30 fps vsync'ed.
The current r600 implementation supports SDMA on Evergreen and
later, but not R600/R700 due to some bugs apparently present
in their SDMA implementation.
Signed-off-by: Mario Kleiner <[email protected]>
Cc: Marek Olšák <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
* Check gen <= 7, rather than gen == 7. (Ian)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2:
* Cleanups suggested by Ian, Matt and Topi
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For gen < 8, we can't sample from the stencil buffer, which is
required for the ARB_stencil_texturing extension. We'll make a copy of
the stencil data into a new texture that we can sample using the
R8_UINT surface type.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
* has_component (Ken); const bits_per_channel (Topi)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
thread_width_max in the GPGPU walker command limits us to a maximum of
64 threads.
This fixes a crash on Haswell in the OpenGLES 3.1 conformance test
suite which tests the advertised limits of the max invocation counts.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
This file was supposed to be added with the previous "svga: add guest
statistic gathering interface" patch but went MIA for some reason.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
If the kernel driver doesn't support it, it returns 0.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
This was missed while rewriting the PIPE_DUMP flags.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SDMA is much faster for tiled->linear blits from VRAM to GTT.
I have Bonaire in my second PCIe slot.
$ glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD TONGA ...
$ DRI_PRIME=1 glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ...
Without SDMA:
$ DRI_PRIME=1 glxgears
8796 frames in 5.0 seconds = 1759.074 FPS
8899 frames in 5.0 seconds = 1779.672 FPS
With SDMA:
$ DRI_PRIME=1 glxgears
12765 frames in 5.0 seconds = 2552.788 FPS
12888 frames in 5.0 seconds = 2577.495 FPS
The 1st GPU is irrelevant. The improvement should be much lower at 60 fps,
but definitely measurable.
SI will get this once we add SDMA blit support for it.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
It passes R600_DEBUG=testdma on Bonaire/radeon.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
there's no reason to separate these
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes conditional jump depending on uninitialized value
in si_state_draw.c:593
Cc: <[email protected]>
Signed-off-by: Miklós Máté <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This regression is caused because of commit 3190c7ee9727161d627f107c2e7f8ec3a11941c1
Regression caused by following OpenGL 4.4 spec rules relates to
GL_FRAMEBUFFER_SRGB in Mesa.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|