| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Starting with Skylake, the display engine is capable of scanning out from
Y-tiled buffers. As such, we can and should use Y-tiling for better efficiency.
This also has the added benefit of being able to fast clear the winsys buffer.
Note that the buffer allocation done for mipmaps will already never allocate an
X-tiled buffer for GEN9.
This has an almost universal positive impact on benchmarks, some improving by as
much as 20%.
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
no change in behavior, because both are defined the same
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]> (v1)
Reviewed-by: Roland Scheidegger <[email protected]> (v1)
Acked-by: Jose Fonseca <[email protected]> (v1)
v2: name enums
|
|
|
|
|
|
|
|
|
|
| |
and remove number assignments which are consecutive
Reviewed-by: Edward O'Callaghan <[email protected]> (v1)
Reviewed-by: Roland Scheidegger <[email protected]> (v1)
Acked-by: Jose Fonseca <[email protected]> (v1)
v2: name enums
|
|
|
|
|
|
|
|
| |
not used anymore
this is a follow-up to the RW buffer cleanup.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
const buffers are no longer used since the clip plane const buffer was
moved to RW buffers
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
add it to the RW_BUFFERS descriptor array
now the slot masks don't have to have 64 bits
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
this will be used in the next commit
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
v2: also simplify invalidation of RW buffer bindings (squashed)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
- use an enum
- use a unique slot number regardless of the shader stage
(the per-stage slots will go away for RW buffers)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Screwed up since 0753b135f6e83b171d8a1b08aea967374f3542bc.
(Only an issue with different min/mag filters, and then only in some cases,
which is probably why it went unnoticed for quite a while.
The effect should have simply been nearest mip filter instead of linear, iff
min was nearest, mag was linear, and all pixels hit the mignifying path.)
Fixes a bunch of dEQP failures.
Reviewed-by: Jose Fonseca <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit cda886a4851ab767fba40e8474d6fa8190347e4f, Neil made us stop
advertising RGBX formats on Gen9+, as the hardware apparently no longer
has working fast clear support for those formats. Instead, we just
fall back to RGBA formats, and use SCS to override alpha to 1.0.
This is fine, but had one unintended side effect: it made us fall back
to slow clears when the color mask disables alpha. Normally, we ignore
the color mask for non-existent channels. This includes alpha for XRGB
formats as writing garbage to the X channel is harmless. But, now that
we use RGBA, we think there's a real alpha channel, and can't do the
optimization.
To hack around this, check if _BaseFormat is GL_RGB and ignore alpha.
Improves WebGL Aquarium performance on Skylake GT3e by about 50%
by letting it use repclears instead of slow clears.
Cc: [email protected]
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We do this in two steps: first we clip the dst rect and adjust the src
rect accordingly. Then we do it the other way around. In both passes
the adjustment part involves multiplying by a scale factor that can lead
to a small precision loss. This is breaking a few dEQP tests.
Specifically, the problem happens when we need to clip the same coordinate
twice. For example, if srcX0 and dstX0 need both to be clipped we want to
avoid the situation where we clip srcX0 first, then adjust dstX0 accordingly
but then we realize that the resulting dstX0 still needs to be clipped, so
we clip dstX0 and adjust srcX0 again. Each of these two passes can lead
to precission loss. What we want to do here is detect the rect that leads
to the largest clip (accounting for the scale factor involved), clip that
rect and adjust the other one. With this we ensure that the adjusted
coordinate does not need to be clipped again and we can skip a second pass,
improving precision.
Fixes the following 4 dEQP tests:
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_linear
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Mark Janes <[email protected]>
|
|
|
|
|
|
|
| |
shader->config is not updated for compute kernels.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Float suffixes are allowed in all subsequent GLSL specifications, and
it's obvious what the user meant if they specify one. Accept it with a
warning to avoid breaking applications, like Planeshift (although it
looks like between 0.6.1 and 0.6.3 they might have removed the suffixes
from their shaders).
Reviewed-by: Lars Hamre <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit b449366587b5f3f64c6fb45fe22c39e4bc8a4309.
I removed the pass thinking that it was now not useful, but that was not
true. I believe I ran shader-db on HSW and saw no results, but HSW does
not use the unlit centroid workaround code and as a result does not emit
redundant MOV_DISPATCH_TO_FLAGS instructions.
On IVB, the shader-db results are:
total instructions in shared programs: 6650806 -> 6646303 (-0.07%)
instructions in affected programs: 106893 -> 102390 (-4.21%)
helped: 793
total cycles in shared programs: 56195538 -> 56103720 (-0.16%)
cycles in affected programs: 873048 -> 781230 (-10.52%)
helped: 553
HURT: 209
On SNB, the shader-db results are:
total instructions in shared programs: 7173074 -> 7168541 (-0.06%)
instructions in affected programs: 119757 -> 115224 (-3.79%)
helped: 799
total cycles in shared programs: 98128032 -> 98072938 (-0.06%)
cycles in affected programs: 1437104 -> 1382010 (-3.83%)
helped: 454
HURT: 237
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
| |
This was left out from the original gen8 upload introduction.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
There is not initial assignment, thus appending to it does not work.
Fixes: b27c85c4c08 "i965: add build rule for brw_nir_trig_workarounds.c"
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add GBM_FORMAT_XBGR8888/__DRI_IMAGE_FORMAT_XBGR8888 format support which
is needed for Android.
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Add support for 32-bit RGBX/RGBA formats which are preferred for Android.
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add MESA_FORMAT_R8G8B8A8_UNORM and MESA_FORMAT_R8G8B8X8_UNORM formats as
these are the preferred formats for Android.
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Commit bfd17c76c126 ("i965: Port INTEL_PRECISE_TRIG=1 to NIR.") added a
generated file brw_nir_trig_workarounds.c which broke the Android build.
Add the necessary makefiles to the Android build.
Cc: Kenneth Graunke <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Tested-by: Chih-Wei Huang <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Commit 4db8f15a2576 ("glsl: move the android build scripts a level up")
dropped a generated include path for glcpp. Add it back adjusting for the
new location.
Signed-off-by: Rob Herring <[email protected]>
Tested-by: Chih-Wei Huang <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use dev_node_from_fd() with HAVE_LIBDRM to provide an implmentation
of loader_get_device_name_for_fd() for non-linux systems that
use libdrm but don't have udev or sysfs.
Signed-off-by: Jonathan Gray <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use the defines Mesa configure sets to indicate presence of the bswap32
builtins. This lets i965 work on OpenBSD again after the changes that
were made in 0a5d8d9af42fd77fce1492d55f958da97816961a.
Signed-off-by: Jonathan Gray <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For systems without udev or sysfs that use drm ioctls in the loader
drm authentication must take place earlier or the loader will fail
"MESA-LOADER: failed to get param for i915".
Patch from Mark Kettenis.
Cc: "11.2 11.1" <[email protected]>
Signed-off-by: Mark Kettenis <[email protected]>
Signed-off-by: Jonathan Gray <[email protected]>
[Emil Velikov: remove gratuitous white-space]
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We need to enable a bit in the CONTEXT_CONTROL packet for the
loads to work.
v2: Style issues.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Use field names provided by Nicolai.
v3: Updated to use CONTEXT_CONTROL prefix.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The missing break caused the IB size to be overwritten with
the size of IB_CONST.
This was introduced in: 7201230582e060aa2eb79c825d3188b437ef7bb8
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the vertex buffer consisted of eight floats per vertex
of which six where constants. These can be as easily provided by
vertex fetcher as it is capable of filling vertex elements with
constant one and zero. This reduces the size of the vertex buffer
from 3 * 8 * 4 = 96 to 3 * 2 * 4 = 24 bytes.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise clearing with blorp will regress performance in some
synthetic test cases.
v2: Used vsize >= 2 instead of vsize > 0, and updated the comment.
Review by Ken in one of the earlier patches revealed this.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In case there is no source it means the program does a simple
clear or a resolve. In such case there is no need to program
sampling state or enable pixel kill in fragment shader.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This partially reverts 2f28a0dc23165123cf1e8b5942acad37878edd8a
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Also add the additional render format check to the same utility.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|