| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a hang in
piglit/arb_blend_func_extended-fbo-extended-blend-pattern_gles2 on REDWOOD.
Cc: 11.0 <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
(cherry-picked from commit b5b87c4ed1dfd58aec8905e0514c9ba92ba83e1d)
Conflicts:
src/gallium/drivers/r600/r600_shader.c
|
|
|
|
|
|
|
|
|
| |
v2: set the behaviour default for future ASICs.
Signed-off-by: Boyuan Zhang <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
Cc: [email protected]
(cherry picked from commit f55f134a033a61d67c2a71bbe57f85eb3484eec1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use PKG_CHECK_MODULES to get the flags to link libelf
v2: keep AC_CHECK_LIB as a fallback for elfutils provided
libelf that doesn't install a pkg-config file.
Signed-off-by: Jonathan Gray <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
Cc: "11.0 11.1" <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
(cherry picked from commit 7f585a6a98d0553ec0ba48e18b1d9bac1256881a)
[Emil Velikov: squash trivial conflict]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/targets/opencl/Makefile.am
|
|
|
|
|
|
|
|
|
|
| |
This fixes a long time ago memory leak (even before all my query
related changes).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 9aca60bfb07d87d82aff943a23cfa693e2712528)
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 6aca7fecb7f7b6c67cf0315e781060a8d1d4b704)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the recommended setting according to hw people and it makes Hyper-Z
stable. Just the two magic states.
This fixes Evergreen, Cayman, SI, CI, VI (using the Cayman code).
Cc: 11.0 11.1 <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
(cherry picked from commit d3c08309abd17b6e0d466b677af57e3cc74b0e00)
[Emil Velikov: s/radeon_set_context_reg/r600_write_context_reg/g]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/drivers/r600/evergreen_state.c
src/gallium/drivers/radeon/cayman_msaa.c
|
|
|
|
|
|
| |
Cc: 11.0 11.1 <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
(cherry picked from commit 787ada6bf65a58b1bab5a30be86698e9b7b0797e)
|
|
|
|
|
|
|
|
|
|
| |
Both caused a crash due to a division by zero in that function.
This is an alternative fix.
Cc: 11.0 11.1 <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
(cherry picked from commit 0f9519b938d78ac55e8e5fdad5727a79baf18d42)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always reset the vertex bufctx to make sure there's no pointer to
an already freed pipe_resource left after unbinding buffers.
Fixes use after free crash in nvc0_bufctx_fence().
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004
Signed-off-by: Patrick Rudolph <[email protected]>
[imirkin: simplify nvc0 fix, apply to nv50]
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 432a798cf5c7fab18a3e32d4073840df7d0d37cb)
|
|
|
|
|
|
|
|
|
| |
This handles loading doubles from LDS properly.
Reviewed-by: Michel Dänzer <[email protected]>
Cc: "11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 8c9e40ac22ce5a60753172a8f95a120d84a3ec4c)
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes:
glsl-1.50/execution/geometry/dynamic_input_array_index.shader_test
my profanity.
We need to load the AR register with the value from the index reg
Cc: "11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit cce3864046be104933fd4f1bb7a4b36092ff4925)
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes:
gs-input-array-vec4-index-rd
The others run out of gprs unfortunately.
Cc: "11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 38542921c785efb37bae88db409d278990684fa4)
|
|
|
|
|
|
|
|
|
|
|
|
| |
These utilities are to be used to do things like integer adds and
multiplies to be used in calculating the LDS offsets etc.
It handles CAYMAN MULLO differences as well.
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 0696ebc899d3aa125ae85b757c5fba137617ecaa)
[Emil Velikov: requred by the next commit]
Nominated-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This will be used in the tess shaders.
Reviewed-by: Oded Gabbay <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 4d64459a92a4c1a64fb7051fd1320c14c1854dcb)
[Emil Velikov: required by the commit after the next one]
Nominated-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only effect here is a space savings - 822 programs in shader-db
affected with the following overall change:
total bytes used in shared programs : 44154976 -> 44139880 (-0.03%)
Fixes: 641eda0c (nv50/ir: r63 is only 0 if we are using less than 63 registers)
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit f920f8eb026d39c0adb547a90399e76b8351fec6)
|
|
|
|
|
|
|
|
|
| |
According to nvdisasm both the immediate and non-imm cases use the same
bits. Both of these flags are quite rarely set though.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 1d708aacb7631833b0f04e704481854428f60ba3)
|
|
|
|
|
|
|
|
|
|
|
| |
We actually leave the sampler unset for OP_TXF, which caused the GK104+
logic to treat some texel fetches as indirect. While this works, it's
incredibly wasteful. This only happened when the texture was > 0 (since
sampler remained == 0).
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 63b850403c90f33c295d3ad6be4ad749d4ea6274)
|
|
|
|
|
|
|
|
| |
The elemental demo hits this case.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit db072d20867426958153279575dfdc2049b5f595)
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 2b98914fe01f1c7b2de8a096c8923b3ab0a69578)
|
|
|
|
|
|
|
|
|
|
|
| |
A situation where there's a 128-bit load where the last component gets
DCE'd causes a 96-bit load to be generated, which no GPU can actually
emit. Avoid generating such instructions by scaling back to 64-bit on
the first load when splitting.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 49692f86a1b77fac4634d2a3f0502ec7451c3435)
|
|
|
|
|
|
| |
Cc: 11.0 11.1 <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
(cherry picked from commit dd27825c8cf0e7b55ebaa139e299f275943d22f6)
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 101e315cc167b0b00319aa70f64c49470e2525f8)
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 06055121e6386bc74e4558a86ef690eae9556482)
|
|
|
|
|
|
|
|
| |
For example if it's $r63 (aka 0), there won't be a definition.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 11fcf46590129abfa2ca2117a320e8a8052761e4)
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example if there are only returns, the break bb will not end up part
of the CFG. However there will have been a prebreak already emitted for
it, and when hitting the RET that comes after, we will try to insert the
current (i.e. break) BB into the graph even though it will be
unreachable. This makes the SSA code sad.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit adcc547bfbef362067bb3b4e3aee75b287bc6189)
|
|
|
|
|
|
|
|
|
| |
SM20-SM50 can't emit a post-factor in the presence of a long immediate.
Make sure to fold it in.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit ff61ac48387d3f42ede50a572c11f404f4cd3abb)
|
|
|
|
|
|
|
|
|
|
|
|
| |
streamout, gs rings bug on certain r600s, requires a wait idle
before each surface sync.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.6 11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit af4013d26b3203a794ae34fe0c98139bc1058273)
[Emil Velikov: s/radeon_set_config_reg/r600_write_config_reg/g ]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Need to insert a SQ_NON_EVENT when ever geometry
shaders are enabled.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.6 11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit b63944e8b9177d231b3789bf84ea9e67b9629905)
|
|
|
|
|
|
|
|
|
| |
Just point the hw to valid memory.
This fixes hangs in piglit/depthstencil-render-miplevel tests.
What's even more bizzare is that the hanging tests report "skip".
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Tested.
(cherry picked from commit bfc14796b077444011c81f544ceec5d8592c5c77)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compiler has more information and is able to optimize the bits
it sets in these registers.
Reviewed-by: Marek Olšák <[email protected]>
CC: <[email protected]>
(cherry picked from commit 89851a296536b89364fe6104d13330975788f960)
[Emil Velikov: squash trivial conflict]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/drivers/radeonsi/si_compute.c
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the future, these will be used by other shaders types.
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 95e051091676584fd7bfba9d0316c3747bf17f35)
[Emil Velikov: squash trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/drivers/radeonsi/si_state_draw.c
src/gallium/drivers/radeonsi/si_state_shaders.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looks like a4xx hw does this in a more standard way and we don't need to
hack around it like we do on a3xx. Fixes GL_ALPHA formats in
fbo-blending-formats, fbo-colormask-formats, and fbo-alphatest-formats.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
(cherry picked from commit ff9450ecd1f7635f8917e3177f0ef18eb8f9f49b)
[Emil Velikov: squash trivial conflict]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/drivers/freedreno/a4xx/fd4_program.c
|
|
|
|
|
|
|
|
|
| |
This fixes teximage-colors, fbo-generatemipmap-formats, and probably
others (in relation to the RGB5 formats, others still fail).
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
(cherry picked from commit 769b3ab6c5111f50502f9df0e8930c8d13f475c7)
|
|
|
|
|
|
|
|
|
|
| |
The lower layers assume that we support this, and it's been core since
GL 1.4. This fixes a slew of piglit tests, especially around
tex-miplevel-selection.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
(cherry picked from commit 0a4462ad6ee921ed805ad51e330b819b95ee90d6)
|
|
|
|
|
|
|
|
|
|
|
| |
Curiously this has no actual effect. I think it's because the first 8
textures are bound in multiple slots for some reason. However seems
prudent to use these the same way as regular texturing, esp in the case
where there are more than 8 textures bound.
Signed-off-by: Ilia Mirkin <[email protected]>
(cherry picked from commit 5877a594d54fdd2b3aa329f4d35b3491a7ee8a33)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93110
|
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Cc: "11.0" <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to emit at least one cut/emit in every
geometry shader, the easiest workaround it to
stick a single CUT at the top of each geom shader.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.6 11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 4f347225752b48f3dc5a59a6be71fe78616252a7)
[Emil Velikov: squash trivial conflict]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/drivers/r600/r600_shader.c
|
|
|
|
|
|
|
|
|
| |
This is specified in the docs for rv670 to work properly.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.6 11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 04efcc6c7adfda75b425f443588f0faab453ba3a)
|
|
|
|
|
|
|
|
|
|
|
| |
On some chips the GSVS itemsize needs to be aligned to a cacheline size.
This only applies to some of the r600 family chips.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.6 11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 8168dfdd4e63457bd8a9ef04a5d49a1f2e202ab8)
|
|
|
|
|
|
|
|
|
|
| |
There is no 96-bit load/store operations, so we have to split it up
into a 32-bit parts, with a split/merge around it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90348
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 4deb118d06e96731f3481daa72c201d7258bfbbb)
|
|
|
|
|
|
|
|
|
|
| |
In case that the buffer has no bind at all, assume it can be a regular
buffer. This can happen on buffers created through the ARB_dsa
interfaces.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit ad5f6b03e793b9390e3b9f3eca68bd43f9d809eb)
|
|
|
|
|
|
|
|
|
|
|
|
| |
With ARB_direct_state_access, buffers can be created without any binding
hints at all. We still need to allocate these buffers to VRAM or GART,
as we don't have logic down the line to place them into GPU-mappable
space. Ideally we'd be able to shift these things around based on usage.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92438
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
(cherry picked from commit 079f713754a9e5d7802b655d54320bd37f24fbfa)
|
|
|
|
|
|
|
|
|
| |
Looks like this was forgotten in the commit which added the AFETCH
logic.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
(cherry picked from commit 8e68113c1a78c48f26e820f4beb2dda9e4b99f32)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are currently two methods in llvmpipe code to calculate coeffs to
be used as inputs for the fragment shader. The two methods use slightly
different ways to do the floating point calculations and thus produce
slightly different results.
The decision which method to use is determined by the size of the vector
that is used by the platform.
For vectors with size of more than 128bit, a single-step method is used,
in which coeffs_init_simple() + attribs_update_simple() are called.
For vectors with size of 128bit or less, a two-step method is used, in
which coeffs_init() + attribs_update() are called.
This causes some piglit tests (clip-distance-bulk-copy,
interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with
128bit vectors (such as ppc64le or x86-64 without AVX).
This patch makes platforms with 128bit vectors use the single-step
method (aka "simple" method) instead of the two-step method.
This would make the resulting coeffs identical between more platforms,
make sure the piglit tests passes, and make debugging and maintainability
a bit easier as the generated LLVM IR will be the same for more platforms.
The performance impact is negligible for x86-64 without AVX, and
basically non-existent for ppc64le, as it can be seen from the following
benchmarking results:
- glxspheres, on ppc64le:
- original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec
- with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec
- Additional 0.8% performance boost
- glxspheres, on x86-64 without AVX:
- original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec
- with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec
- Additional 0.74% performance boost
- glmark2, on ppc64le:
- original code: score of 58
- with my change: score of 57
- glmark2, on x86-64 without AVX:
- original code: score of 175
- with the patch: score of 167
- Impact of of -4.5% on performance
- OpenArena, on ppc64le:
- original code: 3398 frames 1719.0 seconds 2.0 fps
255.0/505.9/2773.0/0.0 ms
- with the patch: 3398 frames 1690.4 seconds 2.0 fps
241.0/497.5/2563.0/0.2 ms
- 29 seconds faster with the patch, which is about 2%
- OpenArena, on x86-64 without AVX:
- original code: 3398 frames 239.6 seconds 14.2 fps
38.0/70.5/719.0/14.6 ms
- with the patch: 3398 frames 244.4 seconds 13.9 fps
38.0/71.9/697.0/14.3 ms
- 0.3 fps slower with the patch (about 2%)
Additional details can be found at:
http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html
Signed-off-by: Oded Gabbay <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
(cherry picked from commit 39b4dfe6ab1003863778a25c091c080e098833ec)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It looks like nir_lower_idiv is going to use it soon, so add support.
With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with
GL 3.0 forced on).
Cc: "11.0" <[email protected]>
(cherry picked from commit a4bf28178f064082d3b818d2cd48abf9075cc459)
[Emil Velikov: Resolve trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/drivers/vc4/vc4_qpu_emit.c
|
|
|
|
|
|
|
|
|
|
|
|
| |
Requires proper kernel tiling configuration so check the tiling
config registers.
v2: send the right version of the patch
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
(cherry picked from commit 00f554abba8c0f3b65af94365c15109c3b858486)
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
(cherry picked from commit f94e1d97381ec787c2abbbcd5265252596217e33)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
otherwise the SX or CB blocks can go bananas
Reviewed-by: Nicolai Hähnle <[email protected]>
Cc: [email protected]
(cherry picked from commit 40912dd91e96376517fb41bb4dc228b45fd1a01c)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/gallium/drivers/radeonsi/si_state.c
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes the corruption on rendering that we are seeing in
certain geometry shaders.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91780
Reviewed-by: Alex Deucher <[email protected]>
Tested / Reviewed-by: Glenn Kennard <[email protected]>
Cc: "10.6" "11.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit df8af7d75155845d12d5a14a3a5ca644f07cb3b1)
|