| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Analogous to previous commit. Check with the extensive commit
description and bug report referenced.
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit 51accecce7755be9b7eb1baadaec7e4b7d1011af)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions
The regioning parameters are now properly set by convert_to_hw_regs()
and we don't need to fix them in the generator. That latter fix
previously done in the generator was strictly speaking wrong for any
non-identity regions.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit f57e234fdd52331d0aa6656a36efdebea9d11e9d)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gen7, the swizzles used in DF align16 instructions works for element
size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that
in the rest of the code and prepare the instructions for this (scalarize_df()),
we need to set it to two again.
However, for DF align1 instructions, a width of 2 is wrong as we are not
reading the data we want. For example, an uniform would have a region of
<0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access
to the first 4.
This patch sets the default one to 4 and then modifies the width of
align16 instruction's DF sources when we translate the logical swizzle
to the physical one.
v2:
- Remove conditional (Curro).
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit aaeb1c99beed39d85c300ebdb8a7bf056ee6717c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From IVB PRM, vol4, part3, "General Restrictions on Regioning
Parameters":
"If ExecSize = Width and HorzStride ≠ 0, VertStride must
be set to Width * HorzStride."
In next patch, we are going to modify the region parameter for
uniforms and vgrf. For uniforms that are the source of
DF align1 instructions, they will have <0, 4, 1> regioning and
the execsize for those instructions will be 4, so they will break
the regioning rule. This will be the same for VGRF sources where
we use the vstride == 0 exploit.
As we know we are not going to cross the GRF boundary with that
execsize and parameters (not even with the exploit), we just fix
the vstride here.
v2:
- Move is_align1_df() (Curro)
- Refactor exec_size == width calculation (Curro)
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit 7f728bce811fc283e672e3a07b008bb7b52de35e)
[Andres Gomez: use original is_align1_df]
Signed-off-by: Andres Gomez <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/brw_vec4.cpp
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit b71ef173a5a61a667380dc77f5ae1f7e8c0c2fb8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we were only making sure types were the same within a
single stage. This looks to have regressed with 953a0af8e3f73.
Fixes: 953a0af8e3f73 ("mesa: validate sampler uniforms during gluniform calls")
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
https://bugs.freedesktop.org/show_bug.cgi?id=97524
(cherry picked from commit d682f8aa8e0edd166166f87fcd774dd2d57b4180)
[Andres Gomez: there was an intermediate cleanup but this commit
basically brings everything that was missing back]
Signed-off-by: Andres Gomez <[email protected]>
Conflicts:
src/mesa/main/uniforms.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If VDPAU is installed in the non-default location, we'll fail to find
the headers and error at build time.
../../src/gallium/include/state_tracker/vdpau_dmabuf.h:37:25: fatal error: vdpau/vdpau.h: No such file or directory
#include <vdpau/vdpau.h>
^
Fixes: faba96bc60b ("st/vdpau: add new interop interface")
Cc: Christian König <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit 51c0c213b7fa53b249e9fcb9004a3ba1076fe773)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
opt_register_coalesce() was optimizing sequences such as:
mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
mov(8) m4.zw:F, vgrf5.xxxy:F
into:
mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D
This doesn't work - if we're going to reswizzle MACH, we'd need to
reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw
components with attr18.yy * attr19.yy. But the MACH instruction expects
.z to contain attr18.x * attr19.x. Bogus results ensue.
No change in shader-db on Haswell. Prevents regressions in Timothy's
patches to use enhanced layouts for varying packing (which rearrange
code just enough to trigger this pre-existing bug, but were fine
themselves).
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit 2faf227ec2e22c7a37e0a54783a3f0a0062ac852)
Squashed with commit:
i965/vec4: Use reads_accumulator_implicitly(), not MACH checks.
Curro pointed out that I should not just check for MACH, but use
the reads_accumulator_implicitly() helper, which would also prevent
the same bug with MAC and SADA2 (if we ever decide to use them).
Cc: [email protected]
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit 6b10c37b9c3a73add73f444fe1aee73c9ec82c94)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now the spilling cost calculation was neglecting the amount of
data read from the register during the spilling cost calculation.
This caused it to make suboptimal decisions in some cases leading to
higher memory bandwidth usage than necessary.
Improves Unigine Heaven performance by ~4% on BDW, reversing an
unintended FPS regression from my previous commit
147e71242ce539ff28e282f009c332818c35f5ac with n=12 and statistical
significance 5%. In addition SynMark2 OglCSDof performance is
improved by an additional ~5% on SKL, and a Kerbal Space Program
apitrace around the Moho planet I can provide on request improves by
~20%.
Cc: <[email protected]>
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit 58324389be7bc7c5e10093b9cc0a8efa9b4c93a9)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is what we use later on to compute the number of registers that
will actually get spilled to memory, so it's more likely to match
reality than the current open-coded approximation.
Cc: <[email protected]>
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit ecc19e12dca95d2571d3761dea6dec24b061013c)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
|
|
|
|
|
|
|
| |
Cc: <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 7cd6e2df65de9e2f0d77022a64c4e48ca2ebcb33)
|
|
|
|
|
|
|
|
|
| |
Fixes a bug in
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 51deba0eb35d0d27560bb7dad24b8d39abb58be6)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When any count[i] is negative, we must skip all draws.
Moving to vbo makes the subsequent change easier.
v2:
- provide the function in all contexts, including GLES
- adjust validation accordingly to include the xfb check
v3:
- fix mix-up of pre- and post-xfb prim count (Nils Wallménius)
Cc: [email protected]
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 42d5465b9ba85b4918b9e6fb57994720e3c8a80b)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <[email protected]>
Conflicts:
src/mesa/main/varray.c
|
|
|
|
|
|
|
|
|
| |
The same logic needs to be applied to glMultiDrawArrays.
Cc: [email protected]
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 756e9ebbdd84018382908d3556973a62dbda09ca)
|
|
|
|
|
|
|
|
|
| |
Found by inspection.
Cc: [email protected]
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit ea9a8940cadb30ac8d72a26b82bdb54872c0e199)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We check these bitfields when computing the Haswell max GL version.
We need to set them ahead of time, or they won't exist, and all our
checks will fail. That sets the max core profile GL version to 4.2.
This introduces the bizarre situation where asking for a GL context
with version 4.3+ fails, but asking for a GL core profile context
with version <= 4.2 actually promotes you a 4.5 context.
GLX_MESA_query_renderer also reported the bogus 4.2 value.
Now it shows 4.5.
Cc: "17.0" <[email protected]>
Reported-and-tested-by: Rafael Ristovski <[email protected]>
(cherry picked from commit 02ccd8f52cffcc25e5fefdd0f900cf04230395f4)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/intel_screen.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Detecting register write support by trial and error introduces a
stall at screen creation time, which it would be nice to avoid.
Certain command parser versions guarantee this will work (see the
giant comment in intelInitScreen2 below, or a few commits ago):
- Ivybridge: version >= 1 (kernel v3.16)
- Baytrail: version >= 2 (kernel v3.19)
- Haswell: version >= 7 (kernel v4.8)
For simplicity, we don't bother with version 1 in this patch.
This assumes that the user hasn't disabled aliasing PPGTT via a kernel
command line parameter. Don't do that - you're only breaking things.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
(cherry picked from commit 5e29af5f772c1e1b02a4cc46d2f7d3b5d2151ad8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we can't write registers, then the effective command parser version
is 0 - it may exist, but it's not usefully enabling anything.
See kernel commit 1ca3712ca3429a617ed6c5f87718e4f6fe4ae0c6 (in v4.8)
where the kernel starts doing this for us. This makes us do more or
less the same thing on older kernels.
This should preserve a bit of sanity by allowing us to perform a
screen->cmd_parser_version > N check to determine that we really can
use the features promised by command parser version N.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
(cherry picked from commit 31693a13f8fbc52d4f19f1e8800a4edabeecbe19)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/intel_screen.c
|
|
|
|
|
|
|
|
|
| |
This should help us figure out the complexities of which kernel
versions we need to get various features on various platforms.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
(cherry picked from commit 4a2ad6b145b4dd0d19a8e5e0ee6bed09e08ce0eb)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit f938354362655a378d474c5f79c52cea9852ab91 recently increased the
alignment on vertex buffer data from 32 to 64. This caused us to
consume a bit more batch than we were before and we now go over the
estimate by a small amount on certain blits on gen8+. This commit bumps
then gen8 batch estimate by a bit to compensate. Haswell and older
still seems to be well within the limit.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100582
Reviewed-by: Iago Toral Quiroga <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
(cherry picked from commit c9c39812b91c8104bc0bea16053312547846249c)
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
(cherry picked from commit f938354362655a378d474c5f79c52cea9852ab91)
[Emil Velikov: brw_state_batch has different signature]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/genX_blorp_exec.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already provide a default LOD for textureQueryLevels and texture() on
non-fragment stages. However, there are more cases where one is needed
such as textureSize(gsampler2DMS*) in SPIR-V. Instead of trying to list
out all of the cases one at a time, just provide the default for all TXS
and TXL operations. This fixes a shader validation error in the new
Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391
Reviewed-by: Anuj Phogat <[email protected]>
Cc: "13.0 17.0" <[email protected]>
(cherry picked from commit 3503b2714b98684a2ceba5f4fd9a5bfbfbcaad38)
|
|
|
|
|
|
|
|
|
|
|
|
| |
st_finalize_texture always accesses image at face 0, but it may not be
set if we are working with cubemap that had other face set.
This fixes crash in piglit
same-attachment-glFramebufferTexture2D-GL_DEPTH_STENCIL_ATTACHMENT.
Cc: [email protected]
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit 52f9ccefcb75a9d42307890d7714b1cd92e864cb)
|
|
|
|
|
|
|
|
|
|
|
|
| |
SEL can only convert between a few integer types, which we basically
never do.
Fixes fs/vs-double-uniform-array-direct-indirect-non-uniform-control-flow
Cc: [email protected]
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Acked-by: Francisco Jerez <[email protected]>
(cherry picked from commit 7dccd38b400d3a65da20ddefe282a7bb0b7ccb58)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
primcount must be a GLsizei as in the signature for MultiDrawElements
or bad things can happen.
Furthermore, an error should be flagged when primcount is negative.
Curiously, this code used to work somewhat correctly even when primcount
was negative, because the loop that checks count[i] would iterate out of
bounds and almost certainly hit a negative value at some point.
Found by an ASAN error in
GL45-CTS.gtf32.GL3Tests.draw_elements_base_vertex.draw_elements_base_vertex_primcount
Note that the OpenGL spec seems to have s/primcount/drawcount/ at some
point, and the code still reflects the old language.
v2: provide the correct spec quotes (pointed out by Ian)
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]> (v1)
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit c11dcfb5e9b051b9036949b3e40a9dc15138bd97)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit d2590eb65ff28a9cbd592353d15d7e6cbd2c6fc6 I enabled GL 4.5
on Haswell...but failed to check if we could do indirect compute
shader dispatch...and query buffer objects.
Indirect compute shader dispatch requires command parser version 5
(kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in
Linux v4.4). On earlier kernels we would have disabled
ARB_compute_shader, which is a mandatory part of OpenGL 4.3+.
Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG,
which mean command parser version 7 (Linux v4.8). On earlier kernels
we would have disabled ARB_query_buffer_object, which is a mandatory
part of OpenGL 4.4+.
The new version support looks like:
- Kernel 4.1 and older => OpenGL 3.3
- Kernel 4.2-4.3 => OpenGL 4.2
- Kernel 4.4-4.7 => OpenGL 4.3
- Kernel 4.8+ => OpenGL 4.5
Cc: "17.0" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
(cherry picked from commit 9b324e4dca4754801e5db59aba0ab559f2cf35ea)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PRMs state that this packet is 16 DWORDS long. Ensure that the last
three DWORDS are zeroed as required by the hardware when allocating a
null surface state.
Cc: <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
(cherry picked from commit 7c50f9903f58ef04ff393505a383d06f499f1fdc)
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents textureQueryLevels, which maps as LODQ, from ending up
with a xyzw writemask, which is illegal.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100061
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit dab88e9af7a35ebcdd0fc87df97f4b13e908552a)
|
|
|
|
|
|
|
|
|
|
| |
just as earlier gens do.
CC: "17.0 13.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96743
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
(cherry picked from commit bd25d9670b466043cdb5d9668f82accbd587c889)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit 077078ce77e8653725def01ed291eb486989a9ad)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/brw_defines.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The is_color_attachement variable is later read when handling two
separate error cases, where only one of the cases results in the
variable being initialized.
This can be avoided by giving the variable a safe default value.
Coverity-Id: 1398631
Cc: [email protected]
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
(cherry picked from commit 88becf73022d780cfd0d7dbc5bb3911f8b0d2b11)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even though compute shaders cannot access the framebuffer, there is a
synchronization issue when a compute dispatch accesses a texture that
was previously bound and drawn to as a framebuffer.
Section 9.3 (Feedback Loops Between Textures and the Framebuffer) of
the OpenGL 4.5 spec rather implicitly clarifies that undefined behavior
results if the texture is still attached to the currently bound
framebuffer. However, the feedback loop is broken when the application
changes the framebuffer binding before a compute dispatch, and the
state tracker needs to let the driver known about this.
Fixes GL45-CTS.compute_shader.pipeline-post-fs on SI family Radeons.
Cc: [email protected]
Signed-off-by: Marek Olšák <[email protected]>
(cherry picked from commit 40c77bbf83a369f21c5a95f14417348aae2dbe42)
|
|
|
|
|
|
|
|
|
|
|
|
| |
exec_node::get_prev() does not guard against going past the beginning
of the list, so we need to add explicit checks here.
Found by ASAN in piglit arb_shader_storage_buffer_object-rendering.
Cc: [email protected]
Signed-off-by: Marek Olšák <[email protected]>
(cherry picked from commit 911391bd70fe30ad970c5e56632b2d7ccc29d955)
|
|
|
|
|
|
|
|
|
| |
This was hiding bugs as it retyped the source to destination's type.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.0" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit 0dddad5b1bb3b05190074a71d274c04c0b5ea700)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When generating the MOV INDIRECT instruction, the source type is ignored
and it is set to destination's type. However, this is going to change in a
later patch, so we need to explicitly set the proper source type.
brw_vec8_grf() creates an float type's fs_reg by default, when the
ICP handle is actually unsigned. This patch fixes these cases before
applying the aforementioned patch.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.0" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit d8122128bc6bd291ff0abcb7f2e52d9cdc631527)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lowered BSW/BXT indirect move instructions had incorrect
source types, which luckily wasn't causing incorrect assembly to be
generated due to the bug fixed in the next patch, but would have
confused the remaining back-end IR infrastructure due to the mismatch
between the IR source types and the emitted machine code.
v2:
- Improve commit log (Curro)
- Fix read_size (Curro)
- Fix DF uniform array detection in assign_constant_locations() when
it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT.
v3:
- Move changes in assign_constant_locations() to other patch.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.0" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit 56266df7ed9dbdf63acfd58944442893b4cd0c0b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
proper locations
Previously, if we had accesses with different sizes to the same uniform, we might not
push it aligned with the bigger one. This is a problem in BSW/BXT when we access
an array of DF uniform with both direct and indirect addressing because for the latter
we use 32-bit MOV INDIRECT instructions. However this problem can happen with other
generations and bitsizes.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.0" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit a497ab6838ae5a9898abfed82f7bc8295b490911)
|
|
|
|
|
|
|
|
|
|
| |
This bug can make that we don't detect the end of a contiguous area
correctly and push larger areas than the real ones.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.0" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit 7427425247d80c9f59a3c3ad2dfeeb2429de6f67)
|
|
|
|
|
|
|
|
| |
v2: restore the state
Cc: 13.0 17.0 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit cc2f92b09f8ab0470106185585fdc1282da523e6)
|
|
|
|
|
|
| |
Cc: 13.0 17.0 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit a40b76143d8b929412bed6fbed04810902844c40)
|
|
|
|
|
|
|
|
|
| |
Found by inspection. However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: "13.0 17.0" <[email protected]>
(cherry picked from commit 075ed20614e91110322aadff44dbd4c1ca2422e8)
|
|
|
|
|
|
|
|
|
| |
Found while running shader-db under valgrind.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Cc: "13.0 17.0" <[email protected]>
(cherry picked from commit a0ac118398c924f2ae75e5649fbaacd95abd231f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can only do the optimization if the source *is* SSA.
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
(cherry picked from commit a4393bd97fe62e8299273bae769201c5c9c816ea)
Squashed with commit:
i965/fs: Remove the inline pack_double_2x32 optimization
It's broken in a number of ways. In particular, a bunch of the
conditions are backwards so it doesn't actually detect what it's
supposed to detect. Since it's been broken, it hasn't actually been
helping anything so just deleting it isn't a regression.
This (and removing another optimization) were done on master in commit
b07381161777ba5d5f4a1d713f7655bcaede4139.
Cc: "Kenneth Grunke" <[email protected]>
Cc: "Mark Janes" <[email protected]>
[Emil Velikov: patch is a backport of the below "cherry pick"]
Fixes: a4393bd97fe ("i965/fs: Fix the inline nir_op_pack_double optimization")
(cherry picked from commit b07381161777ba5d5f4a1d713f7655bcaede4139)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have OES_tessellation_shader, the same situation can occur
in ES too, not just GL core profile.
Having a TCS but no TES may confuse drivers - i965 crashes, for example.
This prevents regressions in
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
with some SSO pipeline validation changes I'm making.
v2: Add an ES spec citation (suggested by Alejandro)
Cc: "17.0" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
(cherry picked from commit 05a56893aa2570cb1f6e61e3c9cf365266ea1d3a)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes two GL ES 3.0 CTS tests on Sandy Bridge:
ES3-CTS.functional.texture.mipmap.cube.base_level.linear_linear
ES3-CTS.functional.texture.mipmap.cube.base_level.linear_nearest
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "17.0 13.0" <[email protected]>
(cherry picked from commit c59d1ea51bd0809761094e54c66bf3a200d964ff)
|
|
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "17.0 13.0" <[email protected]>
(cherry picked from commit c4f8f395b291a88eb74b07b90a4028ef4f026f58)
|
|
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "17.0" <[email protected]>
(cherry picked from commit 9df3778016e9153bc8759f84075db2d62a62a596)
|
|
|
|
|
|
|
|
|
| |
Fixes dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: [email protected]
(cherry picked from commit 3970257cef5e0c7b5b31c023450f1ea55b784e88)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 0bac2551e40410e2251daf4fd9faf69310ab34ce.
Now that we position the guardband correctly (applying translations
in addition to scaling) and made it as large (or larger) than the
render target, this shouldn't be necessary.
Now we leave guardband clipping enabled 100% of the time, like the
Windows driver does.
Fixes GL45-CTS.gtf21.GL2FixedTests.clip.clip. It tries to draw a
16384x64 rectangle, and it appears that some kind of numerical
imprecisions in the clipper result in some edge pixels going missing.
The Windows driver passes this test because of guardband clipping.
Cc: "17.0" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit ce8a63de6dffd4a7bc704b63bdd48a63798a438e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we disabled the guardband when the viewport was smaller than
the framebuffer on Gen6-7.5, to prevent portions of primitives from
being draw outside of the viewport. On Gen8+, we relied on the viewport
extents test to effectively scissor this away for us.
We can simply always enable scissoring instead. We already include the
viewport in the scissor rectangle, so this will effectively do the
viewport extents test for us. (The only difference is that the scissor
rectangle doesn't support sub-pixel values. I think that's okay.)
Given that the viewport extents test is essentially a second scissor,
and is enabled for basically all 3D drawing on Gen8+, it stands to
reason that scissoring is cheap. Enabling the guardband reduces the
cost of clipping, which is expensive.
The Windows driver appears to never disable guardband clipping, and
appears to use scissoring in this case. I don't know if they leave
it on universally though.
This fixes misrendering in Blender, where the "floor plane" grid lines
started rendering at wrong angles after I disabled XY clipping of line
primitives. Enabling the guardband seems to solve the issue.
Cc: "17.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99339
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit ece0e535a44c228dd994861592deb155c14740d8)
|