| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 779cabfc7d022de8b7b9bc7fdac0caffa8646c51 the same txformat table entries
are used for "normal" texturing as well as for blits. However, I forgot to put
in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
path can't hit them because the radeon tex format chooser will never chose
them, but we get that format from the dri buffers (at least I assume we got
it from there).
This is untested but essentially addressing the same bug as for radeon.
(I don't think that the second entry per le/be table is actually necessary,
but shouldn't hurt...)
Tested-by: Ian Romanick <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit a2611ffe4b5f1852c59301f086b988233a1c62f3)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since d21320f6258b2e1780a15c1ca718963d8a15ca18 the same txformat table entries
are used for "normal" texturing as well as for blits. However, I forgot to put
in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
path can't hit them because the radeon tex format chooser will never chose
them, but we get that format from the dri buffers (at least I assume we got
it from there). This caused lots of piglit regressions (and probably lots of
trouble outside piglit too).
This fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=92900.
Tested-by: Ian Romanick <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit 983614dbede7b94cba1bad9f3e8627fc5e14bb91)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fallback_required
Previously GL_FRAMEBUFFER was used. However, if GL_EXT_framebuffer_blit
is supported (note: it is supported by every Mesa driver), this is
*sometimes* an alias for GL_DRAW_FRAMEBUFFER (getters) and *sometimes*
an alias for *both* GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER
(setters). As a result, the code saved one binding but modified both.
If the bindings were different, the GL_READ_FRAMEBUFFER would be
incorrect on exit.
Fixes the piglit fbo-generatemipmap-versus-READ_FRAMEBUFFER test.
Ideally this function would use DSA functions and not modify the binding
at all. However, that would be a much more intrusive change because
_mesa_meta_bind_fbo_image would also need to be modified.
_mesa_meta_bind_fbo_image has a lot of callers. Much of this code is
about to get a major rework due to bug #92363, so I don't think it
matters too much. In fact, I discovered this bug while working on the
other bug. Le bon temps!
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: "10.6 11.0" <[email protected]>
(cherry picked from commit c40a88b6c5a698e5297957e28cccf2ce23820caa)
|
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Cc: "10.6 11.0" <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
(cherry picked from commit 758f12fd98dea9a9682becf2d496bd38ef3959e5)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comment in the code details the restriction. Thanks to Ken for having a very
helpful conversation with me, and spotting the blurb in the link I sent him :P.
There are still stability problems for me on GT4, but this definitely helps with
some of the failures.
v2: Comment fixes
Cc: [email protected]
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 55314c5be4cbf933ab7fbd20f6aa49207e04c946)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For compressed textures, the image size is not necessarily a multiple of
the block size (e.g. the last mip levels). Section 18.3.2 (Copying
Between Images) of the OpenGL 4.5 Core Profile spec says:
An INVALID_VALUE error is generated if the dimensions of either
subregion exceeds the boundaries of the corresponding image
object, or if the image format is compressed and the dimensions of
the subregion fail to meet the alignment constraints of the
format.
and Section 8.7 (Compressed Texture Images) says:
An INVALID_OPERATION error is generated if any of the following
conditions occurs:
* width is not a multiple of four, and width + xoffset is not
equal to the value of TEXTURE_WIDTH.
* height is not a multiple of four, and height + yoffset is not
equal to the value of TEXTURE_HEIGHT.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: [email protected]
(cherry picked from commit 912babba7bf1abd3caa49f6372d581ae1afe7e84)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/mesa/main/copyimage.c
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2294f6f3112f34c4685586f95cd567ce130ee1ab.
It introduces a regression in the following test
piglit.spec.oes_compressed_paletted_texture.basic api
In general this commit is needed to prevent regressions in
GL_KHR_texture_compression_astc_ldr, which... isn't in 11.0
Reported-by: Mark Janes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like other gen8+ hardware, the hardware automatically scales up thread counts.
We must be careful about the URB sizes since GT4 adds another slice.
One of the existing PCI IDs is actually mislabeled as GT3. Arguably this is a
real bug since the URB size will be wrong. Because this patch is simply meant to
add the missing IDs, that will be fixed in a later patch.
v2: No longer relevant.
v3: Update the wm thread count to support GT4. The WM thread count is used to
determine the maximum scratch space required. Currently the code always
allocates the maximum amount even though lower GT SKUs require less. The formula
is threads_per_psd * subslices_per_slice * slices
Cc: [email protected]
Reviewed-by: Jordan Justen <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
(cherry picked from commit 7cbd6608f544591bc6aadf48877608b30a78ccb8)
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
(cherry picked from commit 985b51551a9bafec86604714d5faf3065dad4812)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without the clamping by NumLevels, the state tracker would reallocate the
texture storage (incorrect) and even fail to copy the base level image
after reallocation, leading to the graphical glitch of
https://bugs.freedesktop.org/show_bug.cgi?id=91993 .
A piglit test has been submitted for review as well (subtest of
arb_texture_storage-texture-storage).
v2: also bypass all calls to st_finalize_texture (suggested by Marek Olšák)
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 24c90888aeaf90b13700389b91b74bf63ee9f28d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the case of two nearly identical GLSL fragment shaders:
out vec4 color;
void main() { color = vec4(1); }
and
layout(early_fragment_tests) in;
out vec4 color;
void main() { color = vec4(1); }
These shaders compile to the exact same assembly, but have distinct
values for brw_wm_prog_data::early_fragment_tests.
Since these are two independent GLSL shaders, they have different
program keys - notably, brw_wm_prog_key::program_string_id differs.
When uploading the second, brw_upload_cache will find an existing copy
of the assembly in the cache BO, which means matching_data will be
non-NULL. Although we create a second cache item (with the new key
and prog_data), we set item->offset to the existing copy and avoid
re-uploading duplicate assembly.
However, brw_search_cache() would only flag BRW_NEW_*_PROG_DATA if
item->offset differed from the supplied offset. With reuse, both
programs have the same offset, but prog_data changed. We have to
flag it, but failed to.
To fix this, we simply need to check if the aux (prog_data) pointer
changed. If either the assembly or the prog_data differs, flag it.
This fixes a regression since 1bba29ed403e735ba0bf04ed8aa2e571884f,
where Topi fixed brw_upload_cache() to actually reuse identical
assembly. Prior to that, reuse basically never happened due to bugs.
Unfortunately, this code apparently wasn't prepared to handle reuse!
Fixes GPU hangs in Dolphin on Broadwell.
Huge thanks to Pierre Bourdon and Ilia Mirkin for debugging this
and helping track down the real issue.
Cc: "11.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92623
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Tested-by: Pierre Bourdon <[email protected]>
(cherry picked from commit bf05af3f0e8769f417bbd995470dc1b8083a0df9)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we could create a renderbuffer with format
MESA_FORMAT_R8G8B8A8_UNORM, convert that renderbuffer to an EGLImage,
then FAIL to convert the EGLImage back to a renderbuffer because
reasons. Just use the same check in
intel_image_target_renderbuffer_storage that brw_render_target_supported
uses.
There are more checks in brw_render_target_supported, but I don't think
they are necessary here. A different approach would be to refactor
brw_render_target_supported to take rb->Format and rb->NumSamples as
parameters (instead of a gl_renderbuffer) and use the new function here.
Fixes:
ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Tested-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92476
Cc: "10.3 10.4 10.5 10.6 11.0" <[email protected]>
(cherry picked from commit 7070c8879adff2a1204d7473f119d8194eff919b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The refactoring commit, c6bf1cd, accidentally reverted cd49b97
and 99b1f47. These changes caused more code to be added to the
function and removed the existing support for ASTC. This patch
reverts those modifications.
v2. Actually include ASTC support again.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92221
Cc: "11.0" <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
(cherry picked from commit f1147a238ab35a56fa7d1c64f6025ff3b909dad8)
[Emil Velikov]
- Drop the KHR_texture_compression_astc_ldr check
- Add texcompress.h include.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch also refactors name length queries which were using array size
in computation, this has to be done in same time to avoid regression in
arb_program_interface_query-resource-query Piglit test.
Fixes rest of the failures with
ES31-CTS.program_interface_query.no-locations
v2: make additional check only for GS inputs
v3: create helper function for resource name length
so that it gets calculated only in one place
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Martin Peres <[email protected]>
(cherry picked from commit c0722be9f58ef89dae98d8c459ec4f9589f97748)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
intel_update_winsys_renderbuffer_miptree() will release the existing
miptree when wrapping a new DRI2 buffer, so we can remove the early
release and so prevent a NULL mt dereference should importing the new
DRI2 name fail for any reason. (Reusing the old DRI2 name will result
in the rendering going astray, to a stale buffer, and not shown on the
screen, but it allows us to issue a warning and not crash much later in
innocent code.)
Signed-off-by: Chris Wilson <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86281
Reviewed-by: Martin Peres <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
(cherry picked from commit 70e91d61fde239e8ae58148cacd4ff891126e2aa)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The src_reg constructor that received the glsl_type was using it
only to build the swizzle, but not to fill this->type as dst_reg
is doing.
This caused some type mismatch between movs and alu operations
on the NIR path, so copy propagation optimization was not applied
to remove unneeded movs if negate modifier was involved. This was
first detected on minus (negate+add) operations.
Shader DB results (taking into account only vec4):
total instructions in shared programs: 20019 -> 19934 (-0.42%)
instructions in affected programs: 2918 -> 2833 (-2.91%)
helped: 79
HURT: 0
GAINED: 0
LOST: 0
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit 4de86e1371b0d59a5b9a787b726be3d373024647)
Nominated-by: Christoph Brill <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
opt_register_coalesce stopped to check previous instructions to
coalesce with if somebody else was writing on the same
destination. This can be optimized to check if somebody else was
writing to the same channels of the same destination using the
writemask.
Shader DB results (taking into account only vec4):
total instructions in shared programs: 1781593 -> 1734957 (-2.62%)
instructions in affected programs: 1238390 -> 1191754 (-3.77%)
helped: 12782
HURT: 0
GAINED: 0
LOST: 0
v2: removed some parenthesis, fixed indentation, as suggested by
Matt Turner
v3: added brackets, for consistency, as suggested by Eduardo Lima
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit d4e29af2344c06490913efc35430f93a966061bb)
Nominated-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes assertion failure with new piglit
arb_draw_buffers_blend-state_set_get test.
Cc: [email protected]
Reviewed-by: Jose Fonseca <[email protected]>
(cherry picked from commit e24d04e436ed48d4a0aac90590cbaa40da936208)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids a serious r600g bug leading to a GPU hang.
The chances this bug will get fixed are pretty low now.
I deeply regret listening to others and not pushing this patch, leaving
other users with a GPU-crashing driver. Yes, it should be fixed
in the compiler and it's ugly, but users couldn't care less about that.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720
Cc: 11.0 10.6 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 814f31457e9ae83d4f1e39236f704721b279b73d)
|
|
|
|
|
|
|
|
| |
This allows removing FLUSH_VERTICES in MatrixMode.
Cc: [email protected]
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 3c6156a4a7b647cc55cbe3a4c13d53b5ffe505e6)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise there are problems when user overrides version and application
such as Piglit wants to detect used api with glGetString(GL_VERSION).
This makes it currently impossible to run glslparsertest tests for
OpenGL ES when using version override.
Below is example when using MESA_GLES_VERSION_OVERRIDE=3.1.
Before:
"3.1 Mesa 11.1.0-devel (git-24a1a15)"
After:
"OpenGL ES 3.1 Mesa 11.1.0-devel (git-78042ff)"
v2: only include api prefix for OpenGL ES (Boyan Ding)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit dc8c221e2890cc9913dfc99e1e0fcb73c89af52c)
|
|
|
|
|
|
|
|
|
|
| |
Cc: "10.6 11.0" <[email protected]>
Fixes: 669cfc267a1 (android: mesa: fix the path of the SSE4_1
optimisations)
Signed-off-by: Chih-Wei Huang <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
(cherry picked from commit 67d8518a0e5a3df400a6e70de667d69e4b6ce9c5)
|
|
|
|
|
|
|
|
|
|
|
|
| |
pipe_surface_reference have problems with deleted contexts,
so use of pipe_surface_release might be more appropriate.
Fixes Wasteland 2 Director's Cut crash on start.
Cc: [email protected]
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 14f7ce42484c31a45fcb6aabdf503f7496a9a94c)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variable 'i' is a value in [0, MAT_ATTRIB_MAX-1] so subtracting
VERT_ATTRIB_GENERIC0 gave a bogus value and we executed the default
switch clause for all loop iterations.
This doesn't fix any known issues but was clearly incorrect.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit dd293d8aae324ac7b9d5297e33a1e732e1f3f4d3)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the way layout(binding=xxx) works from GLSL. The old method
just happened to work (and significantly predated support for
layout(binding=xxx)), but future changes will break this.
v2: Remove some stale comments. Suggested by Matt and Chris Forbes.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.6 11.0" <[email protected]>
(cherry picked from commit 8acce5d53af44a3d1d05a26e69559fd35f835de4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial glGetUniformdv support didn't cover all the
casting cases that are apparantly legal, and cts seems to
test for them.
I've updated the piglit test to cover these cases now.
v2: fix indentation - it's all broken in this file (Ilia)
fix src/dst index tracking in light of fp64 support (Ilia)
cc: "11.0" <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit bcfaab38858fdcfbd8ffeaf6b0e3da8a726f02e6)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The point is to avoid having to re-validate all image units when
_NEW_TEXTURE is flagged, which can be expensive if the driver exposes
a large number of image units. This has been reported to fix a 36%
performance regression in the Synmark2 Multithread benchmark on the
i965 driver which exposes 192 image units.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91788
Reported-by: Wendy Wang <[email protected]>
Tested-by: Ye Tian <[email protected]>
CC: "11.0" <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 7e441bf025cf8c5d088430d546acb4c0ed58d27b)
|
|
|
|
|
|
|
|
|
| |
gl_image_unit::_Valid will be removed in a future commit.
Tested-by: Ye Tian <[email protected]>
CC: "11.0" <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 2d97a78b37ddf325d90e056f5eefee0548092530)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The call to _mesa_test_texobj_completeness() is unnecessary if the
texture is already known to be complete. If the texture object is
dirtied in the meantime _BaseComplete and _MipmapComplete will be
reset to false. _mesa_is_image_unit_valid() will start to be called
more frequently in a future commit, so it seems desirable to avoid the
unnecessary work.
Tested-by: Ye Tian <[email protected]>
CC: "11.0" <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 25d3338be37ddbfe676716034ec5f29e27323704)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A future commit will remove all texture object-dependent derived state
from the image unit struct to make validation unnecessary on texture
state changes. Instead of checking gl_image_unit::_Valid drivers will
be required to call this function when needed to find out whether an
image unit is in a valid state and whether access from the shader is
allowed.
Tested-by: Ye Tian <[email protected]>
CC: "11.0" <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 5152db415f4047569822d648fda09bdde4171d6d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hardware documentation relating to the UAV HW-assisted coherency
mechanism and UAV access enable bits is scarce and sometimes
contradictory, and there's quite some guesswork behind this commit, so
let me summarize the background first: HSW and later hardware have
infrastructure to support a stricter form of data coherency between
shader invocations from separate primitives. The mechanism is
controlled by the "Accesses UAV" bits on 3DSTATE_VS, _HS, _DS, _GS and
_PS (or _PS_EXTRA on BDW+), and the "UAV Coherency Required" bit on
the 3DPRIMITIVE command.
Regardless of whether "UAV Coherency Required" is set, the hardware
fixed-function units will increment a per-stage semaphore for each
request received if "Accesses UAV" is set for the same or any lower
stage. An implicit DC flush is emitted by the lowermost stage with
"Accesses UAV" set once it's done processing the request, this also
happens regardless of the value of "UAV Coherency Required". The
completion of the DC flush will cause the same stage and all previous
ones to decrement the semaphore, marking the UAV accesses for the
primitive as coherent with L3.
The "UAV Coherency Required" 3DPRIMITIVE bit will cause a pipeline
stall before any threads are dispatched for the first FF stage with
"Accesses UAV" set until the semaphore is cleared for the same stage.
Effectively this guarantees that UAV memory accesses performed by
previous primitives from any stage will be strictly ordered (and
thanks to the implicit DC flush visible in memory) with UAV accesses
from the following primitives.
None of this is required by the usual image, atomic counter and SSBO
GL APIs which have very relaxed cross-primitive coherency and ordering
requirements, so we don't actually ever set the "UAV Coherency
Required" bit -- Ordering with respect to shader invocations from
previous stages on the same primitive where there is a data dependency
is of course already guaranteed as the spec requires, regardless of
this mechanism being enabled. We do set the "Accesses UAV" bits
though since my commit ac7664e493655e290783c23a0412b9c70936da50 (which
this patch partially reverts), mainly because of comments like the
following from the BDW PRM:
> 3DSTATE_GS
>[...]
> 12 Accesses UAV
> Format: Enable
> This field must be set when GS has a UAV access.
There are similar comments in the documentation for the other
3DSTATE_*S commands. The "must" part is misleading and unjustified
AFAIK. Most of the "Accesses UAV" bits don't seem to have any side
effects other than the implicit DC flushes and the related
book-keeping in anticipation for a subsequent primitive with "UAV
Coherency Required" set, so in most cases they are unnecessary and may
incur a performance penalty. There is an exception though. On Gen8+
the PS_EXTRA UAV access bit influences the calculation of the PS
UAV-only and ThreadDispatchEnable signals which on previous
generations were set explicitly by the driver, so we cannot always
avoid enabling it on the PS stage.
The primary motivation for this change is that in fact the hardware
coherency mechanism is buggy and will cause a rather non-deterministic
hang on Gen8 when VS is the only stage with "Accesses UAV" set and the
processing of a request terminates immediately after the implicit DC
flush is sent for a previous primitive with no additional vertices
being emitted for the second primitive, what will cause the hardware
to skip sending a second DC flush and cause the VS to stall
indefinitely waiting for a response from the DC (BDWGFX HSD 1912017).
This hardware bug can be reproduced on current master with the
spec@arb_shader_image_load_store@host-mem-barrier@Indirect/RaW piglit
subtest (if you have the patience to run it a few dozen times).
The proposed workaround is to insert CS STALLs speculatively between
3DPRIMITIVE commands when "Accesses UAV" is enabled for the VS stage
only. Because this would affect one of the hottest paths in the
driver and likely decrease performance even further due to the
unnecessary serialization, and because we don't actually need the
implicit DC flushes, it seems better to just disable them.
Cc: 11.0 <[email protected]>
(cherry picked from commit 5346c1167064d6429c6338974c6342f8346fd34b)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch adds missing type (used with NV_read_depth) so that it gets
handled correctly. This fixes errors seen with following CTS test:
ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit d8d0e4a81e42678cc8c8b876dfee24d5c2f4ba38)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The EXT_texture_format_BGRA8888 extension (which mesa supports
unconditionally) adds a new format and internal format called GL_BGRA_EXT.
Previously, this was not really handled at all in
_mesa_ex3_error_check_format_and_type. When the checks were tightened in
commit f15a7f3c, we accidentally tightened things too far and GL_BGRA_EXT
would always cause an error to be thrown.
There were two primary issues here. First, is that
_mesa_es3_effective_internal_format_for_format_and_type didn't handle the
GL_BGRA_EXT format. Second is that it blindly uses _mesa_base_tex_format
which returns GL_RGBA for GL_BGRA_EXT. This commit fixes both of these
issues as well as adds explicit checks that GL_BGRA_EXT is only ever used
with GL_BGRA_EXT and GL_UNSIGNED_BYTE.
Signed-off-by: Jason Ekstrand <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92265
Reviewed-by: Ian Romanick <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit 6ad9ebb073fc4ed245ef8e9db4479a52e818cb92)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The result of POW for a negative base is undefined. Even when the result
is multiplied by zero (which is the case here whenever the base is
negative), the Inf and NaNs can propagate past that.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Signed-off-by: Daniel Scharrer <[email protected]>
Cc: "10.6 11.0" <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
(cherry picked from commit b3f9c5cc0fab6d770d220efd87ba3746c6673875)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old code had some significant problems with respect to
sampler2DArray textures. The biggest problem was that some of the code
would use vec3 for the texture coordinate type, and other parts of the
code would use vec2. The resulting shader would not even compile.
Since there were not tests for this path, nobody noticed.
The input to the fragment shader is always treated as a vec3. If the
source data is only vec2, the vertex puller will supply 0 for the .z
component. The texture coordinate passed to the fragment shader is
always a vec2 that comes from the .xy part of the vertex shader input.
The layer, taken from the .z of the vertex shader input is passed
separately as a flat integer. If the generated fragment shader does not
use the layer integer, the GLSL linker will eliminate all the dead code
in the vertex shader.
Fixes the new piglit tests "blit-scaled samples=2 with
gl_texture_2d_multisample_array", etc. on i965.
Note for stable maintainer: This patch may depend on 46037237, and that
patch should be safe for stable.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: Topi Pohjolainen <[email protected]>
Cc: Jordan Justen <[email protected]>
Cc: "10.6 11.0" <[email protected]>
(cherry picked from commit 9bd9cf1fa402bf948020ee5d560259a5cfd2a739)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring over the following fix from i965:
commit fb3d62fe3d4fc40ba4ad9804d8b6f451316c9ae2
Author: Kenneth Graunke <[email protected]>
Date: Tue Aug 6 14:36:09 2013 -0700
i965: Remember to call intel_prepare_render() before blitting.
Fixes a crash in the following piglit tests:
bin/fbo-sys-blit -auto
bin/fbo-sys-sub-blit -auto
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit a1a3f0961b20907b6948959c1f224bb055bd4f3d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
i915 fragment programs utilize the texture coordinate registers
for both texture coordinates and varyings. Unfortunately the
code doesn't check if the same index might be in use for both.
It just naively uses the index to pick a texture unit, which
could lead to collisions.
Add an extra mapping step to allocate non conflicting texture
units for both uses.
The issue can be reproduced with a pair of simple shaders like
these:
attribute vec4 in_mod;
varying vec4 mod;
void main() {
mod = in_mod;
gl_TexCoord[0] = gl_MultiTexCoord0;
gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
}
varying vec4 mod;
uniform sampler2D tex;
void main() {
gl_FragColor = texture2D(tex, vec2(gl_TexCoord[0])) * mod;
}
Fixes many piglit tests on i915:
glsl-link-varyings-2
glsl-orangebook-ch06-bump
interpolation-none-gl_frontcolor-smooth-fixed
interpolation-none-gl_frontcolor-smooth-none
interpolation-none-gl_frontcolor-smooth-vertex
interpolation-none-gl_frontsecondarycolor-smooth-fixed
interpolation-none-gl_frontsecondarycolor-smooth-vertex
interpolation-none-gl_frontsecondarycolor-smooth-none
interpolation-none-other-flat-fixed
interpolation-none-other-flat-none
interpolation-none-other-flat-vertex
interpolation-none-other-smooth-fixed
interpolation-none-other-smooth-none
interpolation-none-other-smooth-vertex
v2 [idr]: Minor formatting tweaks.
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit c349031c27b7f66151f07d785625c585e10a92c2)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0) are trying to occupy
the same bit. Move the texture bits upwards a bit to make room for
I830_UPLOAD_RASTER_RULES.
Now the driver will actually upload the raster rules which is rather
important to get the provoking vertex right. Fixes the appearance
of glxgears teeth on gen2.
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "10.6 11.0" <[email protected]>
(cherry picked from commit 9504740f3e6698d860ac93310a33f51f01c10c4f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For 8-bit RGB(A) texture formats we set the PIPE_BIND_RENDER_TARGET flag
to try to get a hardware format which also supports rendering (for FBO
textures). Do the same thing for floating point formats.
This allows the Redway3D Flat demo to run.
Cc: 10.6 11.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
(cherry picked from commit cb758b892a7e62ff1f6187f2ca9ac543ff70a096)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IVB and VLV hang sporadically when an untyped surface read or write
message is used to access a surface of format other than RAW, as may
happen when there is a mismatch between the format qualifier of the
image uniform and the format of the actual image bound to the
pipeline. According to the spec this condition gives undefined
results but may not lead to program termination (which is one of the
possible outcomes of the hang). Fix it by checking at runtime whether
the surface is of the right type.
Fixes the "arb_shader_image_load_store.invalid/format mismatch" piglit
subtest.
Reported-by: Mark Janes <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91718
CC: [email protected]
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit b61292296bd7e1876fdb64725a783a7e96f6c4c1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the immutable compressed texture didn't have the full mip pyramid,
this didn't work, because it tried to generate mip levels for non-existing
levels. _mesa_prepare_mipmap_level() would correctly handle this by returning
FALSE if the mip level didn't exist, however we actually created the
non-existing mip level right before that because we used _mesa_get_tex_image()
before calling _mesa_prepare_mipmap_level(). It would then proceed to crash
(we allocated the mip level, which is a bad idea on an immutable texture,
but didn't initialize the values, leading to assertion failures or segfaults).
Fix this by using _mesa_select_tex_image() instead and call it after
_mesa_prepare_mipmap_level(), as that function will allocate missing mip levels
for non-immutable textures already.
This fixes a (2 year old) crash with astromenace which was hack-fixed in ubuntu
packages instead: http://bugs.debian.org/718680 (I guess most apps do full mip
chains - I believe this app not doing it is actually unintentional, always one
level less than full mip chain...).
Cc: "10.6 11.0" <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 19604d30e1351868f7f54847c91ffec7b3fcd27e)
|
|
|
|
|
|
|
|
|
| |
Broken by: d082c5324914212f76e45be497229c7a0681f706
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92072
Cc: 10.6 11.0 <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
(cherry picked from commit f3a081953393c7d40bd8df9ec22a2551d01098f5)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When validating format+type+internalFormat for texture pixel operations
on GLES3, the effective internal format should be used if the one
specified is an unsized internal format. Page 127, section "3.8 Texturing"
of the GLES 3.0.4 spec says:
"if internalformat is a base internal format, the effective internal
format is a sized internal format that is derived from the format and
type for internal use by the GL. Table 3.12 specifies the mapping of
format and type to effective internal formats. The effective internal
format is used by the GL for purposes such as texture completeness or
type checks for CopyTex* commands. In these cases, the GL is required
to operate as if the effective internal format was used as the
internalformat when specifying the texture data."
v2: Per the spec, Luminance8Alpha8, Luminance8 and Alpha8 should not be
considered sized internal formats. Return the corresponding unsize format
instead.
v4: * Improved comments in
_mesa_es3_effective_internal_format_for_format_and_type().
* Splitted patch to separate chunk about reordering of
error_check_subtexture_dimensions() error check, which is not directly
related with this patch.
v5: Dropped the splitted patch because it was actually a work around 3
dEQP tests that are buggy:
dEQP-GLES2.functional.negative_api.texture.texsubimage2d_neg_offset
dEQP-GLES2.functional.negative_api.texture.texsubimage2d_offset_allowed
dEQP-GLES2.functional.negative_api.texture.texsubimage2d_neg_wdt_hgt
Cc: "11.0" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Mark Janes <[email protected]>
(cherry picked from commit 5edd9961c15a80d557ba42f48c97a471b23d9c5e)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91582
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function will be needed as part of validating the combination of format,
type and internal format of texture pixel operations, which happens in
glformats files. Specifically, we want to be able to obtain the base format
of a resolved effective internal format, to compare it with the original
internal format passed.
Also, since this function deals solely with GL formats, it fits better in
glformats where the rest of similar format functionality rests.
The function is moved as-is, without any modification.
Cc: "11.0" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Mark Janes <[email protected]>
(cherry picked from commit c6bf1cd1467ea5d5370394ba99366dd8a59a385c)
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/mesa/main/teximage.c
src/mesa/main/teximage.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The more specific GLES constrains should be checked after the general
validation performed by _mesa_error_check_format_and_type(). This is also
for consistency with the error checks order of glTexSubImage ops.
v3: The change of order uncovered a bug that regresses a couple of piglit
tests written against OpenGL-ES 1.1 spec, which expects an INVALID_VALUE
instead of the INVALID_ENUM returned by _mesa_error_check_format_and_type()
when an invalid format is passed to glTexImage2D. This version of the patch
accounts for those cases.
Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.texture.teximage2d
Cc: "11.0" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Mark Janes <[email protected]>
(cherry picked from commit 15ab968f62dd322ecda6d70b1069f52616fe39bb)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we assign hw regs to attributes, we don't incorporate the stride
and subreg_offset from the fs_reg. It's rarely used, but the integer
multiplication lowering uses unusual stride and subreg_offset
combination breaks when one source is an attribute.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970
Cc: "10.6 11.0" <[email protected]>
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit 2ea16966ae66d4dd5c5dcb996d7996d9c734bbee)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is used everywhere else in this file because it avoids problems
when count is zero (due to trimming).
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
Signed-off-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109
Reviewed-by: Brian Paul <[email protected]>
Cc: Marius Predut <[email protected]>
Cc: "10.6 11.0" <[email protected]>
(cherry picked from commit 25543d8ec506ef32599af6f5e0dd735e01b39909)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was missing in the HAVE_TRIANGLES path, and that could cause
incorrect rendering.
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
Signed-off-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109
Reviewed-by: Brian Paul <[email protected]>
Cc: Marius Predut <[email protected]>
Cc: "10.6 11.0" <[email protected]>
(cherry picked from commit c0b3b2f7603eab210acdb2e654e5411fe912ca34)
|
|
|
|
|
|
|
|
|
| |
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "10.6 11.0" <[email protected]>
(cherry picked from commit 0d475ee2b989ac1697720ca391913e9158156bdc)
|
|
|
|
|
|
|
|
|
| |
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "10.6 11.0" <[email protected]>
(cherry picked from commit fad8d54de7e7f908cb0d06f0b54af8440e689928)
|