| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should be more efficient than the previous snprintf() solution.
But more importantly, it avoids a buffer overflow bug that could result
in crashes or unpredictable results when processing very large interface
blocks.
For the app in question, key->length = 103 for some interfaces. The check
if size >= sizeof(hash_key) was insufficient to prevent overflows of the
hash_key[128] array because it didn't account for the terminating zero.
In this case, this caused the call to hash_table_string_hash() to return
different results for identical inputs, and then shader linking failed.
This new solution also takes all structure fields into account instead
of just the first 15 when sizeof(pointer)==8.
Cc: [email protected]
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 31667e6237d30188d0b29e17f5b9892f10c0d83a)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit dd6f641303c(mesa: Build with subdir-objects.) removed the SRCDIR
variable, but forgot to update all references of it.
v2: Fix path - must be relative to LOCAL_PATH. (Chih-Wei)
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 669cfc267a1102ff903b3e562f9aa45a410e0312)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Required by the i965 driver.
v2:
- Split out the nir_builder_opcodes.h rules.
- Do not unconditionally hide the python command - use $(hide)
- Use LOCAL_EXPORT_C_INCLUDE_DIRS to manage includes for the generated
sources.
Cc: "10.5" <[email protected]>
[Emil Velikov: Split from a larger commit, v2]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 06619749a11651a50e353168c7c793082820585d)
|
|
|
|
|
|
|
|
|
| |
The dri modules depend on symbols provided by it.
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 618885f71fcacb3d68bf37fa23be36830d4178d2)
|
|
|
|
|
|
|
|
|
|
| |
Required by the format_{un,}pack rework. Otherwise the build will fail
to locate the respective headers - format_{un,}pack.h
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 0afbd2df0485cd480979d9f4cdae00262d1a3c62)
|
|
|
|
|
|
|
|
| |
Otherwise we'll fail to find the drm.h header.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 8d90bfb724f89b04d703f869362cf2fc2a3d7567)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... via local_shared_libraries. Otherwise the sync/sync.h header won't
be found.
Note: 10.5 and earlier will need similar change in st/egl.
v2: Append the library to the local_shared_libraries list. (Chih-Wei)
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 2d06791f6f9e8ab37109be52e63d247bbbcb42d4)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Missed out with commit e1fdcddafe9(mesa: Autogenerate format_unpack.c)
v2: Conditionaly print the python commands - s/@/$(hide) / (Chih-Wei)
Cc: "10.5" <[email protected]>
[Emil Velikov: Split our from a larger commit.]
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 5f7081eb90bc5a25f0740314fa22e04d189238ca)
|
|
|
|
|
|
|
|
|
|
| |
Many parts of mesa already have the include with others depending on it
but it's missing. Add it once at the top makefile and be done with it.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 6fb801786604c270fae99c3d665dcebaa0bff3a6)
|
|
|
|
|
|
|
|
|
|
| |
... to manage the LIBDRM*_CFLAGS. The former is the recommended approach
by the Android build system developers while the latter has been
depreciated for quite some time.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 86919352e3da1c80409fdcb67c36f29a9687b7a9)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Appears to fix shader compilation. Tested by starting the client,
dragging the "quality and speed" slider back and forth, and watching the
console output - instead of piles of "shader failed to compile", the CPU
seems to be busy compiling shaders. I haven't actually tried to play.
Signed-off-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69226
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71591
Cc: [email protected]
(cherry picked from commit 00bf7d2e9cd60dbd82d25b459c448e11c545a89a)
|
|
|
|
|
|
| |
Cc: 10.4 10.5 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit dcc74d47c40bf117f2dfaa359f9de7faef2c2200)
|
|
|
|
|
|
|
|
|
|
| |
This fixes piglit shaders@glsl-fs-uniform-array-loop-unroll with immediate
shader compilation - it's a compiler test, so it has never been translated
to TGSI before.
Cc: 10.4 10.5 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 14c5bc3b9a6b03a8e42ef79da66d8b81b239cf96)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ir_tex opcode turns into a sample or sample_c message, which will try to
compute derivatives to determine the lod. This produces garbage for
non-fragment shaders where the sample coordinates don't correspond to
subspans.
We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
lod to 0.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89457
Cc: "10.5" <[email protected]>
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 993a6288f72fa98932df7cdb6f64d9dd645e670d)
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Cc: "10.5" <[email protected]>
(cherry picked from commit bc672e261c5f7ff56cd2b8f6b518ebfdc0163bb7)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
new_prim was declared as a stack variable within a nested scope; we
tried to retain a pointer to that data beyond the scope, which is bogus.
GCC with -O1 eliminated most of the code that set new_prim's fields.
Move the declaration to fix the bug.
v2: Also fix new_ib (thanks to Matt Turner and Ben Widawsky).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81025
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
Cc: [email protected]
(cherry picked from commit 406df68736a213f17f21a38a7c2da4ea15acd053)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We create textures internally for texsubimage, and we use
the values from sub image to create a new texture, however
we don't align these to valid sizes, and cube map arrays
must have an array size aligned to 6.
This fixes texsubimage cube_map_array on CAYMAN at least,
(it was causing GPU hang and bad values), it probably
also fixes it on radeonsi and evergreen.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89957
Tested-by: Tom Stellard <[email protected]>
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit cc5860e40787b3afe36856674f028e830685271b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we can subimage upload a number of cube map array layers,
that aren't a complete cube map array, we should specify things
as a 2D array and blit from that.
Suggested by Ilia Mirkin as an alternate fix for texsubimage
cube map array issues.
seems to work just as well.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 5ed79312ed99f3b141c35569b9767f82f5ba0a93)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change fixes a regression with timer queries introduced with
commit 3eb6258. There the pending batchbuffer is flushed
only if glEndQuery is executed. This present change adds such
a flush to glQueryCounter which also schedules a value query
just like glEndQuery does. The patch fixes GPU timer queries
going mad from within osgviewer.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
Cc: [email protected]
(cherry picked from commit 1e1d5456ba3dff82301ad4bbdde2fb6e2f562fe3)
|
|
|
|
|
|
|
|
| |
Accidentally added with commit 64d0f0e3b24(radeonsi: Cache
LLVMTargetMachineRef in context instead of in screen)
Reported-by: Michel Dänzer <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise the scons build will fail.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89905
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Increase the device info .urb.size for CHV to match the default URB
size (192kB).
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
(cherry picked from commit 970dc2360372a7859691d690bd2f1976c3c97fb0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When nvc0_push_vbo calls nouveau_scratch_done it does not mean
scratch buffers can be freed immediately. It means "when hardware
advances to this place in the command stream the scratch buffers
can be freed".
To fix it, just postpone scratch runout destruction after current
fence is signalled.
The bug existed for a very long time. Nobody noticed, because
"scratch runout" code path is rarely executed.
Fixes hang at the very beginning of first mission in "Serious Sam 3"
on nve7/gk107. It manifested as:
nouveau E[ PFIFO][0000:01:00.0] read fault at 0x000a9e0000 [PTE] from GR/GPC0/PE_2 on channel 0x007f853000 [Sam3[17056]]
Cc: "10.4 10.5" <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
(cherry picked from commit f9e2295560f9b4869fa2a94933c1881ec7970af4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We limit y-tiling to 0x20 when depth is involved. However the function is
run for each miplevel, and the hardware expects miplevel 0 to have the
highest tiling settings. Perform the y-tiling limit on all levels of a
3d texture, not just the ones that have depth.
Fixes:
texelFetch fs sampler3D 98x129x1-98x129x9
Signed-off-by: Ilia Mirkin <[email protected]>
Tested-by: Nick Tenney <[email protected]> # GT216
Cc: "10.4 10.5" <[email protected]>
(cherry picked from commit ae720c66cb91c2640dfd6707446899694a24ab5b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Haswell hardware seems to ignore Render Stream Select bits from
3DSTATE_STREAMOUT packet when the SOL stage is disabled even if
the PRM says otherwise. Because of this, all primitives are sent
down the pipeline for rasterization, which is wrong. If SOL is
enabled, Render Stream Select is honored and primitives bound to
non-zero streams are discarded after stream output.
Since the only purpose of primives sent to non-zero streams is to
be recorded by transform feedback, we can simply discard all geometry
bound to non-zero streams then transform feedback is disabled
to prevent it from ever reaching the rasterization stage.
Notice that this patch introduces a small change in the behavior we
get when a geometry shader emits more vertices than the maximum declared:
before, a vertex that was emitted to a non-zero stream when TF was
disabled would still count for the purposes of checking that we don't
exceed the maximum number of output vertices declared by the shader. With
this change, these vertices are completely ignored and won't increase
the output vertex count, making more room for other (hopefully more
useful) vertices.
Fixes piglit test arb_gpu_shader5-emitstreamvertex_nodraw on Haswell
and Broadwell.
v2 (Ken): Drop is_haswell check in favor of doing this unconditionally.
Broadwell needs the workaround as well, and it doesn't hurt to do it in
general. Also tweak comments - the Haswell PRM does actually mention
this ("Command Reference: Instructions" page 797).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83962
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
(cherry picked from commit 2042a2f961a07e04eaca0347e42859c249325531)
|
|
|
|
|
|
|
|
|
| |
Fixes Piglit's arb_gpu_shader5-xfb-streams-without-invocations.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: [email protected]
(cherry picked from commit f368d0fa1fe37a58780ee555d4a9ccf15474782b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jordan added this in commit 741782b5948bb3d01d699f062a37513c2e73b076 for
Gen7 platforms. I missed this when adding the Broadwell code.
Fixes Piglit's spec/arb_gpu_shader5/invocation-id-{basic,in-separate-gs}
with MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5 set.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: [email protected]
(cherry picked from commit f9e5dc0a85df8dbfb8213ff772dfeb218972db12)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1a170980a09 started writing to q->data[4]/[5] but kept the
per-query space at 16, which meant that in some cases we would write
past the end of the buffer. Rotate by 32, like nvc0 does. This ensures
that we always have 32 bytes in front of us, and the data writes will go
within the allocated space.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89679
Signed-off-by: Ilia Mirkin <[email protected]>
Tested-by: Nick Tenney <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Tobias Klausmann <[email protected]>
Cc: "10.4 10.5" <[email protected]>
(cherry picked from commit ba353935a392d2a43422f1d258456336b40b60ea)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow us to finally remove python from the build time
dependencies list. Considering that you're building from a release
tarball of course :-)
Cc: Bernd Kuhls <[email protected]>
Reported-by: Bernd Kuhls <[email protected]>
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit a665b9b3c89095923cf2251895afc69c9f79aafe)
|
|
|
|
|
|
|
|
|
|
|
|
| |
fails v2
v2:
- Don't use _errs map
Cc: 10.5 10.4 <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
(cherry picked from commit fda7558057a301a5a0ee1cb4d68f09ea39b03bb3)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a crash in genymotion with several threads compiling shaders
concurrently.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89746
Cc: 10.5 <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
(cherry picked from commit d64adc3a79e419062432cfa8d1cbc437676a3fbd)
Conflicts:
src/gallium/drivers/radeonsi/si_shader.c
|
|
|
|
|
|
|
|
|
|
|
|
| |
The division is probably a holdover from the days when the fixed point
inline functions generated by headergen were broken.
Also reduce the maximum point size to 4092 (vs 4096), which is what the
blob does.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Ilia Mirkin <[email protected]>
(cherry picked from commit 7fc5da8b9392042b5f8a989d2afa49ea1944f9a9)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SZ2 field contains the layer size of a lower miplevel. It only
contains 4 bits, which limits the maximum layer size it can describe. In
situations where the next miplevel would be too big, the hardware
appears to keep minifying the size until it hits one of that size.
Unfortunately the hardware's ideas about sizes can differ from
freedreno's which can still lead to issues. Minimize those by stopping
to minify as soon as possible.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.4 10.5" <[email protected]>
(cherry picked from commit 738c8319ac85b175994b35d1fdc4860e18184b93)
|
|
|
|
|
|
| |
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Ilia Mirkin <[email protected]>
(cherry picked from commit 58030a8f99d94d6c1bab02ef113d93c6c2636216)
|
|
|
|
|
|
|
|
|
|
|
| |
Multiply operations can have a post-factor on them, which other ops
don't support. Only perform the peephole optimizations when there is no
post-factor involved.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89758
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Ilia Mirkin <[email protected]>
(cherry picked from commit 49b86007aa2bb599ada6cdbed7ff56246917f12e)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the recently-sent gl-2.0-vertex-const-attr piglit test. Makes sure
to revalidate arrays when only the current attribute has been updated
via glVertexAttrib*.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89754
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.4 10.5" <[email protected]>
(cherry picked from commit 9d1b5febb62d74c9fc564635d4e0fa5207928c46)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't propagate ARRAYs
This should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=89759
v2: just specify arrays so we get input propagation
Signed-off-by: Dave Airlie <[email protected]>
Cc: [email protected]
Reviewed-by: Ilia Mirkin <[email protected]>
(cherry picked from commit 91e3533481d6921c4b46109742d6f67b7f897f86)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in different fragment shaders. This also applies to a case when gl_FragCoord
is redeclared with no layout qualifiers in one fragment shader and not
declared but used in other fragment shader.
Signed-off-by: Anuj Phogat <[email protected]>
Khronos Bug#12957
Cc: "10.5" <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
(cherry picked from commit d8208312a3a200b4e6d71ce533d835b2d705234a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glXGetProcAddress("glFoo") ends up in stub_add_dynamic() to
create dynamic stubs for dynamic functions. stub_add_dynamic()
doesn't store the caller provided name string "Foo" in a mesa
private copy, but just stores a pointer to the "glFoo" string
passed to glXGetProcAddress - a pointer into arbitrary memory
outside mesa's control.
If the caller passes some dynamically allocated/changing
memory buffer to glXGetProcAddress(), or the caller gets unmapped
from memory, e.g., some dynamically loaded application
plugin which uses OpenGL, this ends badly - with a dangling
pointer.
strdup() the name string provided by the client to avoid
this problem.
Cc: "10.3 10.4 10.5" <[email protected]>
Signed-off-by: Mario Kleiner <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 1110113a7f0b6f9b21dd26dee8e95a021041c71c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The storage size for local kernel args can be queried before the
arguments are set by using the CL_KERNEL_LOCAL_MEM_SIZE param
of clGetKernelWorkGroupInfo().
The spec says that if local kernel arguments have not been specified,
then we should assume their size is 0.
v2:
- Implement using c++11 member initialization.
Reviewed-by: Jan Vesely <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Cc: 10.5 10.4 <[email protected]>
(cherry picked from commit dfb1ae9d914b7723ef50fdd2efe811feebc045ad)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch changes lowering pass to use unique name for each uniform
so that arrays from different stages cannot end up having same
name.
v2: instead of global counter, use pointer to achieve
unique name (Kenneth Graunke)
Signed-off-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89590
Reviewed-by: Chris Forbes <[email protected]>
Cc: 10.5 10.4 <[email protected]>
(cherry picked from commit 3cf99701ba6c9e56c9126fdbb74107a31ffcbcfb)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both do_vs_prog and do_gs_prog initialize brw_stage_prog_data::nr_params to
the number of uniform *vectors* required by the shader rather than the number
of uniform components, contradicting the comment. This is inconsistent with
what the state upload code and scalar path expect but it happens to work until
Gen8 because vec4_visitor interprets it as a number of vectors on construction
and later on overwrites its original value with the number of uniform
components referenced by the shader.
Also there's no need to add the number of samplers, they're not actually
passed in as uniforms.
Fixes a memory corruption issue on BDW with SIMD8 VS.
Cc: "10.5" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit fd149628e142af769c1c0ec037bc297d8a3e871f)
[Emil Velikov: s/DIV_ROUND_UP/CEILING/]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.)
Discovered by Coverity. Reported by Ilia Mirkin.
Cc: 10.5 10.4 <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
(cherry picked from commit a984abdad39df2d8ceb4c46e11f4ce1344c36c86)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.
It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.
This is not always the case under the DRI3/Present backend,
where the Present extension can execute presents and send out
completion events in a different order than the submission
order of the present requests, due to client code specifying
targetMSC target vblank counts which are not strictly
monotonically increasing. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.
How this is supposed to work:
awire->sbc contains the low 32-Bits of the true 64-Bit sbc
of the current swap event, transmitted over the wire.
glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit
sbc of the most recently processed swap event.
glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper
32-Bits of the current sbc. The final 64-Bit output sbc
aevent->sbc is computed from the sum of awire->sbc and
glxDraw->eventSbcWrap.
Under DRI3/Present, swap completion events can be received
slightly out of order due to non-monotic targetMsc specified
by client code, e.g., present request submission:
Submission sbc: 1 2 3
targetMsc: 10 11 9
Reception of completion events:
Completion sbc: 3 1 2
The completion sequence 3, 1, 2 would confuse the old wraparound
handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound
has happened when it hasn't.
The client can queue multiple present requests, in the case of
Mesa up to n requests for n-buffered rendering, e.g., n = 2-4 in
the current Mesa GLX DRI3/Present implementation. In the case of
direct Pixmap presents via xcb_present_pixmap() the number n is
limited by the amount of memory available.
We reasonably assume that the number of outstanding requests n is
much less than 2 billion due to memory contraints and common sense.
Therefore while the order of received sbc's can be a bit scrambled,
successive 64-Bit sbc's won't deviate by much, a given sbc may be
a few counts lower or higher than the previous received sbc.
Therefore any large difference between the incoming awire->sbc and
the last recorded glxDraw->lastEventSbc will be due to 32-Bit
wraparound and we need to adapt glxDraw->eventSbcWrap accordingly
to adjust the upper 32-Bits of the sbc.
Two cases, correponding to the two if-statements in the patch:
a) Previous sbc event was below the last 2^32 boundary, in the previous
glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32
epoch, therefore the low 32-Bit awire->sbc wrapped around to zero,
or close to zero --> awire->sbc is apparently much lower than the
glxDraw->lastEventSbc recorded for the previous epoch
--> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch to be one higher than the previous one.
--> Case a) also handles the old DRI2 behaviour.
b) Previous sbc event was above closest 2^32 boundary, but now a
late event from the previous 2^32 epoch arrives, with a true sbc
that belongs to the previous 2^32 segment, so the awire->sbc of
this late event has a high count close to 2^32, whereas
glxDraw->lastEventSbc is closer to zero --> awire->sbc is much
greater than glXDraw->lastEventSbc.
--> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch back to the previous lower epoch of this late
completion event.
We assume such a wraparound to a higher (a) epoch or lower (b)
epoch has happened if awire->sbc and glxDraw->lastEventSbc differ
by more than 2^30 counts, as such a difference can only happen
on wraparound, or if somehow 2^30 present requests would be pending
for a given drawable inside the server, which is rather unlikely.
v2: Explain the reason for this patch and the new wraparound handling
much more extensive in commit message, no code change wrt. initial
version.
Cc: "10.3 10.4 10.5" <[email protected]>
Signed-off-by: Mario Kleiner <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
(cherry picked from commit cc5ddd584d17abd422ae4d8e83805969485740d9)
|
|
|
|
|
|
|
|
|
|
| |
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 55f0c0a29f788c5df4820e81c0cf93613ccedf5e)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Required by fstat(), otherwise we'll error out due to implicit function
declaration.
Cc: "10.4 10.5" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <[email protected]>
Reported-by: Vadim Rutkovsky <[email protected]>
Tested-by: Vadim Rutkovsky <[email protected]>
(cherry picked from commit 771cd266b9d00bdcf2cf7acaa3c8363c358d7478)
|
|
|
|
|
|
|
|
| |
Fix a3xx texture layer-size.
Signed-off-by: Rob Clark <[email protected]>
Cc: "10.4 10.5" <[email protected]>
(cherry picked from commit e92bc6b38e90339a394e95a562bcce35c3ee9696)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).
This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.4 10.5" <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
(cherry picked from commit 620e29b74821fd75b24495ab2bfddea53fc75350)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We only program in one layer size per texture, so that means that all
levels must share one size. This makes the piglit test
bin/texelFetch fs sampler2DArray
have the same breakage as its non-array version instead of being
completely off, and makes
bin/ext_texture_array-gen-mipmap
start passing.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.4 10.5" <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
(cherry picked from commit 89b26d5a360ebde11a69f2cdefa66e4d6a2a13fd)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimization done by commit 34ec1a24d did not take it into account.
Fixes:
dEQP-GLES3.functional.shaders.random.all_features.fragment.20
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.4 10.5" <[email protected]>
(cherry picked from commit b43bbfa90ace596c8b2e0b3954a5f69924726c59)
|