| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Commit dd6f641303c(mesa: Build with subdir-objects.) removed the SRCDIR
variable, but forgot to update all references of it.
v2: Fix path - must be relative to LOCAL_PATH. (Chih-Wei)
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 669cfc267a1102ff903b3e562f9aa45a410e0312)
|
|
|
|
|
|
|
|
|
| |
The dri modules depend on symbols provided by it.
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 618885f71fcacb3d68bf37fa23be36830d4178d2)
|
|
|
|
|
|
|
|
|
|
| |
Required by the format_{un,}pack rework. Otherwise the build will fail
to locate the respective headers - format_{un,}pack.h
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 0afbd2df0485cd480979d9f4cdae00262d1a3c62)
|
|
|
|
|
|
|
|
| |
Otherwise we'll fail to find the drm.h header.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 8d90bfb724f89b04d703f869362cf2fc2a3d7567)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Missed out with commit e1fdcddafe9(mesa: Autogenerate format_unpack.c)
v2: Conditionaly print the python commands - s/@/$(hide) / (Chih-Wei)
Cc: "10.5" <[email protected]>
[Emil Velikov: Split our from a larger commit.]
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 5f7081eb90bc5a25f0740314fa22e04d189238ca)
|
|
|
|
|
|
|
|
|
|
| |
Many parts of mesa already have the include with others depending on it
but it's missing. Add it once at the top makefile and be done with it.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 6fb801786604c270fae99c3d665dcebaa0bff3a6)
|
|
|
|
|
|
|
|
|
|
| |
... to manage the LIBDRM*_CFLAGS. The former is the recommended approach
by the Android build system developers while the latter has been
depreciated for quite some time.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 86919352e3da1c80409fdcb67c36f29a9687b7a9)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Appears to fix shader compilation. Tested by starting the client,
dragging the "quality and speed" slider back and forth, and watching the
console output - instead of piles of "shader failed to compile", the CPU
seems to be busy compiling shaders. I haven't actually tried to play.
Signed-off-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69226
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71591
Cc: [email protected]
(cherry picked from commit 00bf7d2e9cd60dbd82d25b459c448e11c545a89a)
|
|
|
|
|
|
| |
Cc: 10.4 10.5 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit dcc74d47c40bf117f2dfaa359f9de7faef2c2200)
|
|
|
|
|
|
|
|
|
|
| |
This fixes piglit shaders@glsl-fs-uniform-array-loop-unroll with immediate
shader compilation - it's a compiler test, so it has never been translated
to TGSI before.
Cc: 10.4 10.5 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
(cherry picked from commit 14c5bc3b9a6b03a8e42ef79da66d8b81b239cf96)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ir_tex opcode turns into a sample or sample_c message, which will try to
compute derivatives to determine the lod. This produces garbage for
non-fragment shaders where the sample coordinates don't correspond to
subspans.
We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
lod to 0.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89457
Cc: "10.5" <[email protected]>
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 993a6288f72fa98932df7cdb6f64d9dd645e670d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
new_prim was declared as a stack variable within a nested scope; we
tried to retain a pointer to that data beyond the scope, which is bogus.
GCC with -O1 eliminated most of the code that set new_prim's fields.
Move the declaration to fix the bug.
v2: Also fix new_ib (thanks to Matt Turner and Ben Widawsky).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81025
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
Cc: [email protected]
(cherry picked from commit 406df68736a213f17f21a38a7c2da4ea15acd053)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We create textures internally for texsubimage, and we use
the values from sub image to create a new texture, however
we don't align these to valid sizes, and cube map arrays
must have an array size aligned to 6.
This fixes texsubimage cube_map_array on CAYMAN at least,
(it was causing GPU hang and bad values), it probably
also fixes it on radeonsi and evergreen.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89957
Tested-by: Tom Stellard <[email protected]>
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit cc5860e40787b3afe36856674f028e830685271b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we can subimage upload a number of cube map array layers,
that aren't a complete cube map array, we should specify things
as a 2D array and blit from that.
Suggested by Ilia Mirkin as an alternate fix for texsubimage
cube map array issues.
seems to work just as well.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
(cherry picked from commit 5ed79312ed99f3b141c35569b9767f82f5ba0a93)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change fixes a regression with timer queries introduced with
commit 3eb6258. There the pending batchbuffer is flushed
only if glEndQuery is executed. This present change adds such
a flush to glQueryCounter which also schedules a value query
just like glEndQuery does. The patch fixes GPU timer queries
going mad from within osgviewer.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
Cc: [email protected]
(cherry picked from commit 1e1d5456ba3dff82301ad4bbdde2fb6e2f562fe3)
|
|
|
|
|
|
|
|
|
| |
Increase the device info .urb.size for CHV to match the default URB
size (192kB).
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
(cherry picked from commit 970dc2360372a7859691d690bd2f1976c3c97fb0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Haswell hardware seems to ignore Render Stream Select bits from
3DSTATE_STREAMOUT packet when the SOL stage is disabled even if
the PRM says otherwise. Because of this, all primitives are sent
down the pipeline for rasterization, which is wrong. If SOL is
enabled, Render Stream Select is honored and primitives bound to
non-zero streams are discarded after stream output.
Since the only purpose of primives sent to non-zero streams is to
be recorded by transform feedback, we can simply discard all geometry
bound to non-zero streams then transform feedback is disabled
to prevent it from ever reaching the rasterization stage.
Notice that this patch introduces a small change in the behavior we
get when a geometry shader emits more vertices than the maximum declared:
before, a vertex that was emitted to a non-zero stream when TF was
disabled would still count for the purposes of checking that we don't
exceed the maximum number of output vertices declared by the shader. With
this change, these vertices are completely ignored and won't increase
the output vertex count, making more room for other (hopefully more
useful) vertices.
Fixes piglit test arb_gpu_shader5-emitstreamvertex_nodraw on Haswell
and Broadwell.
v2 (Ken): Drop is_haswell check in favor of doing this unconditionally.
Broadwell needs the workaround as well, and it doesn't hurt to do it in
general. Also tweak comments - the Haswell PRM does actually mention
this ("Command Reference: Instructions" page 797).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83962
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
(cherry picked from commit 2042a2f961a07e04eaca0347e42859c249325531)
|
|
|
|
|
|
|
|
|
| |
Fixes Piglit's arb_gpu_shader5-xfb-streams-without-invocations.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: [email protected]
(cherry picked from commit f368d0fa1fe37a58780ee555d4a9ccf15474782b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jordan added this in commit 741782b5948bb3d01d699f062a37513c2e73b076 for
Gen7 platforms. I missed this when adding the Broadwell code.
Fixes Piglit's spec/arb_gpu_shader5/invocation-id-{basic,in-separate-gs}
with MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5 set.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: [email protected]
(cherry picked from commit f9e5dc0a85df8dbfb8213ff772dfeb218972db12)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow us to finally remove python from the build time
dependencies list. Considering that you're building from a release
tarball of course :-)
Cc: Bernd Kuhls <[email protected]>
Reported-by: Bernd Kuhls <[email protected]>
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit a665b9b3c89095923cf2251895afc69c9f79aafe)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the recently-sent gl-2.0-vertex-const-attr piglit test. Makes sure
to revalidate arrays when only the current attribute has been updated
via glVertexAttrib*.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89754
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.4 10.5" <[email protected]>
(cherry picked from commit 9d1b5febb62d74c9fc564635d4e0fa5207928c46)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't propagate ARRAYs
This should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=89759
v2: just specify arrays so we get input propagation
Signed-off-by: Dave Airlie <[email protected]>
Cc: [email protected]
Reviewed-by: Ilia Mirkin <[email protected]>
(cherry picked from commit 91e3533481d6921c4b46109742d6f67b7f897f86)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both do_vs_prog and do_gs_prog initialize brw_stage_prog_data::nr_params to
the number of uniform *vectors* required by the shader rather than the number
of uniform components, contradicting the comment. This is inconsistent with
what the state upload code and scalar path expect but it happens to work until
Gen8 because vec4_visitor interprets it as a number of vectors on construction
and later on overwrites its original value with the number of uniform
components referenced by the shader.
Also there's no need to add the number of samplers, they're not actually
passed in as uniforms.
Fixes a memory corruption issue on BDW with SIMD8 VS.
Cc: "10.5" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit fd149628e142af769c1c0ec037bc297d8a3e871f)
[Emil Velikov: s/DIV_ROUND_UP/CEILING/]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().
Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit 6ac1bc90c4a7a6f32901a9782e14b090f6fe5270)
Nominated-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to loop over all color attachments, and emit FB writes for each
one, even if the shader didn't write to a corresponding output variable.
Those color attachments would be filled with garbage (undefined values).
Football Manager binds a framebuffer with 4 color attachments, but draws
to it using a shader that only writes to gl_FragData[0..2]. This meant
that color attachment 3 would be filled with garbage, resulting in
rendering artifacts. Now we skip writing to it, fixing rendering.
Writes to gl_FragColor initialize outputs[0..nr_color_regions-1] to
GRFs, while writes to gl_FragData[i] initialize outputs[i].
Thanks to Jason Ekstrand for tracking this down.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86747
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: [email protected]
(cherry picked from commit e95969cd9548033250ba12f2adf11740319b41e7)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we emitted the shader-time epilogue from emit_fb_writes(),
during the middle of looping through color regions (or emit_urb_writes
for the VS). This is duplicated several times and rather awkward.
I need to fix a bug in our FB write handling, and it will be a lot
easier if we move emit_shader_time_end() out of there.
Now, we simply emit FB writes/URB writes, and subsequently have
emit_shader_time_end() insert instructions before the final SEND with
EOT. Not only is this simpler, it's actually a slight improvement:
we now include the MOVs to set up the final FB write payload in our
shader-time measurements.
Note that INTEL_DEBUG=shader_time only exists on Gen7+, and uses
send-from-GRF. (In the past, we might have hit trouble where both
attempt to use MRFs for messages; that's not a problem now.)
v2: Rebase on v3 of the previous patch and other shader_time fixes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]> [v1]
Acked-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit 4ebeb71573ad44f7657810dc5dd2c9030e3e63db)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes another part of the INTEL_DEBUG=shader_time code emittable
at arbitrary locations, rather than just at the end of the instruction
stream.
v2: Don't lose smear! Caught by Topi Pohjolainen.
v3: Don't set smear on the destination of the MOV. Thanks Topi!
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit e43af8d09f919d02b5ac0810c1c0f1783cbef6ef)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of emit_shader_time_write, we now do emit(SHADER_TIME_ADD(...)).
The advantage is that we can also insert a shader time write at an
arbitrary location in the instruction stream, rather than being
restricted to emitting at the end.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit bea854c7f33cc10b8292f931f114afc4f88a8dd4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ADD(diff, diff, fs_reg(-2u)) instruction reads diff, which is a
width 1 register. We need to read it as <0,1,0> with a subreg of 0,
which is what smear accomplishes.
Fixes assertion:
brw_eu_emit.c:285: validate_reg: Assertion `hstride == 0' failed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit f1adc45dbe649cdd4538fb96f6d2a27328bbfba1)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These computations don't have anything to do with the currently
executing channels, so they should use force_writemask_all.
This fixes assert failures.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit ef9cc7d0c176669c03130abf576f2b700be39514)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs.cpp
|
|
|
|
|
|
|
|
|
|
| |
The yoffset needs to be interpreted as a slice offset for 1D array
textures. This patch implements that by moving the yoffset into
zoffset similar to how it moves the height into depth.
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "10.5" <[email protected]>
(cherry picked from commit 7286a6899176a8b26aa794097288eff941f5178c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that a layered source PBO is interpreted as a single tall 2D image
it's quite easy to accept the image height packing option by just
creating an image that is tall enough to include the image padding.
I'm not sure whether the image height property should affect 1D_ARRAY
textures. My intuition and interpretation of the GL spec (which is a
bit vague) would be that it shouldn't. However the software fallback
path in Mesa uses the property for packing but not for unpacking. The
binary NVidia driver uses it for both. This patch doesn't use it for
either case so it is different from the software fallback. There is
some discussion about this here:
http://lists.freedesktop.org/archives/mesa-dev/2015-February/077925.html
This is tested by the texsubimage Piglit test with the array and pbo
arguments. Previously this test was skipping this code path because it
always sets the image height.
I've also tested it by modifying the getteximage-targets test. It
wasn't using this code path before because it was using the default
texture object so this code couldn't successfully create a frame
buffer. I also modified it to add some image padding with the image
height in the PBO.
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "10.5" <[email protected]>
(cherry picked from commit a08bff1e98b8e630f8bdf341af1491cd99e7d104)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 546aba143d13ba3f993ead4cc30b2404abfc0202.
I think the changes to the calls to glBlitFramebuffer from this patch
are no different to what it was doing previously because it used to
set height to 1 before doing the blits. However it was introducing
some problems with the blit for layer 0 because this was no longer
special cased. It didn't fix problems with the yoffset which needs to
be interpreted as a slice offset. I think a better solution would be
to modify the original if statement to cope with the yoffset.
Conflicts:
src/mesa/drivers/common/meta_tex_subimage.c
Cc: "10.5" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit 7d10d2feee381739eef97f4720cbadbd65bb4fc6)
|
|
|
|
|
|
|
|
|
| |
A layered PBO image is now interpreted as a single tall 2D image so
the z argument in _mesa_meta_bind_fbo_image is ignored. Therefore this
was just redundantly rebinding the same image repeatedly.
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit a44606eb8164be2aa37eb288fd90894d74bd0935)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some given GLSL IR like (+ (neg x) (* 1.2 x)), the try_emit_mad
function would see that one of the +'s sources was a negate expression
and set mul_negate = true without confirming that it was actually a
multiply.
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89315
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89095
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit d528907fd2950c7bb968fff66dd79863cd128890)
[Emil Velikov: drop the changes in brw_vec4_visitor.cpp]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a dEQP test failure. In the test,
glCopyTexSubImage2D was called with target = 0 and failed to throw
INVALID ENUM. This failure was caused by _mesa_get_current_tex_object(ctx,
target) being called before the target checking. To remedy this, target
checking was separated from the main error-checking function and
called prior to _mesa_get_current_tex_object.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89312
Reviewed-by: Anuj Phogat <[email protected]>
(cherry picked from commit ca65764d6042d2ea220a1e3952490f79c226f3e0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a dEQP test failure. In the test,
glCompressedTexSubImage2D was called with target = 0 and failed to throw
INVALID ENUM. This failure was caused by _mesa_get_current_tex_object(ctx,
target) being called before the target checking. To remedy this, target
checking was made into its own function and called prior to
_mesa_get_current_tex_object.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89311
Reviewed-by: Anuj Phogat <[email protected]>
(cherry picked from commit 549078cb5a95e0ee381d036b8c36bc41506f21bc)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Correctly set _BaseFormat field when creating a gl_renderbuffer
with EGLImage storage.
Change-Id: I8c9f7302d18b617f54fa68304d8ffee087ed8a77
Signed-off-by: Frank Henigman <[email protected]>
Reviewed-by: Stéphane Marchesin <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
(cherry picked from commit e43729943e67972e547a19123fb3afca6b77202b)
Nominated-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
Cc: 10.4, 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 0dfec59a2785cf7a87ee5128889ecebe810b611b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A while back I switched intel_blit_framebuffer to prefer Meta over the
BLT. This meant that Gen8 platforms would start using the 3D engine
for blits, just like we do on Gen6-7.5.
However, I hadn't considered Gen4-5 when making that change. The BLT
engine appears to be substantially faster on 965GM than using Meta to
drive the 3D engine. This isn't too surprising: original Gen4 doesn't
support tile offsets (that came on G45), and the level/layer fields
don't work for cubemap rendering, so for inconvenient miplevel
alignments, we end up blitting or copying data to/from temporaries
in order to render to it. We may as well just use the blitter.
I chose to use the BLT on Gen4-5 because they use the same ring for
both 3D and BLT; Gen6+ splits it out.
Fixes regressions on 965GM due to botched tile offset code (we should
fix those properly as well, but they're longstanding bugs - for now,
put things back to the status quo).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89430
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Cc: "10.5" <[email protected]>
(cherry picked from commit aa0705c06c03d2b882ac7b185ed123bc8a10d716)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SSSE3 swizzling code was written for fast uploads to the GPU and
assumed the destination was always 16-byte aligned. When we began using
this code for fast downloads as well we didn't do anything to account
for the fact that the destination pointer given by glReadPixels() or
glGetTexImage() is not guaranteed to be suitably aligned.
With SSSE3 enabled (at compile-time), some applications would crash when
an SSE aligned-store instruction tried to store to an unaligned
destination (or an assertion that the destination is aligned would
trigger).
To remedy this, tell intel_get_memcpy() whether we're uploading or
downloading so that it can select whether to assume the destination or
source is aligned, respectively.
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89416
Tested-by: Uriy Zhuravlev <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit 2e4c95dfe2cb205c327ceaa12b44a9273bdb20dc)
|
|
|
|
|
|
|
| |
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89317
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 1e128e9b69c6336762a2b6ee5d356c763b9ae3b0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a fix for a regression introduced in commit a9f8296d ("i965/fs:
Preserve the CFG in a few more places.").
The errata this code works around is described in a comment before the function:
"[DevBW, DevCL] Errata: A destination register from a send can not be
used as a destination register until after it has been sourced by an
instruction with a different destination register.
The framebuffer write's sources must be in message registers, which SEND
instructions cannot have as a destination. There's no way for this
errata to affect anything at the end of the program. Just remove the
code.
Cc: 10.4, 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84613
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit e214000f258ae564e64d839cccee9418526f226b)
|
|
|
|
|
|
|
|
|
| |
This takes "fbo-stencil blit GL_STENCIL_INDEX1/4/16" from crash to pass on
BDW.
Cc: 10.5 <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit c4925d7f3b66d63fbdd7b7607cd809db1e58bee9)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, there were bugs where if the app set a scissor it could affect
the area of the texture that was downloaded. There was also potential that
the framebuffer SRGB state could affect downloads. This ensures that those
will get saved/restored and can't affect the texture download.
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89292
Reviewed-by: Neil Roberts <[email protected]>
(cherry picked from commit b1ab02d9c0cc11ba8ef4efaba9452d644b6a0811)
|
|
|
|
|
|
|
|
|
| |
We could do better by tracking scratch reads and writes.
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88793
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit da20bf068ef0f816968d9bc4dfea81facf0fd680)
|
|
|
|
|
|
|
| |
Cc: "10.4, 10.5" <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 491d42135ad0e5670756216154f2ba9fc79d4ba7)
|
|
|
|
|
|
|
| |
Cc: "10.4, 10.5" <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
(cherry picked from commit 87109acbed9c9b52f33d58ca06d9048d0ac7a215)
|
|
|
|
|
|
|
|
|
| |
Always indenting break statements makes spotting missing ones easier.
Cc: "10.4, 10.5" <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
(cherry picked from commit 2b2fa1865248c6e3b7baec81c4f92774759b201f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we compared our new GS-out VUE map to the existing *VS*-out
VUE map, which is bogus.
This would mostly manifest as redundant dirty flagging where the GS is
in use but the VS and GS output layouts differ; but there is a scary
case where we would fail to flag a GS-out layout change if it happened
to match the VS-out layout.
Signed-off-by: Chris Forbes <[email protected]>
Cc: "10.5, 10.4" <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88885
(cherry picked from commit b51ff50a767cc78d678ed3d2c25995f5c4194fea)
|