| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The dri modules depend on symbols provided by it.
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 618885f71fcacb3d68bf37fa23be36830d4178d2)
|
|
|
|
|
|
|
|
| |
Otherwise we'll fail to find the drm.h header.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 8d90bfb724f89b04d703f869362cf2fc2a3d7567)
|
|
|
|
|
|
|
|
|
|
| |
Many parts of mesa already have the include with others depending on it
but it's missing. Add it once at the top makefile and be done with it.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
(cherry picked from commit 6fb801786604c270fae99c3d665dcebaa0bff3a6)
|
|
|
|
|
|
|
|
|
|
| |
... to manage the LIBDRM*_CFLAGS. The former is the recommended approach
by the Android build system developers while the latter has been
depreciated for quite some time.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 86919352e3da1c80409fdcb67c36f29a9687b7a9)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Appears to fix shader compilation. Tested by starting the client,
dragging the "quality and speed" slider back and forth, and watching the
console output - instead of piles of "shader failed to compile", the CPU
seems to be busy compiling shaders. I haven't actually tried to play.
Signed-off-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69226
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71591
Cc: [email protected]
(cherry picked from commit 00bf7d2e9cd60dbd82d25b459c448e11c545a89a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ir_tex opcode turns into a sample or sample_c message, which will try to
compute derivatives to determine the lod. This produces garbage for
non-fragment shaders where the sample coordinates don't correspond to
subspans.
We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
lod to 0.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89457
Cc: "10.5" <[email protected]>
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 993a6288f72fa98932df7cdb6f64d9dd645e670d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change fixes a regression with timer queries introduced with
commit 3eb6258. There the pending batchbuffer is flushed
only if glEndQuery is executed. This present change adds such
a flush to glQueryCounter which also schedules a value query
just like glEndQuery does. The patch fixes GPU timer queries
going mad from within osgviewer.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
Cc: [email protected]
(cherry picked from commit 1e1d5456ba3dff82301ad4bbdde2fb6e2f562fe3)
|
|
|
|
|
|
|
|
|
| |
Increase the device info .urb.size for CHV to match the default URB
size (192kB).
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
(cherry picked from commit 970dc2360372a7859691d690bd2f1976c3c97fb0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Haswell hardware seems to ignore Render Stream Select bits from
3DSTATE_STREAMOUT packet when the SOL stage is disabled even if
the PRM says otherwise. Because of this, all primitives are sent
down the pipeline for rasterization, which is wrong. If SOL is
enabled, Render Stream Select is honored and primitives bound to
non-zero streams are discarded after stream output.
Since the only purpose of primives sent to non-zero streams is to
be recorded by transform feedback, we can simply discard all geometry
bound to non-zero streams then transform feedback is disabled
to prevent it from ever reaching the rasterization stage.
Notice that this patch introduces a small change in the behavior we
get when a geometry shader emits more vertices than the maximum declared:
before, a vertex that was emitted to a non-zero stream when TF was
disabled would still count for the purposes of checking that we don't
exceed the maximum number of output vertices declared by the shader. With
this change, these vertices are completely ignored and won't increase
the output vertex count, making more room for other (hopefully more
useful) vertices.
Fixes piglit test arb_gpu_shader5-emitstreamvertex_nodraw on Haswell
and Broadwell.
v2 (Ken): Drop is_haswell check in favor of doing this unconditionally.
Broadwell needs the workaround as well, and it doesn't hurt to do it in
general. Also tweak comments - the Haswell PRM does actually mention
this ("Command Reference: Instructions" page 797).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83962
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
(cherry picked from commit 2042a2f961a07e04eaca0347e42859c249325531)
|
|
|
|
|
|
|
|
|
| |
Fixes Piglit's arb_gpu_shader5-xfb-streams-without-invocations.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: [email protected]
(cherry picked from commit f368d0fa1fe37a58780ee555d4a9ccf15474782b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jordan added this in commit 741782b5948bb3d01d699f062a37513c2e73b076 for
Gen7 platforms. I missed this when adding the Broadwell code.
Fixes Piglit's spec/arb_gpu_shader5/invocation-id-{basic,in-separate-gs}
with MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5 set.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: [email protected]
(cherry picked from commit f9e5dc0a85df8dbfb8213ff772dfeb218972db12)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow us to finally remove python from the build time
dependencies list. Considering that you're building from a release
tarball of course :-)
Cc: Bernd Kuhls <[email protected]>
Reported-by: Bernd Kuhls <[email protected]>
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit a665b9b3c89095923cf2251895afc69c9f79aafe)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both do_vs_prog and do_gs_prog initialize brw_stage_prog_data::nr_params to
the number of uniform *vectors* required by the shader rather than the number
of uniform components, contradicting the comment. This is inconsistent with
what the state upload code and scalar path expect but it happens to work until
Gen8 because vec4_visitor interprets it as a number of vectors on construction
and later on overwrites its original value with the number of uniform
components referenced by the shader.
Also there's no need to add the number of samplers, they're not actually
passed in as uniforms.
Fixes a memory corruption issue on BDW with SIMD8 VS.
Cc: "10.5" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit fd149628e142af769c1c0ec037bc297d8a3e871f)
[Emil Velikov: s/DIV_ROUND_UP/CEILING/]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().
Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202
Reviewed-by: Matt Turner <[email protected]>
(cherry picked from commit 6ac1bc90c4a7a6f32901a9782e14b090f6fe5270)
Nominated-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to loop over all color attachments, and emit FB writes for each
one, even if the shader didn't write to a corresponding output variable.
Those color attachments would be filled with garbage (undefined values).
Football Manager binds a framebuffer with 4 color attachments, but draws
to it using a shader that only writes to gl_FragData[0..2]. This meant
that color attachment 3 would be filled with garbage, resulting in
rendering artifacts. Now we skip writing to it, fixing rendering.
Writes to gl_FragColor initialize outputs[0..nr_color_regions-1] to
GRFs, while writes to gl_FragData[i] initialize outputs[i].
Thanks to Jason Ekstrand for tracking this down.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86747
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: [email protected]
(cherry picked from commit e95969cd9548033250ba12f2adf11740319b41e7)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we emitted the shader-time epilogue from emit_fb_writes(),
during the middle of looping through color regions (or emit_urb_writes
for the VS). This is duplicated several times and rather awkward.
I need to fix a bug in our FB write handling, and it will be a lot
easier if we move emit_shader_time_end() out of there.
Now, we simply emit FB writes/URB writes, and subsequently have
emit_shader_time_end() insert instructions before the final SEND with
EOT. Not only is this simpler, it's actually a slight improvement:
we now include the MOVs to set up the final FB write payload in our
shader-time measurements.
Note that INTEL_DEBUG=shader_time only exists on Gen7+, and uses
send-from-GRF. (In the past, we might have hit trouble where both
attempt to use MRFs for messages; that's not a problem now.)
v2: Rebase on v3 of the previous patch and other shader_time fixes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]> [v1]
Acked-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit 4ebeb71573ad44f7657810dc5dd2c9030e3e63db)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes another part of the INTEL_DEBUG=shader_time code emittable
at arbitrary locations, rather than just at the end of the instruction
stream.
v2: Don't lose smear! Caught by Topi Pohjolainen.
v3: Don't set smear on the destination of the MOV. Thanks Topi!
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit e43af8d09f919d02b5ac0810c1c0f1783cbef6ef)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of emit_shader_time_write, we now do emit(SHADER_TIME_ADD(...)).
The advantage is that we can also insert a shader time write at an
arbitrary location in the instruction stream, rather than being
restricted to emitting at the end.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit bea854c7f33cc10b8292f931f114afc4f88a8dd4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ADD(diff, diff, fs_reg(-2u)) instruction reads diff, which is a
width 1 register. We need to read it as <0,1,0> with a subreg of 0,
which is what smear accomplishes.
Fixes assertion:
brw_eu_emit.c:285: validate_reg: Assertion `hstride == 0' failed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit f1adc45dbe649cdd4538fb96f6d2a27328bbfba1)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These computations don't have anything to do with the currently
executing channels, so they should use force_writemask_all.
This fixes assert failures.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
(cherry picked from commit ef9cc7d0c176669c03130abf576f2b700be39514)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs.cpp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some given GLSL IR like (+ (neg x) (* 1.2 x)), the try_emit_mad
function would see that one of the +'s sources was a negate expression
and set mul_negate = true without confirming that it was actually a
multiply.
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89315
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89095
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit d528907fd2950c7bb968fff66dd79863cd128890)
[Emil Velikov: drop the changes in brw_vec4_visitor.cpp]
Signed-off-by: Emil Velikov <[email protected]>
Conflicts:
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
|
|
|
|
|
|
|
|
|
|
|
|
| |
Correctly set _BaseFormat field when creating a gl_renderbuffer
with EGLImage storage.
Change-Id: I8c9f7302d18b617f54fa68304d8ffee087ed8a77
Signed-off-by: Frank Henigman <[email protected]>
Reviewed-by: Stéphane Marchesin <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
(cherry picked from commit e43729943e67972e547a19123fb3afca6b77202b)
Nominated-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
Cc: 10.4, 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 0dfec59a2785cf7a87ee5128889ecebe810b611b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A while back I switched intel_blit_framebuffer to prefer Meta over the
BLT. This meant that Gen8 platforms would start using the 3D engine
for blits, just like we do on Gen6-7.5.
However, I hadn't considered Gen4-5 when making that change. The BLT
engine appears to be substantially faster on 965GM than using Meta to
drive the 3D engine. This isn't too surprising: original Gen4 doesn't
support tile offsets (that came on G45), and the level/layer fields
don't work for cubemap rendering, so for inconvenient miplevel
alignments, we end up blitting or copying data to/from temporaries
in order to render to it. We may as well just use the blitter.
I chose to use the BLT on Gen4-5 because they use the same ring for
both 3D and BLT; Gen6+ splits it out.
Fixes regressions on 965GM due to botched tile offset code (we should
fix those properly as well, but they're longstanding bugs - for now,
put things back to the status quo).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89430
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Cc: "10.5" <[email protected]>
(cherry picked from commit aa0705c06c03d2b882ac7b185ed123bc8a10d716)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SSSE3 swizzling code was written for fast uploads to the GPU and
assumed the destination was always 16-byte aligned. When we began using
this code for fast downloads as well we didn't do anything to account
for the fact that the destination pointer given by glReadPixels() or
glGetTexImage() is not guaranteed to be suitably aligned.
With SSSE3 enabled (at compile-time), some applications would crash when
an SSE aligned-store instruction tried to store to an unaligned
destination (or an assertion that the destination is aligned would
trigger).
To remedy this, tell intel_get_memcpy() whether we're uploading or
downloading so that it can select whether to assume the destination or
source is aligned, respectively.
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89416
Tested-by: Uriy Zhuravlev <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit 2e4c95dfe2cb205c327ceaa12b44a9273bdb20dc)
|
|
|
|
|
|
|
| |
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89317
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 1e128e9b69c6336762a2b6ee5d356c763b9ae3b0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a fix for a regression introduced in commit a9f8296d ("i965/fs:
Preserve the CFG in a few more places.").
The errata this code works around is described in a comment before the function:
"[DevBW, DevCL] Errata: A destination register from a send can not be
used as a destination register until after it has been sourced by an
instruction with a different destination register.
The framebuffer write's sources must be in message registers, which SEND
instructions cannot have as a destination. There's no way for this
errata to affect anything at the end of the program. Just remove the
code.
Cc: 10.4, 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84613
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit e214000f258ae564e64d839cccee9418526f226b)
|
|
|
|
|
|
|
|
|
| |
We could do better by tracking scratch reads and writes.
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88793
Reviewed-by: Jason Ekstrand <[email protected]>
(cherry picked from commit da20bf068ef0f816968d9bc4dfea81facf0fd680)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we compared our new GS-out VUE map to the existing *VS*-out
VUE map, which is bogus.
This would mostly manifest as redundant dirty flagging where the GS is
in use but the VS and GS output layouts differ; but there is a scary
case where we would fail to flag a GS-out layout change if it happened
to match the VS-out layout.
Signed-off-by: Chris Forbes <[email protected]>
Cc: "10.5, 10.4" <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88885
(cherry picked from commit b51ff50a767cc78d678ed3d2c25995f5c4194fea)
|
|
|
|
|
|
|
|
|
| |
I broke this in commit 2881b123d. I must have misread i2b as b2i.
Cc: 10.5 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88246
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 43ef2657a08f850c5756f28520f2cbe506807f24)
|
|
|
|
|
|
|
|
|
|
| |
It appears that all the other instructions that need it already use it.
This one just got missed.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "10.5" <[email protected]>
(cherry picked from commit b8a1637119249c1d5e76c27d0053360bbb7f4e77)
|
|
|
|
|
|
|
|
|
|
|
| |
The header is included in ../xmlpool.h. With the latter of which used
directly in a number of places in mesa.
Note that we can also add it (alongside t_option.h) to noinst_HEADERS,
but neither solution fixes the issue that brough us here - namely:
Do not regenerate the headers, if it already exists.
Cc: "10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Skylake+ doesn't support setting a depth buffer to a 1D surface but it
does allow pretending it's a 2D texture with a height of 1 instead.
This fixes the GL_DEPTH_COMPONENT_* tests of the copyteximage piglit
test (and also seems to avoid a subsequent GPU hang).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89037
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 5b29b2922afe2b8167a589fc2896a071fc85b693)
Nominated-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Cc: "10.5" <[email protected]>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540962
(cherry picked from commit 0b6d43e329d194b01ab5cd554617f79a13f6669a)
|
|
|
|
|
|
|
|
|
|
| |
Previously we were using a B/UB source in an Align16 instruction, which
is illegal. It for some reason works on all platforms, except Broadwell.
Cc: "10.5" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86811
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit e0137fd6f720e4977466b1760ac02a72c5abceb8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The saturate propagation pass recognizes that the second instruction
below does not interfere with an attempt to propagate the saturate
modifier from instruction 3 to 1.
1: add(8) dst0 src0 src1
2: mov.sat(8) dst1 dst0
3: mov.sat(8) dst2 dst0
Unfortunately, we did not consider the case of instruction 2 having a
source modifier on dst0. Take for instance:
1: add(8) dst0 src0 src1
2: mov.sat(8) dst1 -dst0
3: mov.sat(8) dst2 dst0
Consider such an instruction to interfere. Increase instruction counts
in Anomaly 2, which could be a bug fix depending on the values the first
instruction produces.
instructions in affected programs: 53228 -> 53934 (1.33%)
HURT: 360
Cc: <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 7f8dd91d166e49d7da98f90d6428dc2705fb96d0)
|
|
|
|
|
|
|
|
| |
This is safer and matches the conditional_mod propagation pass.
Cc: <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 871ad3f08bc34e16fdd728e9a4821b9a83e509f0)
|
|
|
|
|
|
| |
Cc: <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit bf3389ec49a158e0b66db8e038d801eacabd20f1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's some debate about whether we should use Meta or BLORP,
but either should run circles around the BLT engine.
In particular, this means that Gen8+ will use the 3D engine for blits,
like we do on Gen6-7.
Improves performance in "copypixrate -blit -back" (from Mesa demos)
by 232.037% +/- 3.15795% (n=10) on Broadwell GT3e.
v2: Rebase on Laura's changes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.5" <[email protected]>
(cherry picked from commit d523fefa756eef9c7a2c0d91cf4c2df10b89ed2a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... instead of emit(BRW_OPCODE_CMP, ...). In commit 6b3a301f I changed
vec4_visitor::CMP to set the destination's type to that of src0. In the
following commit (2335153f) I removed an apparently now unnecessary work
around for Gen8 that did the same thing.
But there was a single place that emitted a CMP instruction without
using the vec4_visitor::CMP function. Use it there.
And change dst_null_d to dst_null_f for good measure, since ARB vp
doesn't have integers.
Cc: "10.5" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89032
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 72b9f8db2a035b52dbd59af0d2a6441cbde11a4c)
|
|
|
|
|
|
|
|
|
|
| |
+82 Piglits - 100% of border color tests now pass on Haswell.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: [email protected]
(cherry picked from commit 08a06b6b891df456902f5e170f1d82236d0c73d2)
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should have no effect, but will make it easier to implement other
bug fixes.
v2: Eliminate "unsigned one" local; just use the value where necessary.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: [email protected]
(cherry picked from commit e1e73443c572b5432ef66a923fe64b73467f411b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hardware's integer luminance formats are completely unusable;
currently we fall back to RGBA. This means we need to override
the texture swizzle to obtain the XXX1 values expected for luminance
formats.
Fixes spec/EXT_texture_integer/texwrap formats bordercolor [swizzled]
on Broadwell - 100% of border color tests now pass on Broadwell.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: [email protected]
(cherry picked from commit 8cb18760cccf2c89d94c50ff14b330ec2d5c4a3c)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blits to or from a y-tiled surface must always be a multiple of the tile size.
From page 16 of the HSW PRM
(https://01.org/linuxgraphics/sites/default/files/documentation/intel-gfx-prm-osrc-hsw-memory-views.pdf#16)
"The pitch of a tiled enclosing region must be an integral number of tile
widths"
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
An upcoming patch is going to introduce some code here, and having this code
organized as the patch does makes it a bit easier to read later.
There should be no functional change here.
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As it turns out, we were over-thinking the cause of the hang on
Cherryview. It's simply errata for Cherryview.
commit 88fea85f09e2252035bec66ab26c375b45b000f5
Author: Ben Widawsky <[email protected]>
Date: Fri Nov 21 10:47:41 2014 -0800
i965/vec4/gen8: Handle the MUL dest hazard exception
This is an explanation to why we never saw the hang on BDW.
NOTE: The problem the original patch was trying to fix does still exist. It will
have to be fixed at some point.
v2: Modify commit message, s/CHV/BDW
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were incorrectly attributing VS time to FS8 on Gen8+, which now use
fs_visitor for vertex shaders.
We don't hit this for geometry shaders yet, but we may as well add
support now - the fix is obvious, and we'll just forget later.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we special cased FB writes and URB writes in the register
allocation code. What we really wanted was to handle any message with
EOT set.
This saves us from extending the list with new opcodes in the future.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This helper function basically just checks inst->eot, but also asserts
that only opcodes we expect to terminate threads have EOT set. As far
as I'm aware, we've never had such a bug.
Removing it means that we don't have to extend the list for new opcodes.
Cherryview and Skylake introduce an optimization where sampler messages
can have EOT set; scalar GS/HS/DS will likely introduce new opcodes as
well.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|