| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In fact, the description of the workaround states that the mask field
doesn't work correctly on gen10, and we need to set it to 0xffff even we
we only want to update a single field:
"The mask bits are not implemented properly on 3DSTATE_3D_MODE. Driver
must always program bits 31:16 of DW1 a value of 0xFFFF. This means
if it is only updating 1 field, it must update all the fields to the
correct value."
So unless we want to change any of the fields of 3DSTATE_3D_MODE,
there's not need to emit. Additionally, it seems this workaround is not
required on gen11. And last but not least, this workaround is not
implemented on iris or anv, and it doesn't seem to be missed there.
So let's just remove the whole thing.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We accidentally started copying a full 64-bit value rather than copying
a 32-bit offset and zeroing the top 32-bits. This caused us to compute
bogus vertex counts which could lead to GPU hangs in some cases.
Thanks to Clayton Craft for catching the regressions!
Fixes: 0e24d10ff5c ("iris: Use gen_mi_builder to handle CS ALU operations.")
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's kind-of an anomaly that the Intel drivers are still treating
gl_FragCoord as an input. It also makes zero sense because we have to
special-case it in the back-end.
Because ANV is the only user of nir_lower_wpos_center, we go ahead and
just update it to look for nir_intrinsic_load_frag_coord as part of this
patch.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This fixes glsl-fcoord-invariant-pass.shader_test on drivers that set
GLSLFragCoordIsSysVal which includes radeonsi among others.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Seems like RB_BLIT_SCISSOR needs to be aligned to (minimum?) tile size.
Fixes intermittent GPU hangs triggered by some of the three.js samples
on https://threejs.org/
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fishgl.com has a shader which does roughly:
foo = texture(...);
if (bar)
foo = texture(...);
after lowering phi webs to regs we end up w/ a vec4 reg (array). But
since it was not an indirect access, we try to skip the extra mov. This
results that the per-component fanout (split) meta instructions store
directly to the reg (array). Which doesn't work out in RA.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
0.45 has a few annoying bugs (like the one in !358 [1]), and 0.46 is
well over a year old by now, so let's move to it.
[1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/358
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
| |
Arcturus CHIP enum is less than Navi10, since it's still gfx9,
but its VCN version belongs to VCN2.x
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
| |
different internal registers offset from previous HW
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Sonny Jiang <[email protected]>
Reviewed-by: Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Sonny Jiang <[email protected]>
Reviewed-by: Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
| |
Init graphic shader Only when PIPE_CAP_GRAPHICS is true.
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
| |
Natively supported on VI+.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Taken off https://android.googlesource.com/platform/frameworks/native/+/refs/tags/android-9.0.0_r45/vulkan/include/vulkan/vk_android_native_buffer.h
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
aka. fix a s/||/&&/ typo
Fixes: 74063ee61aadd1371a9b ("intel/mi: Add a new gen_mi_store_if() helper.")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that helgrind is less upset and I've completed many successful
full shader-db runs, we should be able to enable freedreno shader-db
runs for Mesa checkins on the tiny public shader-db.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Even if the data race wasn't real (I'm not great at reasoning about
this), helgrind is a nice enough tool that keeping noise out of it is
probably worthwhile. Besides, typing out the numbers keeps the data
in the read-only data section instead of emitting code to initialize
it every time.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
| |
The value is only used for IR3_DBG_DISASM, but it cleans up the
helgrind output.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Shaders are shared across contexts in gallium (part of making it so
that you get shader compile at link time, for shader-db and to reduce
compiles at draw time). So, we need to protect from variant creation
for a shader from multiple threads at the same time.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a single ir3_compiler in the screen, and each context may be
compiling ir3 shaders, which call ir3_create. ralloc doesn't do any
locking on its own, so eventually you can end up racing to break
ralloc's linked lists.
We really don't want struct ir3 to live as long as the compiler (maybe
struct ir3_shader's lifetime, if anything), so you'd better be freeing
it anyway.
Fixes: 8fe20762433d ("freedreno/ir3: convert over to ralloc")
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
If the variable's going to be static, we shouldn't be memsetting it
from every thread and instead just have it in the data section.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Fixes: b5e04e9217b "radv: Support allocating variable size descriptor sets."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
On Haswell, the format works but it doesn't properly do an sRGB decode.
It appears to act identically to R8G8B8_UNORM. Only Vulkan uses this
format so this only affects Vulkan on HSW.
Cc: [email protected]
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Fixes: 9beb3391b55 ("pan/midgard: Tag SSA/reg")
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes:
dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_points
dEQP-GLES3.functional.rasterizer_discard.basic.write_stencil_points
dEQP-GLES3.functional.rasterizer_discard.fbo.write_depth_points
dEQP-GLES3.functional.rasterizer_discard.fbo.write_stencil_points
dEQP-GLES3.functional.rasterizer_discard.scissor.write_depth_points
dEQP-GLES3.functional.rasterizer_discard.scissor.write_stencil_points
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
To select the correct layer the z-coordinate must be rounded before it
is multiplied by six.
Fixes a number of tests out of
dEQP-GLES31.functional.texture.filtering.cube_array.formats.*
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Eric's nits
v3: Reuse timespec utils (Daniel)
Deal with ppoll being interrupted by a signal (Daniel)
v4: Remove unnecessary time check
v5: Deal with EAGAIN from wl_display_prepare_read_queue() (Daniel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]> (v2)
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
|
|
|
|
| |
Copied from Weston, upon Daniel's suggestion
Signed-off-by: Lionel Landwerlin <[email protected]>
Suggested-by: Daniel Stone <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For now, this keeps the "100 bytes" allocation; we can try to figure out
the correct size as a follow up.
Suggested-by: Lionel Landwerlin <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It causes issues on GFX10.
This fixes rendering issues with vkmark and Wreckfest at least.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]
|
|
|
|
|
|
|
|
|
|
| |
This might happen when a pipeline doesn't define the vertex input
state, so the buffer data format is 0 (aka INVALID).
This fixes crashes when compiling some shaders on GFX10.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Seems to fix some hair artifacts in Max Payne 3:
https://github.com/daniel-schuermann/mesa/issues/76
Signed-off-by: Rhys Perry <[email protected]>
Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver')
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Totals from affected shaders:
SGPRS: 376 -> 376 (0.00 %)
VGPRS: 620 -> 560 (-9.68 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 292 -> 292 (0.00 %) dwords per thread
Code Size: 20024 -> 20144 (0.60 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 25 -> 25 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit rewrites opt_find_array_copies to be able to handle
an array copy sequence with other intervening operations in between. In
particular, this handles the case where we OpLoad an array of structs
and then OpStore it, which generates code like:
foo[0].a = bar[0].a
foo[0].b = bar[0].b
foo[1].a = bar[1].a
foo[1].b = bar[1].b
...
that wasn't recognized by the previous pass.
In order to correctly handle copying arrays of arrays, and in particular
to correctly handle copies involving wildcards, we need to use a tree
structure similar to lower_vars_to_ssa so that we can walk all the
partial array copies invalidated by a particular write, including
ones where one of the common indices is a wildcard. I actually think
that when factoring in the needed hashing/comparing code, a hash table
based approach wouldn't be a lot smaller anyways.
All of the changes come from tessellation control shaders in Strange
Brigade, where we're able to remove the DXVK-inserted copy at the
beginning of the shader. These are the result for radv:
Totals from affected shaders:
SGPRS: 4576 -> 4576 (0.00 %)
VGPRS: 13784 -> 5560 (-59.66 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread
Code Size: 329940 -> 263268 (-20.21 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 330 -> 898 (172.12 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
We print the size as decimal too, and using hex without a leading "0x"
was very confusing.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were missing handling for a few other ops that rearrange their
sources somehow in codegen, namely complex2 and select.
This should fix [email protected]@execution@built-in-functions@vs-asin-vec3
and possibly other random regressions from the new scheduler which were
supposed to be fixed in the commit right after.
Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler")
Signed-off-by: Connor Abbott <[email protected]>
Acked-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Remove an unnecessary nir_lower_regs_to_ssa as that should be done by
the state tracker, and add a missing DCE pass after running copy
propagation in order to remove the dead copies. This shouldn't fix
anything but the second part will reduce shader sizes.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In try_node(), we assume that the node we pick can still be scheduled
successfully after speculatively trying all the other nodes. Normally we
always undo every node after speculating it, so that when we finally
schedule best_node the scheduler state is exactly the same and it
succeeds. However, we also try to spill nodes, which can change the
state and in a corner case that can make scheduling best_node fail. In
particular, the following sequence of events happened with piglit
shaders@glsl-vs-if-nested: a partially-ready node N was spilled and a
register store node S, which is a use of N, was created and then later
the other uses of N were scheduled, so that S is now ready and N is
partially ready. First we try to schedule S and succeed, then we try to
schedule another node M, which fails, so we try to spill the remaining
uses of N. This succeeds, but scheduling M still fails so that best_node
is still S. However since one of the uses of N is one cycle ago, and
therefore we inserted a read dependent on S one cycle ago when spilling
N, S can no longer be scheduled as read-after-write latency is three
cycles.
While we could ad-hoc try to catch cases like this, or (the best option
but very complicated) treat the spill as speculative and roll it back if
we decide not to schedule the node, a simpler solution is to just
give up on spilling if we've already successfully speculatively
scheduled another node. We'd give up a few cases where we discover that
by spilling even harder we could schedule a more desirable node, but
that seems like it would be pretty rare in practice. With this we
guarantee that nothing has been touched after best_node was successfully
scheduled. We also cut down on pointless spilling, since if we already
scheduled a node it's unlikely that spilling harder will let us schedule
an even better node, and hence any spilling at this point is probably
useless.
While we're here, clean up the code around spilling by flattening the
two if's and getting rid of the second unnecessary check for INT_MIN.
Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler")
Acked-by: Qiang Yu <[email protected]>
Signed-off-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With OpenCL, kernels can take arguments and return values (?). However
in practice, there is no more TGSI compute implementation, and even if
there were, it would probably have named functions and no explicit main.
This improves RA considerably for compute shaders, since temps are not
kept around as return values.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This can happen if it's e.g. a uniform or a function argument.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Cc: [email protected]
|