| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pctx->resource_copy_region() needs to fall back to sw copy for
non-renderable formats. But previously for things that we could
not use the blitter for, would fall back to 3d. Which won't work
if 3d can't render to the dst format either.
Instead rework things to fallback to fd_resource_copy_region(),
which will try 3d core and then fall back to memcpy().
Fixes (for example) dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Code-motion prep for next patch.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
We already had one for is_snorm() but not unorm.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes a crash with unsupported formats in dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot
Also fixes gpu hangs with some formats that are supported, but which we
don't know what internal-format to use for the blitter, for ex
dEQP-GLES3.functional.texture.format.sized.2d_array.rgb10_a2_pot
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Fixes: 0d240c22141 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z
Fixes: 0d240c22141 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
If we emit shader as a pointer to a GEM object, also set the RELOC_DUMP
flag as a hint to kernel that this is a useful buffer to snapshot for
debug dumps.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
With a recent enough kernel, set debug names for GEM BOs, which will
show up in $debugfs/gem
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Pull in updated UAPI and use kernel API version to enable softpin.
Since MSM_SUBMIT_BO_DUMP flag was added at same time, use that to
signal to kernel that cmdstream buffers are useful to dump for
debugging/cmdstream-traces.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
I needed the same function for v3d. This was originally in d3e046e76c06
("nir: Pull some of intel's image load/store format conversion to
nir_format.h") before we made am istake about simplifying the function.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This reverts commit 06fbcd2cd5cc5702c9039c26d20082a99bc157bf.
nir_pack_half_2x16_split *isn't* vectorizable, it's 1-component only, thus
why we had this split-scalar code in the first place.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This helps a lot when debugging image load/store lowering on large
testcases. Unfortunately the Mesa enum name stuff is under src/mesa and
we can't get at it from the compiler.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
We have a NIR path, and V3D doesn't have TGSI input for compute (only what
TTN can handle for the various gallium-internal shaders).
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we gave this function 0 counter buffers, we'd still try and
access pCounterBuffers[0] as this check was incorrect.
Fixes crash with ext_transform_feedback-pipeline-basic-primgen
on zink on radv.
Fixes: 677b496b6 (radv: fix begin/end transform feedback with 0 counter buffers.)
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pass should work for all bit sizes but it's less clear that the
extra instructions are worth it on small integers. Also, the hardware
doesn't do mul_high on anything other than 32-bit integers and, absent
any decent mechanism for testing the pass on 8 and 16-bit types, it's
probably best to just leave it disabled for now.
Shader-db results on Sky Lake:
total instructions in shared programs: 15105795 -> 15111403 (0.04%)
instructions in affected programs: 72774 -> 78382 (7.71%)
helped: 0
HURT: 265
Note that hurt here actually means helped because we're getting rid of
integer quotient operations (which are a send on some platforms!) and
replacing them with fairly cheap ALU ops.
Reviewed-by: Ian Romanick [email protected]
|
|
|
|
| |
Reviewed-by: Ian Romanick [email protected]
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It's a reasonably well-known fact in the world of compilers that integer
divisions by constants can be replaced by a multiply, an add, and some
shifts. This commit adds such an optimization to NIR for easiest case
of udiv. Other division operations will be added in following commits.
In order to provide some additional driver control, the pass takes a
minimum bit size to optimize.
Reviewed-by: Ian Romanick [email protected]
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick [email protected]
|
|
|
|
| |
Reviewed-by: Ian Romanick [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we have the three dri "platforms" - drm, apple and windows.
Since xf86vidmode is a thing only for the drm one, adjust the
preprocessor guards and correctly check for the dependency.
v2: terminate the GLX_USE_WINDOWSGL hunk
Cc: Jon TURNEY <[email protected]>
Fixes: 5bc509363b6 ("glx: make xf86vidmode mandatory for direct rendering")
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
To avoid the following warning:
./src/compiler/nir/nir_loop_analyze.c:807:16: warning: unused variable ‘ns’ [-Wunused-variable]
nir_shader *ns = impl->function->shader;
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Virglrenderer does the wrong thing when given an instance divisor;
it tries to use the element-index rather than the binding-index as
the argument to glVertexBindingDivisor(). This worked fine as long
as there was a 1:1 relationship between elements and bindings,
which was the case util 19a91841c34 "st/mesa: Use Array._DrawVAO in
st_atom_array.c.".
So let's detect instance divisors, and restore a 1:1 relationship in
that case. This will make old versions of virglrenderer behave
correctly. For newer versions, we can consider making a better
interface, where the instance divisor isn't specified per element,
but rather per binding. But let's save that for another day.
Signed-off-by: Erik Faye-Lund <[email protected]>
Fixes: 19a91841c34 "st/mesa: Use Array._DrawVAO in st_atom_array.c."
Reviewed-by: Mathias Fröhlich <[email protected]>
Tested-By: Gert Wollny <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This just has one member for now; the handle. But this is about to
change.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Mathias Fröhlich <[email protected]>
Tested-By: Gert Wollny <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Mathias Fröhlich <[email protected]>
Tested-By: Gert Wollny <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Mathias Fröhlich <[email protected]>
Tested-By: Gert Wollny <[email protected]>
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit e90429cc6dc5b5ada4253fc0a8517645d86a4f6c)
|
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit 419ee20097597bed77e73fd283e6d15e8dcb89e9)
|
|
|
|
|
|
|
| |
This is always TRUE if htile_size is not 0.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
When hile_size is 0, we can't enable HTILE. This doesn't change
anything, except not calling radv_image_alloc_htile().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, Yakuza hangs the GPU with DXVK. We don't know if
linetrip and pointlist are affected, so my point is to do that
only for triangle strips.
Cc: [email protected]
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Feral games aren't affected because they don't decompress DCC.
F1 2018 has one DCC decompression per frame, but I don't see
any performance improvements. This new predicate will be
probably more useful for DCC/MSAA.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It's somehow similar to the FCE predicate.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
This was noted by Jason in review when I tried to make a helper for the
old path.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I needed the same functions for v3d. Note that the color value in the
Intel lowering has already been cut down to image.chans num_components.
v2: Drop the half float one, since it was a 1-liner after cleanup.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Most of the bits were constant, but a few were missed. Avoids warnings
from v3d's upcoming static const bits declarations.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows loop analysis to detect inductions variables that
are incremented in both branches of an if rather than in a main
loop block. For example:
loop {
block block_1:
/* preds: block_0 block_7 */
vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20
vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4
vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4
vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21
vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22
vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9
vec1 32 ssa_14 = ige ssa_8, ssa_5
/* succs: block_2 block_3 */
if ssa_14 {
block block_2:
/* preds: block_1 */
break
/* succs: block_8 */
} else {
block block_3:
/* preds: block_1 */
/* succs: block_4 */
}
block block_4:
/* preds: block_3 */
vec1 32 ssa_15 = ilt ssa_6, ssa_8
/* succs: block_5 block_6 */
if ssa_15 {
block block_5:
/* preds: block_4 */
vec1 32 ssa_16 = iadd ssa_8, ssa_7
vec1 32 ssa_17 = load_const (0x3f800000 /* 1.000000*/)
/* succs: block_7 */
} else {
block block_6:
/* preds: block_4 */
vec1 32 ssa_18 = iadd ssa_8, ssa_7
vec1 32 ssa_19 = load_const (0x3f800000 /* 1.000000*/)
/* succs: block_7 */
}
block block_7:
/* preds: block_5 block_6 */
vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18
vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4
vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19
/* succs: block_1 */
}
Unfortunatly GCM could move the addition out of the if for us
(making this patch unrequired) but we still cannot enable the GCM
pass without regressions.
This unrolls a loop in Rise of The Tomb Raider.
vkpipeline-db results (VEGA):
Totals from affected shaders:
SGPRS: 88 -> 96 (9.09 %)
VGPRS: 56 -> 52 (-7.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2168 -> 4560 (110.33 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 4 -> 4 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Thomas Helland <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
|
|
|
|
| |
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
| |
This will allow us to improve analysis to find more induction
variables.
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removing the last continue can allow more loops to unroll. Also
inserting code into the if branch can allow the various if opts
to progress further.
The insertion of some loops into the if branch also reduces VGPR
use in some shaders.
vkpipeline-db results (VEGA):
Totals from affected shaders:
SGPRS: 6552 -> 6576 (0.37 %)
VGPRS: 6544 -> 6532 (-0.18 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 481952 -> 478032 (-0.81 %) bytes
LDS: 13 -> 13 (0.00 %) blocks
Max Waves: 241 -> 242 (0.41 %)
Wait states: 0 -> 0 (0.00 %)
Shader-db results radeonsi (VEGA):
Totals from affected shaders:
SGPRS: 168 -> 168 (0.00 %)
VGPRS: 144 -> 140 (-2.78 %)
Spilled SGPRs: 157 -> 157 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 8524 -> 8488 (-0.42 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 7 -> 7 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
v2: (Timothy Arceri):
- allow for continues in either branch
- move any trailing loops inside the if as well as blocks.
- leave nir_opt_trivial_continues() to actually remove the
continue.
Reviewed-by: Thomas Helland <[email protected]>
Signed-off-by: Timothy Arceri <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
|
|
|
|
|
|
|
| |
Here we rework force_unroll_array_access() so that we can reuse
the induction variable detection in a following patch.
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
| |
Acked-by: Jason Ekstrand <[email protected]>
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This documents a process for using GitLab Merge Requests as an second
way to submit code changes for Mesa. Only one of the two methods is
allowed for each patch series.
We will *not* require all patches to be emailed. Some code changes may
be reviewed and merged without any discussion on the mesa-dev email
list.
v2:
* No longer require email. Allow submitter to choose email or a
GitLab merge request.
* Various feedback from Brian, Daniel, Dylan, Eric, Erik, Jason,
Matt, Michel and Rob.
Signed-off-by: Jordan Justen <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Error message building freedreno Gallium driver with meson:
../src/gallium/drivers/freedreno/freedreno_fence.c:27:21: fatal error: libsync.h: No such file or directory
\#include <libsync.h>
Fixes: 4aa69cc4257 ("meson: build freedreno")
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|