| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MaxPicOrderCntLsb should be at least 16 according to the spec,
therefore add minimum value check.
Also use poc value passed from st instead of calculation
in slice header encoding.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: [email protected]
V2: Fix typo
V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb
should be power of 2 according to spec.
Signed-off-by: Boyuan Zhang <[email protected]>
Acked-by: Leo Liu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MaxPicOrderCntLsb should be at least 16 according to the spec,
therefore add minimum value check.
Also use poc value passed from st instead of calculation
in slice header encoding.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: [email protected]
V2: Fix typo
V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb
should be power of 2 according to spec.
Signed-off-by: Boyuan Zhang <[email protected]>
Acked-by: Leo Liu <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We don't have calculate final quotient in order to calculate unsigned
modulo result. Once we are done with error correction we have partial
result which can be used to find out modulo operation result
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make scalar scheduling onto vector units more aggressive (it can only
help while we schedule strictly in order). Also, allow imov on VLUT.
total bundles in shared programs: 2176 -> 2117 (-2.71%)
bundles in affected programs: 901 -> 842 (-6.55%)
helped: 24
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 2.46 x̃: 2
helped stats (rel) min: 2.08% max: 20.00% x̄: 8.68% x̃: 5.94%
95% mean confidence interval for bundles value: -3.93 -0.99
95% mean confidence interval for bundles %-change: -10.92% -6.45%
Bundles are helped.
total quadwords in shared programs: 3605 -> 3566 (-1.08%)
quadwords in affected programs: 1984 -> 1945 (-1.97%)
helped: 28
HURT: 5
helped stats (abs) min: 1 max: 3 x̄: 1.68 x̃: 2
helped stats (rel) min: 1.02% max: 14.29% x̄: 5.12% x̃: 2.94%
HURT stats (abs) min: 1 max: 3 x̄: 1.60 x̃: 1
HURT stats (rel) min: 0.57% max: 9.09% x̄: 6.40% x̃: 9.09%
95% mean confidence interval for quadwords value: -1.67 -0.69
95% mean confidence interval for quadwords %-change: -5.37% -1.37%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes inlining of integer constants.
total quadwords in shared programs: 3585 -> 3568 (-0.47%)
quadwords in affected programs: 625 -> 608 (-2.72%)
helped: 13
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1
helped stats (rel) min: 1.27% max: 9.52% x̄: 3.84% x̃: 2.94%
95% mean confidence interval for quadwords value: -1.60 -1.02
95% mean confidence interval for quadwords %-change: -5.60% -2.07%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We loosen the requirement of "no dependencies" to simply be "no
non-pipelined dependencies", so we check for what could be pipelined.
total bundles in shared programs: 2176 -> 2156 (-0.92%)
bundles in affected programs: 779 -> 759 (-2.57%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.33% max: 20.00% x̄: 6.47% x̃: 2.78%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -9.44% -3.50%
Bundles are helped.
total quadwords in shared programs: 3605 -> 3585 (-0.55%)
quadwords in affected programs: 1391 -> 1371 (-1.44%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 14.29% x̄: 3.84% x̃: 1.64%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -5.73% -1.94%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than bailing if we see something that's not SSA, do out the
analysis to check if we can pipeline and do so if we can.
total registers in shared programs: 392 -> 391 (-0.26%)
registers in affected programs: 3 -> 2 (-33.33%)
helped: 1
HURT: 0
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This facilitates analysis of vec4 registers (after going out-of-SSA).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than always emitting an extra move for fragments, check the
actual criteria and emit accordingly. (This was lost during the RA
improvements at the end of May).
total bundles in shared programs: 2210 -> 2176 (-1.54%)
bundles in affected programs: 501 -> 467 (-6.79%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.59% max: 33.33% x̄: 13.13% x̃: 12.50%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -16.06% -10.21%
Bundles are helped.
total quadwords in shared programs: 3639 -> 3605 (-0.93%)
quadwords in affected programs: 795 -> 761 (-4.28%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.96% max: 33.33% x̄: 11.22% x̃: 8.33%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -14.31% -8.13%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Think of this pass as register coalescing part 2. After RA runs, but
before scheduling, we scan for code of the form:
mov rN, rN
and delete the move, since it's totally redundant. This pass helps
already, but it'd of course be much more effective paired with
register coalescing to encourage moves in general to end up in this
form. Nevertheless, even by itself:
total instructions in shared programs: 3665 -> 3613 (-1.42%)
instructions in affected programs: 2046 -> 1994 (-2.54%)
helped: 52
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 8.02% x̃: 4.00%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -10.26% -5.79%
Instructions are helped.
total bundles in shared programs: 2256 -> 2213 (-1.91%)
bundles in affected programs: 1154 -> 1111 (-3.73%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.33% max: 25.00% x̄: 9.10% x̃: 5.56%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -11.60% -6.60%
Bundles are helped.
total quadwords in shared programs: 3689 -> 3642 (-1.27%)
quadwords in affected programs: 2025 -> 1978 (-2.32%)
helped: 47
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 7.86% x̃: 3.85%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -10.30% -5.42%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
To be used with redundant move elimination.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 3916 -> 3665 (-6.41%)
instructions in affected programs: 1405 -> 1154 (-17.86%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 21 x̄: 7.17 x̃: 3
helped stats (rel) min: 3.00% max: 28.57% x̄: 20.11% x̃: 21.74%
95% mean confidence interval for instructions value: -9.35 -4.99
95% mean confidence interval for instructions %-change: -22.75% -17.46%
Instructions are helped.
total bundles in shared programs: 2472 -> 2256 (-8.74%)
bundles in affected programs: 906 -> 690 (-23.84%)
helped: 32
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 6.75 x̃: 3
helped stats (rel) min: 5.56% max: 32.26% x̄: 20.83% x̃: 16.67%
95% mean confidence interval for bundles value: -9.09 -4.41
95% mean confidence interval for bundles %-change: -23.77% -17.89%
Bundles are helped.
total quadwords in shared programs: 3965 -> 3689 (-6.96%)
quadwords in affected programs: 1568 -> 1292 (-17.60%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 21 x̄: 7.89 x̃: 3
helped stats (rel) min: 2.08% max: 28.57% x̄: 19.87% x̃: 20.00%
95% mean confidence interval for quadwords value: -10.38 -5.39
95% mean confidence interval for quadwords %-change: -22.57% -17.17%
Quadwords are helped.
total registers in shared programs: 411 -> 392 (-4.62%)
registers in affected programs: 76 -> 57 (-25.00%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.27 x̃: 1
helped stats (rel) min: 9.09% max: 50.00% x̄: 30.97% x̃: 33.33%
95% mean confidence interval for registers value: -1.52 -1.01
95% mean confidence interval for registers %-change: -39.12% -22.82%
Registers are helped.
total threads in shared programs: 426 -> 432 (1.41%)
threads in affected programs: 6 -> 12 (100.00%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The source and destination were incorrectly flipped in the move, but
some details of our internal regalloc made this function anyway. Now
that we're changing the regalloc, we need to fix this to avoid
regressing blend shaders.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We shouldn't try to schedule onto a vmul if the last unit was a smul;
that would force a break ("traveling back in time").
total bundles in shared programs: 2519 -> 2472 (-1.87%)
bundles in affected programs: 791 -> 744 (-5.94%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 2.35 x̃: 1
helped stats (rel) min: 1.52% max: 11.76% x̄: 7.94% x̃: 7.69%
95% mean confidence interval for bundles value: -3.47 -1.23
95% mean confidence interval for bundles %-change: -9.36% -6.51%
Bundles are helped.
total quadwords in shared programs: 4028 -> 3965 (-1.56%)
quadwords in affected programs: 1223 -> 1160 (-5.15%)
helped: 17
HURT: 0
helped stats (abs) min: 1 max: 17 x̄: 3.71 x̃: 2
helped stats (rel) min: 2.97% max: 10.64% x̄: 6.97% x̃: 7.14%
95% mean confidence interval for quadwords value: -5.71 -1.70
95% mean confidence interval for quadwords %-change: -8.03% -5.91%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
The swizzle should be taken on the masked component, rather than
unconditionally X.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This is a special case of DCE designed to run after the out-of-ssa pass
to cleanup special register lowering.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We mixed up component_lo and full, which made it appear that we had
less freedom in RA than we actually do. Fix this to fix some
disassemblies as well as prepare for RA with the bias field.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Following the RA work, we apply the same technique to eliminate the move
to r27 when loading cubemaps.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
As per previous commit, Meson doesn't support using uninstalled libs,
they're simply not ready until `ninja install` is ran, so delete them.
Suggested-by: Jason Ekstrand <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]> # for anv
Reviewed-by: Eric Anholt <[email protected]> # for tu
Reviewed-by: Bas Nieuwenhuizen <[email protected]> # for radv
|
|
|
|
|
|
|
|
|
|
|
| |
We need 5 images:
1) CPU work
2) GPU work
3) idle
4) queued for flip
5) presenting
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise the wait only happens at flip time, which messes with
keeping idle buffers around if the GPU work makes the image miss
the next flip.
I decided not to use the wait fences as those are still xshm fences,
so that means we'd still have to wait in the application. Just doing
it before presenting makes things simpler.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
This allows doing a potential long blocking operation before present.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Much easier to work with if we want to use them in the WS-specific
WSI implementation.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not only variables can be flagged as NonUniformEXT but also
expressions. We're currently ignoring it in an expression such as :
imageLoad(data[nonuniformEXT(rIndex)], 0)
The associated SPIRV :
OpDecorate %69 NonUniformEXT
...
%69 = OpLoad %61 %68
This changes propagates access qualifiers through ssa & pointers so
that when it hits a OpLoad/OpStore style instructions, qualifiers are
not forgotten.
Fixes failure the following tests :
dEQP-VK.descriptor_indexing.*
Signed-off-by: Lionel Landwerlin <[email protected]>
Fixes: 8ed583fe523703 ("spirv: Handle the NonUniformEXT decoration")
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This refactor allows for common code to apply decoration on all
ssa/pointer values. In particular this will allow to propagage access
qualifiers.
Signed-off-by: Lionel Landwerlin <[email protected]>
Suggested-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
SPIRV added the ability to access variables and have expressions non
dynamically uniform and because spirv_to_nir generates deref
instructions, we'll need to have that access there.
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
| |
Old program was overwritten without release of memory.
Signed-off-by: Yevhenii Kolesnikov <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Fixes leaking memory on iris.
Signed-off-by: Yevhenii Kolesnikov <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This needs to take the vertex count from the provided transform
feedback buffer.
v2:
- don't take the vertex count from the underlying buffer, instead,
take it from a v3d subclass of pipe_stream_output_target (Eric).
Fixes piglit tests:
spec/ext_transform_feedback2/draw-auto
spec/ext_transform_feedback2/draw-auto instanced
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This reverts commit 4508f43eed5a4528f0e8ca9d1cfcdc78857043e0, which
broke a bunch of dEQP tests (e.g. in
dEQP-GLES2.functional.draw.draw_arrays.*)
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
It's totally implementable, it's just that the plumbing is a bit
different and we never hooked it up. Don't advertise a broken feature.
Fixes: 36ee2fd61c8 "anv: Implement the basic form of VK_EXT_transform_feedback"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Iago Toral Quiroga fixed this in commit 94f740e3fce0cb26e4d90cb9de75b,
but it recently regressed in 0d8826f723cd8868b5271f17f18a1ab4548a1199.
Quoting Iago's original commit message for the fix:
GetTex*Image should return INVALID_ENUM if target is not valid, however,
GetTextureImage does not receive a target, and instead should return
INVALID_OPERATION if the effective target is not valid. From the
OpenGL 4.6 core profile spec, section 8.11 Texture Queries:
"An INVALID_OPERATION error is generated by GetTextureImage if the
effective target is not one of TEXTURE_1D, TEXTURE_2D, TEXTURE_3D,
TEXTURE_1D_ARRAY, TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY,
TEXTURE_RECTANGLE, or TEXTURE_CUBE_MAP (for GetTextureImage only)."
Note that this differs from the original ARB_direct_state_access spec.
However, the EXT_direct_state_access version does take a target
parameter, so it should continue reporting INVALID_ENUM.
Fixes KHR-GL45.direct_state_access.textures_image_query_errors.
Fixes: 0d8826f723c ("mesa: refactor get_texture_image to remove duplicate code")
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
In a few cases, we switch to MI_MATH instead of MI_PREDICATE,
just because we were already doing math and it's easier to chain
together.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
| |
This tests that predicated stores work.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
| |
This performs predicated MI_STORE_REGISTER_MEM commands, assuming that
the condition is already loaded into MI_PREDICATE_DATA.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
| |
These provide comparisons against zero.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
|
| |
We might want to resolve something to be in a particular register,
so we can access it outside of the gen_mi framework...but it may already
be in that register, at which point there's no work to do.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
| |
This will let us use Jason's new MI-builder shortly.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
| |
It's going to be in genxml code shortly.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
| |
This will let us put the genxml boilerplate in one place, before we
expand genxml to more files shortly. Like i965/genX_boilerplate.h.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
| |
This lets us specify the prototypes once, instead of cut and pasting
them per generation. isl uses a similar approach (isl_genX_priv.h).
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
| |
Set the correct relative offset.
Fixes: f8b6c5a "radeonsi: rewrite si_get_opaque_metadata, also for gfx10 support"
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
I'm using this for shader-db analysis on x86_64 systems.
Reviewed-by: Rob Clark <[email protected]>
|