| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The driver only supports up to 8 samples, so it's useless to
create more pipelines than needed.
This fixes a conditional jump reported by Valgrind on GFX10:
==194282== Conditional jump or move depends on uninitialised value(s)
==194282== at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242)
==194282== by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334)
==194282== by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440)
==194282== by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764)
==194282== by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788)
==194282== by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114)
==194282== by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297)
==194282== by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277)
==194282== by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363)
==194282== by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080
This is caused by an out of bound access of 'fmask_array' (ie. index
is 4 as for 16 samples).
Cc: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Introduce the appropriate pipe controls
Properly deal with changes in metric sets (using execbuf parameter)
Record marker at query end
v3: Fill out PerfCntr1&2
v4: Introduce vkUninitializePerformanceApiINTEL
v5: Use new execbuf extension mechanism
v6: Fix comments in genX_query.c (Rafael)
Use PIPE_CONTROL workarounds (Rafael)
Refactor on the last kernel series update (Lionel)
v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
Those are not part of the OA reports.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We have 2 of those we can configure to source programmable events.
Those are not part of the OA reports. Configuration happens in i915
through the metric set selected by the application. On the Mesa side
we'll just sample those and do a diff.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
We use this as a communication mechanism between MDAPI & Anv.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pull new updates from drm-next as of the following commit:
commit f1b4a9217efd61d0b84c6dc404596c8519ff6f59
Merge: 400e91347e1d f3a36d469621
Author: Dave Airlie <[email protected]>
Date: Tue Oct 22 15:04:00 2019 +1000
Merge tag 'du-next-20191016' of git://linuxtv.org/pinchartl/media into drm-next
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Will conflict with the genxml RPSTAT register.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
| |
We want to query the content of register configurations from the
kernel. Let's pull this out of the query.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The Vulkan performance query extension is a bit lower level than the
GL one. Expose some of the functions to do the result accumulation
directly in the Anv driver.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
A simple utility to put the marker at the right location.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
| |
Other debug_printf's in this file are in if (0) blocks.
Trivial.
|
|
|
|
|
|
|
|
| |
This is useful for PBO texture upload with GL_RGB and GL_UNSIGNED_BYTE.
v2: Vasily Khoruzhick provided an update for the Lima CI expectations.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're skipping the context destruction in some cases which is the
grand scheme of thing is not that important because closing device->fd
will destroy the associated context as well.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reported-by: Jordan Justen <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Cc: <[email protected]>
Fixes: b30e01aef56 ("anv: fix memory leak on device destroy")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
or export"
We don't properly pass on ctx.lgkm_cnt/ctx.barrier_imm/etc, so this
waitcnt was necessary for barriers and correctly waiting for SMEM before
s_dcache_wb on GFX10.
Totals from affected shaders:
SGPRS: 33200 -> 33200 (0.00 %)
VGPRS: 31376 -> 31376 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2431804 -> 2433956 (0.09 %) bytes
LDS: 316 -> 316 (0.00 %) blocks
Max Waves: 1609 -> 1609 (0.00 %)
This reverts commit 2c050b49b3d776f054f1265d5523cabb61f22fc3.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Affects 30 shaders in the pipeline-db (all youngblood).
Totals from affected shaders:
SGPRS: 2656 -> 2456 (-7.53 %)
VGPRS: 2260 -> 2260 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 240680 -> 240944 (0.11 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 90 -> 90 (0.00 %)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
p_extract_vector's second operand is in units of the definition size, not
dwords.
v2: move extract_subvector() to right before ds_write_helper
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
| |
We'll want these for GS, since VS->GS IO on Vega is done using LDS.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
| |
we don't need more than that
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
for st/mesa
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Small typo resulted in not converting footprint to vec4, meaning that we
could potentially ask for quite a few more registers than required
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the load_interpolated_input is scalarized, we would be too
conservative about deciding the tex instruction wasn't a candidate to
pre-fetch:
vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 */)
vec2 32 ssa_1 = intrinsic load_barycentric_pixel () (0) /* interp_mode=0 */
vec1 32 ssa_2 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 0) /* base=0 */ /* component=0 */ /* packed:v_uv,v_uv1 */
vec1 32 ssa_3 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 1) /* base=0 */ /* component=1 */ /* packed:v_uv,v_uv1 */
vec2 32 ssa_8 = vec2 ssa_2, ssa_3
vec4 32 ssa_9 = tex ssa_8 (coord), 0 (texture), 0 (sampler)
Really we don't care that the texcoord components come from different
load_interpolated_input instructions, just that they have consecutive
varying offsets.
Reported-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Previously, we used one hashset per BB, so that we could
always initialize the current hashset from the immediate
dominator. This patch changes the behavior to a single
hashmap using the block index per instruction to resolve
dominance.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Some of these lowerings aren't supported for drivers that supports
tesselation and geometry shaders. Let's add a couple of asserts to make
it obvious if these have been enabled when it's not possible.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2:
* Use LLVM 8 from buster-backports
v3:
* Use LLVM 7 again for armhf, llvmpipe is still broken there with LLVM 8
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
Cross builds don't use the llvm-config path from the native file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows running the regression tests.
One downside is that we can't easily build the Vulkan overlay layer,
because only x86 binaries of the glslang validator are available. If
that's important, we could either use those binaries via qemu, or build
it from source.
v2:
* Add :amd64 suffix to existing debian-9/10 job names (Eric Engestrom)
Acked-by: Eric Engestrom <[email protected]> # v1
|
|
|
|
|
|
|
| |
Apparently needs: in a definition overwrites inherited ones. So
.deqp-test effectively didn't declare needs: for debian-10, which means
any jobs based on .deqp-test could spuriously run after the debian-10
job failed or was cancelled.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use https:// URLs in the APT configuration.
Drop --no-install-recommends, the image generation template disables
installation of recommended packages in /etc/apt/apt.conf.
Run apt-get autoremove at the end, cleaning up packages which were
installed to satisfy dependencies but are no longer needed.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
No functional change.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On GFX9, the driver is able to do an optimized fast depth/stencil
clear with only one aspect (ie. clear the stencil part of a
depth/stencil image). When this happens, the driver should only
update the clear values of the given aspect.
Note that it's currently only supported on GFX9 but I have some
local patches that extend this optimized path for other gens.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967
Cc: 19.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On Gen12, we support mixed mode HF/F operands, and also 3 source
instruction supports immediate value support, so keep immediate as it
is, if it fits properly in 16 bit field.
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
On Gen >= 12, if src0 or src2 holds immediate value, we need set
src[0/2]_is_imm bits instead of register file.
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
On Gen >= 10, Either src0 or src2 can use 16-bit immediate value, but
not both.
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It's been throwing the following error today:
"<Fault -32603: 'Internal Server Error (contact server administrator
for details): could not extend file "base/17952/18226": No space left
on device\nHINT: Check free disk space.\n'>"
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
|
|
|
| |
Fixes: 7ecfbd4f6d4 ("nir: Add alpha_to_coverage lowering pass")
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If you set LP_NUM_THREADS=0 compute shaders would hang,
just execute the workloads in sequence if we have no threads
in the pool.
Fixes: 1b24e3ba75 ("llvmpipe: add compute threadpool + mutex")
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This file is created in 2a0d45ae6cf09d60c048d7854e3d082bf15e374f but
addition to android makefiles was omitted. It breaks the build with
missing references which are defined in this file.
List the file in ir3_SOURCES to make the build succeed.
Signed-off-by: Marijn Suijten <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes some crashes with dEQP-VK.descriptor_indexing.* when
read_first_invocation has its source from a descriptor.
Most of these tests still fail because of an LLVM bug (they work
with ACO).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Totals from affected shaders:
SGPRS: 13920 -> 13656 (-1.90 %)
VGPRS: 12972 -> 12960 (-0.09 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1005680 -> 1000648 (-0.50 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Max Waves: 688 -> 688 (0.00 %)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
| |
v7: rename _nv50/_llvm to _fast/_precise
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: make variable names snake_case
v2: minor cleanups in emit_udiv()
v2: fix Panfrost build failure
v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature
v4: remove nir_op_urcp
v5: drop nv50 path
v5: rebase
v6: add back nv50 path
v6: add comment for nir_lower_idiv_path enum
v7: rename _nv50/_llvm to _fast/_precise
v8: fix etnaviv build failure
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove emit_alpha_to_coverage workaround from backend compiler and start
using ported workaround from NIR.
v2: Copy comment from brw_fs_visitor (Caio Marcelo de Oliveira Filho)
Fixes piglit test on HSW:
- arb_sample_shading-builtin-gl-sample-mask-mrt-alpha-to-coverage-combinations
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|