| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
This pipe_screen function is not implemented by all backends.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443
Signed-off-by: Julien Isorce <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
|
|
|
|
|
|
|
|
| |
Both drivers are feature-complete and should be running more-or-less at
perf at this point. Drop the warning.
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Just don't emit the transform array at all if there are no transforms
v2:
- Don't use len(array) > 0 (Dylan)
- Keep using ARRAY_SIZE to make the generated C code easier to read
(Jason).
|
|
|
|
|
|
|
|
|
| |
st/mesa's PBO upload path binds a vertex shader that doesn't use any
textures, but leaves the existing sampler views bound in place. This
was tricking us into thinking the PBO destination might be bound for
texturing in some cases. In Civilization VI, this fixes a false self-
dependency issue that was preventing CCS_E compression on upload.
Fixing this slightly improves frame times.
|
|
|
|
|
|
|
| |
v2: - Use new with_shader_cache variable instead of
host_machine.system() == 'windows'
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
To allow the this test to be built with MSVC, which doesn't support
VLAs.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This makes nm not required, but used if found. In general I imagine that
this means that on windows nm wont be found, and on other platforms it
will.
v2: - fix gbm and egl symbols check tests to only be run if nm is found
- reword commit message to reflect the code change
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Somewhere down in the depths of the mingw headers 'interface' is
defined, change it to iface like a similar patch did.
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This allows the identifier to be used even if shared-glapi isn't build,
which simplifies a bunch of things.
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Chuck Atkins <[email protected]>
Cc: mesa-stable <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because the new raw/struct intrinsics are buggy with LLVM 8
(they weren't marked as source of divergence), we fallback to the
old instrinsics for atomic buffer operations only. This means we need
to apply the indexing workaround for GFX9. The load/store
operations still use the new LLVM 8 intrinsics.
The fact that we need another workaround is painful but we should
be able to clean up that a bit once LLVM 7 support will be dropped.
This fixes a GPU hang with AC Odyssey and some rendering problems
with Nioh.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573
Fixes: 31164cf5f70 ("ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found while running Talos Principle.
As far as I can tell running a draw call with a pipeline having push
constants without the application having called vkCmdPushConstants
gives undefined push constant values.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
| |
This restores the previous behaviour before YCBCR landed. For D+S
formats, it returns the depth format.
This fixes an assertion with Thrones of Britannia.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110540
Fixes: 66507cc6563 ("radv: Add single plane image views & meta operations")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since 09f1de97a76 "anv,i965: Lower away image derefs in the driver"
the backend compiler is not expected to handle any derefs, so let's
assert on it.
This helps identifying problems when a deref is not lowered and
"leaks" into the backend compiler.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Factoring code with resource_get_handle.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443
Signed-off-by: Julien Isorce <[email protected]>
Reviewed-by: Dave Airlie [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MASK macro is used in the RANGE macro, and it should
return the pre-bitset word mask for the (b) value.
i.e.
BITSET_MASK(0) should be undefined since it's meaningless.
BITSET_MASK(31) should give 0x7fffffff
BITSET_MASK(32) should give 0xffffffff
BITSET_MASK(33) should give 0x00000001
BITSET_MASK(64) should give 0xffffffff
However then BITSET_RANGE ends up broken for cases where
it's (b) value is the 0,32,64 value as in that case the lower
mask would be 0 not 0xffffffff.
This fixes the unit tests that I've added, and my code that
uses bitsets.
Reviewed-by: Jason Ekstrand <[email protected]>
Fixes: bb38cadb1c5f2 "More GLSL code"
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
The last test here currently fails as there is a bug in bitset.h
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
This has a couple of hardcoded vec4 limits in it, change them
to the proper sizing to avoid future issues.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
The spir-v spec says this returns a bool.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a port of Danylo's eca4a6548d07bbbb02a7768edb397bad7b72cfc2
which fixed the hang on i965. It fixes GPU hangs in his new Piglit
test, arb_blend_func_extended-dual-src-blending-discard-without-src1.
I avoided my own review feedback here, and decided to simply adjust
3DSTATE_PS_BLEND rather than BLEND_STATE_ENTRY[0]. It has never been
clear to me which the hardware uses in every case. However, whacking
the enable in 3DSTATE_PS_BLEND seems to be sufficient to fix the hang,
and that packet is already dynamic, so it's easy to handle. I'd rather
avoid making BLEND_STATE_ENTRY[0] dynamic unless I have to.
|
|
|
|
|
|
|
| |
It is an input but it comes in as part of the shader payload and doesn't
count towards the limits.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes: 691d5a825a6 nir: rework tex instruction printing
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Support nir_op_ftrunc by turning it into a mov with a round to integer
output modifier.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add GBM_BO_IMPORT_FD_MODIFIER to documentation of supported foreign
object types
- Add newline before documentation block
- Improve language
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
|
|
|
|
|
|
|
|
| |
Other GFX9 chips aren't affected.
Cc: "19.0" "19.1" <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
- make sure compute shader derivatives are exposed for all extensions
- unify duplicated code
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's done by:
- decrease the number of frames in flight by 1
- flush before throttling in SwapBuffers
(instead of wait-then-flush, do flush-then-wait)
The improvement is apparent with Unigine Heaven.
Previously:
draw frame 2
wait frame 0
flush frame 2
present frame 2
The input lag is 2 frames.
Now:
draw frame 2
flush frame 2
wait frame 1
present frame 2
The input lag is 1 frame. Flushing is done before waiting, because
otherwise the device would be idle after waiting.
Nine is affected because it also uses the pipe cap.
|
|
|
|
|
|
|
|
|
|
| |
This way we can mark the dri_drivers and dri_link arrays as temporary,
as all knowledge about them are contained in a single build-file with
clearly visible limited life-span.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Acked-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
| |
We just need to do a sequence of commands to flush the cache.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Wire up support to sample from the fb (and force GMEM rendering when we
have fb reads). The existing GLSL IR lowering for blend_equation_advanced
does the rest.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
Lower load_output to txf_ms_fb and add support for the new texture fetch
instruction.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
Needed for sampling from tile buffer (GMEM).
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently we never hit this path. Or at least haven't for a rather
long time. But in either case (load_deref or load_frag_coord), we can
just directly use the intrinsic's ssa dest. So stop passing the
nir_variable (which would be NULL in the load_frag_coord case) around
and instead just use &intr->dest.ssa.
(This ofc means we need to setup the cursor to insert *after* the
instruction, which seems to be another bug of the original
implementation.)
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
The extra comma at the end was annoying me.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
And a comment.. since we are mixing units of bytes/dwords/vec4,
hopefully this will avoid some unit confusion.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It isn't quite as simple as not running the pass, since with packed
varyings we get load_ubo for block==0 (ie. the "real" uniforms). So
instead run the pass normally but decline to lower anything in
block > 0
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since we emit UBO regions INDIRECTly (ie. not copied into cmdstream but
emit by EXT_SRC_ADDR) we need to keep them 4*vec4 aligned. Which the
code already mostly did, except for aligning the first UBO region itself
(ie. the one after block==0 which is the "real" uniforms).
Fixes: 893425a607a freedreno/ir3: Push UBOs to constant file
Fixes: 3c8779af325 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise we zero out the state again, but all the UBO loads that we
could lower are already lowered. End result is that we didn't emit the
uniforms for lowered UBO access in any case where multiple shader
variants are used.
Fixes: 893425a607a freedreno/ir3: Push UBOs to constant file
Fixes: 3c8779af325 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Keen on having other people contribute.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
And fix the unused CmdDrawIndirect.
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This is useful to normalize the numbers written into the output file
as those number are accumulated over a period of time and number of
frames.
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The output looks something like this (csv style) :
fps, frame, frame_timing(us), submit, draw_indexed, pipeline_graphics, acquire_timing(us), vert_invocations, frag_invocations, gpu_timing(ns)
480.55, 242, 501512, 247, 1444, 1204, 714, 5827272, 113043296, 121424174
467.80, 234, 500214, 234, 1412, 1176, 648, 5635680, 109436188, 117743760
424.37, 213, 501923, 213, 2130, 1704, 623, 5132448, 99657292, 105474683
472.15, 237, 501962, 237, 2370, 1896, 667, 5710752, 110924644, 122226004
411.32, 206, 500826, 206, 2060, 1648, 709, 4963776, 96491764, 95333273
458.87, 230, 501228, 230, 2300, 1840, 634, 5542080, 107758204, 123112090
475.01, 238, 501044, 238, 2380, 1904, 631, 5734848, 111477480, 122087426
471.08, 236, 500972, 236, 2360, 1888, 655, 5686656, 110498496, 114816162
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
Looks a bit better.
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
In case you're just interested in data being record to the output
file.
Signed-off-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2: switch to VkBase{In,Out}Structure
v3: Add timestamps at begin/end of primary command buffers to estimate
gpu time spent per submission (Lionel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Eric Engestrom <[email protected]> (v2)
|
|
|
|
|
|
|
|
|
|
|
|
| |
This significantly reworks how numbers displayed are computed. We
accumulate operations written into command buffers and add those to
the device when submitted to a queue. These collected values are then
used to compute per frame overlay data.
We also accumulate the data over the sampling fps period to produce
numbers for that period of time.
Signed-off-by: Lionel Landwerlin <[email protected]>
|