| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Reviewed-by: Mathias Fröhlich <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This commit fix support and adjusts the capabilities
returned by the SWR driver and the documentation
to correctly report the GL_ARB_copy_image extension.
Reviewed-by: Alok Hota <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
| |
This avoids a conflict with freedreno's bo_del().
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
| |
This avoids a conflict with freedreno's table_lock
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
| |
Two driver files had tabs mixed with spaces. Remove the tabs.
Signed-off-by: Guido Günther <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The compiler configuration was hardened to fail on format warnings and
things stopped building.
Fixes: c9c1e2610647 ("mesa: prevent common string formatting security issues")
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-By: Ryan Houdek <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
Fixes: 33800f4612 ("panfrost/midgard: Implement "pipeline register"
prepass")
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We do support TF now, so it's no longer valid. Besides, if we want this
optimization, we should probably have mesa/st doing it right for everyone.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
We have the GLSL type, so we can just ask it how many coordinates there
are. The GLSL function already has Vulkan cases that we'd probably want
eventually.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When comparing our sin/cos behavior to the closed source driver, I
noticed that we were off by a bit (or, in the case of 1/2pi, 3 bits).
Fixes:
dEQP-GLES3.functional.shaders.random.trigonometric.vertex.52
dEQP-GLES3.functional.shaders.random.all_features.vertex.0
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Cc: 19.1 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just provide a "(null)" literal in case driverName is NULL.
In file included from ../src/glx/dri3_glx.c:76:
../src/glx/dri3_glx.c: In function ‘dri3_create_screen’:
../src/glx/dri_common.h:70:36: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
70 | #define CriticalErrorMessageF(...) dri_message(_LOADER_FATAL, __VA_ARGS__)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/glx/dri3_glx.c:1002:4: note: in expansion of macro ‘CriticalErrorMessageF’
1002 | CriticalErrorMessageF("failed to load driver: %s\n", driverName);
| ^~~~~~~~~~~~~~~~~~~~~
../src/glx/dri3_glx.c:1002:50: note: format string is defined here
1002 | CriticalErrorMessageF("failed to load driver: %s\n", driverName);
| ^~
cc1: some warnings being treated as errors
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When PIPE_TRANSFER_READ requires a resolve, we blit from the host
storage to a temporary storage, and do a format conversion from the
temporary storage to the guest storage. This change makes sure we
convert to the correct level of the guest storage.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
|
|
| |
util_format_translate_3d expects the source box to be aligned to the
block size. When resolving, make sure the size of the staging
buffer is aligned to the block size.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
| |
Some new flag setting disallows it due to being a security risk.
Fixes: c9c1e261064 "mesa: prevent common string formatting security issues"
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Reason:
meson.build:586:7: ERROR: Unknown variable "dep_libdrm".
if building without x11 platform.
This reverts commit 392c60928a5debbe6782ed1aa136597504bfbc5b.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A previous optimization converts fmax(x, 0.0) instructions to fmov.pos.
This pass then propagates the .pos from the move up to the source
instruction (when possible). From there, copy propagation will eliminate
the move.
In the future, we might prefer to do this in common NIR code like we do
for saturate, as Bifrost can also benefit.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prepass, run after scheduling but before RA, specializes to
pipeline registers where possible. It walks the IR, checking whether
sources are ever used outside of the immediate bundle in which they are
written. If they are not, they are rewritten to a pipeline register (r24
or r25), valid only within the bundle itself. This has theoretical
benefits for power consumption and register pressure (and performance by
extension). While this is tested to work, it's not clear how much of a
win it really is, especially without an out-of-order scheduler (yet!).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
First, this moves the scheduler and emitter out of midgard_compile.c
into their own dedicated files.
More interestingly, this slims down midgard_bundle to be essentially an
array of _pointers_ to midgard_instructions (plus some bundling
metadata), rather than the instructions and packing themselves. The
difference is critical, as it means that (within reason, i.e. as long as
it doesn't affect the schedule) midgard_instrucitons can now be modified
_after_ scheduling while having changes updated in the final binary.
On a more philosophical level, this removes an IR. Previously, the IR
before scheduling (MIR) was separate from the IR after scheduling
(post-schedule MIR), requiring a separate set of utilities to traverse,
using different idioms. There was no good reason for this, and it
restricts our flexibility with the RA. So unify all the things!
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
| |
Trivial.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
| |
These are more generally useful than the files they were constrained to.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
| |
Mostly, this fixes a number of instances of lines >> 80 chars,
refactoring them into something legible.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This represents a major break with the former RA design. We now use
conflicting register classes to represent the subdivision of Midgard's
128-bit registers into varying sizes and arrangement. We determine class
based on the number of components in the instructions' masks. To support
this, we include a number of helpers in the RA to allow composing
swizzles and masks, such that MIR written implicitly assuming .xyzw
sources can be transformed to use actual (non-aligned) sources.
The net result is a marked decrease in register pressure on
non-vec4-exclusive shaders. We could still be doing much better. Not
implemented yet are:
- Register spilling
- Per-component liveness
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These masks distinguish scalar/vec2/vec3 loads from the default vec4,
which helps with assembly readability (since it's immediately obvious
how many components are _actually_ affected, rather than doing
mysterious things to an unknown number of unused components). Later in
the series, this will enable smarter register allocation, as the unused
components will not be interpreted abnormally.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes liveness analysis with respect to inline constants and
branching. in practice, the symptom is abnormally high register
pressure.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These snippets of integer assembly are injected for various purposes.
Eventually, we'll want to implement these in NIR directly. Regardless,
the "default" output modifier is different between floats and ints, so
let's set the right one.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
| |
These were static to midgard_compile.c but are more generally useful
across the compiler.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This mechanism is only used by blend shaders, so just use a move here.
Ideally, it'll be copy-propped and DCE'd away; this removes a source of
considerable indirection and will simplify RA logic.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This pattern was noticed in glmark's jellyfish scene.
v2: Add inexact qualifier due to NaN behaviour.
Minimal shader-db changes (slightly helped).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With 8 and 16-bit types and anything where we have to use non-trivial
strides registersto deal with restrictions, we end up with things that
look like partial writes even though we don't care about any values in
the register except those written by that instruction. This is
particularly important when dealing with loops because liveness sees
is_partial_write and the fact that an old version from a previous loop
iteration may be valid at that point and extends all purely partially
written values to the entire loop.
This commit adds a new UNDEF instruction which does nothing (the
generator doesn't emit anything) but which does a fake write to the
register. This informs liveness that we don't care about any values
before that point so it won't consider those registers to be falsely
live. We can safely emit UNDEF instructions for all SSA values that
come in from NIR and nearly all temporaries generated by various stages
of the compiler. In particular, we need to insert UNDEF instructions
when we handle region restrictions because the newly allocated registers
are almost guaranteed to be partially written.
No shader-db changes.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110432
Reviewed-by: Matt Turner <[email protected]>
|