| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
I'm using this for shader-db analysis on x86_64 systems.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The goal is to enable testing of parts of drivers without depending on any
particular kernel version or hardware being present.
Simply set LD_PRELOAD=$PREFIX/lib/libv3d_drm_shim.so in your environment,
and we'll fake a /dev/dri/renderD128 (or whatever the next available node
is) using v3dv3. That node can then be used with the surfaceless or gbm
EGL platforms.
Acked-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When generating the error message for a missing function error where
all available overloads were missing due to a too low GLSL version, we
used to report something like this:
---8<---
0:224(14): error: no matching function for call to
`textureCubeLod(samplerCube, vec3, float)'; candidates are:
0:224(14): error: type mismatch
---8<---
This is a pretty confusing error message, and can throw people off when
debugging. So let's instead check if any overload is available before we
decide what to print. This allow us to report something like this
instead:
---8<---
0:224(14): error: no function with name 'textureCubeLod'
0:224(14): error: type mismatch
---8<---
This is arguably easier to understand for programmers, and doesn't send
you on a wild goose chase to figure out what argument is wrong just
because you stopped reading the message prematurely. I'm of course
referring to a friend, not me. For sure. I would never do that.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Without correct size, radeonsi assumes the metadata is incorrect,
which can and will cause issues.
Since the metadata is really incorrect without the size, let us
fix that.
Fixes: e43cc3e3afc "radv/gfx9: handle GFX9 opaque metadata"
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Report VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT as
supported for images. It was being shown supported for buffers, but not
images.
Fixes: 69cc6272fbc1 ("anv: Implement VK_EXT_external_memory_host")
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes
dEQP-VK.rasterization.primitive_size.points.point_size_*
This also fixes some black squares with the Sascha SSAO demo.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's coherent and faster. GFX7-GFX9 should also support this but
for now only uses L2 for GFX10 because it's untested on previous gens.
This fixes dEQP-VK.memory.pipeline_barrier.transfer_*
This also fixes some missing geometry in Dawn Of War III because
VBOs weren't updated correctly.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We add a new opt pass fusing perspective projection with varyings. Minor
win..? We don't combine non-varying projections, since if we're too
agressive, the extra load/store traffic will hurt us so it's not really
a win in practice.
total instructions in shared programs: 3915 -> 3913 (-0.05%)
instructions in affected programs: 76 -> 74 (-2.63%)
helped: 1
HURT: 0
total bundles in shared programs: 2520 -> 2519 (-0.04%)
bundles in affected programs: 46 -> 45 (-2.17%)
helped: 1
HURT: 0
total quadwords in shared programs: 4027 -> 4025 (-0.05%)
quadwords in affected programs: 80 -> 78 (-2.50%)
helped: 1
HURT: 0
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We don't use it yet, since it's actually a shader-db regression. This is
primarily helpful as an intermediate step for attaching projection to
varyings.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
It doesn't make sense to use them with anything less.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
We use a special conflicting register class.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
While load/store ops like st_vary can take an argument in either
r26/r27, ops like those for perspective projection must specifically
take their argument in r27.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that all the piping is in place to do so without regressions, we
flip on automatic register allocation for varyings. Hooray!
total instructions in shared programs: 4025 -> 3915 (-2.73%)
instructions in affected programs: 1667 -> 1557 (-6.60%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 1.77 x̃: 2
helped stats (rel) min: 0.93% max: 20.00% x̄: 10.80% x̃: 10.64%
95% mean confidence interval for instructions value: -1.89 -1.66
95% mean confidence interval for instructions %-change: -12.50% -9.11%
Instructions are helped.
total bundles in shared programs: 2683 -> 2520 (-6.08%)
bundles in affected programs: 1066 -> 903 (-15.29%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 2.63 x̃: 3
helped stats (rel) min: 2.94% max: 42.86% x̄: 23.85% x̃: 22.50%
95% mean confidence interval for bundles value: -2.83 -2.43
95% mean confidence interval for bundles %-change: -27.73% -19.97%
Bundles are helped.
total quadwords in shared programs: 4192 -> 4027 (-3.94%)
quadwords in affected programs: 1584 -> 1419 (-10.42%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.66 x̃: 3
helped stats (rel) min: 1.85% max: 30.00% x̄: 16.49% x̃: 16.52%
95% mean confidence interval for quadwords value: -2.87 -2.46
95% mean confidence interval for quadwords %-change: -19.14% -13.84%
Quadwords are helped.
total registers in shared programs: 433 -> 411 (-5.08%)
registers in affected programs: 67 -> 45 (-32.84%)
helped: 23
HURT: 1
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 25.00% max: 50.00% x̄: 41.30% x̃: 50.00%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 14.29% max: 14.29% x̄: 14.29% x̃: 14.29%
95% mean confidence interval for registers value: -1.09 -0.74
95% mean confidence interval for registers %-change: -45.45% -32.52%
Registers are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Fixes classes defaulting to vec4 in some cases.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
The load/store pipes can't take a uniform register in, so an explicit
move is necessary here.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Now that we have its registers handled normally like the rest of the IR.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Given the constraints on special registers, we add a helper for lowering
these by inserting moves (copies) where needed to satsify the ISA
constraints.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We generalize the constant emission helper used in fragment writeout as
we'll also need it for vertex outputs.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Specialized version of a rewrite that only rewrites a certain type of
instruction.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This ensures the rules for accessing special register classes are
satisfied. This is asserted as a prepass should have lowered offending
uses to something satisfying these rules. Special register classes are
*not* work registers and cannot be used for RMW operations; they are
essentially 1-way pipes straight into/from fixed-function logic in the
shader cores.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We reuse the same register spilling mechanism as for work->memory to
spill special->work registers, e.g. to allow writing out more than 2
vec4 varyings (without better scheduling anyway).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
These can consume sources now.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
This does not yet support special->work spilling, nor does it support
multiclass breakup. These corner cases will be handled in succeeding
commits.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We'll want to also handle load/store and texture registers in our RA
loop.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
We also expose some utilities it uses as general MIR helpers.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Checks for x/xy/xyz/xyzw style swizzles (slightly more general but you
get the idea).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Helps as an optimization heuristic.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
...rather than exposing it in the vendored compiler region.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Again, it's in shader_info for us!
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
No need to track this ourselves!
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We don't need to guesstimate this ourselves. This will help when we
bringup derivatives.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We include a zsbuf attachment function based on how the corresponding
MFBD code works, as well as extending cbufs to mipmapped rendering while
we're at it.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Just because we don't have the format codes to render to them yet.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
We'll need it to specialize resource creation by chip.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Fixes invalid read errors reported by valgrind.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Complements the existing getters and the setter for node class. To be
used in the Panfrost RA refactor.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Tomeu Vizoso <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes eglExportDMABUFImageQueryMESA() so it will report the
modififers of the underlying image. Without this information,
re-importing will likely be broken as it is rare these days that no
modifiers are used.
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
Fixes: 8f7338f284cdb1fef64c ("egl: add initial EGL_MESA_image_dma_buf_export v2.4")
|
|
|
|
|
|
|
|
| |
Extend MESA_framebuffer_flip_y to be used with OpenGL versions 4.3 and
higher. OpenGL 4.3 adds FramebufferParameteri needed by this extension.
Reviewed-by: Fritz Koenig <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes dEQP crashes.
Fixes: 2f93ecd654e ("panfrost: Fake CAPs for dEQP-GLES31")
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
This is why we can't have nice things. I'm sure there's someway
to do this with {0} but I really don't have time for that.
Fixes: 2631fd3b0bf ("gallivm: rework lp_build_tgsi_soa to take a struct")
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When 'x' is the result of a scmp op:
x != 0.0 or x == 1.0: passthrough
x == 0.0 or x != 1.0: invert
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add generic lowerings for fall_equalN/fany_nequalN. These should be optimal
for vec4 backends that doesn't have any special instructions for it, as
long as they support saturate.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add simple fdot2 optimizations that are missing.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Thomas Helland <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
For backends that don't have a 'fdph' instructions
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Thomas Helland <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This version has less ops for the same precision.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This is to allow optimizations in nir_opt_algebraic not otherwise possible
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Previously we'd hit the unreachable() for uploading RGTC.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Structure itself wasn't freed during context tear-down, causing a
memory leak on iris.
Signed-off-by: Yevhenii Kolesnikov <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This will effectively enable the optimization in anv.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|