| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]
Introduced by 8f83eea71e233 ("i965: Add negative_equals methods").
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the SPIR-V spec, OpTypeImage:
"Depth is whether or not this image is a depth image. (Note that
whether or not depth comparisons are actually done is a property of
the sampling opcode, not of this type declaration.)"
The sampling opcodes that specify depth comparisons are
OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set
is_shadow only for these (we were using the deph property of the
image until now).
v2:
- Do the same for OpImageDrefGather.
- Set is_shadow to false if the sampling opcode is not one of these (Jason)
- Reuse an existing switch statement instead of adding a new one (Jason)
Fixes crashes in:
dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.*
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen8+ use 48-bit address relocations so need to extend the sign
to 64-bit return value. Without it we have higher bits zeroed
and missing the negavive values.
Haswell and older use 32-bit deltas so are unaffected by this issue.
v2:
used int32_t fucntion parameter instead of explicit type conversion.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408
Signed-off-by: Sergii Romantsov <[email protected]>
Tested-by: Andriy Khulap <[email protected]>
Tested-by: Stuart Young <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "18.0 17.3" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise we may end up trying to coalesce in a case such as
ssa_1 = fadd r1, r2
r3.x = fneg(r2);
r3 = vec4(ssa_1, ssa_1.y, ...)
and that would cause us to move the writes to r3 from the vec to the
fadd which would re-order them with respect to the write from the fneg.
In order to solve this, we just don't coalesce if the destination of the
vec is not SSA. We could try to get clever and still coalesce if there
are no writes to the destination of the vec between the vec and the ALU
source. However, since registers only come from phi webs and indirects,
the chances of having a vec with a register destination that is actually
coalescable into its source is very slim.
Shader-db results on Haswell:
total instructions in shared programs: 13657906 -> 13659101 (<.01%)
instructions in affected programs: 149291 -> 150486 (0.80%)
helped: 0
HURT: 592
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440
Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"
Reported-by: Vadym Shovkoplias <[email protected]>
Tested-by: Vadym Shovkoplias <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel
will hit a -EBADF error. Move the close(fd) call to the end of
anv_CreateDmaBufImageINTEL().
Signed-off-by: Kevin Strasser <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug in radeonsi where LLVM cannot handle the case where
a break exists but its not the last instruction in the block.
LLVM would fail with:
Terminator found in the middle of a basic block!
LLVM ERROR: Broken function found, compilation aborted!
Fixes: 96fe8834f539 "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively"
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317
|
|
|
|
|
|
| |
Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Reviewed-by: Daniel Stone <[email protected]>
CC: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When running virgl on a GLES host the only sRGB formats that support
rendering is RGBA and RGBX. That pipe format is in the sRGB default
lists that the state tracker uses when mapping mesa formats.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Prior to printing a decoded field, print out all dwords that field
belongs to. In particular with address fields spanning multiple
dwords, we want to have all the dwords presented before the field is
decoded to make it easier to read.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one
command. In our drivers we only use one tuple at a time, but the
kernel might load more than one at a time.
Instead of making all the tuple part of a group, we leave out the
first tuple (the one we use in the generated packing structures).
This is particularly useful for looking at error stats generated by
the kernel.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example, a PIPE_CONTROL with DWordLength = 2 should look like
this :
0xffffe374: 0x7a000002: PIPE_CONTROL
0xffffe374: 0x7a000002 : Dword 0
DWord Length: 2
0xffffe378: 0x00800000 : Dword 1
Depth Cache Flush Enable: false
Stall At Pixel Scoreboard: false
State Cache Invalidation Enable: false
Constant Cache Invalidation Enable: false
VF Cache Invalidation Enable: false
DC Flush Enable: false
Pipe Control Flush Enable: false
Notify Enable: false
Indirect State Pointers Disable: false
Texture Cache Invalidation Enable: false
Instruction Cache Invalidate Enable: false
Render Target Cache Flush Enable: false
Depth Stall Enable: false
Post Sync Operation: 0 (No Write)
Generic Media State Clear: false
TLB Invalidate: false
Global Snapshot Count Reset: false
Command Streamer Stall Enable: false
Store Data Index: 0
LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
Destination Address Type: 0 (PPGTT)
Flush LLC: false
0xffffe37c: 0x00000000 : Dword 2
Address: 0x00000000
0xffffe384: 0x05000000: MI_BATCH_BUFFER_END
Prior to this change, fields beyond the length of the command would be
decoded (notice the MI_BATCH_BUFFER_END decoded as part of the
previous PIPE_CONTROL) :
0xffffe374: 0x7a000002: PIPE_CONTROL
0xffffe374: 0x7a000002 : Dword 0
DWord Length: 2
0xffffe378: 0x00800000 : Dword 1
Depth Cache Flush Enable: false
Stall At Pixel Scoreboard: false
State Cache Invalidation Enable: false
Constant Cache Invalidation Enable: false
VF Cache Invalidation Enable: false
DC Flush Enable: false
Pipe Control Flush Enable: false
Notify Enable: false
Indirect State Pointers Disable: false
Texture Cache Invalidation Enable: false
Instruction Cache Invalidate Enable: false
Render Target Cache Flush Enable: false
Depth Stall Enable: false
Post Sync Operation: 0 (No Write)
Generic Media State Clear: false
TLB Invalidate: false
Global Snapshot Count Reset: false
Command Streamer Stall Enable: false
Store Data Index: 0
LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
Destination Address Type: 0 (PPGTT)
Flush LLC: false
0xffffe37c: 0x00000000 : Dword 2
Address: 0x00000000
0xffffe380: 0x00000000 : Dword 3
0xffffe384: 0x05000000 : Dword 4
Immediate Data: 83886080
0xffffe384: 0x05000000: MI_BATCH_BUFFER_END
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel reports workaround batch buffers, but we're not presenting
them currently. Also they might not be useful for debugging purely
userspace driver issues, when problems arise because of interactions
between kernel & userspace drivers, it's nice to be able to decode
them.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
| |
Helpful to debug kernel workaround batchbuffers.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This variable controls whether we link using the glsl code path or the
spirv path. It's set when we validate that all shaders are glsl or
spirv, but if there are no shaders attached to the program it will
remain unset, resulting in undefined behavior. We want to go down the
glsl path in that case, so initialize to false.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820
Fixes: 16f6634e7fb5ada308e55b852cd49251e7f3f8b1
("mesa/program: Link SPIR-V shaders using the SPIR-V code-path")
Signed-off-by: Dylan Baker <[email protected]>
Tested-by: Mark Janes <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
The driver already supports exporting the Layer and ViewportIndex
built-ins from vertex or tessellation shaders.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add helpers to get the number of src/dest components for an intrinsic,
and update spots that were open-coding this logic to use the helpers
instead.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since we were MARK flag for both preventing loops, and tracking whether
instructions were used, we could end up in an infinite loop due to
bd2ca2bcdd. Instead invert the logic.. mark all instructions UNUSED
up front and clear the flag as we visit them.
Fixes: bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Without this the return value will never get set to -1. This
was first added in 49866c8f3457 and copied in 2b396eeed983.
Fixes: 2b396eeed983 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager"
Reviewed-by: Marek Olšák <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342
|
|
|
|
|
|
|
|
|
| |
This reverts commit 41cf30b8bc55fdf36adac3311002dc32b6715949.
Commit caused regressions with KHR-GLES3.packed_pixels.* tests.
Signed-off-by: Tapani Pälli <[email protected]>
Suggested-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When allocating a buffer for DRI2, set the modifier to INVALID to inform
the backend that we have no supplied modifiers and it should do its own
thing. The missed initialisation forced linear, even if the
implementation had made other decisions.
This resulted in VC4 DRI2 clients failing with:
Modifier 0x0 vs. tiling (0x700000000000001) mismatch
Signed-off-by: Daniel Stone <[email protected]>
Reported-by: Andreas Müller <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Fixes: 3f8513172ff6 ("gallium/winsys/drm: introduce modifier field to winsys_handle")
|
|
|
|
|
|
|
| |
MSAA is supported using sample shading. Layered rendering and all texture
targets are also supported.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
| |
We'll need it for FBFETCH in both TGSI and NIR paths.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
| |
For testing.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: rebased on top of subpass rework.
v3: rebased
v4:
- rebased
- reset pending clear views in one go rather one bit at a time (Caio)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When multiview is active a subpass clear may only clear a subset of the
attachment layers. Other subpasses in the same render pass may also
clear too and we want to honor those clears as well, however, we need to
ensure that we only clear a layer once, on the first subpass that uses
a particular layer (view) of a given attachment.
This means that when we check if a subpass attachment needs to be cleared
we need to check if all the layers used by that subpass (as indicated by
its view_mask) have already been cleared in previous subpasses or not, in
which case, we must clear any pending layers used by the subpass, and only
those pending.
v2:
- track pending clear views in the attachment state (Jason)
- rebased on top of fast-clear rework.
v3:
- rebased on top of subpass rework.
v4: rebased.
v5 (Caio):
- Rebased.
- Initialize pending clear views to only have bits set for layers
that exist.
- Reset pending clear views in one go rather one bit at a time.
- Put "last subpass for this attachment" condition in a separate
function to simplify the conditional that resets pending_clear_aspects.
Fixes:
dEQP-VK.multiview.readback_implicit_clear.*
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
For now we skip SI && HAVE_LLVM < 0x0600 for simplicity. We also skip
setting the more accurate masks for builtin uniforms for now as it
causes some piglit regressions.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This will be shared by the TGSI and NIR backends. For simplicity
we leave the SI LLVM 5.0 and lower work around only in the TGSI
backend.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Buffers can be large, so we probably don't want to make them all 32x
bigger. But they can't be rendered to (at least in GL) so we don't
need this workaround to prevent page faults on mem<->gmem.
Cc: "18.0" <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We could alternatively fall back to using "old style" draw's for
mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32,
but that is somewhat more work (and not really something that could
be applied to stable)
Cc: "18.0" <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes an issue that became possible when we started lowering phi webs to
regs (a7ea2b4e) (although was not really seen until we also switched to
using peephole select pass (ec8bc54a) instead of lowering *all* if/else
to select).
If texture coord (or anything else that uses create_collect() to collect
scalar values in a sequence of scalar registers) was consuming a value
produced on either side of an if/else (ie. a phi lowered to nir reg,
which in ir3 is an "array" of length 1) then register allocation would
happen incorrectly and we'd end up sampling from garbage coordinates.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Some instructions require src/dst to be in full or half precision
register depending on src/dst type. So do a better job of propagating
register type.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
We'll also need to be able to create a half-precision immediate. So
re-work create_immed(). Prep work for following patch.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Prep work for following patch.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously false-dependencies would get flagged as used, even if the
only "use" was a false dep to (for example) prevent a load from being
scheduled after a store.
In addition to being pointless instructions, in some cases they can
cause problems. For example, ldg (and similar instructions) depend on
an immed arg getting CP'd into the instruction, but this doesn't happen
if an instruction is otherwise unused. Which can result in undefined
results (overwriting unintended registers).
Signed-off-by: Rob Clark <[email protected]>
|