| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Only relevant for indirect contexts, so let's get that code out of the
common path.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the MEDIA_VFE_STATE docs:
"Starting with this configuration, the Maximum Number of Threads must
be set to (#EU * 8) for GPGPU dispatches.
Although there are only 7 threads per EU in the configuration, the
FFTID is calculated as if there are 8 threads per EU, which in turn
requires a larger amount of Scratch Space to be allocated by the
driver."
It's pretty clear that we need to increase this for scratch address
calculations, because the FFTID has a certain bit-pattern. The quote
above seems to indicate that we should increase the actual thread count
programmed in MEDIA_VFE_STATE as well, but we think the intention is to
only bump the scratch space.
Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.
Fixes: 5ac804bd9ac ("intel: Add a preliminary device for Ice Lake")
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 729de1488f49033bc181b8123af5658228a51bf1.
It turns out that, although the register is in the logical context,
it isn't whitelisted, so we can't actually write it from userspace
batch buffers. The write just becomes a noop, which is why we saw
no performance changes.
I manually whitelisted it, and still observed no performance gains, but
it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments
on the iris driver. So we might need to fix something before enabling
this. To prevent it randomly getting turned on should the kernel ever
whitelist this register, we revert the patch for now.
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use VPC_SO_OVERRIDE to control whether we do streamout in binning or
draw pass. Normally we want to do streamout in binning pass, except
when there is a single tile and binning passed is skipped.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We could bit doing streamout from binning pass. In this case we want to
use the full VS which doesn't have (potentially streamed out) varyings
stripped out.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
This fixes VAAPI.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
| |
This fixes some dEQP tests.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This caused a failure in NIR validation.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
| |
It confuses radeonsi.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: cleanup
Signed-off-by: Sonny Jiang <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
PCI IDs for amdgpu will be removed from Mesa.
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
| |
Cc: 19.2 <[email protected]>
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
trivial and urgent
Cc: 19.2 <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a3268599f3c9, I attempted to fix nir_repair_ssa for unreachable
blocks. However, that commit missed the possibility that the use is in
a block which, itself, is unreachable. In this case, we can end up in
an infinite loop trying to replace a def with itself. Even though a
no-op replacement is a fine operation, it keeps extending the end of the
uses list as we're walking it. Instead of explicitly checking for the
group of conditions, just check if the phi builder gives us a different
def. That's guaranteed to be 100% reliable and, while it lacks symmetry
with the is_valid checks, should be more reliable.
Fixes: a3268599 "nir/repair_ssa: Repair dominance for unreachable..."
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Rhys Kidd <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Rhys Kidd <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pipe->clear() is not called for partial clears, which mesa emulates by
drawing a quad.
Furthermore, drivers should not use rasterizer state information for
scissor information (which was being used to handle the partial clears).
So, remove the partial clear support since it was not supposed to be
handled by pipe->clear() anyway.
This fixes issues with clearing after switching to different sized
framebuffers.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
| |
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
| |
->padded_count should be large enough to cover all vertices pointed by
the index array. Use the local vertex_count variable that contains the
updated vertex_count value for the indexed draw case.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
fixes "sorry, unimplemented: non-trivial designated initializers not supported"
Fixes: deb04adf2ae ("clover: add support for passing kernels as nir to the driver")
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Incomplete attachments don't have an associated pipe_surface, so
this would crash.
Fixes a WebGL conformance test that uses incomplete attachments:
https://www.khronos.org/registry/webgl/sdk/tests/conformance2/renderbuffers/invalidate-framebuffer.html?webglVersion=2&quiet=0&quick=1
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111756
Reviewed-By: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allocating BOs is expensive, so we should avoid doing that by caching
freed BOs.
BO cache is modelled after one in v3d driver and works as follows:
- in lima_bo_create() check if we have matching BO in cache and return
it if there's one, allocate new BO otherwise.
- in lima_bo_unreference() (renamed from lima_bo_free()): put BO in
cache instead of freeing it and remove all stale BOs from cache
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
os_time_get_absolute_timeout(0) returns current time, while kernel
driver expects 0 as value to poll BO status and return immediately.
Fix it by setting abs_timeout to 0 if timeout_ns is 0
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
| |
Reviewed-and-Tested-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Some time weston set full damage region. It is
more effient to use the cached pp stream instead
of dynamically create one.
Reviewed-and-Tested-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This extension set a damage region for each
buffer swap which can be used to reduce buffer
reload cost by only feed damage region's tile
buffer address for PP.
Reviewed-and-Tested-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PLBU expects the viewport's 4 borders' coordinates, however
currently we're feeding the coordinate of the left-bottom point and the
size to it, which leads to misrendering when the left-bottom point is
not (0,0).
Change the macros for the viewport PLBU command, and the data feed to
it. The code to calculate the 4 borders is ported from Panfrost.
Signed-off-by: Icenowy Zheng <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
| |
ACO depends on C++14, but radeonsi/radv with LLVM 8,9 do not. Let us
only require it for RADV, since that is the only user.
Fixes: a70a9987181 "radv/aco: Setup alternate path in RADV to support the experimental ACO compiler"
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
required for OpenCL
v2: adjust to changes in previous commits
v3: properly convert to NIR in nvc0_cp_state_create
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: minor formatting fixes
v3: call glsl_type_singleton_init_or_ref and glsl_type_singleton_decref
v4: capitalize and punctuate comments
fix text_executable -> text_intermediate in TODO
make glsl_type_singleton wrapper static
v5: rewrite how we run the nir passes
v6: fix unhandled case switch warning in st/mesa
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]> (v4)
|
|
|
|
|
|
|
|
|
|
|
| |
v2: rework arguments to compiler::compile_program
add assert to device::ir_format
v3: remove PIPE_SHADER_IR_SPIRV
change title
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]> (v2)
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most drivers have actually no binary format and just store the IR directly
as a single entry point blob.
v2: add a cap to switch between single or multi entry point binaries
v3: remove the entry_point field
v4: remove PIPE_CAP_MULTI_ENTRY_POINT_BINARIES
v5: remove supports_multiple_entry_points
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: pass argument by value
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
| |
We want to use it for other formats as well, so give it a more generic name
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
| |
makes it easier to consume a IR_NATIVE binary
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Karol Herbst <[email protected]>
Acked-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2 (Karol Herbst):
silence warnings about unhandled enum values
v3 (Karol Herbst):
added back array size parsing (needed for structs passed by value)
Acked-by: Francisco Jerez <[email protected]> (v2)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes since:
* v12:
- remove autotools (Karol Herbst)
- Remove the callback in format_validation_msg. (Francisco Jerez)
- Removed is_binary_spirv. (Francisco Jerez)
- Pass a string reference to is_valid_spirv instead of the
notification callback. (Francisco Jerez)
* v11: Fix compilation error introduced in v11.
* v10:
- Reuse format_validation_msg in is_valid_spirv.
- Remove LVL2STR macro in format_validation_msg.
* v9: Add `clover_cpp_std` to the overrides of the `libclspirv` target
in Meson.
* v7: Add DEFINES to libclspirv and libclover, in autotools, as they
would otherwise never know whether CLOVER_ALLOW_SPIRV has been
defined (Dave Airlie)
* v6: Update the dependency name (meson) and the libs variable
(Makefile) due to the replacement of llvm-spirv to the new
official SPIRV-LLVM-Translator.
* v5: Changed to match the updated “clover/llvm: Allow translating from
SPIR-V to LLVM IR” in the v6.
Reviewed-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes since:
* v12 (Karol Herbst):
- rename CLOVER_ALLOW_SPIRV to HAVE_CLOVER_SPIRV
* v11 (Karol Herbst):
- only set new defines for clover to speed up recompilation
- remove autotools
* v10:
- Add a new flag (`--enable-opencl-spirv` for autotools, and
`-Dopencl-spirv=true` for meson) for enabling SPIR-V support in
clover, and never automagically enable it without that flag. (Dylan Baker)
- When enabling the SPIR-V support, the SPIRV-Tools and
SPIRV-LLVM-Translator libraries are now required dependencies.
* v7:
- Properly align LLVMSPIRVLib comment (Dylan Baker)
- Only define CLOVER_ALLOW_SPIRV when **both** dependencies are found:
autotools was only requiring one or the other.
* v6: Replace the llvm-spirv repository by the new official
SPIRV-LLVM-Translator.
* v4: Add a comment saying where to find llvm-spirv (Karol Herbst).
* v3:
- make SPIRV-Tools and llvm-spirv optional (Francisco Jerez);
- bump requirement for llvm-spirv to version 0.2
* v2:
- Bump the required version of SPIRV-Tools to the latest release;
- Add a dependency on llvm-spirv.
Reviewed-by: Dylan Baker <[email protected]> (v10)
Reviewed-by: Karol Herbst <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Gen11 doesn't require us to bypass the L2 cache for BC* images anymore.
The documentation is a bit hard to follow on this point, but the Windows
driver clearly only applies this workaround on Gen9, and their commit
history indicates that this was an intentional change to drop the
workaround for Gen11+.
Reviewed-by: Jason Ekstrand <[email protected]>
|