| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
It only applies to legacy BO fences.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This is just a refactor, similar to what we did for semaphores, in
preparation for handling VK_KHR_external_fence.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This commit changes fences to work a bit more like BO semaphores.
Instead of the fence being a batch, it's simply a BO that gets added
to the validation list for the last execbuf call in the QueueSubmit
operation. It's a bit annoying finding the last submit in the execbuf
but this allows us to avoid the dummy execbuf.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We didn't allow them before because it didn't look like the spec allowed
it. It certainly doesn't make much sense. However, there are CTS tests
that apparently hit this. What the spec actually says is:
"Importing a payload using handle types with copy transference
creates a duplicate copy of the payload at the time of import, but
makes no further reference to it. Fence signaling, waiting, and
resetting operations performed on the target of copy imports must
not affect any other fence or payload."
A SYNC_FD has copy transference but the import may be temporary or
permanent. If you do a permanent import of something with copy
transference, I guess it's supposed to work and end up resetting the
permanent state. In any case, there seems to be no real harm in
allowing it, so why not.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Right now, OpenGL uses the GLSL lowering for shared variables and anv
uses NIR to lower them. For a long time, we've done this weird thing
where we do the NIR lowering unconditionally and then add the SLM sizes
from the two together. This works because one of them will always be 0
but it's a bit sketchy. Let's just move the NIR-based lowering into
anv_pipeline and get rid of the sketch.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Probably harmless, but will overwrite errno with a failure status
code. Reported by coverity.
CID 1416600: Argument cannot be negative (NEGATIVE_RETURNS)
Fixes: 5c4e4932e02 (anv: Implement support for exporting semaphores as FENCE_FD)
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The anv_execbuf_add_bo() call can actually fail in practice, which
should cause the QueueSubmit operation to fail. Reported by Coverity.
CID: 1416606: Unchecked return value (CHECKED_RETURN)
Fixes: 017cdb10cf (anv: Submit a dummy batch when only semaphores are provided.)
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
It's only used in the gen7_cmd_buffer_emit_scissor() function.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
|
|
|
|
|
|
| |
This fixes make check
Fixes: 5c4e4932e02164e18cba9ae2cf3ec56afa2f2a6b
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
Vulkan allows you to do a submit whose only job is to wait on and
trigger semaphores. The easiest way for us to support that right
now is to insert a dummy execbuf.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds an implementation based on DRM BOs. We don't actually
advertise the extension yet because we want to add a couple more paths
first.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We can't sample from depth-stencil formats but on gen7 but we can sample
from depth-only formats.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024
Reviewed-by: Juan A. Suarez Romero <[email protected]>
Cc: [email protected]
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
There is literally no work for us to do here. It already just works in
our driver.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This way everything stays in sync and we only have the one version
number.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This is more lines of code but the python is far easier to read than the
sed expressions we were using before. Also, this allows us to pull the
API version from anv_entrypoints.py so it never gets out-of-sync.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
The VkVersion class is probably overkill but it makes it really easy to
compare versions in a way that's safe without the caller having to think
about patch vs. no patch.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This way we can use "from anv_extensions import *" in the entrypoint
generator without worrying too much about pollution
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As time goes on, extension advertising is going to get more complex.
Today, we either implement an extension or we don't. However, in the
future, whether or not we advertise an extension will depend on kernel
or hardware features. This commit introduces a python codegen framework
that generates the anv_EnumerateFooExtensionProperties functions as well
as a pair of anv_foo_extension_supported functions for querying for the
support of a given extension string. Each extension has an "enable"
predicate that is any valid C expression. For device extensions, the
physical device is available as "device" so the expression could be
something such as "device->has_kernel_feature". For instance
extensions, the only option is VK_USE_PLATFORM defines.
This mechanism also means that we have a single one-line-per-entry table
for all extension declarations instead of the two tables we had in
anv_device.c and the one we had in anv_entrypoints_gen.py. The Python
code is smart and uses the XML to determine whether an extension is an
instance extension or device extension.
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This will allow us to keep everything in one place when it comes to
declaring what extensions are supported.
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.
Cc: "17.2" <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Cc: "17.2" <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
This allows us to avoid some extra zeroing.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Image layouts only let us know that an image *may* be fast-cleared. For
this reason we can end up with redundant resolves. Testing has shown
that such resolves can measurably hurt performance and that predicating
them can avoid the penalty.
v2:
- Introduce additional resolve state management function (Jason Ekstrand).
- Enable easy retrieval of fast clear state fields.
v3: Use more descriptive field enums (Jason)
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
With an earlier patch from this series, resolves are additionally
performed on layout transitions. Remove the now unnecessary implicit
resolves within render passes.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Expound on comment for the pipe controls (Jason Ekstrand).
v3:
- Cast base_layer to uint64_t to avoid overflow.
- Remove "seems" from the pipe control comment.
- Fix clamp of layer_count (Jason Ekstrand).
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Use the performance warning infrastructure to provide helpful
information when testing applications.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
For readability, bring the assignment of CCS closer to the assignment of
NONE and MCS.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lifespan of the fast-clear data will surpass the render pass scope.
We need CCS_D to be enabled in order to invalidate blocks previously
marked as cleared and to sample cleared data correctly.
v2: Avoid refactoring.
v3: Allow CCS_D for subpass resolves.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The next patch enables the use of CCS_D even when the color attachment
will not be fast-cleared. Catch the gen7 case early to simplify the
changes required.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
v2: Rewrite functions, change location of synchronization.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'll be performing a GPU memcpy in more places to copy small amounts of
data. Add an alternate function that thrashes less state.
v2:
- Make a new function (Jason Ekstrand).
- Move the #define into the function.
v3:
- Update the function name (Jason).
- Update comments.
v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand).
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: Remove ::first_subpass_layout assertion (Jason Ekstrand).
v3: Allow some fast clears in the GENERAL layout.
v4: Remove extra '||' and adjust line break (Jason Ekstrand).
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Don't pass in the command buffer (Jason Ekstrand).
v3: Remove an incorrect assertion and an if condition for gen7.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2: Rewrite functions.
v3 (Jason Ekstrand):
- Don't set ResourceMinLOD.
- Fix clamp of level_count.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
v2: Update comments, function signatures, and add assertions.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|