| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
The implementation of brw_miptree_layout was removed in bf24c3539e4b69.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally, I had moved it to the caller to make some things easier when
adding the CCS modifier. However, this broke DRI2 because
intel_process_dri2_buffer calls intel_miptree_create_for_bo but never
calls intel_miptree_alloc_aux. Also, in hindsight, it should be pretty
easy to make the CCS modifier stuff work even if create_for_bo allocates
the CCS when DISABLE_AUX is not set.
Reviewed-by: Jordan Justen <[email protected]>
Cc: "17.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The flag hasn't affected actual surface layout for some time. The only
purpose it served was to set bo->cache_coherent = false on the BO used
to create the miptree. This is fairly silly because we can just set
that directly from the caller where it makes much more sense.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We rename it to intel_miptree_supports_mcs and make the function
signature match intel_miptree_supports_ccs/hiz. We also move the sample
count check into the function so it returns false for single-sampled
surfaces.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
The one caller of is_mcs_supported passes 0 in as the layout_flags
unconditionally.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were calculating the total height of 2D surfaces by multiplying the
row pitch by the number of slices. This means that we actually request
slightly more space than actually needed since the padding on the last
slice is unnecessary. For tiled surfaces this is not likely to make a
difference. For linear surfaces, on the other hand, this means we may
require additional memory. In particular, this makes the i965 driver
reject EGL imports of buffers which do not have this extra padding.
Reviewed-by: Jordan Justen <[email protected]>
Cc: "17.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The docs contain a bunch of commentary about the need to pad various
surfaces out to multiples of something or other. However, all of those
requirements are about avoiding GTT errors due to missing pages when the
data port or sampler accesses slightly out-of-bounds. However, because
the kernel already fills all the empty space in our GTT with the scratch
page, we never have to worry about faulting due to OOB reads. There are
two caveats to this:
1) There is some potential for issues with caches here if extra data
ends up in a cache we don't expect due to OOB reads. However,
because we always trash the entire cache whenever we need to move
anything between cache domains, this shouldn't be an issue.
2) There is a potential issue if a surface gets placed at the very top
of the GTT by the kernel. In this case, the hardware could
potentially end up trying to read past the top of the GTT. If it
nicely wraps around at the 48-bit (or 32-bit) boundary, then this
shouldn't be an issue thanks to the scratch page. If it doesn't,
then we need to come up with something to handle it.
Up until some of the GL move to ISL, having the padding code in there
just caused us to harmlessly use a bit more memory in Vulkan. However,
now that we're using ISL sizes to validate external dma-buf images,
these padding requirements are causing us to reject otherwise valid
images due to the size of the BO being too small.
Acked-by: Kenneth Graunke <[email protected]>
Tested-by: Tapani Pälli <[email protected]>
Tested-by: Tomasz Figa <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Cc: "17.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We can't sample from depth-stencil formats but on gen7 but we can sample
from depth-only formats.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024
Reviewed-by: Juan A. Suarez Romero <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
| |
This ports the workaround from radeonsi, that was missing in radv.
This fixes Talos rendering when MSAA is enabled on my Tahiti card.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This mirrors what Marek has done for radeonsi, and uses
a separate counter to handle the fmask surface for MSAA
MRTs.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This just copies the code from the -pro shaders,
and fixes the tests on CIK.
With this CIK passes the same set of conformance
tests as VI.
Fixes: 83e58b03 (radv: flush f32->f16 conversion denormals to zero. (v2))
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
R8_UNORM textures can be emulated by means of L8 and a swizzle.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for large shaders on GC3000. For example the "terrain"
glmark benchmark with a large fragment shader will work after this.
If the GPU supports ICACHE, shaders larger than the available state area will
be uploaded to a bo of their own and instructed to be loaded from memory on
demand. Small shaders will be uploaded in the usual way. This mimics the
behavior of the blob.
On GPUs that don't support ICACHE, this patch should make no difference.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GC3000 has changed from a separate store for VS and PS uniforms
to a single, unified one. There is backwards compatibilty functionalty,
however this does not work correctly together with ICACHE.
This patch adds explicit support, although in the simplest way possible:
the PS/VS uniforms split is still fixed and hardcoded. It should
make no difference on hardware that does not have unified uniform
memory.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
| |
Trivial. There is no _gl_ in there.
|
|
|
|
|
|
|
|
|
| |
The argument here is a bitmask, so the old code selected .xy, which
got silently truncated to .x when constructing the vec4 from components,
instead of using .w.
Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It justs works with the fragment shader resolve, so no need to do
a custom conversion. In fact with SRGB dest, it actually gives
wrong results.
Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These seem to store very bogus results. Luckily there is some code
that converts srgb->linear already, so just making the descriptor
format UNORM should work.
Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: fix an indentation error
v3: don't enable for r600
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
These need to match for interop compatibility queries.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is required for interop use cases. The same device must report
identical UUIDs through the GL and Vulkan APIs so that users can
identify when it is safe to perform a memory object import.
v2: use ac helpers to calculate the uuid
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
These are just basic implementations.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
v2: move from r600_common to radeonsi
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We need vulkan and gl to produce the same UUIDs. Therefore we should
keep the mechanism to compute these in a common location to guarantee
they are updated in lockstep.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: respective changes for new gallium interface
v3: fix UUID size asserts
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2: remove unnecessary returns
v3 (Timothy Arceri): updated trace
v4 (Timothy Arceri): actually dump the params in trace
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]> (v2)
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are used by EXT_external_objects to present UUIDs for the device
and the driver.
v2 (Timothy Arceri):
- remove extra break
- use _mesa_problem() rather the _mesa_error() for unimplemented
support for value types
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: use PIPE_CAP_MEMOBJ to guard the extension
v3 (Timothy Arceri):
- expose extensions via the cap_mappings array
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include no_error variants as well.
v2 (Timothy Arceri):
- reduced code churn by squashing some changes into
previous commits
v3 (Timothy Arceri):
- drop unused function declaration
v4 (Timothy Arceri):
- fix Driver function assert()
- add missing GL errors
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Use a memory object instead of user memory.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: also consider gfx9 metadata
v3: ref/unref memobj->buf
v4: add refcount comment
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Plumbing for importing memobj backed textures.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of allocating memory to back a texture, use the provided memory
object.
v2: split off extension exposure logic
v3: de-duplicate code with st_AllocTextureStorage
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Plumbing for using memory objects as texture storage.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
V2 (Timothy):
- error check memory == 0 before lookup
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
V2 (Timothy Arceri):
- formating fixes
V3 (Timothy):
- error check memory == 0 before lookup
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: pass dedicated flag
v3 (Timothy Arceri):
- remove unrequired _mesa_init_memory_object_functions()
call in the state tracker.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]> (v2)
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2: fix comment regarding fd ownership, define pipe_memory_object
v3: remove stray return
v4 (Timothy Arceri): update trace
v5 (Timothy Arceri): actually dump the params in trace
Reviewed-by: Marek Olšák <[email protected]> (v3)
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
V2 (Timothy Arceri):
- fix copy and paste error with error message
V3 (Timothy Arceri):
- drop the Protected field for now as its unused
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Used by EXT_external_objects and EXT_external_objects_fd
V2 (Timothy Arceri):
- Throw GL_OUT_OF_MEMORY error if CreateMemoryObjectsEXT()
fails.
- C99 tidy ups
- remove void cast (Constantine Kharlamov)
V3 (Timothy Arceri):
- rename mo -> memObj
- check that the object is not NULL before initializing
- add missing "EXT" in function error message
V4 (Timothy Arceri):
- remove checks for (memory objecy id == 0) and catch in
_mesa_lookup_memory_object() instead.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Includes implementation stubs.
Signed-off-by: Andres Rodriguez <[email protected]>
Acked-by: Timothy Arceri <[email protected]>
Acked-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The device version is the maximum CL version that the device supports.
device_version and device_clc_version are not necessarily the same for
devices that support CL 1.0, but have a 1.1 compiler and the necessary
extensions.
Eventually, this will be based on the features/extensions of the actual
device, but for now move it a bit closer to its eventual destination.
Signed-off-by: Aaron Watry <[email protected]>
Reviewed-by: Jan Vesey <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.
v2: get it right, reverse the polarity.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Having two callbacks to manage a single int seems like an overkill.
Use a cached copy and update that when needed.
Signed-off-by: Emil Velikov <[email protected]>
---
Might want to look if the dimensions dance in .query_surface ...
speaking of which close to nobody implements that ...
|
|
|
|
|
|
|
|
|
| |
If we get a xfixes v1.x we'll error out, without freeing the
xfixes_query reply.
Cc: <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently xmlconfig is conditionally used, only when --enable-dri is
available.
As the library has moved to src/util and has wider wisebase, this guard
is no longer correct. Strictly speaking - it wasn't since the
introduction of xmlconfig into st/nine a while ago.
Unconditionally enable xmlconfig and drop the linking. As said before
there's other users of the library, so depending on the configure
options we will get multiple definitions of said symbols.
NOTE: To avoid breaking other combinations, this commit adds the
xmlconfig link to the required places - throughout gallium and the DRI
loaders.
Cc: Aaron Watry <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel only cares about whether the object is to be written to or
not, only reduces (reloc.read_domains, reloc.write_domain) down to just
!!reloc.write_domain. When we use NO_RELOC, the kernel doesn't even read
those relocs and instead userspace has to pass that information in the
execobject.flags. We can simplify our reloc api by also removing the
unused read/write domains and only pass the resultant flags.
The caveat to the above are when we need to make the kernel aware that
certain objects need to take into account different work arounds.
Previously, this was done using the magic (INSTRUCTION, INSTRUCTION)
reloc domains. NO_RELOC requires this to be passed in the execobject
flags as well, and now we push that up the callstack.
The API is more compact, more expressive of what happens underneath, but
unfortunately requires more knowledge of the system at the point of use.
Conversely it also means that knowledge is specific and not generally
applied and so not overused.
text data bss dec hex filename
8502991 356912 424944 9284847 8dacef lib/i965_dri.so (before)
8500455 356912 424944 9282311 8da307 lib/i965_dri.so (after)
v2: (by Ken) Rebase.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on a patch by Chris Wilson (who also wrote this commit message).
Passing the index of the target buffer via the reloc.target_handle is
marginally more efficient for the kernel (it can avoid some allocations,
and can use a direct lookup rather than a hash or search). It is also
useful for ourselves as we can use the index into our exec_bos for other
tasks.
v2: Only enable HANDLE_LUT if we can use BATCH_FIRST and thereby avoid
a post-processing loop to fixup the relocations.
v3: Move kernel probing from context creation to screen init.
Use batch->use_exec_lut as it more descriptive of what's going on (Daniel)
v4: Kernel features already exists, use it for BATCH_FIRST
Rename locals to preserve current flavouring
v5: Squash in "always insert batch bo first"
v6: (by Ken) Split out BATCH_FIRST from HANDLE_LUT.
|