| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the advent of asynchronous maps, domain tracking doesn't make a
whole lot of sense. Buffers can be in use on both the CPU and GPU at
the same time. In order to avoid blocking, we stopped using set_domain
for asynchronous mappings, which means that the kernel's tracking has
lies. We can't properly track it in userspace either, as the kernel
can change domains on us spontaneously (for example, when un-swapping).
According to Chris Wilson, I915_GEM_SET_DOMAIN does the following:
1. pins the backing storage (acquiring pages outside of the
struct_mutex)
2. waits either for read/write access, including inter-device waits
3. updates the domain, clflushing as required
4. marks the object as used (for swapping)
5. turns off FBC/PSR/fancy scanout caching
Item (1) is not terribly important. Most BOs are recycled via the
BO cache, so they already have pages. Regardless, we fixed this
via an initial set_domain in the previous patch.
We implement item (2) with I915_GEM_WAIT. This has one downside:
we'll stall unnecessarily if we do a read-only mapping of a buffer
that the GPU is reading. I believe this is pretty uncommon. We
may want to extend the wait ioctl at some point.
Mesa already does item (3) itself. For cache-coherent buffers (most on
LLC systems), we don't need to do any clflushing - the CPU and GPU views
are coherent. For non-coherent buffers (most on non-LLC systems), we
currently only use the CPU for read-only maps, and we explicitly clflush
when necessary.
We don't care about item (4)...swapping has already killed performance.
Plus, with async maps, the kernel's domain tracking is already bogus,
so it can't do this accurately regardless.
Item (5) should be okay because we avoid cached maps of scanout buffers.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Suggested by Chris Wilson.
v2: Set the write domain to 0 (suggested by Chris).
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
From the ARB_uniform_buffer_object spec:
""shared" uniform blocks, the default layout, ..."
This doesn't fix anything as the default layout is already applied
at this point but fixes the misleading code/comment.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
This was added in 2d03f48a65a666 and seems like it was intended
as a TODO comment in a function stub rather than a useful
code comment.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When we validate the texture sample count, pass the correct
pipe_texture_target for the texture, rather than PIPE_TEXTURE_2D.
Also add more comments about MSAA.
No piglit regressions with VMware driver.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
The NumSamples and FixedSampleLocation fields are set again later at
the end of the function so these earlier assignments aren't needed.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
imm {128.0, -128.0, 2.0, 3.0} is used for lit instruction which
is not used very frequently. So allocate it only if lit instruction is used.
Tested with mtt piglit and mtt glretrace
v2: As per Charmaine's comment
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the ordering of the constant indices for texcoord scale
factor and texture buffer size to match the order they were added to the
constant buffer in svga_get_extra_constants_common().
Tested with MTT piglit, glretrace.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sometimes, converting unnormalized coordinates to normalized
coordinates requires an epsilon value to produce the right texels with
nearest filtering. Adding 0.0001 to the coordinates when the min/mag
filter is nearest fixes the issue.
Fixes piglit test fbo-blit-scaled-linear
Tested with mtt-piglit, mtt-glretrace
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Skip 2x MSAA, for example, since it's seldom used and just bloats
the list of pixel formats.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
This allows us to override contexts to use no_error functionality
even if the applications themselves do not.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
We have a very specific row pitch that we want and we don't want ISL to
be changing it on us so just be explicit about it.
Fixes: a40f0430347c07bf2d5794642fe02f5dd248a473
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
For some reason we left an empty file, rather than deleting it.
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes
missrendering in TombRaider
KHR-GL44.gpu_shader5.precise_qualifier
KHR-GL45.gpu_shader5.precise_qualifier
v4: disable opt only for MAD, it's fine for SAD
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
| |
v4: initialize field with NULL
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
v4: add comment about intermediate rounding step to MAD
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: use str_match_no_case to fix _SAT_PRECISE detection
v4: usd is_digit_alpha_underscore to match end of mods
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Only implemented for glsl->tgsi. Other converters just set precise to 0.
v2: remove precise paramter from ureg_tex_insn and ureg_memory_insn
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
all subexpression inside an ir_assignment needs to be tagged as precise.
v2: make precise handling more global inside the visitor
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
There's a second struct for Gen6+.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When I ported from libdrm, I forgot to add the line to reset
the sem, we just need to reset the context.
This fixes a regression in DOOM.
Fixes: 9ac1432a571 ("radv: port to new libdrm API.")
Reported-by: Grazvydas Ignotas <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch, the st manager will maintain a hash table for
the active framebuffer interface objects. A destroy_drawable interface
is added to allow the state tracker to notify the st manager to remove
the associated framebuffer interface object from the hash table,
so the associated framebuffer and its resources can be deleted
at framebuffers purge time.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101829
Fixes: 147d7fb772a ("st/mesa: add a winsys buffers list in st_context")
Tested-by: Brad King <[email protected]>
Tested-by: Gert Wollny <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The two generators forked from each other, and they remain basically the
same. This rebases the radv version on the anv version, but with the
radv changes ported over. The result is that we get rid of the "cat |"
madness and gain mako, correct "generated by" attributions, and write
files out directly.
The only differences between the output is whitespace and comments.
Signed-off-by: Dylan Baker <[email protected]>
Acked-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was only needed for checking gen6 stencil which is already
using isl. One could delete GEN6_HIZ_STENCIL layout altogether
but that will be gone with the rest after a while anyway.
The dim_layout converter is needed even after transition to isl
when setting up surface states - see brw_emit_surface_state().
Hence dropping the unneeded argument separately.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
allowing graceful failure instead of crash on assert later on.
This can be hit, for example, on SNB when trying to allocate
8kx8k CUBE_MAP against isl: x-tiled buffer size becomes
2421161984 exceeding the maximum of 1 << 31 == 2147483648.
Another way to hit this on SNB is with multisampling of over
64-bit formats.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise init_teximage_fields_ms() (called by
_mesa_init_teximage_fields()) will always assert as it can't
find valid base format.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is the same constraintg later on as assert in
isl_gen7_choose_image_alignment_el() so catch it earlier in order
to return error instead of crash.
Needed to avoid crashes with piglits on IVB and HSW:
arb_internalformat_query2.image_format_compatibility_type pname checks
arb_internalformat_query2.all internalformat_<x>_type pname checks
arb_internalformat_query2.max dimensions related pname checks
arb_copy_image.arb_copy_image-formats --samples=2/4/6/8
arb_texture_float.multisample-fast-clear gl_arb_texture_float
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.
There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try 16I/32I in addition to GL_RGBA8I.
IvyBridge passed all tests with all sample numbers.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.
There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try GL_RGBA16F/32F/16I/16UI/32I/32UI in addition to GL_RGBA/8I.
IvyBridge passed all tests with all sample numbers and even
with 128-bit formats.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
in order to support blit engine.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
See brw_miptree_choose_tiling().
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had some caller using LLVMAddInstrAttributes, which couldn't be
converted to lp_add_function_attr, because attributes were only handled
for functions in this case, so fix this.
For llvm >= 4.0, this already works correctly.
(radeonsi seems to avoid setting call site attributes prior to llvm 4.0,
the patch then citing it doesn't work when calling intrinsics. But at
least for calling external functions we always used that, albeit only
for actual call attributes, not call parameter attributes, though some
quick test shows llvm seems to handle that as well. The attribute index
is sort of iffy though, since attribute 0 of the call is the actual function,
attribute 1 corresponds to the first parameter of the called function.)
(Verified with GALLIVM_DEBUG=dumpbc plus llvm-dis that the correct
attributes are shown for calls, both for llvm 4.0 and 3.3.)
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can also use storage images internally for resolves, which don't
require TRANSFER_DST usage on the image, so currently we may not create
the needed descriptors.
Just create these descriptors unconditionally.
Fixes: 0e1886efb9e ("radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT")
Reported-by: Grazvydas Ignotas <[email protected]>
Signed-off-by: Alex Smith <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linux-specific gettid() syscall shouldn't be used in portable code.
Fix does assume a 1:1 thread:LWP architecture, but works for our
current target platforms and can be revisited later if needed.
Fixes unresolved symbol in linux scons builds.
v2: add comment in code about the 1:1 assumption.
Cc: [email protected]
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for sharing semaphores using kernel syncobjects.
Syncobj backed semaphores are used for any semaphore which is
created with external flags, and when a semaphore is imported,
otherwise we use the current non-kernel semaphores.
Temporary imports from syncobj fd are also available, these
just override the current user until the next wait, when the
temp syncobj is dropped.
v2: allocate more chunks upfront, fix off by one after
previous refactor of syncobj setup, remove unnecessary null
check.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This just adds syncobj create/destroy/export/import paths into
the winsys interface.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|