| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Simplify the existing if/else + temporary variable into if (foo) return
X.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Move the common extension check at the top.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Will allow us to simplify existing code and make further improvements
short and simple.
No functional change intended.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As per EGL_KHR_image_base:
If an attribute specified in <attrib_list> is not one of the
attributes listed in Table bbb, the error EGL_BAD_PARAMETER is
generated.
We should set the error as opposed to simply log it.
Currently we have a partial solution, whereby only some of the callers
call _eglError().
Since that has proven to be less robust, simply set the error by the
function itself and change the return type to EGLBoolean, updating the
callers.
So now the code is slightly simpler. Plus the follow-up fixes will be
easier to manage.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't bother allocating any memory until we're finished parsing and
sanitising all the attributes.
As a nice side effect we now consistently set eglError when any of
the attrib/values are not correct.
Strangely enough the spec does not mention _anything_ about what error
should be set where, even if the implementation already sets the odd
one.
Cc: Kristian Høgsberg <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit bfe1e7737a76e3b046 changed how texture swizzles are set up.
This exposed a latent bug in the VMware driver: we were ignoring
the texture instruction's writemask when applying the 0 and 1
swizzle terms.
This wasn't caught by the Piglit texture swizzle test because it
only exercises fixed function (no write masking).
Fixes issues seen with ETQW apitrace.
CC: <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valgrind doesn't actually implement VALGRIND_FREELIKE_BLOCK as the
exact inverse of VALGRIND_MALLOCLIKE_BLOCK. It makes the block
inaccessible, but still leaves it defined in its allocation tracker i.e.
it will report the mmap as lost despite the call to FREELIKE!
Instead of treating the mmap as an allocation, treat it as changing the
access bits upon the memory, i.e. that it becomes defined (because of
the buffer objects always contain valid content from the user's
perspective) upon mmap and inaccessible upon munmap. This makes memcheck
happy without leaving it thinking there is a very large leak.
Finally for consistency, we treat all the mmap/munmap paths the same
even though valgrind can intercept the regular mmap used for GTT. We
could move this in the drm_mmap/drm_munmap macros, but that quickly
looks ugly given the desire for those to support different OSes, but I
didn't try that hard!
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using a read-only CPU mapping, we may encounter stale buffer
contents. For example, the Piglit primitive-restart test offers the
following scenario:
1. Read data via a CPU map.
2. Destroy that buffer.
3. Create a new buffer - obtaining the same one via the BO cache.
4. Call BufferSubData, which does a GTT map with MAP_WRITE | MAP_ASYNC.
(We avoid set_domain for async mappings, so no flushing occurs.)
5. Read data via a CPU map.
(Without explicit clflushing, this will contain data from step 1!)
Otherwise, everything ought to work, keeping in mind that we never use
CPU maps for writing - just read-only CPU maps.
This restores the performance gains after Matt's revert in commit
71651b3139c501f50e6547c21a1cdb816b0a9dde.
v2: Do the invalidate later, and even when asking for a brand new map.
v3: Add more comments from Chris.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just map the buffer and memcpy. This will do a CPU mmap, which should
be reasonably efficient, and doing this gives us full control over the
domains and caching instead of leaving it to the kernel.
This prevents regressions on Braswell in the next commit. Specifically
GL45-CTS.shader_atomic_counters.basic-buffer-operations. Because async
maps start skipping set-domain, the pread thought everything was nicely
still in the CPU domain, and returned stale data.
v2: Use _mesa_error_no_memory() if the map fails instead of crashing.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swr used to build and link the rasterizer to the driver, and to support
multiple architectures we needed to have multiple versions of the
driver/rasterizer combination, which needed to link in much of mesa.
Changing to having one instance of the driver and just building
architecture specific versions of the rasterizer gives a large reduction
in disk space.
libGL.so 6464 Kb -> 7000 Kb
libswrAVX.so 10068 Kb -> 5432 Kb
libswrAVX2.so 9828 Kb -> 5200 Kb
Total 26360 Kb -> 17632 Kb
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use the SWR rasterizer API through the table returned from
SwrGetInterface rather than referencing the functions directly.
This will allow us to move to a model of having the driver dynamically
load the appropriate swr architecture library.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Needed to expose SwrGetInterface
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We could have used a single integer to store that value, but
Cannonlake has different number of subslices per slice depending on
the GT.
v2: Add CFL subslice numbers (Lionel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
| |
Reduces IOCTL calls by 1, and provides a centralized place to override
such configurations if we have a need to do so.
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From KHR_fence_sync:
When the condition of the sync object is satisfied by the fence
command, the sync is signaled by the associated client API context,
causing any eglClientWaitSyncKHR commands (see below) blocking on
<sync> to unblock. The only condition currently supported is
EGL_SYNC_PRIOR_COMMANDS_COMPLETE_KHR, which is satisfied by
completion of the fence command corresponding to the sync object,
and all preceding commands in the associated client API context's
command stream. The sync object will not be signaled until all
effects from these commands on the client API's internal and
framebuffer state are fully realized. No other state is affected by
execution of the fence command.
If clients are passing the fence fd (from EGL_ANDROID_native_fence_sync)
to a compositor, that fence must only be signaled once the framebuffer
is resolved and not before as is currently the case.
v2: fixup assert to use GL_SYNC_GPU_COMMANDS_COMPLETE (Chad)
Reported-by: Sergi Granell <[email protected]>
Fixes: c636284ee8ee ("i965/sync: Implement DRI2_Fence extension")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Sergi Granell <[email protected]>
Cc: Rob Clark <[email protected]>
Cc: Chad Versace <[email protected]>
Cc: Daniel Stone <[email protected]>
Cc: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
| |
And s/unsigned/enum tgsi_interpolate_loc/
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
| |
Makes gdb debugging a little nicer.
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Using CPU maps of non-coherent buffers can get us in a lot of trouble,
and WC maps are a reasonable alternative anyway. Guard against shooting
ourselves in the foot by adding an assert, and comment.
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the user triggers an implicit batch flush while holding access to a
CPU mapped buffer, that mmapping will be invalidated by the kernel for
non-LLC devices. (The kernel when executing a batch will change the
cache domain of the buffers in that batch, which for non-LLC CPU access
will cause that buffer to be clflushed and any further CPU access to be
discarded.) To prevent this, simply disallow any CPU async mmap access.
The cases where async CPU access to a non-LLC buffer should continue to
be allowed via their preferred snooping path.
v2 (Ken): Reword the comment slightly.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the buffer is being shared with an external client, our own state
tracking may be stale and in some cases we may wish to double check with
the kernel/hw state. At the moment, this is synonymous with not being
reusable, but the semantics between reusable and external are quite
different and we will have more examples of non-reusable buffers in the
near future.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Thanks to Chris Wilson for pointing this out.
Reviewed-by: Daniel Vetter <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I want to use these in the OpenGL driver as well.
v2: Add to COMMON_FILES in Makefile.sources (caught by Emil)
Reviewed-by: Daniel Vetter <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were hitting the
unreachable("Invalid image opcode")
near the end of vtn_handle_image when parsing the
SpvOpAtomicCompareExchange opcode.
v2: Add stable CC.
v3: Ignore SpvOpAtomicCompareExchangeWeak. It requires the Kernel
capability which is not exposed in Vulkan, and spirv_to_nir is not used
for OpenCL which does support it.
Reviewed-by: Jason Ekstrand <[email protected]>
CC: <[email protected]>
|
|
|
|
|
|
|
| |
It can't parse "llu".
Reviewed-by: Thomas Helland <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we use set_domain() to cause a stall on rendering. But the
set-domain ioctl has the side-effect of changing the kernel's cache
domain underneath the struct_mutex, which may perturb state if there was
no rendering to wait upon and in general is much heavier than the
lockless wait-ioctl. Historically libdrm used set-domain as we did not
have an explicit wait-ioctl (and the patches to teach it to use wait if
available were lost in the mists). Since mesa already depends upon a
kernel support the wait-ioctl, we do not need to supply a fallback.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This query is supposed to return the max texture buffer size/width in
texels, not size in bytes. Divide by 16 (the largest format size) to
return texels.
Fixes Piglit arb_texture_buffer_object-max-size test.
Cc: [email protected]
Reviewed-by :Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes a regression in some piglit tests since commit 5e5d5f1a2eb.
I think I mis-resolved the merge conflict when cherry-picking that
commit to master.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reason we were doing this was to ensure that the kernel did the
appropriate cross-ring synchronization and flushing. However, the
kernel only looks at EXEC_OBJECT_WRITE to determine whether or not to
insert a fence. It only cares about the domain for determining whether
or not it needs to clflush the BO before using it for scanout but the
domain automatically gets set to RENDER internally by the kernel if
EXEC_OBJECT_WRITE is set.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The register values depend on the currently set program, so make sure to
revalidate when the program changes.
Fixes glsl-1.10-fragdepth as well as
dEQP-GLES3.functional.shaders.fragdepth.compare.*
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since radv uses compute rings and we can't know when we are setting
up the shaders what ring they are to be used on, we should just use
the default xnack setting. This may be suboptimal in some places,
but if we hit a problem, we likely should try and address this
between llvm and mesa.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Rather than using 64k, use what addrlib returns as the base
alignment for vulkan allocations.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes a bunch of gl_BackColor interpolation tests that had explicit
interpolation specified on the fragment shader gl_Color.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Better to just point at the bcolor_entry struct which has our current
understanding encoded into it. Also add an assert to ensure that the
struct remains the expected size.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It should have been removed after 00c47e111c.
Cc: Jason Ekstrand <[email protected]>
Cc: Connor Abbott <[email protected]>
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Figured out the clear value when we have a combined depth stencil
surface.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It could only handle indices 0/1, otherwise what happened was bad (accessing
array out of bounds, no crash but kind of random). This is enough for the gl
state tracker (primary/secondary color) but not enough for some other state
trackers (d3d9 has no limits on the number of color interpolants).
The complexity with color semantics are all due to the front/back mapping (2
outputs in the vs map to one input in the fs) so this isn't extended to
indices > 1 - d3d9 has no use for back colors, therefore this isn't needed and
still only 2 back colors can be handled correctly.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By design pixel shaders can have up to 3 variants:
* The standard one.
* glDrawPixels variant.
* glBitmap variant.
However "shader_has_one_variant" ignores this fact, and therefore
st_update_fp would select the wrong variant if glDrawPixels or glBitmap
was ever called.
This patch fixes the problem. If the standard variant has been created,
calling glDrawPixels or glBitmap will append the variant to the second
entry of the linked list, so that st_update_fp still selects the right
one if shader_has_one_variant is set.
If the standard variant hasn't been created yet and glDrawPixel/Bitmap
has been called, st_update_fp will will see this and take the slow path
instead. The standard variant will then be added at the front of the
linked list, so that the next time the fast path is taken.
Blender in particular is hit by this bug.
v2: Marek - cosmetic changes
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=101596
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 8aaa13467dc289d35dc7900ab9fab9a7689c4178, which was
based on an incorrect assumption. Unlike the restriction placed on image
views in the Vulkan API, OpenGL allows you to render to texture views
whose formats differ from the originals.
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=101677
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we try to build a display list with just a glPrimitiveRestartNV()
call, we'd crash because of a null GLvertexformat::PrimitiveRestartNV
pointer. This change fixes that case.
The previous patch fixed the case of calling glPrimitiveRestartNV()
inside a glBegin/End pair.
v2: minor clean-up in save_PrimitiveRestartNV(), per Charmaine.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glPrimitiveRestartNV crashes when it is called during the compilation
of a display list.
There are two reasons:
- ctx->Driver.CurrentSavePrimitive is not set to the current primitive
- save_PrimitiveRestartNV() calls _save_Begin() which only sets an
OpenGL error, instead of calling vbo_save_NotifyBegin().
This patch correctly calls vbo_save_NotifyBegin() but it detects
the current primitive mode by looking at the latest saved primitive.
Additional work by Brian Paul
Signed-off-by: Olivier Lauffenburger <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101464
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|