| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Adds suppport for ARB_fragment_shader_interlock. We achieve
the interlock and fragment ordering by issuing a memory fence
via sendc.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension provides new GLSL built-in functions
beginInvocationInterlockARB() and endInvocationInterlockARB()
that delimit a critical section of fragment shader code. For
pairs of shader invocations with "overlapping" coverage in a
given pixel, the OpenGL implementation will guarantee that the
critical section of the fragment shader will be executed for
only one fragment at a time.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106748
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Patch skips useless and possibly dangerous calls down to the driver
in case invalid arguments were given. I noticed this would be happening
with demo of Darwinia game. AFAIK this does not fix anything but makes
this path safer and more like how other API functions are implemented.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Gallium drivers don't expose this yet due to:
"st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This requires layered FBOs from GL 3.2.
Gallium drivers don't expose this yet due to:
"st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Gallium drivers don't expose this yet due to:
"st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
Bindless texture handles can be passed via vertex attribs using this type.
They use the double codepath, so don't use st_pipe_vertex_format.
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Bindless texture handles can be passed via vertex attribs using this type.
This fixes a bunch of bindless piglit tests on radeonsi.
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
This is required for tessellation shader Compat profile support.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This just renames this as we want to add an shm handle which
isn't really drm related.
Originally by: Marc-André Lureau <[email protected]>
(airlied: I used this sed script instead)
This was generated with:
git grep -l 'DRM_API_' | xargs sed -i 's/DRM_API_/WINSYS_/g'
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 92f01fc5f914 ("i965: Emit VF cache invalidates for 48-bit
addressing bugs with softpin.") tried to only emit the VF invalidate if
the high bits changed, but it accidentally always set need_invalidate to
true; causing it to emit unconditionally emit the pipe control before
every primitive.
Fixes: 92f01fc5f914 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106708
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
---
v2: rebased on top of 432df741e0b85c021da0 "dri_util: Add
R10G10B10{A,X}2 translation between DRI and mesa_format."
|
|
|
|
|
|
|
|
|
|
|
|
| |
0 is not a valid value for the __DRI_IMAGE_FORMAT_* enum.
It is, however, the value of MESA_FORMAT_NONE, which two of the callers
(i915 & i965) checked for.
The other callers (that check for errors, ie. st/dri) already check for
__DRI_IMAGE_FORMAT_NONE.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 5c33e8c7729edd5e16020ebb8703be96523e04f2. It broke
fixed function vertex programs on vc4 and v3d, and apparently caused
trouble for radeonsi's NIR paths as well.
Acked-by: Timothy Arceri <[email protected]>
https://bugs.freedesktop.org/show_bug.cgi?id=106673
|
|
|
|
|
|
|
|
|
| |
This reverts commit 79fe00efb474b3f3f0ba4c88826ff67c53a02aef.
This reverts commit f5e8b13f78a085bc95a1c0895e4a38ff6b87b375.
This reverts commit d21c086d819d78fb3f6abcbb14aa492970f442aa.
They broke the Android build and I'd rather not leave it broken
for the long holiday weekend.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.
Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.
v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)
v3: Add units to parameter names of tile_extents (Nanley Chery)
Use _mesa_align_malloc for the shadow copy (Nanley)
Continue using gtt maps on gen4 (Nanley)
v4: Use streaming_load_memcpy when detiling
v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it
takes precedence. Add intel_miptree_access_raw, needed after
rebasing on commit b499b85b0f2cc0c82b7c9af91502c2814fdc8e67.
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We stream from a tiled and aligned source into an unaligned user buffer,
so we need to use _mm_storeu_si128.
Fixes: d21c086d819d78fb3f6abcbb14aa492970f442aa (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For certain EGLImage cases, we represent a single slice or LOD of an
image with a byte offset to a tile and X/Y intratile offsets to the
given slice. Most of i965 is fine with this but it breaks blorp. This
is a terrible way to represent slices of a surface in EGL and we should
stop some day but that's a very scary and thorny path. This gets blorp
to start working with those surfaces and fixes some dEQP EGL test bugs.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629
Cc: [email protected]
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes shader images where we always bind stObj->pt and not individual
gl_texture_images.
Roughly based on i965 commit 845ad2667ab2466752f06ea30bdb9c837116c308
which does a similar thing but for a different reason.
This fixes GL CTS assertion failures introduced by Ilia.
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reference for MOVNTDQA says:
For WC memory type, the nontemporal hint may be implemented by
loading a temporary internal buffer with the equivalent of an
aligned cache line without filling this data to the cache.
[...] Subsequent MOVNTDQA reads to unread portions of the WC
cache line will receive data from the temporary internal
buffer if data is available.
This hidden cache line sized temporary buffer can improve the
read performance from wc maps.
v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When glUseProgram is used, references to the included shaders are
added in ctx->Shader.ReferencedProgram. But those references are not
decreased when the shader data is deallocated. Thus, those shaders
are leaked.
Explicitely remove the pending references to these shaders.
Fixes: e6506b3cd23 ("mesa: retain gl_shader_programs after glDeleteProgram if they are in use")
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Functionality already covered by ARB_texture_view, patch also
adds missing 'gles guard' for enums (added in f1563e6392).
Tested via arb_texture_view.*_gles3 tests and individual app
utilizing texture view with ETC2.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of directly using intel_obj->buffer. Among other things
intel_bufferobj_buffer() will update intel_buffer_object::
gpu_active_start/end, which are used by glBufferSubData() to decide
which path to take. Fixes a failure in the Piglit
ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests,
which could be reproduced with a non-standard glGetTexSubImage
implementation (see bug report).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351
Reported-by: Nanley Chery <[email protected]>
Cc: [email protected]
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Otherwise the specified surface state will allow the GPU to access
memory up to BufferOffset bytes past the end of the buffer. Found by
inspection.
v2: Protect against out-of-range BufferOffset (Nanley).
Cc: [email protected]
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The buffer texture size calculations (should be easy enough, right?)
are repeated in three different places, each of them subtly broken in
a different way. E.g. the image load/store path was never fixed to
clamp to MaxTextureBufferSize, and none of them are taking into
account the buffer offset correctly. It's easier to fix it all in one
place.
Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit c0ed52f6146c7e24e1275451773bd47c1eda3145. It was
preventing the image format validation from being done on buffer
textures, which is required to ensure that the application doesn't
attempt to bind a buffer texture with an internal format incompatible
with the image unit format (e.g. of different texel size), which is
not allowed by the spec (it's not allowed for *any* texture target,
whether or not there is spec wording restricting this behavior
specifically for buffer textures) and will cause the driver to
calculate texel bounds incorrectly and potentially crash instead of
the expected behavior.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This patch adds {X,A}BGR2101010 entries to the list of supported
'intel_image_formats'.
Bug: https://crbug.com/776093
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add R10G10B10{A,X}2 translation between mesa_format and DRI format
to driGLFormatToImageFormat() and driImageFormatToGLFormat().
Bug: https://crbug.com/776093
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The only function that doesn't need to call access_raw is map_blit. If
it takes the blitter path, it will happen as part of intel_miptree_copy.
If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle
doing whatever resolves are needed. This should save us resolves in
quite a few cases and will probably help performance a bit.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We still support the blitter on gen4-5 but it's on the same ring as 3D.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
It's faster than the blitter and can handle things like stencil properly
so it doesn't require software fallbacks.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
The blorp path (called first) can do anything the blitter path can do so
it's just dead code.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On gen4-5, we try the blitter before we even try blorp. On newer
platforms, blorp can do everything the blitter can so there's no point
in even having the blitter fall-back path.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This function is no longer used.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using meta for anything is fairly aweful and definitely has more CPU
overhead. However, it also uses the 3D pipe and is therefore likely
faster in terms of GPU time than the blitter. Also, the blitter code
has so many early returns that it's probably not buying us that much.
We may as well just use meta all the time instead of working over-time
to find the tiny case where we can use the blitter. We keep gen4-5
using the old blit paths to avoid perturbing old hardware too much.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'd like to start using soft-pin to assign BO addresses up front, and
never move them again. Our previous plan for dealing with 48-bit VF
cache bugs was to relocate vertex buffers to the low 4GB, so we'd never
have addresses that alias in the low 32 bits. But that requires moving
buffers dynamically.
This patch tracks the last seen BO address for each vertex/index buffer,
and emits a VF cache invalidate if the high bits change. (Ideally, we
won't hit this case very often.) This should work for the soft-pin
case, but unfortunately won't work in the relocation case, as we don't
actually know the addresses. So, we have to use both methods.
v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple
more explicitly (suggested by Scott). Mention "single batch" too
(suggested by Chris).
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're planning to start managing the PPGTT in userspace in the near
future, rather than relying on the kernel to assign addresses. While
most buffers can go anywhere, some need to be restricted to within 4GB
of a base address.
This commit adds a "memory zone" parameter to the BO allocation
functions, which lets the caller specify which base address the BO will
be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA.
Eventually, I hope to create a 4GB memory zone corresponding to each
state base address.
Reviewed-by: Scott D Phillips <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
Just let the extension detection do its job as we will be adding
compat profile support in future, also we want these to work
with compat profile version overrides.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This is useful for every user of ISL. Drop the comment along the way to
match similar functions in ISL.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
intel_miptree_supports_{ccs,mcs,hiz} ensures the format is valid for the
color or depth miptree before the miptree is assigned an aux_usage.
alloc_aux switches on the aux_usage so don't assert that the format is
valid.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Synchronize the requirements listed in isl_surf_get_ccs_surf with
intel_miptree_supports_ccs by importing a restriction from ISL. Some
implications:
* We successfully create every aux_surf in alloc_aux
* We only return false from alloc_aux if we run out of memory
Reviewed-by: Topi Pohjolainen <[email protected]>
|