| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement ir_binop_vector_extract using NIR operations. Based on SPIR-V
to NIR approach.
This fixes:
dEQP-GLES3.functional.shaders.indexing.moredynamic.with_value_from_indexing_expression_fragment
Piglit's glsl-fs-vec4-indexing-8.shader_test
CC: [email protected]
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Iago Toral <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Adds suppport for ARB_fragment_shader_interlock. We achieve
the interlock and fragment ordering by issuing a memory fence
via sendc.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension provides new GLSL built-in functions
beginInvocationInterlockARB() and endInvocationInterlockARB()
that delimit a critical section of fragment shader code. For
pairs of shader invocations with "overlapping" coverage in a
given pixel, the OpenGL implementation will guarantee that the
critical section of the fragment shader will be executed for
only one fragment at a time.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
After bebe3d626e5, b->fail_jump is prepared after vtn_create_builder
which can longjmp(3) to it through its vtx_assert()s. This corrupts
the stack and creates confusing core dumps, so we need to avoid it.
While there, I decided to print the offending values for debugability.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The driver must support at least one of
PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT
PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT
and one of
PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER
PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER
otherwise glsl_to_tgsi will fire an assert.
ORIGIN_UPPER_LEFT is the default convention, and is supported by
all mesa drivers, hence it seems reasonable to always report the caps
to be enabled. On gles ORIGIN_LOWER_LEFT is generally not supported,
so we rely on the caps reported by the host that depend on whether we
run on an GL or an EGL host.
For PIXEL_CENTER it is completely host driver dependend on what is
supported, and since we do not report the actual host driver capabilities
it is best to mark both as supported, this is how it works for a GL
host too.
Fixes:
dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_xyz
dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_1
dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_2
Reviewed-by: Gurchetan Singh <[email protected]>
Signed-off-by: Gert Wollny <[email protected]>
Signed-off-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The value returned by tgsi_util_get_texture_coord_dim() does not
account for the sample index. This means image_fetch_coords() will not
fetch it, leading to a null deref in ac_build_image_opcode() which
expects it to be present (the return value of ac_num_coords() *does*
include the sample index).
Signed-off-by: Alex Smith <[email protected]>
Cc: "18.1" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was not previously handled correctly. For example,
push_constant_stages might only contain MESA_SHADER_VERTEX because
only that stage was changed by CmdPushConstants or
CmdBindDescriptorSets.
In that case, if vertex has been merged with tess control, then the
push constant address wouldn't be updated since
pipeline->shaders[MESA_SHADER_VERTEX] would be NULL.
Use radv_get_shader() instead of getting the shader directly so that
we get the right shader if merged. Also, skip emitting the address
redundantly - if two merged stages are set in push_constant_stages
this change would have made the address get emitted twice.
Signed-off-by: Alex Smith <[email protected]>
Cc: "18.1" <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was being handled in a few different places, consolidate it into a
single radv_get_shader() function.
Signed-off-by: Alex Smith <[email protected]>
Cc: "18.1" <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With GFX9 merged shaders, active_stages would be set to the original
stages specified if shaders were not cached, but to the stages still
present after merging if they were.
Be consistent and use the original stages.
Signed-off-by: Alex Smith <[email protected]>
Cc: "18.1" <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106748
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Jason Ekstrand):
- Split the blorp bit into it's own patch and re-order a bit
- Use anv_address helpers
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
This is better than having BO and offset fields.
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This commit renames add_surface_state_reloc to add_surface_reloc and
makes it takes an address. We also rename add_image_view_relocs to
add_surface_state_relocs because it takes an anv_surface_state and
doesn't really care about the image view anymore.
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
| |
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
| |
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of storing a BO and offset separately, use an anv_address. This
changes anv_fill_buffer_surface_state to use anv_address and we now call
anv_address_physical and pass that into ISL.
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This refactors surface state filling to work entirely in terms of
anv_addresses instead of offsets. This should make things simpler for
when we go to soft-pin image buffers. Among other things,
add_image_view_relocs now only cares about the addresses in the surface
state and doesn't really need the image view anymore.
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
| |
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These will be used to assign virtual addresses to soft pinned
buffers in a later patch.
Two allocators are added for separate 'low' and 'high' virtual
memory areas. Another alternative would have been to add a
double-sided allocator, which wasn't done here just because it
didn't appear to give any code complexity advantages.
v2 (Scott Phillips):
- rename has_exec_softpin to use_softpin (Jason)
- Only remove bottom one page and top 4 GiB from virt (Jason)
- refer to comment in anv_allocator about state address + size
overflowing 48 bits (Jason)
- Mention hi/lo allocators vs double-sided allocator in
commit message (Chris)
- assign state pool memory ranges statically (Jason)
v3 (Jason Ekstrand):
- Use (LOW|HIGH)_HEAP_(MIN|MAX)_ADDRESS rather than (1 << 31) for
determining which heap to use in anv_vma_free
- Only return de-canonicalized addresses to the heap
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test pseudo-randomly makes allocations and deallocations with
the virtual memory allocator and checks that the results are
consistent. Specifically, we test that:
* no result from the allocator overlaps an already allocated range
* allocated memory fulfills the stated alignment requirement
* a failed result from the allocator could not have been fulfilled
* memory freed to the allocator can later be allocated again
v2: - fix if() in test() to actually run fill()
v3: - add c++11 build flag (Jason)
- test the full 64-bit range (Jason)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is simple linear-walk first-fit allocator roughly based on the
allocator in the radeon winsys code. This allocator has two primary
functional differences:
1) It cleanly returns 0 on allocation failure
2) It allocates addresses top-down instead of bottom-up.
The second one is needed for Intel because high addresses (with bit 47
set) need to be canonicalized in order to work properly. If we allocate
bottom-up, then high addresses will be very rare (if they ever happen).
We'd rather always have high addresses so that the canonicalization code
gets better testing.
v2: - [scott-ph] remove _heap_validate() if NDEBUG is defined (Jordan)
Reviewed-by: Scott D Phillips <[email protected]>
Tested-by: Scott D Phillips <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a RADV_DEBUG=startup option to dump more info about
instance creation and device enumeration.
A common question end users have is why the direver is not loading
for them, and this has two common reasons:
1) They did not install the driver.
2) AMDGPU is not used for the card in the kernel.
This adds some info messages so we can easily get a some useful
output from end users.
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Errors are not that common of a case so we can eat a slight perf
hit in having to call a function and do a runtime check.
In turn this makes debugging random errors happening for end users
easier, because they don't have to have a debug build on hand.
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
They are only used in 1 file.
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Totals from affected shaders:
SGPRS: 80 -> 80 (0.00 %)
VGPRS: 48 -> 48 (0.00 %)
Code Size: 2120 -> 2096 (-1.13 %) bytes
Max Waves: 16 -> 16 (0.00 %)
Only two Rise of Tomb Raider shaders are affected on my side.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Patch skips useless and possibly dangerous calls down to the driver
in case invalid arguments were given. I noticed this would be happening
with demo of Darwinia game. AFAIK this does not fix anything but makes
this path safer and more like how other API functions are implemented.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CXXLD gallium_dri.la
../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_dump_packet':
src/broadcom/clif/clif_dump.c:87: undefined reference to `v3d33_clif_dump_packet'
src/broadcom/clif/clif_dump.c:85: undefined reference to `v3d41_clif_dump_packet'
../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_process_worklist':
src/broadcom/clif/clif_dump.c:140: undefined reference to `v3d41_clif_dump_gl_shader_state_record'
src/broadcom/clif/clif_dump.c:144: undefined reference to `v3d33_clif_dump_gl_shader_state_record'
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Gurchetan Singh <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Signed-off-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jakob Bornecrantz <[email protected]>
Signed-off-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jakob Bornecrantz <[email protected]>
Signed-off-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jakob Bornecrantz <[email protected]>
Signed-off-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jakob Bornecrantz <[email protected]>
Signed-off-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes this use all 32 bits, so future sets need to be
defined in a new struct.
Reviewed-by: Jakob Bornecrantz <[email protected]>
Signed-off-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
| |
This avoids loop unrolling regressions in Wolfenstein II on DXVK
with an upcoming optimisation series from Samuel.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The current implementation depends on bpermute, which
is VI+.
Fixes: f2c6a550611 "radv: enable subgroup capabilities"
Reviewed-by: Daniel Schürmann <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This was terribly wrong, I forced use of 32-bit pointers when
emitting shader descriptor pointers. This fixes GPU hangs with
LLVM 5&6 because 32-bit pointers are only supported with LLVM 7.
Fixes: 88d1ed0f81 ("radv: emit shader descriptor pointers consecutively")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Fixes: f7604d8af52 ("st/dri: only expose config formats that are display targets")
Cc: "18.1" <[email protected]>
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Gallium drivers don't expose this yet due to:
"st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This requires layered FBOs from GL 3.2.
Gallium drivers don't expose this yet due to:
"st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Gallium drivers don't expose this yet due to:
"st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY"
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
Bindless texture handles can be passed via vertex attribs using this type.
They use the double codepath, so don't use st_pipe_vertex_format.
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Bindless texture handles can be passed via vertex attribs using this type.
This fixes a bunch of bindless piglit tests on radeonsi.
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
This is required for tessellation shader Compat profile support.
Reviewed-by: Marek Olšák <[email protected]>
|