| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: * Add a gl_shader_spirv_data member to gl_shader, which already
encapsulates a gl_spirv_module where the binary will be saved.
(Eduardo Lima)
* Just use the 'spirv_data' member to know whether a gl_shader has
the SPIR_V_BINARY_ARB state. (Timothy Arceri)
* Remove redundant argument checks. Move extension presence check
to API entry point where the rest of checks are. Retype 'n' and
'length'arguments to use the correct and more standard types.
(Ian Romanick)
* Fix some nitpicks. (Ian Romanick)
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a per-shader structure holding the SPIR-V data associated with the
shader (binary module, specialization constants and entry-point).
This is needed because both gl_shader and gl_linked_shader need to share this
data. Instead of copying the data, we pass a reference to it upon program
linking. That's why it is reference-counted.
This struct is created and associated with the shader upon calling
glShaderBinary(), then subsequently filled up by the call to
glSpecializeShaderARB().
v2: Readability improvements (Ian Romanick)
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: * Make the SPIR-V module struct part of a larger gl_shader_spirv_data
struct that will be introduced later, and don't reference it directly
in gl_shader. (Eduardo Lima)
* Readability improvements (Ian Romanick)
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: * Add meson build bits (Eric Engestrom)
* Return INVALID_OPERATION error on SpecializeShaderARB (Ian Romanick)
v3: Include boilerplate for the GL 4.6 alias of glSpecializeShaderARB
(Neil Roberts)
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101560
|
|
|
|
|
|
|
|
|
| |
Instead of calling vtn_add_case for the default case and then looping,
add an is_default variable and do everything inside the loop. This will
make the next commit easier.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
| |
This autogenerated pass will automatically find and set the type field
on all vtn_values. This way we always have the type and can use it for
validation and other checks.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
At the moment, this just lets us drop the const_type for constants and
unify things a bit. Eventually, we will use this to store the types of
all SPIR-V SSA values.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can write to the same output but in different components, like
in this example:
layout(location = 0, component = 0) out ivec2 dEQP_FragColor_0;
layout(location = 0, component = 2) out ivec2 dEQP_FragColor_1;
Therefore, they are not two different outputs but only one.
Fixes:
dEQP-VK.glsl.440.linkage.varying.component.frag_out.*
v3:
- Remove FRAG_RESULT_MAX.
- Add const and use sizeof (Ian).
- Do three-pass to set properly the locations of fragment
outputs when having arrays (Jason).
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
GLSL IR operation arguments can sometimes have an implicit swizzle as a
result of a vector arg and a scalar arg, where the scalar argument is
implicitly expanded to the size of the vector argument.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103955
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Care must be taken that all coords end up correct, the tests are very
sensitive that everything is correctly rounded. This doesn't matter
for bilinear filter (since picking a wrong texel with weight zero is
ok), and we could also switch the per-sample coords mistakenly.
While here, also optimize the coord_mirror helper a bit (we can do the
mirroring directly by exploiting float rounding, no need for fixing up
odd/even manually).
I did not touch the mirror_clamp and mirror_clamp_to_border modes.
In contrast to mirror_clamp_to_edge and mirror_repeat these are legacy
modes. They are specified against old gl rules, which actually does
the mirroring not per sample (so you get swapped order if the coord
is in the mirrored section). I think the idea though is that they should
follow the respecified mirror_clamp_to_edge rules so the order would be
correct.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since we switched over to lowering SLM access directly in SPIR-V -> NIR,
we no longer have vtn_variables for SLM. It's all safe as with UBOs and
SSBOs but we need to let it through in the assert.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104213
Fixes: 8761a04d0d9332d9c0c99164faf855fc3c741f7c
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Fixes: 93b4cb61eb2 "spirv: Allow OpPtrAccessChain for block indices"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
To support the reindex intrinsic, we need the result to be
something on which we can adjust the index/address.
Since it is all within a basic block, the compiler should be
able to merge any extra loads.
v2: Change visit_get_buffer_size too.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Cc: 17.3 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
If the app does not plan to put a buffer or image in it
(why? But it is allowed and CTS does it), they do not need to
allocate it with the deciate allocation struct.
Fixes: a639d40f133 "radv: add support for local bos. (v3)"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There is no chain, so checking the length ends with a SEGFAULT.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103579
Cc: <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Push constants on Intel hardware are significantly more performant than
pull constants. Since most Vulkan applications don't actively use push
constants on Vulkan or at least don't use it heavily, we're pulling way
more than we should be. By enabling pushing chunks of UBOs we can get
rid of a lot of those pulls.
On my SKL GT4e, this improves the performance of Dota 2 and Talos by
around 2.5% and improves Aztec Ruins by around 2%.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
In Vulkan, we don't support classic pull constants and everything the
client asks us to push, we push. However, for pushed UBOs, we still
want to fall back to conventional pulls if we run out of space.
|
|
|
|
|
|
|
|
| |
Push constants work in terms of 32-byte chunks so if we want to be able
to push UBOs, every thing needs to be 32-byte aligned. Currently, we
only require 16-byte which is too small.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
In order to do this we have to modify push constant set up to handle
ranges. We also have to tweak the way we handle dirty bits a bit so
that we re-push whenever a descriptor set changes.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
There are several places where we look up opcodes in an array of stages.
Assert that the we don't end up going out-of-bounds.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We want to call brw_nir_analyze_ubo_ranges immedately after
anv_nir_apply_pipeline_layout and it badly wants constants. We could
run an optimization step and let constant folding do it but that's way
more expensive than needed. It's really easy to just handle constants
in apply_pipeline_layout.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This rewires the logic for assigning uniform locations to work in terms
of "complex alignments". The basic idea is that, as we walk the list of
instructions, we keep track of the alignment and continuity requirements
of each slot and assert that the alignments all match up. We then use
those alignments in the compaction stage to ensure that everything gets
placed at a properly aligned register. The old mechanism handled
alignments by special-casing each of the bit sizes and placing 64-bit
values first followed by 32-bit values.
The old scheme had the advantage of never leaving a hole since all the
64-bit values could be tightly packed and so could the 32-bit values.
However, the new scheme has no type size special cases so it handles not
only 32 and 64-bit types but should gracefully extend to 16 and 8-bit
types as the need arises.
Tested-by: Jose Maria Casanova Crespo <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The testing for this extension is currently very poor. The CTS tests
only test accessing UBOs and SSBOs at dynamic offsets so none of our
constant-offset paths get triggered at all. Also, there's an assertion
in our handling of nir_intrinsic_load_uniform that offset % 4 == 0 which
is never triggered indicating that nothing every gets loaded from an
offset which is not a dword. Both push constants and the constant
offset pull paths are complex enough, we really don't want to ship
without tests. We'll turn the extension back on once we have decent
tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VCE processing IBs starts from session and task info at first level,
other commands processed subsequently. The task info for destroy is
embedded to destroy command, resulting that feedback command is not
properly procoessed. This is causing kernel spin VM fault messages on
Polaris and Vega10 card when running ends at encode application.
The fix is also verified on VCE physical mode card.
Signed-off-by: Leo Liu <[email protected]>
Cc: [email protected]
Acked-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Power8, Power8NV, and Power9 are supported on an equal footing
with X86.
Cc: "17.2" "17.3" <[email protected]>
Signed-off-by: Ben Crocker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
[Eric: changed formatting, reworded a bit (with Ben's ack)]
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 49a612d1580b3316392273a069d20d93967126a8)
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 8d55da9f579463038f4305ed7d505aa7fffa0f37)
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Might be useful to know the VRAM/GTT usage, the number of VRAM
CPU page faults, etc. Nothing is currently using that new
interface, but it's a first step.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
emit_fast_color_clear() already checks that.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Already checked by the respective callers.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
dota2 binds a ton of index buffers but the type is always 16-bit.
Note that we have to invalidate the type when switching from
indexed draws to normal draws.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
dota2 always calls vkResetCommandBuffer() before
vkBeginCommandBuffer() which is quite useless.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think
it makes sense to declare it. Could be used later with better
command buffer error handling.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Copied from RadeonSI.
This fixes all CTS
dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.clear.*
And some other ones which use the same format.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Acked-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This resolves an apparent game bug described in 85564. The game
doesn't properly handle ARB_get_program_binary with 0 supported
formats.
V2 (Timothy Arceri):
- less driver code as more has been moved into the common helpers.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564
Signed-off-by: Timothy Arceri <[email protected]>
Signed-off-by: Jordan Justen <[email protected]> (v1)
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GL_ARB_get_program_binary extension spec says:
"If ProgramBinary fails to load a binary, no error is generated, but
any information about a previous link or load of that program object
is lost."
v2:
* Re-initialize shProg->data after clear. (Jordan)
(Required after 6a72eba755fea15a0d97abb913a6315d9d32e274)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
V2: call generic mesa_program_binary() helper rather than driver
function directly to allow greater code sharing.
Signed-off-by: Timothy Arceri <[email protected]>
Signed-off-by: Jordan Justen <[email protected]> (v1)
Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
V2: call generic _mesa_get_program_binary() helper rather than driver
function directly to allow greater code sharing.
Signed-off-by: Timothy Arceri <[email protected]>
Signed-off-by: Jordan Justen <[email protected]> (v1)
Reviewed-by: Nicolai Hähnle <[email protected]> (v1)
Reviewed-by: Tapani Pälli <[email protected]>
|