| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is equivalent to the linker.cpp
build_program_resource_list() but will extract the resources from NIR
shaders instead.
For now, only uniforms and program inputs are implemented.
v2: move from compiler/nir to compiler/glsl (Timothy Arceri)
v3: remove support for inputs, that is still WIP (spotted by Timothy
Arceri)
Signed-off-by: Eduardo Lima <[email protected]>
Signed-off-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
So it could be used by the GLSL and NIR linker.
v2: (Timothy Arceri)
* Moved from compiler to compiler/glsl
* Method renamed to link_util_add_program_resource
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
This is based on link_uniform_initializers.cpp.
v2: move from compiler/nir to compiler/glsl (Timothy Arceri)
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function will be the entry point for linking the uniforms from
the nir_shader objects associated with the gl_linked_shaders of a
program.
This patch includes initial support for linking uniforms from NIR
shaders. It is tailored for the ARB_gl_spirv needs, and it is far from
complete, but it should handle most cases of uniforms, array
uniforms, structs, samplers and images.
There are some FIXMEs related to specific features that will be
implemented in following patches, like atomic counters, UBOs and
SSBOs.
Also, note that ARB_gl_spirv makes mandatory explicit location for
normal uniforms, so this code only handles uniforms with explicit
location. But there are cases, like uniform atomic counters, that
doesn't have a location from the OpenGL point of view (they have a
binding), but that Mesa assign internally a location. That will be
handled on following patches.
A nir_linker.h file is also added. More NIR-linking related API will
be added in subsequent patches and those will include stuff from Mesa,
so reusing nir.h didn't seem a good idea.
v2: move from compiler/nir to compiler/glsl (Timothy Arceri)
v3: sets var->driver.location if the uniform was found from a previous
stage (Neil Roberts).
Signed-off-by: Eduardo Lima <[email protected]>
Signed-off-by: Neil Roberts <[email protected]
Signed-off-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linker utilities common to the GLSL IR and NIR linker (the latter to
be used for ARB_gl_spirv).
We need to move it to a new header as the NIR linker doesn't need to
know about ir_variable, and others, included at linker.h.
v2: move from src/compiler to src/compiler/glsl (Timothy Arceri)
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When SpvDecorationBinding is encountered in the SPIR-V source it now
sets explicit_binding on the nir_variable. This will be used to
determine whether to initialise sampler and image uniforms with the
binding value.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vtn_variable_mode_image and _sampler are instead replaced with
vtn_variable_mode_uniform which encompasses both of them. In the few
places where it was neccessary to distinguish between the two, the
GLSL type of the pointer is used instead.
The main reason to do this is that on OpenGL it is permitted to put
images and samplers into structs and declare a uniform with them. That
means that variables can now have a mix of uniform, sampler and image
modes so picking a single one of those modes for a variable no longer
makes sense.
This fixes OpLoad on a sampler within a struct which was previously
using the variable mode to determine whether it was a sampler or not.
The type of the variable is a struct so it was not being considered to
be uniform mode even though the member being loaded should be sampler
mode.
The previous code appeared to be using var->interface_type as a place
to store the type of the variable without the enclosing array for
images and samplers. I guess this worked because opaque types can not
appear in interfaces so the interface_type is sort of unused. This
patch removes the overloading of var->interface_type and any places
that needed the type without the array can now just deduce it from
var->type.
v2: squash in this patch the changes to anv/nir (Timothy)
Signed-off-by: Eduardo Lima <[email protected]>
Signed-off-by: Neil Roberts <[email protected]
Signed-off-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
They are supported by SPIR-V for ARB_gl_spirv.
v2 (changes on top of Nicolai's original patch):
* Handle UniformConstant storage class for uniforms other than
samplers and images. (Eduardo Lima)
* Handle location decoration also for samplers and images. (Eduardo
Lima)
* Rebase update (spirv_to_nir options added, logging changes, and
others) (Alejandro Piñeiro)
Signed-off-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Eduardo Lima <[email protected]>
Signed-off-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
I think it is more accurate to call it a sampler target (?).
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
It is basically a wrapper around glsl_type::component_slots().
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vulkan has the concept of separate image and sampler objects in the
SPIR-V code whereas GL conflates them into one. nir_lower_samplers
contains an assert to verify that sampler operand is not being set on
the nir instruction. However when the code comes from spirv_to_nir the
sampler operand is always set. GL_arb_gl_spirv explicitly states that
OpTypeSampler is not supported so it retains the GL behaviour of not
being able to seperate them. Therefore the sampler will always be the
same as the texture. This GL version of the lowering code ignores
instr->sampler and sets instr->sampler_index to the same value as
instr->texture_index. Some other places in the code (such as in
nir_print) assume that once the instruction is lowered then both
instr->texture and instr->sampler will be NULL, so to keep this
behaviour we now set instr->sampler to NULL after ignoring it to fill
in instr->sampler_index.
Signed-off-by: Eduardo Lima <[email protected]>
Signed-off-by: Neil Roberts <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
This is copied from the corresponding value in ir_variable. The
intention is to eventually use it in a pure-NIR linker.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the recent rework of converting the shell script to a python one
the check for actual tests was dropped.
Bring that back, since it was explicitly added considering we had a ~2
year period, during which the tests were not run.
v2: use raise Exception() over print() & return false (Dylan)
Fixes: db8cd8e36771 ("glcpp/tests: Convert shell scripts to a python
script")
Cc: Dylan Baker <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Bring back the "detection" of the said variables, to allow
standalone execution.
Fixes: db8cd8e36771 ("glcpp/tests: Convert shell scripts to a python
script")
Cc: Dylan Baker <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
As of recently both of these have been reworked so they invoke a python
script. At the same time the latter can be executed with the combined
arguments of both scripts.
AKA we no longer need to have them separate.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Run this pass late (after opt loop) to move load_const instructions back
into the basic blocks which use the result, in cases where a load_const
is only consumed in a single block.
This helps reduce register usage in cases where the backend driver
cannot lower the load_const to a uniform.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Save the next person from digging through the code to figure out what
the indirect_mask parameter actually does.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Just something I stumbled across.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
The prog->Shaders[i]->IsES check was accidentally removed causing
ES linking rules to be applied to desktop GLSL.
Fixes: 725b1a406dbe ("mesa/util: add allow_glsl_relaxed_es driconfig override")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This relaxes a number of ES shader restrictions allowing shaders
to follow more desktop GLSL like rules.
This initial implementation relaxes the following:
- allows linking ES shaders with desktop shaders
- allows mismatching precision qualifiers
- always enables standard derivative builtins
These relaxations allow Google Earth VR shaders to compile.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Google Earth VR shaders uses builtins in constant expressions with
GLSL 1.10. That feature wasn't allowed until GLSL 1.20.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nir_ssa_def::parent_instr and nir_src::parent_instr have the same name,
but they mean really different things. I choose to save the next person
the hour+ that I just spent figuring that out. Even now that I know, I
doubt I'd notice in code review that someone typed foo->parent_instr
when they actually meant foo->ssa->parent_instr.
v2: Minor wording tweak in nir_ssa_def::parent_instr. Suggested by
Jason.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since SSBOs can be written by a different GPU thread, copy propagating a
read can cause the value to magically change. SSBO reads are also very
expensive, so doing it twice will be slower.
The same shader was helped by this patch and the previous.
Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399119 -> 14399113 (<.01%)
instructions in affected programs: 683 -> 677 (-0.88%)
helped: 1
HURT: 0
total cycles in shared programs: 532973113 -> 532971865 (<.01%)
cycles in affected programs: 524666 -> 523418 (-0.24%)
helped: 1
HURT: 0
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since SSBOs can be written by other GPU threads, copy propagating a read
can cause the value to magically change. SSBO reads are also very
expensive, so doing it twice will be slower.
Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399120 -> 14399119 (<.01%)
instructions in affected programs: 684 -> 683 (-0.15%)
helped: 1
HURT: 0
total cycles in shared programs: 532978931 -> 532973113 (<.01%)
cycles in affected programs: 530484 -> 524666 (-1.10%)
helped: 1
HURT: 0
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL 4.60 offically added this but games and older CTS suites actually
had shaders that did this, we may as well enable it everywhere.
Adding stable because it appears apps in the wild do this.
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
An attempt to support SpvExecutionModeStencilRefReplacingEXT's behavior
also follows, with the interpretation to said mode being we prevent
writes to the built-in FragStencilRefEXT variable when the execution
mode isn't set.
v3:
A more cautious reading of 1db44252d01bf7539452ccc2b5210c74b8dcd573 led
me to a missing change that would stop (what I later discovered were)
GPU hangs on the CTS test written to exercise this.
v4:
Turn FragStencilRefEXT decoration usage without StencilRefReplacingEXT
mode into a warning, instead of trying to make the variable read-only.
If we are to follow the originating extension on GL, the built-in
variable in question should never be readable anyway.
v5/v6: rebases.
v7:
Fix check for gen9 lost in rebase. (Ilia)
Reduce the scope of the bool used to track whether
SpvExecutionModeStencilRefReplacingEXT was used. Was in shader_info,
moved to vtn_builder. (Jason)
v8:
Assert for fragment shader handling StencilRefReplacingEXT execution
mode. (Caio)
Remove warning logic, since an entry point might not have
StencilRefReplacingEXT execution mode, but the global output variable
might still exist for another entry point in the module. (Jason)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Add the missing nir intrinsic for the gl_GlobalInvocationID
compute shader variable.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass detects potential loop terminators and moves intructions
from the non breaking branch after the if-statement.
This enables both the new opt_if_simplification() pass and loop
unrolling to potentially progress further.
Unexpectedly this change speed up shader-db run times by ~3%
Ivy Bridge shader-db results (all changes in dolphin/ubershaders):
total instructions in shared programs: 9995662 -> 9995338 (-0.00%)
instructions in affected programs: 87845 -> 87521 (-0.37%)
helped: 27
HURT: 0
total cycles in shared programs: 230931495 -> 230925015 (-0.00%)
cycles in affected programs: 56391385 -> 56384905 (-0.01%)
helped: 27
HURT: 0
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
We will use the helper while simplifying potential loop terminators
in the following patch.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mesa/st decides whether to update samplers after a program change based on
whether num_textures is nonzero. By not counting samplers in a uniform
struct, we would segfault in
KHR-GLES3.shaders.struct.uniform.sampler_vertex if it was run in the same
context after a non-vertex-shader-uniform testcase (as is the case during
a full conformance run).
v2: Implement using two separate pure functions instead of updating
pointers.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is basically the same as the GLSL lowering path.
v2: Fix typo in the link
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
This is basically the same as the GLSL lowering path.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
This is based on the glsl/lower_instructions.cpp implementation, but
should be much more readable.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
There is a fairly simple relation to turn this into ufind_msb.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
ufind_msb is easily expressed in terms of clz, and we can reduce ifind_msb
to that.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
V3D doesn't have opcodes for ibfe/ubfe, so we need to lower similarly to
glsl/lower_instructions.cpp.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If you don't have HW to do bfi, then lowering bitfieldInsert to bfi makes
things harder than keeping the "bits" argument around.
This still uses bfm, but I've added the obvious lowering of bfm if you
need it.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
GLSL ES 1.0.17 specifies that "double" is a keyword reserved
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106823
Signed-off-by: zhaowei yuan <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass turns:
if (cond) {
} else {
do_work();
}
into:
if (!cond) {
do_work();
} else {
}
Here's the vkpipeline-db stats (from affected shaders) on Polaris10:
Totals from affected shaders:
SGPRS: 17272 -> 17296 (0.14 %)
VGPRS: 18712 -> 18740 (0.15 %)
Spilled SGPRs: 1179 -> 1142 (-3.14 %)
Code Size: 1503364 -> 1515176 (0.79 %) bytes
Max Waves: 916 -> 911 (-0.55 %)
This pass only affects Serious Sam 2017 (Vulkan) on my side. The
stats are not really good for now. Some shaders look quite dumb
but this will be improved with further NIR passes, like ifs
combination.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Rename and change the prototype for consistency regarding
nir_tex_instr_is_query(). This function will be used in the
following patch.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
These wrappers were introduces, so start using them.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some trivial help now, but it also prevents ~40 regressions caused by
Samuel's "nir: implement the GLSL equivalent of if simplication in
nir_opt_if" patch.
All Gen4+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14369557 -> 14369555 (<.01%)
instructions in affected programs: 442 -> 440 (-0.45%)
helped: 2
HURT: 0
total cycles in shared programs: 532425772 -> 532425743 (<.01%)
cycles in affected programs: 6086 -> 6057 (-0.48%)
helped: 2
HURT: 0
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
d8d18516b0a and 03fb13f6467 added some patterns to undo conversions like
(('ior', ('flt', a, b), ('flt', a, c)), ('flt', a, ('fmax', b, c)))
If further optimization cause some of the operands to either be the same
or be constants, undoing the transformation can lead to further savings.
I don't know why these patterns were not added in those patches. I did
not check to see which specific patterns actually helped. I just added
all of them for symmetry. This prevents some loop unrolling regressions
Plane Shift caused by Samuel's "nir: implement the GLSL equivalent of if
simplication in nir_opt_if" patch.
Skylake and Broadwell had similar results. (Skylake shown)
total instructions in shared programs: 14369768 -> 14369557 (<.01%)
instructions in affected programs: 44076 -> 43865 (-0.48%)
helped: 141
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 1.50 x̃: 1
helped stats (rel) min: 0.07% max: 1.52% x̄: 0.66% x̃: 0.60%
95% mean confidence interval for instructions value: -1.67 -1.32
95% mean confidence interval for instructions %-change: -0.72% -0.59%
Instructions are helped.
total cycles in shared programs: 532430629 -> 532425772 (<.01%)
cycles in affected programs: 1170832 -> 1165975 (-0.41%)
helped: 101
HURT: 5
helped stats (abs) min: 1 max: 160 x̄: 48.54 x̃: 32
helped stats (rel) min: <.01% max: 8.49% x̄: 2.76% x̃: 2.03%
HURT stats (abs) min: 2 max: 22 x̄: 9.20 x̃: 4
HURT stats (rel) min: <.01% max: 0.05% x̄: 0.02% x̃: <.01%
95% mean confidence interval for cycles value: -53.64 -38.00
95% mean confidence interval for cycles %-change: -3.06% -2.20%
Cycles are helped.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement ir_binop_vector_extract using NIR operations. Based on SPIR-V
to NIR approach.
This fixes:
dEQP-GLES3.functional.shaders.indexing.moredynamic.with_value_from_indexing_expression_fragment
Piglit's glsl-fs-vec4-indexing-8.shader_test
CC: [email protected]
Signed-off-by: Juan A. Suarez Romero <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Iago Toral <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension provides new GLSL built-in functions
beginInvocationInterlockARB() and endInvocationInterlockARB()
that delimit a critical section of fragment shader code. For
pairs of shader invocations with "overlapping" coverage in a
given pixel, the OpenGL implementation will guarantee that the
critical section of the fragment shader will be executed for
only one fragment at a time.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
After bebe3d626e5, b->fail_jump is prepared after vtn_create_builder
which can longjmp(3) to it through its vtx_assert()s. This corrupts
the stack and creates confusing core dumps, so we need to avoid it.
While there, I decided to print the offending values for debugability.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Totals from affected shaders:
SGPRS: 80 -> 80 (0.00 %)
VGPRS: 48 -> 48 (0.00 %)
Code Size: 2120 -> 2096 (-1.13 %) bytes
Max Waves: 16 -> 16 (0.00 %)
Only two Rise of Tomb Raider shaders are affected on my side.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
This avoids loop unrolling regressions in Wolfenstein II on DXVK
with an upcoming optimisation series from Samuel.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
v2 (Jose Maria Casanova Crespo <[email protected]>): add float16 support
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
|