| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Noticed while reviewing Tim Arceri's NIR inlining series.
Without his series:
instructions in affected programs: 16 -> 14 (-12.50%)
helped: 2
With his series:
instructions in affected programs: 196 -> 174 (-11.22%)
helped: 22
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
It can do 32-bit packing too now.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
With 16-bit support we can now do 32-bit packing, a follow-up patch will
rename the pass to something more generic.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Noitice that we don't need 'split' versions of the 64-bit to / from
16-bit opcodes which we require during pack lowering to implement these
operations. This is because these operations can be expressed as a
collection of 32-bit from / to 16-bit and 64-bit to / from 32-bit
operations, so we don't need new opcodes specifically for them.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not all bit-sizes may be supported natively in hardware for all operations.
This pass allows drivers to lower such operations to a bit-size that is
actually supported and then converts the result back to the original
bit-size.
Compiler backends control which operations and wich bit-sizes require
the lowering through a callback function.
v2: generalize this pass and make it available in NIR core (Rob, Jason)
v3: remove some temporaries and reduce nesting in instruction loop using
a continue statement (Jason)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This behaviour was changed in 1e5b09f42f694687ac. The commit message
for that says it is just a “tidy up” so my assumption is that the
behaviour change was a mistake. It’s a little hard to decipher looking
at the diff, but the previous code before that patch was:
if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
nir_var->data.origin_upper_left = b->origin_upper_left;
if (builtin == SpvBuiltInFragCoord)
nir_var->data.pixel_center_integer = b->pixel_center_integer;
After the patch the code was:
case SpvBuiltInSamplePosition:
nir_var->data.origin_upper_left = b->origin_upper_left;
/* fallthrough */
case SpvBuiltInFragCoord:
nir_var->data.pixel_center_integer = b->pixel_center_integer;
break;
Before the patch origin_upper_left affected both builtins and
pixel_center_integer only affected FragCoord. After the patch
origin_upper_left only affects SamplePosition and pixel_center_integer
affects both variables.
This patch tries to restore the previous behaviour by changing the
code to:
case SpvBuiltInFragCoord:
nir_var->data.pixel_center_integer = b->pixel_center_integer;
/* fallthrough */
case SpvBuiltInSamplePosition:
nir_var->data.origin_upper_left = b->origin_upper_left;
break;
This change will be important for ARB_gl_spirv which is meant to
support OriginLowerLeft.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Fixes: 1e5b09f42f694687ac "spirv: Tidy some repeated if checks..."
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SPIR-V allows to define the shift, offset and count operands for
shift and bitfield opcodes with a bit-size different than 32 bits,
but in NIR the opcodes have that limitation. As agreed in the
mailing list, this patch adds a conversion to 32 bits to fix this.
For more info, see:
https://lists.freedesktop.org/archives/mesa-dev/2018-April/193026.html
v2:
- src_bit_size will have zero value for variable bit-size operands (Jason).
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
nir_builder_opcodes.h also depends on nir_intrinsics.py for generating
the system-value builders.
Reported-by: Christoph Haag <[email protected]>
Reported-by: Kenneth Graunke <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This VS system value contains if the draw command used to start the
rendering was an indexed draw command or a non-indexed one
(~0/0 respectively). Useful to calculate the gl_BaseVertex as:
(SYSTEM_VALUE_IS_INDEXED_DRAW & SYSTEM_VALUE_FIRST_VERTEX).
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
To silence warnings about unhandled switch values.
Untested otherwise.
v2: move the INT/UINT8 cases after the INT/UINT16 cases, per Eric.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
With this we should have no passes in src/compiler/nir with any
dependencies on headers from core GL Mesa.
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware
should probably already return false if one of the operands is NaN so
we don’t need to have an explicit check for it. This seems to at least
work on Intel hardware. This should reduce the number of instructions
generated for the most common comparisons.
For what it’s worth, the original code to handle this was added in
e062eb6415de3a. The commit message for that says that it was to fix
some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t
handle NaNs this patch shouldn’t affect those tests. At any rate they
have since been moved out of the mustpass list. Incidentally those
tests fail on the nvidia proprietary driver so it doesn’t seem like
handling NaNs correctly is a priority.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch enables use of short and unsigned short data for texture uploads,
rendering and reading of framebuffers within the restrictions specified
in GL_EXT_texture_norm16 spec.
Patch also enables those 16bit format layout qualifiers listed in
GL_NV_image_formats that depend on EXT_texture_norm16.
v2: expose extension with dummy_true
fix layout qualifier map changes (Ilia Mirkin)
v3: use _mesa_has_EXT_texture_norm16, other fixes
and cleanup (Ilia Mirkin)
v4: fix rest of the issues found
Signed-off-by: Tapani Pälli <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
meson has gotten pretty smart about tracking C and C++ dependencies
(internal and external), and using the right linker. This wasn't always
the case and we created empty c++ files to force the use of the c++
linker. We don't need that any more.
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL 4.6 spec describes hex constant as:
hexadecimal-constant:
0x hexadecimal-digit
0X hexadecimal-digit
hexadecimal-constant hexadecimal-digit
Right now if you have a shader with the following structure:
#if 0X1 // or any hex number with the 0X prefix
// some code
#endif
the code between #if and #endif gets removed because the checking is performed
only for "0x" prefix which results in strtoll being called with the base 8 and
after encountering the 'X' char the strtoll returns 0. Letting strtoll detect
the base makes this limitation go away and also makes code easier to read.
From the strtoll Linux man page:
"If base is zero or 16, the string may then include a "0x" prefix, and the
number will be read in base 16; otherwise, a zero base is taken as 10 (decimal)
unless the next character is '0', in which case it is taken as 8 (octal)."
This matches the behaviour in the GLSL spec.
This patch also adds a test for uppercase hex prefix.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I would have thought falling out of scope would allow the gc to collect
these, but apparently it doesn't, and this hits an fd limit on macos.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106133
Fixes: db8cd8e36771eed98eb638fd0593c978c3da52a9
("glcpp/tests: Convert shell scripts to a python script")
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We seem to use progress for two cases:
1) When we lowered some returns.
2) When we remove unreachable code.
If just case 2 happens we assert as state->return_flag has not
been allocated yet, but we are still trying to do insert all
predicates based on it.
This splits the concerns. We only use progress internally for case 1
and then keep track of 2 in a separate variable to indicate progress
in the return value of the pass.
This is slightly better than transforming the assert into
if (!state->return_flag) return, as the solution in this patch avoids
inserting predicates even if some other part of the might need them.
Fixes: 6e22ad6edc "nir: return early when lowering a return at the end of a function"
CC: 18.1 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106174
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
It looks as if the structure fields array is fully initialized below,
but in fact at least gcc in debug builds will not actually overwrite
the unused bits of bit fields.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
base_vertex will be zero for non-indexed calls and in that case we
need vertex_id to be offset by the ‘first’ parameter instead. That is
what we get with first_vertex. This is true for both GL and Vulkan.
The freedreno driver is also setting vertex_id_zero_based on
nir_options. In order to avoid breakage this patch switches the
relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can
retain the same behavior.
v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from
SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth).
Reviewed-by: Ian Romanick <[email protected]>
Cc: Rob Clark <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.
v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used
for SpvBuiltInBaseVertex. Suggested by Jason.
Reviewed-by: Ian Romanick <[email protected]> [v1]
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This VS system value will contain the value passed as <basevertex> for
indexed draw calls or the value passed as <first> for non-indexed draw
calls. It can be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.
From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays":
- Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance
is referred to as its vertex ID, and may be read by a vertex shader as
gl_VertexID. The vertex ID of the ith element transferred is first +
i."
- Page 355:
"The index of any element transferred to the GL by
DrawElementsOneInstance is referred to as its vertex ID, and may be read
by a vertex shader as gl_VertexID. The vertex ID of the ith element
transferred is the sum of basevertex and the value stored in the
currently bound element array buffer at offset indices + i."
Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but
this will have to change when the value of gl_BaseVertex is
fixed. Currently its value is broken for non-indexed draw calls because
it must be zero but we are setting it to <first>.
v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).
v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be
generated. Reformat commit message to 72 columns.
Reviewed-by: Neil Roberts <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Acked-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This ports glcpp-test.sh and glcpp-test-cr-lf.sh to a python script that
accepts arguments for each line ending type. This should allow for
better reporting to users.
v2: - Use $PYTHON2 to be consistent with other tests in mesa
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Signed-off-by: Dylan Baker <[email protected]>
|
|
|
|
| |
Signed-off-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch converts optimization-test.sh to python, in this process it
removes external shell dependencies including diff. It replaces the
python script that generates shell scripts with a python library that
generates test cases and runs them using subprocess.
v2: - use $PYTHON2 to be consistent with other tests in mesa
Signed-off-by: Dylan Baker <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reimplements the test in python with a shell script wrapper that
allows autotools to continue to run the test without realizing that
anything has changed.
Using python has two advantages, first it's portable so this test can be
run on windows as well as Linux since it just requires python, no more
diff, pwd or sh. It's also no longer tied to autotools implementation
details, like the environment variables $srcdir and $abs_builddir,
though the autotools shell wrapper still uses those, which makes it
possible to run the test in meson.
v2: - Use $PYTHON2 in script to be consistent with other scripts in mesa
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SPIR-V spec doesn’t specify a size requirement for these and the
equivalent functions in the GLSL spec have explicit alternatives for
doubles. Refract is a little bit more complicated due to the fact that
the final argument is always supposed to be a scalar 32- or 16- bit
float regardless of the other operands. However in practice it seems
there is a bug in glslang that makes it convert the argument to 64-bit
if you actually try to pass it a 32-bit value while the other
arguments are 64-bit. This adds an optional conversion of the final
argument in order to support any type.
These have been tested against the automatically generated tests of
glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch
which tests it with quite a large range of combinations.
The issue with glslang has been filed here:
https://github.com/KhronosGroup/glslang/issues/1279
v2: Convert the eta operand of Refract from any size in order to make
it eventually cope with 16-bit floats.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only change neccessary is to change the type of the constant used
to compare against.
This has been tested against the arb_gpu_shader_fp64/execution/
fs-isinf-dvec tests using the ARB_gl_spirv branch.
v2: Use nir_imm_floatN_t for the constant.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
There is an existing macro that is used to choose between either a
float or a double immediate constant based on the bit size of the
first operand to the builtin. This is now changed to use the new
nir_imm_floatN_t helper function to reduce the number of places that
make this decision.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This lets you easily build float immediates just given the bit size.
If we have this single place here to handle this then it will be
easier to add support for 16-bit floats later.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise we create unused conditional return flags and things
get unnecessarily ugly fast when lowering nested functions.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes warnings like this:
[184/1137] Compiling C++ object 'src/compiler/glsl/glsl@sta/lower_jumps.cpp.o'.
In file included from ../src/mesa/main/mtypes.h:48,
from ../src/compiler/glsl_types.h:149,
from ../src/compiler/glsl/lower_jumps.cpp:59:
../src/compiler/glsl/lower_jumps.cpp: In member function '{anonymous}::block_record {anonymous}::ir_lower_jumps_visitor::visit_block(exec_list*)':
../src/compiler/glsl/list.h:650:17: warning: unnecessary parentheses in declaration of 'node' [-Wparentheses]
for (__type *(__inst) = (__type *)(__list)->head_sentinel.next; \
^
../src/compiler/glsl/lower_jumps.cpp:510:7: note: in expansion of macro 'foreach_in_list'
foreach_in_list(ir_instruction, node, list) {
^~~~~~~~~~~~~~~
Signed-off-by: Marc Dietrich <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
A couple spots were missed for handling of the new INT8/UINT8 base type.
Also de-duplicate get_base_type().. get_scalar_type() had nearly the
same switch statement, with the exception that anything with base_type
that was not scalar would return error_type. So just handle that one
special case in get_scalar_type().
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
ir_binop_gequal needs to be converted to nir_op_sge when native integers
are not supported in the driver.
Otherwise it becomes no different than ir_binop_less after the
conversion.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|