| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These built-ins have two "out" parameters, which makes implementing them
efficiently with our current compiler infrastructure difficult. Instead,
implement them in terms of the existing ir_binop_mul IR (to return the
low 32-bits) and a new ir_binop_mul64 which returns the high 32-bits.
v2: Rename mul64 -> imul_high as suggested by Ken.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Calculates the carry out of the addition of two values and the
borrow from subtraction respectively. Will be used in uaddCarry() and
usubBorrow() built-in implementations.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
This gives the compiler the chance to inline and not export class symbols
even in the absence of LTO. Saves about 60kb on disk.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
lrp() can take a scalar as a third argument, and fma() cannot.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
v2: Drop frexp. Rebase on builtins rewrite.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It's a ?: that operates per-component on vectors. Will be used in
upcoming lowering pass for ldexp and the implementation of frexp.
csel(selector, a, b):
per-component result = selector ? a : b
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
v2: Add constant folding support.
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that all the places that used to generate array derefeneces of
vectors have been changed to generate either ir_binop_vector_extract or
ir_triop_vector_insert (or both), remove all support for dealing with
this deprecated construct.
As an added safeguard, modify ir_validate to reject ir_dereference_array
of a vector.
v2: Convert tabs to spaces. Suggested by Eric.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new opcode is used to generate a new vector with a single field from
the source vector replaced. This will eventually replace
ir_dereference_array of vectors in the LHS of assignments.
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add constant expression handling for ir_triop_vector_insert. This
prevents the constant matrix inversion tests from regressing. Duh.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new opcode is used to get a single field from a vector. The field
index may not be constant. This will eventually replace
ir_dereference_array of vectors. This is similar to the extractelement
instruction in LLVM IR.
http://llvm.org/docs/LangRef.html#extractelement-instruction
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add array index range checking to ir_binop_vector_extract constant
expression handling. Suggested by Ken.
v4: Use CLAMP instead of MIN2(MAX2()). Suggested by Ken.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
i965/Gen7+ and Radeon/Evergreen+ have bfm/bfi instructions to implement
bitfieldInsert() from ARB_gpu_shader5.
v2: Add ir_binop_bfm and ir_triop_bfi to st_glsl_to_tgsi.cpp.
Remove spurious temporary assignment and dereference.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
| |
v2: Move use of ir_binop_bfm and ir_triop_bfi to a later patch.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since half of ir_validate uses asserts() (the other using printf() then
abort()), there's not much use to calling it in a release build. Cuts
6.3% of the startup time of TF2.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).
Pattern matching or peepholing this is more desirable, but can be
tricky. By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.
Currently, all consumers lower ir_triop_lrp. Subsequent patches will
actually generate different code.
v2 [mattst88]:
- Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
subsequent patch and ir_triop_lrp translated directly.
v3 [mattst88]:
- Move changes from the next patch to opt_algebraic.cpp to accept
3-src operations.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For each function {pack,unpack}{Snorm,Unorm}4x8, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For each function {pack,unpack}{Snorm,Unorm,Half}2x16, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Also, add opcodes for scalarized variants of the Half2x16 functions. (The
code generator for the i965 fragment shader requires that all vector
operations be scalarized. A lowering pass, to be added later, will
scalarize the Half2x16 functions).
v2: Fix assertion message in ir_to_mesa [for idr].
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Tuner <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces the three ir_variable_mode enums:
- ir_var_in
- ir_var_out
- ir_var_inout
with the following five:
- ir_var_shader_in
- ir_var_shader_out
- ir_var_function_in
- ir_var_function_out
- ir_var_function_inout
This eliminates a frustrating ambiguity: it used to be impossible to
tell whether an ir_var_{in,out} variable was a shader in/out or a
function in/out without seeing where the variable was declared in the
IR. This complicated some optimization and lowering passes, and would
have become a problem for implementing varying structs.
In the lisp-style serialization of GLSL IR to strings performed by
ir_print_visitor.cpp and ir_reader.cpp, I've retained the names "in",
"out", and "inout" for function parameters, to avoid introducing code
churn to the src/glsl/builtins/ir/ directory.
Note: a couple of comments in the code seemed to indicate that we were
planning for a possible future in which geometry shaders could have
shader-scope inout variables. Our GLSL grammar rejects shader-scope
inout variables, and I've been unable to find any evidence in the GLSL
standards documents (or extensions) that this will ever be allowed, so
I've eliminated these comments.
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reported by coverity scan.
v2: fix second case
Note: This is a candidate for stable branches.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Drivers will probably want to be able to take UBO references in a
shader like:
uniform ubo1 {
float a;
float b;
float c;
float d;
}
void main() {
gl_FragColor = vec4(a, b, c, d);
}
and generate a single aligned vec4 load out of the UBO. For intel,
this involves recognizing the shared offset of the aligned loads and
CSEing them out. Obviously that involves breaking things down to
loads from an offset from a particular UBO first. Thus, the driver
doesn't want to see
variable_ref(ir_variable("a")),
and even more so does it not want to see
array_ref(record_ref(variable_ref(ir_variable("a")),
"field1"), variable_ref(ir_variable("i"))).
where a.field1[i] is a row_major matrix.
Instead, we're going to make a lowering pass to break UBO references
down to expressions that are obvious to codegen, and amenable to
merging through CSE.
v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth)
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we performed conversions from float->uint by a two step
process: float->int->uint. However, on platforms that use saturating
conversions (e.g. i965), this didn't work, because if the source value
was larger than the maximum representable int (0x7fffffff), then
converting it to an int would clamp it to 0x7fffffff.
This patch just adds the new opcode; further patches will adapt
optimization passes and back-ends to use it, and then finally the
ast_to_hir logic will be modified to emit the new opcode.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Olivier Galibert <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Variables have types, expression trees have types, but statements don't.
Rather than have a nonsensical field that stays NULL in the base class,
just move it to where it makes sense.
Fix up a few places that lazily used ir_instruction even though they
actually knew the particular subclass.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, set_callee() performed some assertions about the type of the
ir_call; protecting the bare pointer ensured these checks would be run.
However, ir_call no longer has a type, so the getter and setter methods
don't actually do anything useful. Remove them in favor of accessing
callee directly, as is done with most other fields in our IR.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aside from ir_call, our IR is cleanly split into two classes:
- Statements (typeless; used for side effects, control flow)
- Values (deeply nestable, pure, typed expression trees)
Unfortunately, ir_call confused all this:
- For void functions, we placed ir_call directly in the instruction
stream, treating it as an untyped statement. Yet, it was a subclass
of ir_rvalue, and no other ir_rvalue could be used in this way.
- For functions with a return value, ir_call could be placed in
arbitrary expression trees. While this fit naturally with the source
language, it meant that expressions might not be pure, making it
difficult to transform and optimize them. To combat this, we always
emitted ir_call directly in the RHS of an ir_assignment, only using
a temporary variable in expression trees. Many passes relied on this
assumption; the acos and atan built-ins violated it.
This patch makes ir_call a statement (ir_instruction) rather than a
value (ir_rvalue). Non-void calls now take a ir_dereference of a
variable, and store the return value there---effectively a call and
assignment rolled into one. They cannot be embedded in expressions.
All expression trees are now pure, without exception.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
ir_validate.cpp: In member function ‘virtual ir_visitor_status ir_validate::visit_leave(ir_swizzle*)’:
ir_validate.cpp:458:66: warning: narrowing conversion of ‘ir->ir_swizzle::mask.ir_swizzle_mask::x’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11 [-Wnarrowing]
ir_validate.cpp:458:66: warning: narrowing conversion of ‘ir->ir_swizzle::mask.ir_swizzle_mask::y’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11 [-Wnarrowing]
ir_validate.cpp:458:66: warning: narrowing conversion of ‘ir->ir_swizzle::mask.ir_swizzle_mask::z’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11 [-Wnarrowing]
ir_validate.cpp:458:66: warning: narrowing conversion of ‘ir->ir_swizzle::mask.ir_swizzle_mask::w’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11 [-Wnarrowing]
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This requires tracking a couple extra fields in ir_variable:
* A flag to indicate that a variable had an initializer.
* For non-const variables, a field to track the constant value of the
variable's initializer.
For variables non-constant initalizers, ir_variable::has_initializer
will be true, but ir_variable::constant_initializer will be NULL. The
linker can use the values of these fields to check adherence to the
GLSL 4.20 rules for shared global variables:
"If a shared global has multiple initializers, the initializers
must all be constant expressions, and they must all have the same
value. Otherwise, a link error will result. (A shared global
having only one initializer does not require that initializer to
be a constant expression.)"
Previous to 4.20 the GLSL spec simply said that initializers must have
the same value. In this case of non-constant initializers, this was
impossible to determine. As a result, no vendor actually implemented
that behavior. The 4.20 behavior matches the behavior of NVIDIA's
shipping implementations.
NOTE: This is candidate for the 7.11 branch. This patch also needs
the preceding patch "glsl: Refactor generate_ARB_draw_buffers_variables
to use add_builtin_constant"
Signed-off-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34687
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Paul Berry <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no ir_hierarchical_visitor::visit(ir_if *) method, since ir_if
is not a leaf node. Instead, there are visit_enter and visit_leave
methods. Use visit_enter arbitrarily (either would work fine, though
visit_enter will catch errors sooner).
Found thanks to a warning emitted by Clang.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch extends ir_validate.cpp to check the following
characteristics of each ir_call:
- The number of actual parameters must match the number of formal
parameters in the signature.
- The type of each actual parameter must match the type of the
corresponding formal parameter in the signature.
- Each "out" or "inout" actual parameter must be an lvalue.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Reverts commit f41e1db3273a31285360241c4342f0a403ee0b03
"fix conversions from uint to bool and from float/bool to uint"
f2i, b2i, and b2i should not accept uint types. Use i2u and u2i.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are necessary to handle int/uint constructor conversions. For
example, the following code currently results in a type mismatch:
int x = 7;
uint y = uint(x);
In particular, uint(x) still has type int.
This commit simply adds the new operations; it does not generate them,
nor does it add backend support for them.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Ian Romanick <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
The signature list in a function must contain only ir_function_signature nodes.
The target of an ir_call must be an ir_function_signature.
These were added while trying to debug Mesa bugzilla #34203.
|
|
|
|
|
| |
The return type can be void, and this is the case where a `_ret_val'
variable should not be declared.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Standard library functions in C++ are in the std namespace. When using
C++-style header files for the standard library, some compilers, such as
Sun Studio, provide symbols only for the std namespace and not for the
global namespace.
This patch adds using statements for standard library functions. Another
option could have been to prepend standard library function calls with
'std::'.
This patch fixes several compilation errors with Sun Studio.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The vector operator collects 2, 3, or 4 scalar components into a
vector. Doing this has several advantages. First, it will make
ud-chain tracking for components of vectors much easier. Second, a
later optimization pass could collect scalars into vectors to allow
generation of SWZ instructions (or similar as operands to other
instructions on R200 and i915). It also enables an easy way to
generate IR for SWZ instructions in the ARB_vertex_program assembler.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The operate just like ir_unop_sin and ir_unop_cos except that they
expect their inputs to be limited to the range [-pi, pi]. Several
GPUs require this limited range for their sine and cosine
instructions, so having these as operations (along with a to-be-written
lowering pass) helps this architectures.
These new operations also matche the semantics of the
GL_ARB_fragment_program SCS instruction. Having these as operations
helps in generating GLSL IR directly from assembly fragment programs.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In ir_validate::visit_leave(), the cases for
- ir_binop_bit_and
- ir_binop_bit_xor
- ir_binop_bit_or
were incorrect. It was incorrectly asserted that both operands must be the
same type, when in fact one may be scalar and the other a vector. It was also
incorrectly asserted that the resultant type was the type of the left operand,
which in fact does not hold when the left operand is a scalar and the right
operand is a vector.
|
|
|
|
|
|
|
|
|
|
| |
Implement by adding the following cases to ast_expression::hir():
- ast_lshift
- ast_rshift
Also, implement ir validation for the new operators by adding the following
cases to ir_validate::visit_leave():
- ir_binop_lshift
- ir_binop_rshift
|
|
|
|
| |
Caught the bug in the previous commit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It turns out that most people new to this IR are surprised when an
assignment to (say) 3 components on the LHS takes 4 components on the
RHS. It also makes for quite strange IR output:
(assign (constant bool (1)) (x) (var_ref color) (swiz x (var_ref v) ))
(assign (constant bool (1)) (y) (var_ref color) (swiz yy (var_ref v) ))
(assign (constant bool (1)) (z) (var_ref color) (swiz zzz (var_ref v) ))
But even worse, even we get it wrong, as shown by this line of our
current step(float, vec4):
(assign (constant bool (1)) (w)
(var_ref t)
(expression float b2f (expression bool >=
(swiz w (var_ref x))(var_ref edge))))
where we try to assign a float to the writemasked-out x channel and
don't supply anything for the actual w channel we're writing. Drivers
right now just get lucky since ir_to_mesa spams the float value across
all the source channels of a vec4.
Instead, the RHS will now have a number of components equal to the
number of components actually being written. Hopefully this confuses
everyone less, and it also makes codegen for a scalar target simpler.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|