| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This only operates on constant/uniform values for now, because otherwise I'd
have to deal with killing my available CSE entries when assignments happen,
and getting even this working in the tree ir was painful enough.
As is, it has the following effect in shader-db:
total instructions in shared programs: 1524077 -> 1521964 (-0.14%)
instructions in affected programs: 50629 -> 48516 (-4.17%)
GAINED: 0
LOST: 0
And, for tropics, that accounts for most of the effect, the FPS
improvement is 11.67% +/- 0.72% (n=3).
v2: Use read_only field of the variable, manually check the lod_info union
members, use get_num_operands(), rename cse_operands_visitor to
is_cse_candidate_visitor, move all is-a-candidate logic to that
function, and call it before checking for CSE on a given rvalue, more
comments, use private keyword.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I made this a function (instead of a method of ir_variable) because it
made the change set smaller, and I expect that there will be an overload
that takes an ir_var_mode enum. Having both functions used the same way
seemed better.
v2: Add missing case for ir_var_system_value.
v3: Change the ir_var_mode_count case to just break. Move the assertion
and the return outside the switch-statment. In the unlikely event that
var->mode is an invalid value other than ir_var_mode_count, the
assertion will still fire, and in release builds we won't wind up
returning a garbage pointer. Suggested by Paul.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the linker to deal with intrinsic functions which are undefined
all the way down to the driver back-end, and introduce intrinsic
definition helpers in the built-in generator.
We still need to figure out what kind of interface we want for drivers
to communicate to the GLSL front-end which of the supported intrinsics
should use a default GLSL implementation and which should use a
hardware-specific override. As there's no default GLSL implementation
for atomic ops, this seems like something we can worry about later on.
Reviewed-by: Ian Romanick <[email protected]>
v2: Define local helper function to generate ir_call nodes in the
builtin generator.
|
|
|
|
|
|
|
|
|
| |
v2: Fix GLSL version in which the type became available. Add
contains_atomic() convenience method. Split off atomic counter
comparison error checking to a separate patch that will handle all
opaque types. Include new ir_variable fields for atomic types.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
These variables will need to be treated specially by
program_resource_visitor, so that they can be addressed through the
API using their interface block name (and array index, for interface
block arrays).
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Future patches will need to call this function when there isn't an
ir_varible present to refer to.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This will be used by future patches to change an ir_variable's
interface type when the gl_PerVertex built-in interface block is
redeclared.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were already setting the array size of unsized arrays that appeared
inside unnamed interface blocks, but we weren't updating
ir_variable::interface_type to reflect the new array size, causing
bogus link errors.
This patch causes array_sizing_visitor to keep track of all the
unnamed interface types it sees, and the ir_variables corresponding to
each one. After the visitor runs, a new function,
fixup_unnamed_interface_types(), adjusts each unnamed interface type
to correctly correspond with the array sizes in the ir_variables.
Fixes piglit tests:
- spec/glsl-1.50/execution/unsized-in-unnamed-interface-block-gs
- spec/glsl-1.50/execution/unsized-in-unnamed-interface-block-multiple
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unsized arrays appearing inside named interface blocks now get a
proper size assigned by the array_sizing_visitor.
Fixes piglit tests:
- spec/glsl-1.50/execution/unsized-in-named-interface-block
- spec/glsl-1.50/execution/unsized-in-named-interface-block-gs
- spec/glsl-1.50/linker/unsized-in-named-interface-block
- spec/glsl-1.50/linker/unsized-in-named-interface-block-gs
- spec/glsl-1.50/linker/unsized-in-unnamed-interface-block-gs (*)
(*) is fixed by dumb luck--support for unsized arrays in unnamed
interface blocks will come in a later patch.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For interface blocks that contain arrays, this field will contain the
maximum element of each contained array that is accessed by the
shader. This is a first step toward supporting unsized arrays in
interface blocks.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
In a future patch, this will allow us to enforce invariants when the
interface type is updated.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These built-ins have two "out" parameters, which makes implementing them
efficiently with our current compiler infrastructure difficult. Instead,
implement them in terms of the existing ir_binop_mul IR (to return the
low 32-bits) and a new ir_binop_mul64 which returns the high 32-bits.
v2: Rename mul64 -> imul_high as suggested by Ken.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Calculates the carry out of the addition of two values and the
borrow from subtraction respectively. Will be used in uaddCarry() and
usubBorrow() built-in implementations.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
ARB_gpu_shader5 introduces new variants of textureGather* which have an
explicit component selector, rather than relying purely on the sampler's
swizzle state.
This patch adds the GLSL plumbing for the extra parameter.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
V2 [Chris Forbes]:
- Add new pattern, fixup parameter reading.
V3: Rebase onto new builtins machinery
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Note the parameter name change in the int version of ir_constant, to
avoid the conflict with the loop iterator.
v2: Make analogous change to builtin_builder::imm().
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
v2: Drop frexp. Rebase on builtins rewrite.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It's a ?: that operates per-component on vectors. Will be used in
upcoming lowering pass for ldexp and the implementation of frexp.
csel(selector, a, b):
per-component result = selector ? a : b
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
builtin_info was originally going to be a structure containing a bunch
of information, but after various rewrites, it turned into a boolean
availability predicate.
builtin_avail is a better name than builtin_info, since it doesn't
store any information other than availability.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This creates a new replacement for the existing built-in function code.
The new module lives in builtin_functions.cpp (not builtin_function.cpp)
and exists in parallel with the existing system. It isn't used yet.
The new built-in function code takes a significantly different approach:
Instead of implementing built-ins via printed IR, build time scripts,
and run time parsing, we now implement them directly in C++, using
ir_builder. This translates to faster load times, and a much less
complex build system.
It also takes a different approach to built-in availability: each
signature now stores a boolean predicate, which makes it easy to
construct arbitrary expressions based on _mesa_glsl_parse_state's
fields. This is much more flexible than the old system, and also
easier to use.
Built-ins are also now stored in a single gl_shader object, rather
than being spread out across a number of shaders that need to be linked.
When searching for a matching prototype, we simply consult the
availability predicate. This also simplifies the code.
v2: Incorporate Matt Turner's feedback: use the new fma() function rather
than expr(). Don't expose textureQueryLOD() in GLSL 4.00 (since it
was renamed to textureQueryLod()). Also correct some #undefs.
v3: Incorporate Paul Berry's feedback: rename legacy to compatibility;
add comments to explain a few things; fix uvec availability; include
shaderobj.h instead of repeating the _mesa_new_shader prototype.
v4: Fix lack of TEX_PROJECT on textureProjGrad[Offset] (caught by oglc).
Add an out_var convenience function (more feedback by Matt Turner).
v5: Rework availability predicates for Lod functions. They were broken.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Enthusiastically-acked-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already have ir_expression constructors for unary and binary
operations, which automatically infer the type based on the opcode and
operand types.
These are convenient and also required for ir_builder support.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This isn't strictly necessary, since creators of ir_texture objects
should set LOD when relevant. However, it's nice to have a NULL pointer
in case they forget.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During compilation, we'll use this to determine built-in availability.
The plan is to have a single shader containing every built-in in every
version of the language, but filter out the ones that aren't actually
available to the shader being compiled.
At link time, we don't actually need this filtering capability: we've
already imported prototypes for every built-in that the shader actually
calls, and they're flagged as is_builtin(). The linker doesn't import
any additional prototypes, so it won't pull in any unavailable
built-ins. When resolving prototypes to function definitions, the
linker ensures the values of is_builtin() match, which means that a
shader can't trick the linker into importing the body of an unavailable
built-in by defining a suspiciously similar prototype.
In other words, during linking, we can just pass in NULL. It will work
out fine.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We can simply call the stored predicate function. If state is NULL,
just report that the function is available.
v2: Add a comment (requested by Paul Berry).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
A signature is a built-in if and only if builtin_info != NULL, so we
don't actually need a separate flag bit. Making a boolean-valued
method allows existing code to ask the same question while not worrying
about the internal representation.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the upcoming built-in function rewrite, we'll need to be able to
answer "Is this built-in function signature available?".
This is actually a somewhat complex question, since it depends on the
language version, GLSL vs. GLSL ES, enabled extensions, and the current
shader stage.
Storing such a set of constraints in a structure would be painful, so
instead we store a function pointer. When creating a signature, we
simply point to a predicate that inspects _mesa_glsl_parse_state and
answers whether the signature is available in the current shader.
Unfortunately, IR reader doesn't actually know when built-in functions
are available, so this patch makes it lie and say that they're always
present. This allows us to hook up the new functionality; it just won't
be useful until real data is populated. In the meantime, the existing
profile mechanism ensures built-ins are available in the right places.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
| |
v2: Add constant folding support.
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds all of the parsing and semantics for GLSL 150 style
geometry shaders.
v2 (Paul Berry <[email protected]>): Add a few missing calls to
get_pipeline_stage(). Fix some signed/unsigned comparison warnings.
Fix handling of NULL consumer in assign_varying_locations().
v3 (Bryan Cain <[email protected]>): fix indexing order of 2D
arrays. Also, allow interpolation qualifiers in geometry shaders.
v4 (Paul Berry <[email protected]>): Eliminate
get_pipeline_stage()--it is no longer needed thanks to 030ca23 (mesa:
renumber shader indices according to their placement in pipeline).
Remove 2D stuff. Move vertices_per_prim() to ir.h, so that it will be
accessible from outside the linker. Remove
inject_num_vertices_visitor. Rework for GLSL 1.50.
Reviewed-by: Ian Romanick <[email protected]>
v5 (Paul Berry <[email protected]>): Split out
do_set_program_inouts() argument refactoring to a separate patch.
Move geom_array_resizing_visitor to later in the series.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These correspond to the EmitVertex and EndPrimitive functions in GLSL.
v2 (Paul Berry <[email protected]>): Add stub implementations of
new pure visitor functions to i965's vec4_visitor and fs_visitor
classes.
v3 (Paul Berry <[email protected]>): Rename classes to be more
consistent with the names used in the GL spec.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This will allow us to add geometry shader support without having to
add another boolean argument.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These are not used yet, but they exist and are copied appropriately.
v2: Add an explicit "int binding" variable rather than reusing
constant_value, as suggested by Paul Berry.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
No more forgetting to #include "ir_print_visitor.h" when doing temporary
debug code, or forgetting and leaving it in after removing your temporary
debug code. Also, available from C code so you don't need to move the
caller to C++ just to call it (see also: ir_to_mesa.cpp).
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new opcode is used to generate a new vector with a single field from
the source vector replaced. This will eventually replace
ir_dereference_array of vectors in the LHS of assignments.
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add constant expression handling for ir_triop_vector_insert. This
prevents the constant matrix inversion tests from regressing. Duh.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new opcode is used to get a single field from a vector. The field
index may not be constant. This will eventually replace
ir_dereference_array of vectors. This is similar to the extractelement
instruction in LLVM IR.
http://llvm.org/docs/LangRef.html#extractelement-instruction
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add array index range checking to ir_binop_vector_extract constant
expression handling. Suggested by Ken.
v4: Use CLAMP instead of MIN2(MAX2()). Suggested by Ken.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
i965/Gen7+ and Radeon/Evergreen+ have bfm/bfi instructions to implement
bitfieldInsert() from ARB_gpu_shader5.
v2: Add ir_binop_bfm and ir_triop_bfi to st_glsl_to_tgsi.cpp.
Remove spurious temporary assignment and dereference.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
| |
v2: Move use of ir_binop_bfm and ir_triop_bfi to a later patch.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
ir_print_visitor::visit(ir_variable *)'s mode[] array needs to match
the declaration of the enum ir_variable_mode. It's hard to verify
that at compile time, but at least we can use a STATIC_ASSERT to make
sure it's the right size.
This required adding ir_var_mode_count to the enum.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2 [mattst88]:
- Rebase.
- #define GL_ARB_texture_query_lod to 1.
- Remove comma after ir_lod in ir.h for MSVC.
- Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp,
opt_tree_grafting.cpp.
- Rename textureQueryLOD to textureQueryLod, see
https://www.khronos.org/bugzilla/show_bug.cgi?id=821
- Fix ir_reader of (lod ...).
v3 [mattst88]:
- Rename textureQueryLod to textureQueryLOD, pending resolution of
Khronos 821.
- Add ir_lod case to ir_to_mesa.cpp.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
V2: - emit `sample` parameter properly for multisample texelFetch()
- fix spurious whitespace change
- introduce a new opcode ir_txf_ms rather than overloading the
existing ir_txf further. This makes doing the right thing in
the driver somewhat simpler.
V3: - fix weird whitespace
V4: - don't forget to include the new opcode in tex_opcode_strs[]
(thanks Kenneth for spotting this)
Signed-off-by: Chris Forbes <[email protected]>
[V2] Reviewed-by: Eric Anholt <[email protected]>
[V2] Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).
Pattern matching or peepholing this is more desirable, but can be
tricky. By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.
Currently, all consumers lower ir_triop_lrp. Subsequent patches will
actually generate different code.
v2 [mattst88]:
- Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
subsequent patch and ir_triop_lrp translated directly.
v3 [mattst88]:
- Move changes from the next patch to opt_algebraic.cpp to accept
3-src operations.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we had separate constructors for one, two, and four operand
expressions. This patch consolidates them into a single constructor
which uses NULL default parameters.
The unary and binary operator constructors had assertions to verify that
the caller supplied the correct number of operands for the expression,
but the four-operand version did not. Since get_num_operands for
ir_quadop_vector returns the number of vector_elements, we can safely
add that without breaking the semantics of ir_quadop_vector.
This also paves the way for expressions with three operands. Currently,
none can be constructed since get_num_operands() never returns 3.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Fixes uninitialized pointer field defect reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For each function {pack,unpack}{Snorm,Unorm}4x8, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: A previous patch contained a spurious hunk that removed an
assignment to ir_variable::uniform_block. That hunk was moved to this
patch. Suggested by Carl Worth.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|