| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.
v2: - use a switch statement in varying_info_visitor::visit(ir_variable*)
- use snprintf
- disable the optimization for GLES2
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
We counted even the varyings which were later eliminated, which was
suboptimal.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.
For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
See my explanation in mtypes.h.
v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
We were duplicating this code all over the place, and they all would need
updating for the next set of shader targets.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were counting uniforms located in UBOs against the default uniform
block limit, while not doing any counting against the specific combined
limit.
Note that I couldn't quite find justification for the way I did this, but
I think it's the only sensible thing: The spec talks about components, so
each "float" in a std140 block would count as 1 component and a "vec4"
would count as 4, though they occupy the same amount of space. Since GPU
limits on uniform buffer loads are surely going to be about the size of
the blocks, I just counted them that way.
Fixes link failures in piglit
arb_uniform_buffer_object/maxuniformblocksize when ported to geometry
shaders on Paul's GS branch, since in that case the max block size is
bigger than the default uniform block component limit.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Verify that interface blocks match when linking separate shader
stages into a program.
Fixes piglit glsl-1.50 tests:
* linker/interface-blocks-vs-fs-member-count-mismatch.shader_test
* linker/interface-blocks-vs-fs-member-order-mismatch.shader_test
Signed-off-by: Kenneth Graunke <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Verify that interface blocks match when combining compilation
units at the same stage. (For example, when merging all vertex
shaders.)
Fixes piglit glsl-1.50 test:
* linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test
v5 (Ken): Rename to link_interface_blocks.cpp and drop the separate .h
file for consistency with other linker code. Remove "ok" variable.
Fold cross_validate_interface_blocks into its caller.
Signed-off-by: Jordan Justen <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Convert interface blocks with instance names into flat
interface blocks without an instance name.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
do_common_optimization may need to make choices about whether to emit
certain kinds of instructions. gl_context::ShaderCompilerOptions
contains exactly that information, so it makes sense to pass it in.
Rather than passing the whole array, pass the structure for the stage
that's currently being worked on.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Const.MaxTextureImageUnits -> Const.FragmentProgram.MaxTextureImageUnits
Const.MaxVertexTextureImageUnits -> Const.VertexProgram.MaxTextureImageUnits
etc.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since half of ir_validate uses asserts() (the other using printf() then
abort()), there's not much use to calling it in a release build. Cuts
6.3% of the startup time of TF2.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name(). This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).
Future patches will make use of this function for linking transform
feedback varyings.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the function added in the previous commit.
This temporarily causes gles3conform
uniform_buffer_object_index_of_not_active_block,
uniform_buffer_object_inherit_and_override_layouts, and
uniform_buffer_object_repeat_global_scope_layouts to assertion fail.
This is fixed in the next commit.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The way a variable is tested for this property is about to change, and
this makes the code easier to modify.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces the three ir_variable_mode enums:
- ir_var_in
- ir_var_out
- ir_var_inout
with the following five:
- ir_var_shader_in
- ir_var_shader_out
- ir_var_function_in
- ir_var_function_out
- ir_var_function_inout
This eliminates a frustrating ambiguity: it used to be impossible to
tell whether an ir_var_{in,out} variable was a shader in/out or a
function in/out without seeing where the variable was declared in the
IR. This complicated some optimization and lowering passes, and would
have become a problem for implementing varying structs.
In the lisp-style serialization of GLSL IR to strings performed by
ir_print_visitor.cpp and ir_reader.cpp, I've retained the names "in",
"out", and "inout" for function parameters, to avoid introducing code
churn to the src/glsl/builtins/ir/ directory.
Note: a couple of comments in the code seemed to indicate that we were
planning for a possible future in which geometry shaders could have
shader-scope inout variables. Our GLSL grammar rejects shader-scope
inout variables, and I've been unable to find any evidence in the GLSL
standards documents (or extensions) that this will ever be allowed, so
I've eliminated these comments.
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This looks like a copy-and-paste left over.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
linker.cpp is getting pretty big, and we're about to add even more
varying packing code, so split out the linker code that concerns
varyings to its own file.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Previously this macro existed in 3 separate places, some inside the
intel driver and some outside of it. It makes more sense to have it
in main/macros.h
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Not sure what was going on here, but running piglit with debug builds
might be a good plan :-)
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements varying packing between varyings.
Previously, each varying occupied components 0 through N-1 of its
assigned varying slot, so there was no way to pack two varyings into
the same slot. For example, if the varyings were a float, a vec2, a
vec3, and another vec2, they would be stored as follows:
<----slot1----> <----slot2----> <----slot3----> <----slot4----> slots
* * * * * * * * * * * * * * * *
flt x x x <vec2-> x x <--vec3---> x <vec2-> x x varyings
(Each * represents a varying component, and the "x"s represent wasted
space).
This change packs the varyings together to eliminate wasted space
between varyings, like so:
<----slot1----> <----slot2----> <----slot3----> <----slot4----> slots
* * * * * * * * * * * * * * * *
<vec2-> <vec2-> flt <--vec3---> x x x x x x x x varyings
Note that we take advantage of the sort order introduced in previous
patches (vec4's first, then vec2's, then scalars, then vec3's) to
minimize how often a varying is "double parked" (split across varying
slots).
Reviewed-by: Eric Anholt <[email protected]>
v2: Skip varying packing if ctx->Const.DisableVaryingPacking is true.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements varying packing within varyings that are
composed of multiple vectors of size less than 4 (e.g. arrays of
vec2's, or matrices with height less than 4).
Previously, such varyings used up a full 4-wide varying slot for each
constituent vector, meaning that some of the components of each
varying slot went unused. For example, a mat4x3 would be stored as
follows:
<----slot1----> <----slot2----> <----slot3----> <----slot4----> slots
* * * * * * * * * * * * * * * *
<-column1-> x <-column2-> x <-column3-> x <-column4-> x matrix
(Each * represents a varying component, and the "x"s represent wasted
space). In addition to wasting precious varying components, this
layout complicated transform feedback, since the constituents of the
varying are expected to be output to the transform feedback buffer
contiguously (e.g. without gaps between the columns, in the case of a
matrix).
This change packs the constituents of each varying together so that
all wasted space is at the end. For the mat4x3 example, this looks
like so:
<----slot1----> <----slot2----> <----slot3----> <----slot4----> slots
* * * * * * * * * * * * * * * *
<-column1-> <-column2-> <-column3-> <-column4-> x x x x matrix
Note that matrix columns 2 and 3 now cross a boundary between varying
slots (a characteristic I call "double parking" of a varying).
We don't bother trying to eliminate the wasted space at the end of the
varying, since the patch that follows will take care of that.
Since compiler back-ends don't (yet) support this packed layout, the
lower_packed_varyings function is used to rewrite the shader into a
form where each varying occupies a full varying slot. Later, if we
add native back-end support for varying packing, we can make this
lowering pass optional.
Reviewed-by: Eric Anholt <[email protected]>
v2: Skip varying packing if ctx->Const.DisableVaryingPacking is true.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch paves the way for varying packing by adding a sorting step
before varying assignment, which sorts the varyings into an order that
increases the likelihood of being able to find an efficient packing.
First, varyings are sorted into "packing classes" by considering
attributes that can't be mixed during varying packing--at the moment
this includes base type (float/int/uint/bool) and interpolation mode
(smooth/noperspective/flat/centroid), though later we will hopefully
be able to relax some of these restrictions. The number of packing
classes places an upper limit on the amount of space that must be
wasted by varying packing, since in theory a shader might nave 4n+1
components worth of varyings in each of m packing classes, resulting
in 3m components worth of wasted space.
Then, within each packing class, varyings are sorted by vector size,
with vec4's coming first, then vec2's, then scalars, and then finally
vec3's. The motivation for this order is that it ensures that the
only vectors that might be "double parked" (with part of the vector in
one varying slot and the remainder in another) are vec3's.
Note that the varyings aren't actually packed yet, merely placed in an
order that will facilitate packing.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch further subdivides the loop that assigns varying locations
into two phases: one phase to match up the varyings between shader
stages, and one phase to assign them varying locations.
In between the two phases the matched varyings are stored in a new
data structure called varying_matches. This will free us to be able
to assign varying locations in any order, which will pave the way for
packing varyings.
Note that the new varying_matches::assign_locations() function returns
the number of varying slots that were used; this return value will be
used in a future patch.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch subdivides the loop that assigns varying locations into two
phases: one phase to match up varyings between shader stages (and
assign them varying locations), and a second phase to record the
varying assignments for use by transform feedback.
This paves the way for varying packing, which will require us to
further subdivide the first phase.
In addition, it lets us avoid a clumsy O(n^2) algorithm, since we can
now record the locations of all transform feedback varyings in a
single pass through the tfeedback_decls array, rather than have to
iterate through the array after assigning each varying.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the location of each varying is recorded in ir_variable as
a multiple of the size of a vec4. In order to pack varyings, we need
to be able to record, e.g. that a vec2 is stored in the second half of
a varying slot rather than the first half.
This patch introduces a field ir_variable::location_frac, which
represents the offset within a vec4 where a varying's value is stored.
Varyings that are not subject to packing will always have a
location_frac value of zero.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the linker used a value of -1 in ir_variable::location to
denote a generic input or output of the shader that had not yet been
matched up to a variable in another pipeline stage.
This patch introduces a new ir_variable field,
is_unmatched_generic_inout, for that purpose.
In future patches, this will allow us to separate the process of
matching varyings between shader stages from the processes of
assigning locations to those varying. That will in turn pave the way
for packing varyings.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, link_invalidate_variable_locations() was only called
during assign_attribute_or_color_locations() and
assign_varying_locations(). This meant that in the corner case when
there was only a vertex shader, and varyings were being captured by
transform feedback, link_invalidate_variable_locations() wasn't being
called for the varyings.
This patch migrates the calls to link_invalidate_variable_locations()
to link_shaders(), so that they will be called in all circumstances.
In addition, it modifies the call semantics so that
link_invalidate_variable_locations() need only be called once per
shader stage (rather than once for inputs and once for outputs).
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies the clip distance lowering pass so that the new
symbol it generates (glClipDistanceMESA) is added to the shader's
symbol table.
This will allow a later patch to modify the linker so that it finds
transform feedback varyings using the symbol table rather than having
to iterate through all the declarations in the shader.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch updates the following linker checks to do the right thing
in GLSL 3.00 ES:
- Failing to write to gl_Position is allowed in GLSL 1.40+ as well as
GLSL 3.00 ES.
- It is an error to write to both gl_ClipVertex and gl_ClipDistance in
GLSL 1.30+. This does not apply to GLSL 3.00 ES.
- GLSL 3.00 ES uses the same varying counting rules as GLSL 1.00 ES.
- In GLSL 1.30 and GLSL 3.00 ES, "discard" terminates the shader.
- In GLSL 1.00 ES and GLSL 3.00 ES, both a fragment and a vertex
shader must be present.
[v2, idr]: Fix minro typo in a comment. Noticed by Ken.
[v3, idr]: s/IsEs(Shader|Prog)/IsES/ Suggested by Ken and Eric.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we recorded just the GLSL version (or the max version, if
GLSL 1.10 and GLSL 1.20 programs were linked together).
[v2, idr]: s/IsEs(Shader|Prog)/IsES/ Suggested by Ken and Eric.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we prohibited mixing of shading language versions if
min_version == 100 or max_version >= 130. This was technically
correct (since desktop GLSL 1.30 and beyond prohibit mixing of shading
language versions, as does GLSL 1.00 ES), but it was confusing. Also,
we asserted that all shading language versions were between 1.00 and
1.40, which was unnecessary (since the parser already checks shading
language versions) and doesn't work for GLSL 3.00 ES.
This patch changes the code to explicitly check that (a) ES shaders
aren't mixed with desktop shaders, (b) shaders aren't mixed between ES
versions, and (c) shaders aren't mixed between desktop GLSL versions
when at least one shader is GLSL 1.30 or greater. Also, it removes
the unnecessary assertion.
[v2, idr]: Slightly tweak the is_es_prog detection to occur outside the loop
instead of doing something special on the first loop iteration. Suggested by
Ken.
[v3, idr]: s/IsEs(Shader|Prog)/IsES/ Suggested by Ken and Eric.
Reviewed-by: Ian Romanick <[email protected]> [v1]
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The goal of that change was to skip counting things that aren't actually
outputs from the VS to the FS. However, explicit_location isn't set in
the case of linker-assigned locations (the common case), so basically
varying component counting got disabled. At this stage of the linker,
we've already ensured that var->location is set, so we can just look at
it without worrying.
Fixes i965 assertion failure with the new
piglit glsl-max-varyings --exceed-limits.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51545
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Global initializers using the ?: operator with at least one non-constant
operand generate ir_if statements. For example,
float foo = some_boolean ? 0.0 : 1.0;
becomes:
(declare (temporary) float conditional_tmp)
(if (var_ref some_boolean)
((assign (x) (var_ref conditional_tmp) (constant float (0.0))))
((assign (x) (var_ref conditional_tmp) (constant float (1.0)))))
This pattern is necessary because the second or third arguments could be
function calls, which create statements (not expressions).
The linker moves these global initializers into the main() function.
However, it incorrectly had an assertion that global initializer
statements were only assignments, calls, or temporary variable
declarations. As demonstrated above, they can be if statements too.
Other than the assertion, everything works fine. So remove it.
Fixes new Piglit test condition-08.vert, as well as an upcoming
game that will be released on Steam.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Part of fixing piglit maxblocks.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Fixes piglit layout-std140.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
This is a requirement for std140 uniform blocks, and optional for
packed/shared blocks.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
This attempts error-checking, but the layout isn't done yet.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Acked-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we were counting gl_FrontFacing, gl_FragCoord and gl_PointCoord
against the limit of varying variables. This prevented some valid shaders
from linking.
The other potential solution to this is to have the driver advertise
more varying vars or set the GLSLSkipStrictMaxVaryingLimitCheck flag.
But the above-mentioned variables aren't conventional varying attributes
so it doesn't seem right to count them.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Previously, I tried implementing this in the i965 driver, but did so
in a way that violated the intent of the spec, and broke Tropics.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 4ec449a6ed1d2cea3bf83d6518b3b352ce5daceb.
I meant to not push this one. Review found that a link error is not
mandated: it should link, but you get undefined rendering if you rely
on a missing stage.
page 42/55 section 2.11 "Vertex Shaders":
"If the program object has no vertex shader, or no program object
is currently in use, the results of vertex shader execution are
undefined."
(and similar for page 160/173 section 3.9 "Fragment Shaders" for FS,
and page 45/58 section 2.11.2 "Program Objects" for program being 0)
It turns out the commit was broken anyway, because it was missing a
"goto done", so linkstatus got smashed back to true later and the
error just showed up as a warning in the infolog.
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds index support to the GLSL compiler.
I'm not 100% sure of my approach here, esp without how output ordering
happens wrt location, index pairs, in the "mark" function.
Since current hw doesn't ever have a location > 0 with an index > 0,
we don't have to work out if the output ordering the hw requires is
location, index, location, index or location, location, index, index.
But we have no hw to know, so punt on it for now.
v2: index requires layout - catch and error
setup explicit index properly.
v3: drop idx_offset stuff, assume index follow location
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, set_callee() performed some assertions about the type of the
ir_call; protecting the bare pointer ensured these checks would be run.
However, ir_call no longer has a type, so the getter and setter methods
don't actually do anything useful. Remove them in favor of accessing
callee directly, as is done with most other fields in our IR.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aside from ir_call, our IR is cleanly split into two classes:
- Statements (typeless; used for side effects, control flow)
- Values (deeply nestable, pure, typed expression trees)
Unfortunately, ir_call confused all this:
- For void functions, we placed ir_call directly in the instruction
stream, treating it as an untyped statement. Yet, it was a subclass
of ir_rvalue, and no other ir_rvalue could be used in this way.
- For functions with a return value, ir_call could be placed in
arbitrary expression trees. While this fit naturally with the source
language, it meant that expressions might not be pure, making it
difficult to transform and optimize them. To combat this, we always
emitted ir_call directly in the RHS of an ir_assignment, only using
a temporary variable in expression trees. Many passes relied on this
assumption; the acos and atan built-ins violated it.
This patch makes ir_call a statement (ir_instruction) rather than a
value (ir_rvalue). Non-void calls now take a ir_dereference of a
variable, and store the return value there---effectively a call and
assignment rolled into one. They cannot be embedded in expressions.
All expression trees are now pure, without exception.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
A later error prints this properly, fix this case to do the same.
v2: remove attribute as per Ian's suggestion
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Instead of the hard-coded value of 32. Note that MaxUnrollIterations
defaults to 32 so there's no net change. But the gallium state tracker
can override this.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Kenneth Graunke <[email protected]>
|