mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	glsl: Remove exec_list iterators now that nothing uses them.	Kenneth Graunke	2014-01-13	2	-88/+0
\| \| \| \| \| \|	Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Replace iterators in ir_reader.cpp with ad-hoc list walking.	Kenneth Graunke	2014-01-13	1	-8/+10
\| \| \| \| \| \| \| \| \|	These can't use foreach_list since they want to skip over the first few list elements. Just doing the ad-hoc list walking isn't too bad. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Use a new foreach_two_lists macro for walking two lists at once.	Kenneth Graunke	2014-01-13	11	-78/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When handling function calls, we often want to walk through the list of formal parameters and list of actual parameters at the same time. (Both are guaranteed to be the same length.) Previously, we used a pattern of: exec_list_iterator 1st_iter = <1st list>.iterator(); foreach_iter(exec_list_iterator, 2nd_iter, <2nd list>) { ... 1st_iter.next(); } This was awkward, since you had to manually iterate through one of the two lists. This patch introduces a foreach_two_lists macro which safely walks through two lists at the same time, so you can simply do: foreach_two_lists(1st_node, <1st list>, 2nd_node, <2nd list>) { ... } v2: Rename macro from foreach_list2 to foreach_two_lists, as suggested by Ian Romanick. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Statically cast parameter exec_node to ir_variable.	Kenneth Graunke	2014-01-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	Formal function parameters are always ir_variable objects, not an arbitrary ir_instruction. So there's no need to dynamically cast here. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Cast ir_call parameters to ir_rvalue, not ir_instruction.	Kenneth Graunke	2014-01-13	4	-6/+6
\| \| \| \| \| \| \| \| \| \|	A function call's parameters are always rvalues. ir_rvalue may not always be a subclass of ir_instruction in the future, so we should use the right one. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Convert piles of foreach_iter to foreach_list_safe.	Kenneth Graunke	2014-01-13	12	-36/+36
\| \| \| \| \| \| \| \| \|	In these cases, we edit the list (or at least might be), so we use the foreach_list_safe variant. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Convert piles of foreach_iter to the newer foreach_list macro.	Kenneth Graunke	2014-01-13	23	-120/+113
\| \| \| \| \| \| \| \| \| \| \| \| \|	foreach_iter and exec_list_iterators have been deprecated for some time now; we just hadn't ever bothered to convert code to the newer foreach_list and foreach_list_safe macros. In these cases, we aren't editing the list, so we can use foreach_list rather than foreach_list_safe. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Index into ctx->Const.Program[] rather than using ad-hoc code.	Paul Berry	2014-01-09	4	-87/+17
\| \| \| \| \|	Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: replace ctx->Const.{Vertex,Fragment,Geomtery}Program with an array.	Paul Berry	2014-01-09	8	-109/+109
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are replaced with ctx->Const.Program[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '.c' -o -iname '.cpp' -o -iname '.py' \ -o -iname '.y' ')' -print0 \| xargs -0 sed -i \ -e 's/Const\.VertexProgram/Const.Program[MESA_SHADER_VERTEX]/g' \ -e 's/Const\.GeometryProgram/Const.Program[MESA_SHADER_GEOMETRY]/g' \ -e 's/Const\.FragmentProgram/Const.Program[MESA_SHADER_FRAGMENT]/g' Suggested-by: Brian Paul <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: Namespace qualify fma to override ambiguity with fma from math.h	Thomas Sondergaard	2014-01-08	1	-1/+1
\| \| \| \| \| \| \|	MSVC 2013 version of math.h includes an fma() function. Cc: "10.0" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	mesa: Fix compile error with MSVC 2013	Thomas Sondergaard	2014-01-08	1	-1/+1
\| \| \| \| \| \| \| \| \|	This fixes the following compile error: src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3 overloads have similar conversions Cc: "10.0" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	mesa: Remove _mesa_progshader_enum_to_string(), which is no longer used.	Paul Berry	2014-01-08	2	-32/+0
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	glsl: Make more use of gl_shader_stage enum in ir_set_program_inouts.cpp.	Paul Berry	2014-01-08	2	-15/+16
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	glsl: Make more use of gl_shader_stage enum in lower_clip_distance.cpp.	Paul Berry	2014-01-08	1	-8/+8
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	glsl: Make more use of gl_shader_stage enum in link_varyings.cpp.	Paul Berry	2014-01-08	1	-24/+24
\| \| \| \| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> v2: Also rename "shaderType" param of is_varying_var() to "stage". Reviewed-by: Brian Paul <[email protected]>
*	glsl: Change _mesa_glsl_parse_state ctor to use gl_shader_stage enum.	Paul Berry	2014-01-08	5	-11/+9
\| \| \| \| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> v2: Also rename "target" param to "stage". Reviewed-by: Brian Paul <[email protected]>
*	mesa: Use gl_shader::Stage instead of gl_shader::Type where possible.	Paul Berry	2014-01-08	6	-25/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reduces confusion since gl_shader::Type is sometimes GL_SHADER_PROGRAM_MESA but is more frequently GL_SHADER_{VERTEX,GEOMETRY,FRAGMENT}. It also has the advantage that when switching on gl_shader::Stage, the compiler will alert if one of the possible enum types is unhandled. Finally, many functions in src/glsl (especially those dealing with linking) already use gl_shader_stage to represent pipeline stages; using gl_shader::Stage in those functions avoids the need for a conversion. Note: in the process I changed _mesa_write_shader_to_file() so that if it encounters an unexpected shader stage, it will use a file suffix of "????" rather than "geom". Reviewed-by: Brian Paul <[email protected]> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: Store gl_shader_stage enum in gl_shader objects.	Paul Berry	2014-01-08	4	-0/+4
\| \| \| \| \|	Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: make _mesa_shader_stage_to_string() available to non-C++ code.	Paul Berry	2014-01-08	1	-8/+7
\| \| \| \| \| \| \| \|	Reviewed-by: Brian Paul <[email protected]> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: Clean up nomenclature for pipeline stages.	Paul Berry	2014-01-08	17	-148/+148
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we had an enum called gl_shader_type which represented pipeline stages in the order they occur in the pipeline (i.e. MESA_SHADER_VERTEX=0, MESA_SHADER_GEOMETRY=1, etc), and several inconsistently named functions for converting between it and other representations: - _mesa_shader_type_to_string: gl_shader_type -> string - _mesa_shader_type_to_index: GLenum (GL__SHADER) -> gl_shader_type - _mesa_program_target_to_index: GLenum (GL__PROGRAM) -> gl_shader_type - _mesa_shader_enum_to_string: GLenum (GL__{SHADER,PROGRAM}) -> string This patch tries to clean things up so that we use more consistent terminology: the enum is now called gl_shader_stage (to emphasize that it is in the order of pipeline stages), and the conversion functions are: - _mesa_shader_stage_to_string: gl_shader_stage -> string - _mesa_shader_enum_to_shader_stage: GLenum (GL__SHADER) -> gl_shader_stage - _mesa_program_enum_to_shader_stage: GLenum (GL__PROGRAM) -> gl_shader_stage - _mesa_progshader_enum_to_string: GLenum (GL__{SHADER,PROGRAM}) -> string In addition, MESA_SHADER_TYPES has been renamed to MESA_SHADER_STAGES, for consistency with the new name for the enum. Reviewed-by: Kenneth Graunke <[email protected]> v2: Also rename the "target" field of _mesa_glsl_parse_state and the "target" parameter of _mesa_shader_stage_to_string to "stage". Reviewed-by: Brian Paul <[email protected]>
*	glsl: Optimize pow(2, x) --> exp2(x).	Kenneth Graunke	2014-01-07	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using POW requires putting 2.0 in a register, while EXP2 doesn't. I believe that EXP2 will be faster than POW on basically all GPUs, so it makes sense to optimize it. Looking at the savage2 subset of shader-db: total instructions in shared programs: 113225 -> 113179 (-0.04%) instructions in affected programs: 2139 -> 2093 (-2.15%) instances of 'math pow': 795 -> 749 (-6.14%) instances of 'math exp': 389 -> 435 (11.8%) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Refactor is_zero/one/negative_one into an is_value() method.	Kenneth Graunke	2014-01-07	2	-68/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch creates a new generic is_value() method, which checks if an ir_constant has a particular value. (For vectors, it must have the single value repeated across all components.) It then rewrites the is_zero/is_one/is_negative_one methods to use this generic helper. All three were basically identical except for the value they checked for. The other difference is that is_negative_one rejects boolean types. The new is_value function maintains this behavior, only allowing boolean types when checking for 0 or 1. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Optimize pow(1.0, X) --> 1.0.	Kenneth Graunke	2014-01-07	1	-0/+6
\| \| \| \| \| \| \|	Surprisingly, this helps one vertex shader in 3DMMES. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: rename min(), max() functions to fix MSVC build	Brian Paul	2014-01-06	3	-7/+7
\| \| \| \| \| \| \| \|	Evidently, there's some other definition of "min" and "max" that causes MSVC to choke on these function names. Renaming to min2() and max2() fixes things. Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: enable AMD_shader_trinary_minmax	Maxence Le Doré	2014-01-06	1	-1/+1
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: implement mid3 built-in function	Maxence Le Doré	2014-01-06	1	-0/+38
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: implement max3 built-in function	Maxence Le Doré	2014-01-06	1	-0/+38
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Implement min3 built-in function	Maxence Le Doré	2014-01-06	1	-0/+38
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: add min() and max() functions to builder.cpp	Maxence Le Doré	2014-01-06	2	-0/+13
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: add a shader_trinary_minmax predicate	Maxence Le Doré	2014-01-06	1	-0/+6
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Add extension tracking for AMD_shader_trinary_minmax	Maxence Le Doré	2014-01-06	3	-0/+6
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glcpp: error on multiple #else/#elif directives	Erik Faye-Lund	2014-01-02	6	-1/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The preprocessor currently accepts multiple else/elif-groups per if-section. The GLSL-preprocessor is defined by the C++ specification, which defines the following parse-rule: if-section: if-group elif-groups(opt) else-group(opt) endif-line This clearly only allows a single else-group, that has to come after any elif-groups. So let's modify the code to follow the specification. Add test to prevent regressions. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Carl Worth <[email protected]> Cc: 10.0 <[email protected]>
*	glcpp: Replace multi-line comment with a space (even as part of macro ↵	Carl Worth	2014-01-02	8	-9/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	definition) The preprocessor has always replaced multi-line comments with a single space character, (as required by the specification), but as of commit bd55ba568b301d0f764cd1ca015e84e1ae932c8b the lexer also emitted a NEWLINE token for each newline within the comment, (in order to preserve line numbers). The emitting of NEWLINE tokens within the comment broke the rule of "replace a multi-line comment with a single space" as could be exposed by code like the following: #define FOO a/* */b FOO Prior to commit bd55ba568b301d0f764cd1ca015e84e1ae932c8b, this code defined the macro FOO as "a b" as desired. Since that commit, this code instead defines FOO as "a" and leaves a stray "b" in the output. In this commit, we fix this by not emitting the NEWLINE tokens while lexing the comment, but instead merely counting them in the commented_newlines variable. Then, when the lexer next encounters a non-commented newline it switches to a NEWLINE_CATCHUP state to emit as many NEWLINE tokens as necessary (so that subsequent parsing stages still generate correct line numbers). Of course, it would have been more clear if we could have written a loop to emit all the newlines, but flex conventions prevent that, (we must use "return" for each token we emit). It similarly would have been clear to have a new rule restricted to the <NEWLINE_CATCHUP> state with an action much like the body of this if condition. The problem with that is that this rule must not consume any characters. It might be possible to write a rule that matches a single lookahead of any character, but then we would also need an additional rule to ensure for the <EOF> case where there are no additional characters available for the lookahead to match. Given those considerations, and given that the SKIP-state manipulation already involves a code block at the top of the lexer function, before any rules, it seems best to me to go with the implementation here which adds a similar pre-rule code block for the NEWLINE_CATCHUP. Finally, this commit also changes the expected output of a few, existing glcpp tests. The change here is that the space character resulting from the multi-line comment is now emitted before the newlines corresponding to that comment. (Previously, the newlines were emitted first, and the space character afterward.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686 Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glcpp: Add a more descriptive comment for the SKIP state manipulation	Carl Worth	2014-01-02	1	-5/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two things make this code confusing: 1. The uncharacteristic manipulation of lexer start state outside of flex rules. 2. The confusing semantics of the skip_stack (including the "lexing_if" override and the SKIP_NO_SKIP state). This new comment is intended to bring a bit more clarity for any readers. There is no intended beahvioral change to the code here. The actual code changes include better indentation to avoid an excessively-long line, and using the more descriptive INITIAL rather than 0. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Fix gl_type of usamplerCube built-in type.	Paul Berry	2013-12-30	1	-1/+1
\| \| \| \| \| \| \|	I'm not aware of any piglit tests that this fixes, but the old code was obviously wrong. Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa: Improve static error checking of arrays sized by MESA_SHADER_TYPES.	Paul Berry	2013-12-30	2	-7/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces the following pattern: foo bar[MESA_SHADER_TYPES] = { ... }; With: foo bar[] = { ... }; STATIC_ASSERT(Elements(bar) == MESA_SHADER_TYPES); This way, when a new shader type is added in a future version of Mesa, we will get a compile error to remind us that the array needs to be updated. Reviewed-by: Brian Paul <[email protected]>
*	glsl: Remove extraneous shader_type argument from analyze_clip_usage().	Paul Berry	2013-12-30	1	-4/+5
\| \| \| \| \| \| \| \|	This argument was carrying the name of the shader target (as a string). We can get this just as easily by calling _mesa_shader_enum_to_string(). Reviewed-by: Brian Paul <[email protected]>
*	glsl: Get rid of hardcoded arrays of shader target names.	Paul Berry	2013-12-30	2	-15/+9
\| \| \| \| \| \| \|	We already have a function for converting a shader type index to a string: _mesa_shader_type_to_string(). Reviewed-by: Brian Paul <[email protected]>
*	Rename overloads of _mesa_glsl_shader_target_name().	Paul Berry	2013-12-30	5	-30/+30
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, _mesa_glsl_shader_target_name() had an overload for GLenum and an overload for the gl_shader_type enum, each of which behaved differently. However, since GLenum is a synonym for unsigned int, and unsigned ints are often used in place of gl_shader_type (e.g. in loop indices), there was a big risk of calling the wrong overload by mistake. This patch gives the two overloads different names so that it's always clear which one we mean to call. Reviewed-by: Brian Paul <[email protected]>
*	Report that no function found if signature lookup is empty	Kevin Rogovin	2013-12-20	1	-9/+16
\| \| \| \| \| \| \|	If no function signature is found for a function name, report that the function is not found instead of printing an empty list of candidates. Reviewed-by: Ian Romanick <[email protected]>
*	Use line number information from entire function expression	Kevin Rogovin	2013-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the error reporting behavior for incorrect function invocation (triggered by match_function_by_name() unable to find a matching function call) from using the line number information associated to the function name term to using the line number information of the entire function expression. Fixes bug #72264. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72264 Reviewed-by: Ian Romanick <[email protected]> Cc: "10.0" <[email protected]>
*	glsl: Replace _mesa_glsl_parser_targets enum with gl_shader_type.	Paul Berry	2013-12-17	6	-81/+75
\| \| \| \| \| \|	These enums were redundant. Reviewed-by: Brian Paul <[email protected]>
*	glsl: Don't return bad values from _mesa_shader_type_to_index.	Paul Berry	2013-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This will avoid compiler warnings in the patch that follows. There should be no user-visible effect because the change only affects the behaviour when an invalid enum is passed to _mesa_shader_type_to_index(), and that can only happen if there is a bug elsewhere in Mesa. Reviewed-by: Brian Paul <[email protected]>
*	glsl: add gl_SampleMaskIn[] builtin	Chris Forbes	2013-12-14	1	-0/+4
\| \| \| \| \|	Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: modify ir_clone to use memcpy	Tapani Pälli	2013-12-12	1	-20/+3
\| \| \| \| \| \| \| \|	Patch copies the whole data structure at once instead of assigning individual variables. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	glsl: move variables in to ir_variable::data, part II	Tapani Pälli	2013-12-12	22	-357/+366
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch moves following bitfields and variables to the data structure: explicit_location, explicit_index, explicit_binding, has_initializer, is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray, from_named_ifc_block_array, depth_layout, location, index, binding, max_array_access, atomic Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	glsl: move variables in to ir_variable::data, part I	Tapani Pälli	2013-12-12	36	-272/+273
\| \| \| \| \| \| \| \| \| \|	This patch moves following bitfields in to the data structure: used, assigned, how_declared, mode, interpolation, origin_upper_left, pixel_center_integer Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	glsl: introduce data section to ir_variable	Tapani Pälli	2013-12-12	17	-73/+81
\| \| \| \| \| \| \| \|	Data section helps serialization and cloning of a ir_variable. This patch includes the helper bits used for read only ir_variables. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Paul Berry <[email protected]>
*	glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.	Paul Berry	2013-12-09	40	-195/+35
\| \| \| \| \| \| \| \|	Now that loop_controls no longer creates normatively bound loops, there is no need for ir_loop::normative_bound or the lower_bounded_loops pass. Reviewed-by: Ian Romanick <[email protected]>
*	glsl/loops: Stop creating normatively bound loops in loop_controls.	Paul Berry	2013-12-09	3	-20/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, when loop_controls analyzed a loop and found that it had a fixed bound (known at compile time), it would remove all of the loop terminators and instead set the loop's normative_bound field to force the loop to execute the correct number of times. This made loop unrolling easy, but it had a serious disadvantage. Since most GPU's don't have a native mechanism for executing a loop a fixed number of times, in order to implement the normative bound, the back-ends would have to synthesize a new loop induction variable. As a result, many loops wound up having two induction variables instead of one. This caused extra register pressure and unnecessary instructions. This patch modifies loop_controls so that it doesn't set the loop's normative_bound anymore. Instead it leaves one of the terminators in the loop (the limiting terminator), so the back-end doesn't have to go to any extra work to ensure the loop terminates at the right time. This complicates loop unrolling slightly: when deciding whether a loop can be unrolled, we have to account for the presence of the limiting terminator. And when we do unroll the loop, we have to remove the limiting terminator first. For an example of how this results in more efficient back end code, consider the loop: for (int i = 0; i < 100; i++) { total += i; } Previous to this patch, on i965, this loop would compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop (notice that both g8 and g4 are loop induction variables; one is used to terminate the loop, and the other is used to accumulate the total). After this patch, the same loop compiles to: mov(8) g4<1>.xD 0D loop: cmp.ge.f0(8) null g4<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop Reviewed-by: Ian Romanick <[email protected]>