summaryrefslogtreecommitdiffstats
path: root/src/glsl/Makefile.sources
Commit message (Collapse)AuthorAgeFilesLines
* nir: Add a basic metadata management systemJason Ekstrand2015-01-151-0/+1
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a lower_vec_to_movs passJason Ekstrand2015-01-151-0/+1
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a naieve from-SSA passJason Ekstrand2015-01-151-0/+1
| | | | | | | This pass is kind of stupidly implemented but it should be enough to get us up and going. We probably want something better that doesn't generate all of the redundant moves eventually. However, the i965 backend should be able to handle the movs, so I'm not too worried about it in the short term.
* nir: add an SSA-based dead code elimination passConnor Abbott2015-01-151-0/+1
| | | | | v2: Jason Ekstrand <[email protected]>: whitespace fixes
* nir: add an SSA-based copy propagation passConnor Abbott2015-01-151-0/+1
|
* nir: add a pass to convert to SSAConnor Abbott2015-01-151-0/+1
| | | | | v2: Jason Ekstrand <[email protected]>: whitespace fixes
* nir: calculate dominance informationConnor Abbott2015-01-151-0/+1
|
* nir: add an optimization to turn global registers into local registersConnor Abbott2015-01-151-0/+1
| | | | | After linking and inlining, this allows us to convert these registers into SSA values and optimise more code.
* nir: add a pass to lower atomicsConnor Abbott2015-01-151-0/+1
| | | | | v2: Jason Ekstrand <[email protected]> whitespace fixes
* nir: add a pass to lower system value readsConnor Abbott2015-01-151-0/+1
| | | | | v2: Jason Ekstrand <[email protected]>: whitespace fixes
* nir: add a pass to lower sampler instructionsConnor Abbott2015-01-151-0/+1
|
* nir: add a pass to remove unused variablesConnor Abbott2015-01-151-0/+1
| | | | | | | | After we lower variables, we want to delete them in order to free up some memory. v2: Jason Ekstrand <[email protected]>: whitespace fixes
* nir: add a pass to lower variables for scalar backendsConnor Abbott2015-01-151-0/+1
|
* nir: add a glsl-to-nir passConnor Abbott2015-01-151-1/+2
| | | | | | v2: Jason Ekstrand <[email protected]>: Make glsl_to_nir build again fix whitespace
* nir: add a validation passConnor Abbott2015-01-151-0/+1
| | | | | | | This is similar to ir_validate.cpp. v2: Jason Ekstrand <[email protected]>: whitespace fixes
* nir: add a printerConnor Abbott2015-01-151-0/+1
| | | | | | | This is similar to ir_print_visitor.cpp. v2: Jason Ekstrand <[email protected]>: whitespace fixes
* nir: add core helper functionsConnor Abbott2015-01-151-3/+7
| | | | | | | | | These include functions for adding and removing various bits of IR and helpers for iterating over all the sources and destinations of an instruction. This is similar to ir.cpp. v2: Jason Ekstrand <[email protected]>: whitespace and automake fixes
* nir: add the core datastructuresConnor Abbott2015-01-151-0/+2
| | | | | | | | | | | | | This includes all the instructions, ifs, loops, functions, etc. This is similar to the information in ir.h. v2: Jason Ekstrand <[email protected]>: Include ralloc and hash_table from the util directory whitespace fixes Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-By glenn.kennard <[email protected]>
* nir: add a simple C wrapper around glsl_types.hConnor Abbott2015-01-151-0/+3
| | | | | | | v2: Jason Ekstrand <[email protected]>: whitespace and automake fixes Reviewed-by: Eric Anholt <[email protected]>
* glsl: Add headers to distribution.Matt Turner2014-12-121-2/+29
|
* glsl: Lower constant arrays to uniform arrays.Kenneth Graunke2014-11-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider GLSL code such as: const ivec2 offsets[] = ivec2[](ivec2(-1, -1), ivec2(-1, 0), ivec2(-1, 1), ivec2(0, -1), ivec2(0, 0), ivec2(0, 1), ivec2(1, -1), ivec2(1, 0), ivec2(1, 1)); ivec2 offset = offsets[<non-constant expression>]; Both i965 and nv50 currently handle this very poorly. On i965, this becomes a pile of MOVs to load the immediate constants into registers, a pile of scratch writes to move the whole array to memory, and one scratch read to actually access the value - effectively the same as if it were a non-constant array. We'd much rather upload large blocks of constant data as uniform data, so drivers can simply upload the data via constbufs, and not have to populate it via shader instructions. This is currently non-optional because both i965 and nouveau benefit from it, and according to Marek radeonsi would benefit today as well. (According to Tom, radeonsi may want to handle this itself in the long term, but we can always add a flag when it becomes useful.) Improves performance in a terrain rendering microbenchmark by about 2x, and cuts the number of instructions in about half. Helps a lot of "Natural Selection 2" shaders, as well as one "HOARD" shader. total instructions in shared programs: 5473459 -> 5471765 (-0.03%) instructions in affected programs: 5880 -> 4186 (-28.81%) v2: Use ir_var_hidden to avoid exposing the new uniform via the GL uniform introspection API. v3: Alphabetize Makefile.sources properly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77957 Signed-off-by: Kenneth Graunke <[email protected]>
* util: add _mesa_strtod and _mesa_strtofChia-I Wu2014-10-301-2/+1
| | | | | | | | | Both core mesa and glsl have their own wrappers for strtof_l. Merge and move them to util/. They are compiled with a C++ compiler so that we can make them thread-safe in a following commit. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Optimize min/max expression treesIago Toral Quiroga2014-10-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original patch by Petri Latvala <[email protected]>: Add an optimization pass that drops min/max expression operands that can be proven to not contribute to the final result. The algorithm is similar to alpha-beta pruning on a minmax search, from the field of AI. This optimization pass can optimize min/max expressions where operands are min/max expressions. Such code can appear in shaders by itself, or as the result of clamp() or AMD_shader_trinary_minmax functions. This optimization pass improves the generated code for piglit's AMD_shader_trinary_minmax tests as follows: total instructions in shared programs: 75 -> 67 (-10.67%) instructions in affected programs: 60 -> 52 (-13.33%) GAINED: 0 LOST: 0 All tests (max3, min3, mid3) improved. A full shader-db run: total instructions in shared programs: 4293603 -> 4293575 (-0.00%) instructions in affected programs: 1188 -> 1160 (-2.36%) GAINED: 0 LOST: 0 Improvements happen in Guacamelee and Serious Sam 3. One shader from Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of dropping of a (constant float (0.00000)) operand, which was compiled to a saturate modifier. Version 2 by Iago Toral Quiroga <[email protected]>: Changes from review feedback: - Squashed various cosmetic changes sent by Matt Turner. - Make less_all_components return an enum rather than setting a class member. (Suggested by Mat Turner). Also, renamed it to compare_components. - Make less_all_components, smaller_constant and larger_constant static. (Suggested by Mat Turner) - Change mixmax_range to call its limits "low" and "high" instead of "range[0]" and "range[1]". (Suggested by Connor Abbot). - Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by Connor Abbot). - Make the logic more clearer by rearrenging the code and commenting. (Suggested by Connor Abbot). - Added comment to explain why we need to recurse twice. (Suggested by Connor Abbot). - If we cannot prune an expression, do not return early. Instead, attempt to prune its children. (Suggested by Connor Abbot). Other changes: - Instead of having a global "valid" visitor member, let the various functions that can determine this status return a boolean and check for its value to decide what to do in each case. This is more flexible and allows to recurse into children of parents that could not be prunned due to invalid ranges (so related to the last bullet in the review feedback). - Make sure we always check if a range is valid before working with it. Since any use of get_range, combine_range or range_intersection can invalidate a range we should check for this situation every time we use any of these functions. Version 3 by Iago Toral Quiroga <[email protected]>: Changes from review feedback: - Now we can make get_range, combine_range and range_intersection static too (suggested by Connor Abbot). - Do not return NULL when looking for the larger or greater constant into mixed vector constants. Instead, produce a new constant by doing a component-wise minmax. With this we can also remove of the validations when we call into these functions (suggested by Connor Abbot). - Add a comment explaining the meaning of the baserange argument in prune_expression (suggested by Connor Abbot). Other changes: - Eliminate minmax expressions operating on constant vectors with mixed values by resolving them. No piglit regressions observed with Version 3. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861 Reviewed-by: Connor Abbott <[email protected]>
* glsl: Eliminate unused built-in variables after compilationIan Romanick2014-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After compilation (and before linking) we can eliminate quite a few built-in variables. Basically, any uniform or constant (e.g., gl_MaxVertexTextureImageUnits) that isn't used (with one exception) can be eliminated. System values, vertex shader inputs (with one exception), and fragment shader outputs that are not used and not re-declared in the shader text can also be removed. gl_ModelViewProjectMatrix and gl_Vertex are used by the built-in function ftransform. There are some complications with eliminating these variables (see the comment in the patch), so they are not eliminated. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 46 40,661,487,174 75,116,800 68,854,065 6,262,735 0 After (32-bit): 50 40,564,927,443 69,185,408 63,683,871 5,501,537 0 Before (64-bit): 64 37,200,329,700 104,872,672 96,514,546 8,358,126 0 After (64-bit): 59 36,822,048,449 96,526,888 89,113,000 7,413,888 0 A real savings of 4.9MiB on 32-bit and 7.0MiB on 64-bit. v2: Don't remove any built-in with Transpose in the name. v3: Fix comment typo noticed by Anuj. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Eric Anholt <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Anuj Phogat <[email protected]> Cc: Eric Anholt <[email protected]>
* glsl: Add a lowering pass for gl_VertexIDIan Romanick2014-09-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Converts gl_VertexID to (gl_VertexIDMESA + gl_BaseVertex). gl_VertexIDMESA is backed by SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, and gl_BaseVertex is backed by SYSTEM_VALUE_BASE_VERTEX. v2: Put the enum in struct gl_constants and propoerly resolve the scope in C++ code. Fix suggested by Marek. v3: Reabase on Matt's foreach_in_list changes (was using foreach_list). v4 (Ken): Use a systemvalue instead of a uniform because STATE_BASE_VERTEX has been removed. v5: Use a boolean to select lowering, and only allow one lowering method. Suggested by Ken. v6 (Ken): Replace strcmp against literal "gl_BaseVertex"/"gl_VertexID" with SYSTEM_VALUE enum checks, for efficiency. v7: Rebase on context constant initialization work. Signed-off-by: Ian Romanick <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* util: Move ralloc to a new src/util directory.Kenneth Graunke2014-08-041-1/+0
| | | | | | | | | | | | | | | | | | For a long time, we've wanted a place to put utility code which isn't directly tied to Mesa or Gallium internals. This patch creates a new src/util directory for exactly that purpose, and builds the contents as libmesautil.la. ralloc seemed like a good first candidate. These days, it's directly used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl didn't make much sense. Signed-off-by: Kenneth Graunke <[email protected]> v2 (Jason Ekstrand): More realloc uses and some scons fixes Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: Rebalance expression trees that are reduction operations.Matt Turner2014-06-191-0/+1
| | | | | | | | | | The intention of this pass was to give us better instruction scheduling opportunities, but it unexpectedly reduced some instruction counts as well: total instructions in shared programs: 1666639 -> 1666073 (-0.03%) instructions in affected programs: 54612 -> 54046 (-1.04%) (and trades 4 SIMD16 programs in SS3)
* glsl/i965: move lower_offset_array up to GLSL compiler level.Dave Airlie2014-02-251-0/+1
| | | | | | | | This lowering pass will be useful for gallium drivers as well, in order to support the GL TG4 oddity that is textureGatherOffsets. Reviewed-by: Chris Forbes <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl: Vectorize multiple scalar assignmentsMatt Turner2014-01-211-0/+1
| | | | | | | | | | Reduces vertex shader instruction counts in DOTA2 by 6.42%, L4D2 by 4.61%, and CS:GO by 5.71%. total instructions in shared programs: 1500153 -> 1498191 (-0.13%) instructions in affected programs: 59919 -> 57957 (-3.27%) Reviewed-by: Ian Romanick <[email protected]>
* glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.Paul Berry2013-12-091-1/+0
| | | | | | | | Now that loop_controls no longer creates normatively bound loops, there is no need for ir_loop::normative_bound or the lower_bounded_loops pass. Reviewed-by: Ian Romanick <[email protected]>
* glsl/loops: consolidate bounded loop handling into a lowering pass.Paul Berry2013-12-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, all of the back-ends (ir_to_mesa, st_glsl_to_tgsi, and the i965 fs and vec4 visitors) had nearly identical logic for handling bounded loops. This replaces the duplicate logic with an equivalent lowering pass that is used by all the back-ends. Note: on i965, there is a slight increase in instruction count. For example, a loop like this: for (int i = 0; i < 100; i++) { total += i; } would previously compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) break(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop After this patch, the "(+f0) break(8)" turns into: (+f0) if(8) break(8) endif(8) because the back-end isn't smart enough to recognize that "if (condition) break;" can be done using a conditional break instruction. However, it should be relatively easy for a future peephole optimization to properly optimize this. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Move the CSE equality functions to the ir class.Eric Anholt2013-11-151-0/+1
| | | | | | | | I want to reuse them in opt_algebraic. v2: Merge in Chris Forbes's break fix. Reviewed-by: Jordan Justen <[email protected]>
* glsl: Linker support for ARB_shader_atomic_counters.Francisco Jerez2013-11-071-0/+1
| | | | | | | | | | | | | | | v2: Add comments on the purpose of the auxiliary data structures. Check for atomic counter overlaps. Use the contains_atomic() convenience method. Add static assert with the number of expected shader stages. v3: Don't resize atomic arrays. v4: Add comment on the reason why we don't resize atomic counter arrays. Use 'strcmp(...) == 0' instead of '!strcmp(...)'. v5 (idr): Don't use STL in the linker. Signed-off-by: Francisco Jerez <[email protected]> Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add a CSE pass.Eric Anholt2013-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This only operates on constant/uniform values for now, because otherwise I'd have to deal with killing my available CSE entries when assignments happen, and getting even this working in the tree ir was painful enough. As is, it has the following effect in shader-db: total instructions in shared programs: 1524077 -> 1521964 (-0.14%) instructions in affected programs: 50629 -> 48516 (-4.17%) GAINED: 0 LOST: 0 And, for tropics, that accounts for most of the effect, the FPS improvement is 11.67% +/- 0.72% (n=3). v2: Use read_only field of the variable, manually check the lod_info union members, use get_num_operands(), rename cse_operands_visitor to is_cse_candidate_visitor, move all is-a-candidate logic to that function, and call it before checking for CSE on a given rvalue, more comments, use private keyword. Reviewed-by: Paul Berry <[email protected]>
* glsl: Remove builtin_compiler from the build system.Kenneth Graunke2013-09-091-15/+2
| | | | | | | | | We don't actually use anything from builtin_function.cpp, so we don't need to generate it anymore. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Paul Berry <[email protected]>
* glsl: Write a new built-in function module.Kenneth Graunke2013-09-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This creates a new replacement for the existing built-in function code. The new module lives in builtin_functions.cpp (not builtin_function.cpp) and exists in parallel with the existing system. It isn't used yet. The new built-in function code takes a significantly different approach: Instead of implementing built-ins via printed IR, build time scripts, and run time parsing, we now implement them directly in C++, using ir_builder. This translates to faster load times, and a much less complex build system. It also takes a different approach to built-in availability: each signature now stores a boolean predicate, which makes it easy to construct arbitrary expressions based on _mesa_glsl_parse_state's fields. This is much more flexible than the old system, and also easier to use. Built-ins are also now stored in a single gl_shader object, rather than being spread out across a number of shaders that need to be linked. When searching for a matching prototype, we simply consult the availability predicate. This also simplifies the code. v2: Incorporate Matt Turner's feedback: use the new fma() function rather than expr(). Don't expose textureQueryLOD() in GLSL 4.00 (since it was renamed to textureQueryLod()). Also correct some #undefs. v3: Incorporate Paul Berry's feedback: rename legacy to compatibility; add comments to explain a few things; fix uvec availability; include shaderobj.h instead of repeating the _mesa_new_shader prototype. v4: Fix lack of TEX_PROJECT on textureProjGrad[Offset] (caught by oglc). Add an out_var convenience function (more feedback by Matt Turner). v5: Rework availability predicates for Lod functions. They were broken. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Enthusiastically-acked-by: Paul Berry <[email protected]>
* glsl/linker: eliminate unused and set-but-unused built-in varyingsMarek Olšák2013-07-021-0/+1
| | | | | | | | | | | | | This eliminates built-in varyings such as gl_Color, gl_SecondaryColor, gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is broken down into separate vec4s if needed. v2: - use a switch statement in varying_info_visitor::visit(ir_variable*) - use snprintf - disable the optimization for GLES2 Reviewed-by: Ian Romanick <[email protected]>
* glsl: Streamline the built-in type handling code.Kenneth Graunke2013-06-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Over the last few years, the compiler has grown to support 7 different language versions and 6 extensions that add new built-in types. With more and more features being added, some of our core code has devolved into an unmaintainable spaghetti of sorts. A few problems with the old code: 1. Built-in types are declared...where exactly? The types in builtin_types.h were organized in arrays by the language version or extension they were introduced in. It's factored out to avoid duplicates---every type only exists in one array. But that means that sampler1D is declared in 110, sampler2D is in core types, sampler3D is a unique global not in a list...and so on. 2. Spaghetti call-chains with weird parameters: generate_300ES_types calls generate_130_types which calls generate_120_types and generate_EXT_texture_array_types, which calls generate_110_types, which calls generate_100ES_types...and more Except that ES doesn't want 1D types, so we have a skip_1d parameter. add_deprecated also falls into this category. 3. Missing type accessors. Common types have convenience pointers (like glsl_type::vec4_type), but others may not be accessible at all without a symbol table (for example, sampler types). 4. Global variable declarations in a header file? #include "builtin_types.h" in two C++ files would break the build. The new code addresses these problems. All built-in types are declared together in a single table, independent of when they were introduced. The macro that declares a new built-in type also creates a convenience pointer, so every type is available and it won't get out of sync. The code to populate a symbol table with the appropriate types for a particular language version and set of extensions is now a single table-driven function. The table lists the type name and GL/ES versions when it was introduced (similar to how the lexer handles reserved words). A single loop adds types based on the language version. Explicit extension checks then add additional types. If they were already added based on the language version, glsl_symbol_table simply ignores the request to add them a second time, meaning we don't need to worry about duplicates and can simply list types where they belong. v2: Mark uvecs and shadow samplers as ES3 only, and 1DArrayShadow as unsupported in ES entirely. Add a touch more doxygen. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl linker: compare interface blocks during intrastage linkingJordan Justen2013-05-231-0/+1
| | | | | | | | | | | | | | | | Verify that interface blocks match when combining compilation units at the same stage. (For example, when merging all vertex shaders.) Fixes piglit glsl-1.50 test: * linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test v5 (Ken): Rename to link_interface_blocks.cpp and drop the separate .h file for consistency with other linker code. Remove "ok" variable. Fold cross_validate_interface_blocks into its caller. Signed-off-by: Jordan Justen <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* glsl linker: remove interface block instance namesJordan Justen2013-05-231-0/+1
| | | | | | | | Convert interface blocks with instance names into flat interface blocks without an instance name. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add lowering pass for ir_triop_vector_insertIan Romanick2013-05-131-0/+1
| | | | | | | | | | | | | | | | | | This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. v2: Use WRITEMASK_* instead of integer literals. Use a more concise method of generating broadcast_index. Both suggested by Eric. v3: Use a series of scalar compares instead of a single vector compare. Suggested by Eric and Ken. It still uses 'if (cond) v.x = y;' instead of conditional assignments because ir_builder doesn't do conditional assignments, and I'd rather keep the code simple. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add a pass to flip matrix/vector multiplies to use dot products.Kenneth Graunke2013-05-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | This pass flips (matrix * vector) operations to (vector * matrixTranspose) for certain built-in matrices (currently gl_ModelViewProjectionMatrix and gl_TextureMatrix). This is equivalent, but results in dot products rather than multiplies and adds. On some hardware, this is more efficient. This pass is conditionalized on ctx->mvp_with_dp4, the flag drivers set to indicate they prefer dot products. Improves performance in Lightsmark by 1.01131% +/- 0.162069% (n = 10) on a Haswell GT2 system. Passes Piglit on Ivybridge. v2: Use struct gl_shader_compiler_options instead of plumbing through another boolean flag for this purpose. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Refactor handling of ast_array_index to a separate functionIan Romanick2013-04-081-0/+1
| | | | | | | | | I love 800+ line switch-statements as much as the next guy... Future commits will make changes to this part of the AST-to-HIR conversion, and extracting this code will make that a bit easier. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add an optimization pass to flatten simple nested if blocks.Kenneth Graunke2013-04-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GLBenchmark 2.7's shaders contain conditional blocks like: if (x) { if (y) { ... } } where the outer conditional's then clause contains exactly one statement (the nested if) and there are no else clauses. This can easily be optimized into: if (x && y) { ... } This saves a few instructions in GLBenchmark 2.7: total instructions in shared programs: 11833 -> 11649 (-1.55%) instructions in affected programs: 8234 -> 8050 (-2.23%) It also helps CS:GO slightly (-0.05%/-0.22%). More importantly, however, it simplifies the control flow graph, which could enable other optimizations. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Add a visitor to determine whether a uniform block is ever usedIan Romanick2013-01-251-0/+1
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* linker: Refactor intra-stage block compatabililty testingIan Romanick2013-01-251-0/+1
| | | | | | | | | | | | | | | | Also slightly change the compatibility test. Instead of comparing the offsets of the block variables, compare the packing mode of the blocks. Ideally we don't want to assign the offsets until a later stage of linking. This is put in a new file called link_uniform_blocks.cpp. Some new functions related to uniform blocks are going to live in that file as well. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add lowering pass for GLSL ES 3.00 pack/unpack operations (v4)Chad Versace2013-01-241-0/+1
| | | | | | | | | | | | | | | | Lower them to arithmetic and bit manipulation expressions. v2: Rewrite using ir_builder [for idr]. v3: Comment typos. [for mattst88] v4: Fix arithmetic error in comments. Factor out a shift instruction. Don't heap allocate factory.instructions. [for paul] Reviewed-by: Ian Romanick <[email protected]> (v2) Reviewed-by: Matt Tuner <[email protected]> (v3) Reviewed-by: Paul Berry <[email protected]> (v4) Signed-off-by: Chad Versace <[email protected]>
* glsl/Makefile.sources: Correct BUILTIN_COMPILER_CXX_FILESMatt Turner2013-01-221-1/+1
| | | | | | | | | | | | | | | Squashed with two reverts: Revert "android: Update for builtin_stubs.cpp move" This reverts commit c0def90ede1e939173041b8785303de90f8fdc6c. Revert "scons: Update for builtin_stubs.cpp" This reverts commit 8ac4b82699ad0a59ae6ae6d3415702eaa5d4fe3b. Tested-by: Andreas Boll <[email protected]> Tested-on-Android-by: Chad Versace <[email protected]>
* glsl/build: Build glcpp via the glsl MakefileMatt Turner2013-01-221-4/+4
| | | | | | Removing the subdirectory recursion provides a small speed up. Tested-by: Andreas Boll <[email protected]>
* glsl: Separate varying linking code to its own file.Paul Berry2013-01-081-0/+1
| | | | | | | | linker.cpp is getting pretty big, and we're about to add even more varying packing code, so split out the linker code that concerns varyings to its own file. Reviewed-by: Kenneth Graunke <[email protected]>