summaryrefslogtreecommitdiffstats
path: root/src/glsl/Makefile.sources
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Vectorize multiple scalar assignmentsMatt Turner2014-01-211-0/+1
| | | | | | | | | | Reduces vertex shader instruction counts in DOTA2 by 6.42%, L4D2 by 4.61%, and CS:GO by 5.71%. total instructions in shared programs: 1500153 -> 1498191 (-0.13%) instructions in affected programs: 59919 -> 57957 (-3.27%) Reviewed-by: Ian Romanick <[email protected]>
* glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.Paul Berry2013-12-091-1/+0
| | | | | | | | Now that loop_controls no longer creates normatively bound loops, there is no need for ir_loop::normative_bound or the lower_bounded_loops pass. Reviewed-by: Ian Romanick <[email protected]>
* glsl/loops: consolidate bounded loop handling into a lowering pass.Paul Berry2013-12-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, all of the back-ends (ir_to_mesa, st_glsl_to_tgsi, and the i965 fs and vec4 visitors) had nearly identical logic for handling bounded loops. This replaces the duplicate logic with an equivalent lowering pass that is used by all the back-ends. Note: on i965, there is a slight increase in instruction count. For example, a loop like this: for (int i = 0; i < 100; i++) { total += i; } would previously compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) break(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop After this patch, the "(+f0) break(8)" turns into: (+f0) if(8) break(8) endif(8) because the back-end isn't smart enough to recognize that "if (condition) break;" can be done using a conditional break instruction. However, it should be relatively easy for a future peephole optimization to properly optimize this. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Move the CSE equality functions to the ir class.Eric Anholt2013-11-151-0/+1
| | | | | | | | I want to reuse them in opt_algebraic. v2: Merge in Chris Forbes's break fix. Reviewed-by: Jordan Justen <[email protected]>
* glsl: Linker support for ARB_shader_atomic_counters.Francisco Jerez2013-11-071-0/+1
| | | | | | | | | | | | | | | v2: Add comments on the purpose of the auxiliary data structures. Check for atomic counter overlaps. Use the contains_atomic() convenience method. Add static assert with the number of expected shader stages. v3: Don't resize atomic arrays. v4: Add comment on the reason why we don't resize atomic counter arrays. Use 'strcmp(...) == 0' instead of '!strcmp(...)'. v5 (idr): Don't use STL in the linker. Signed-off-by: Francisco Jerez <[email protected]> Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add a CSE pass.Eric Anholt2013-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This only operates on constant/uniform values for now, because otherwise I'd have to deal with killing my available CSE entries when assignments happen, and getting even this working in the tree ir was painful enough. As is, it has the following effect in shader-db: total instructions in shared programs: 1524077 -> 1521964 (-0.14%) instructions in affected programs: 50629 -> 48516 (-4.17%) GAINED: 0 LOST: 0 And, for tropics, that accounts for most of the effect, the FPS improvement is 11.67% +/- 0.72% (n=3). v2: Use read_only field of the variable, manually check the lod_info union members, use get_num_operands(), rename cse_operands_visitor to is_cse_candidate_visitor, move all is-a-candidate logic to that function, and call it before checking for CSE on a given rvalue, more comments, use private keyword. Reviewed-by: Paul Berry <[email protected]>
* glsl: Remove builtin_compiler from the build system.Kenneth Graunke2013-09-091-15/+2
| | | | | | | | | We don't actually use anything from builtin_function.cpp, so we don't need to generate it anymore. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Paul Berry <[email protected]>
* glsl: Write a new built-in function module.Kenneth Graunke2013-09-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This creates a new replacement for the existing built-in function code. The new module lives in builtin_functions.cpp (not builtin_function.cpp) and exists in parallel with the existing system. It isn't used yet. The new built-in function code takes a significantly different approach: Instead of implementing built-ins via printed IR, build time scripts, and run time parsing, we now implement them directly in C++, using ir_builder. This translates to faster load times, and a much less complex build system. It also takes a different approach to built-in availability: each signature now stores a boolean predicate, which makes it easy to construct arbitrary expressions based on _mesa_glsl_parse_state's fields. This is much more flexible than the old system, and also easier to use. Built-ins are also now stored in a single gl_shader object, rather than being spread out across a number of shaders that need to be linked. When searching for a matching prototype, we simply consult the availability predicate. This also simplifies the code. v2: Incorporate Matt Turner's feedback: use the new fma() function rather than expr(). Don't expose textureQueryLOD() in GLSL 4.00 (since it was renamed to textureQueryLod()). Also correct some #undefs. v3: Incorporate Paul Berry's feedback: rename legacy to compatibility; add comments to explain a few things; fix uvec availability; include shaderobj.h instead of repeating the _mesa_new_shader prototype. v4: Fix lack of TEX_PROJECT on textureProjGrad[Offset] (caught by oglc). Add an out_var convenience function (more feedback by Matt Turner). v5: Rework availability predicates for Lod functions. They were broken. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Enthusiastically-acked-by: Paul Berry <[email protected]>
* glsl/linker: eliminate unused and set-but-unused built-in varyingsMarek Olšák2013-07-021-0/+1
| | | | | | | | | | | | | This eliminates built-in varyings such as gl_Color, gl_SecondaryColor, gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is broken down into separate vec4s if needed. v2: - use a switch statement in varying_info_visitor::visit(ir_variable*) - use snprintf - disable the optimization for GLES2 Reviewed-by: Ian Romanick <[email protected]>
* glsl: Streamline the built-in type handling code.Kenneth Graunke2013-06-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Over the last few years, the compiler has grown to support 7 different language versions and 6 extensions that add new built-in types. With more and more features being added, some of our core code has devolved into an unmaintainable spaghetti of sorts. A few problems with the old code: 1. Built-in types are declared...where exactly? The types in builtin_types.h were organized in arrays by the language version or extension they were introduced in. It's factored out to avoid duplicates---every type only exists in one array. But that means that sampler1D is declared in 110, sampler2D is in core types, sampler3D is a unique global not in a list...and so on. 2. Spaghetti call-chains with weird parameters: generate_300ES_types calls generate_130_types which calls generate_120_types and generate_EXT_texture_array_types, which calls generate_110_types, which calls generate_100ES_types...and more Except that ES doesn't want 1D types, so we have a skip_1d parameter. add_deprecated also falls into this category. 3. Missing type accessors. Common types have convenience pointers (like glsl_type::vec4_type), but others may not be accessible at all without a symbol table (for example, sampler types). 4. Global variable declarations in a header file? #include "builtin_types.h" in two C++ files would break the build. The new code addresses these problems. All built-in types are declared together in a single table, independent of when they were introduced. The macro that declares a new built-in type also creates a convenience pointer, so every type is available and it won't get out of sync. The code to populate a symbol table with the appropriate types for a particular language version and set of extensions is now a single table-driven function. The table lists the type name and GL/ES versions when it was introduced (similar to how the lexer handles reserved words). A single loop adds types based on the language version. Explicit extension checks then add additional types. If they were already added based on the language version, glsl_symbol_table simply ignores the request to add them a second time, meaning we don't need to worry about duplicates and can simply list types where they belong. v2: Mark uvecs and shadow samplers as ES3 only, and 1DArrayShadow as unsupported in ES entirely. Add a touch more doxygen. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl linker: compare interface blocks during intrastage linkingJordan Justen2013-05-231-0/+1
| | | | | | | | | | | | | | | | Verify that interface blocks match when combining compilation units at the same stage. (For example, when merging all vertex shaders.) Fixes piglit glsl-1.50 test: * linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test v5 (Ken): Rename to link_interface_blocks.cpp and drop the separate .h file for consistency with other linker code. Remove "ok" variable. Fold cross_validate_interface_blocks into its caller. Signed-off-by: Jordan Justen <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* glsl linker: remove interface block instance namesJordan Justen2013-05-231-0/+1
| | | | | | | | Convert interface blocks with instance names into flat interface blocks without an instance name. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add lowering pass for ir_triop_vector_insertIan Romanick2013-05-131-0/+1
| | | | | | | | | | | | | | | | | | This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. v2: Use WRITEMASK_* instead of integer literals. Use a more concise method of generating broadcast_index. Both suggested by Eric. v3: Use a series of scalar compares instead of a single vector compare. Suggested by Eric and Ken. It still uses 'if (cond) v.x = y;' instead of conditional assignments because ir_builder doesn't do conditional assignments, and I'd rather keep the code simple. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add a pass to flip matrix/vector multiplies to use dot products.Kenneth Graunke2013-05-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | This pass flips (matrix * vector) operations to (vector * matrixTranspose) for certain built-in matrices (currently gl_ModelViewProjectionMatrix and gl_TextureMatrix). This is equivalent, but results in dot products rather than multiplies and adds. On some hardware, this is more efficient. This pass is conditionalized on ctx->mvp_with_dp4, the flag drivers set to indicate they prefer dot products. Improves performance in Lightsmark by 1.01131% +/- 0.162069% (n = 10) on a Haswell GT2 system. Passes Piglit on Ivybridge. v2: Use struct gl_shader_compiler_options instead of plumbing through another boolean flag for this purpose. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Refactor handling of ast_array_index to a separate functionIan Romanick2013-04-081-0/+1
| | | | | | | | | I love 800+ line switch-statements as much as the next guy... Future commits will make changes to this part of the AST-to-HIR conversion, and extracting this code will make that a bit easier. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add an optimization pass to flatten simple nested if blocks.Kenneth Graunke2013-04-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GLBenchmark 2.7's shaders contain conditional blocks like: if (x) { if (y) { ... } } where the outer conditional's then clause contains exactly one statement (the nested if) and there are no else clauses. This can easily be optimized into: if (x && y) { ... } This saves a few instructions in GLBenchmark 2.7: total instructions in shared programs: 11833 -> 11649 (-1.55%) instructions in affected programs: 8234 -> 8050 (-2.23%) It also helps CS:GO slightly (-0.05%/-0.22%). More importantly, however, it simplifies the control flow graph, which could enable other optimizations. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Add a visitor to determine whether a uniform block is ever usedIan Romanick2013-01-251-0/+1
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* linker: Refactor intra-stage block compatabililty testingIan Romanick2013-01-251-0/+1
| | | | | | | | | | | | | | | | Also slightly change the compatibility test. Instead of comparing the offsets of the block variables, compare the packing mode of the blocks. Ideally we don't want to assign the offsets until a later stage of linking. This is put in a new file called link_uniform_blocks.cpp. Some new functions related to uniform blocks are going to live in that file as well. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add lowering pass for GLSL ES 3.00 pack/unpack operations (v4)Chad Versace2013-01-241-0/+1
| | | | | | | | | | | | | | | | Lower them to arithmetic and bit manipulation expressions. v2: Rewrite using ir_builder [for idr]. v3: Comment typos. [for mattst88] v4: Fix arithmetic error in comments. Factor out a shift instruction. Don't heap allocate factory.instructions. [for paul] Reviewed-by: Ian Romanick <[email protected]> (v2) Reviewed-by: Matt Tuner <[email protected]> (v3) Reviewed-by: Paul Berry <[email protected]> (v4) Signed-off-by: Chad Versace <[email protected]>
* glsl/Makefile.sources: Correct BUILTIN_COMPILER_CXX_FILESMatt Turner2013-01-221-1/+1
| | | | | | | | | | | | | | | Squashed with two reverts: Revert "android: Update for builtin_stubs.cpp move" This reverts commit c0def90ede1e939173041b8785303de90f8fdc6c. Revert "scons: Update for builtin_stubs.cpp" This reverts commit 8ac4b82699ad0a59ae6ae6d3415702eaa5d4fe3b. Tested-by: Andreas Boll <[email protected]> Tested-on-Android-by: Chad Versace <[email protected]>
* glsl/build: Build glcpp via the glsl MakefileMatt Turner2013-01-221-4/+4
| | | | | | Removing the subdirectory recursion provides a small speed up. Tested-by: Andreas Boll <[email protected]>
* glsl: Separate varying linking code to its own file.Paul Berry2013-01-081-0/+1
| | | | | | | | linker.cpp is getting pretty big, and we're about to add even more varying packing code, so split out the linker code that concerns varyings to its own file. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add a lowering pass for packing varyings.Paul Berry2012-12-141-0/+1
| | | | | | | | | | | | | | This lowering pass generates GLSL code that manually packs varyings into vec4 slots, for the benefit of back-ends that don't support packed varyings natively. No functional change--the lowering pass is not yet used. Reviewed-by: Eric Anholt <[email protected]> v2: Don't use ir_hierarchical_visitor--just loop over instructions directly. Also, make the names of the packed varyings include the names of the original varyings that were packed into them.
* automake: Remove empty file variable.Brian Paul2012-11-121-2/+1
| | | | | | | | Fixes SCons build regression introduced with commit a665cf1226b80ec52a0c1a4a38378df4389e8ebf. Signed-off-by: Vinson Lee <[email protected]> Tested-by: Vinson Lee <[email protected]>
* automake: Merge *_CXX_FILES variables in the glsl build.Eric Anholt2012-11-121-5/+4
| | | | | Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* automake: Fix a comment typo.Eric Anholt2012-11-121-1/+1
| | | | | Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* build/glsl: fix location of generated files.Christopher James Halse Rogers2012-08-131-7/+8
| | | | | | | | Like in src/mesa, use GLSL_BUILDDIR/GLSL_SRCDIR to unambiguously distinguish between in-tree and generated files. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Christopher James Halse Rogers <[email protected]>
* glsl: Add a lowering pass to turn complicated UBO references to vector loads.Eric Anholt2012-08-071-0/+1
| | | | | | | | | | | v2: Reduce the impenetrable code in emit_ubo_loads() by 23 lines by keeping the ir_variable as the variable part of the offset from handle_rvalue(), and track the constant offsets from that with a plain old integer value, avoiding a bunch of temporary variables in the array and struct handling. Also, fix file description doxygen. v3: Fix a row vs col typo, and fix spelling in a comment. Reviewed-by: Eric Anholt <[email protected]>
* automake: convert libglslJon TURNEY2012-07-131-2/+2
| | | | | | | | | | | | | | | | | | | | v2: Use AM_V_GEN to silence generated code rules. Add BUILT_SOURCES to CLEANFILES v3: - Fix an accidental // in a path - Use automake make rules for lex/yacc rather than writing our own - Update .gitignore appropriately - Build a libglcpp convenience library rather than awkwardly including the files in libglsl and delegating the generation - Remove libglsl.a compatibility link on clean v4: - Automake's rules for lex/yacc make .cc if source is .ll or .yy, and apparently we must use those extensions "because of scons", so update everywhere glsl_parser.cpp -> glsl_parser.cc and glsl_lexer.cpp -> glsl_lexer.cc. This fixes 'make tarballs' and building with dricore enabled. Signed-off-by: Jon TURNEY <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Tested-by: Matt Turner <[email protected]>
* scons: Fix scons build.José Fonseca2012-06-111-0/+2
|
* automake: Add a prefix variable for libglsl sources.Eric Anholt2012-06-111-86/+86
| | | | | See e86c40a84d241b954594f5ae7df9b9c3fc797a4e for reasoning. In the process I did s/:=/=/ to shut up automake about nonportable make syntax.
* glsl: Set initial values for uniforms in the linkerIan Romanick2012-05-231-0/+1
| | | | | | | | | | | | | | v2: Fix handling of arrays-of-structure. Thanks to Eric Anholt for pointing this out. v3: Minor comment change based on feedback from Ken. Fixes piglit glsl-1.20/execution/uniform-initializer/fs-structure-array and glsl-1.20/execution/uniform-initializer/vs-structure-array. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Implement the GLSL 1.30+ discard control flow rule in GLSL IR.Eric Anholt2012-05-141-0/+1
| | | | | | | Previously, I tried implementing this in the i965 driver, but did so in a way that violated the intent of the spec, and broke Tropics. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Remove the opt_discard_simplification pass.Eric Anholt2012-05-141-1/+0
| | | | | | | This conflicts with the GLSL 1.30+ rules for derivatives after a discard has occurred. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Create an ir_builder helper for hand-generating IR.Eric Anholt2012-04-131-0/+1
| | | | | | | | | | | | | | The C++ constructors with placement new, while functional, are extremely verbose, leading to generation of simple GLSL IR expressions like (a * b + c * d) expanding to many lines of code and using lots of temporary variables. By creating a new ir_builder.h that puts simple generators in our namespace and taking advantage of ralloc_parent(), we can generate much more compact code, at a minor runtime cost. v2: Replace ir_instruction usage with just ir_rvalue. v3: Drop remaining missed as_rvalue() in v2. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add an array splitting pass.Eric Anholt2012-04-111-0/+1
| | | | | | | | | | | | | | | | | I've had this code laying around almost done for a long time. The idea is like opt_structure_splitting, that we've got a bunch of transforms at the GLSL IR level that only understand scalars and vectors, which just skip complicated dereferences. While driver backends may manage some optimization after they split matrices up themselves, it would be better to bring all of our optimization to bear on the problem. While I wasn't expecting changes quite yet, a few programs end up winning: a gstreamer convolution shader, and the Humus dynamic branching demo: Total instructions: 269430 -> 269342 3/2148 programs affected (0.1%) 1498 -> 1410 instructions in affected programs (5.9% reduction)
* glsl: rename Makefile.sources' _SOURCES variablesMatt Turner2012-01-301-9/+9
| | | | | | | | automake uses variables named *_SOURCES. Reviewed-by: Eric Anholt <[email protected]> Tested-by: Eric Anholt <[email protected]> Signed-off-by: Matt Turner <[email protected]>
* glsl: Add a lowering pass to remove reads of shader output variables.Vincent Lejeune2012-01-061-0/+1
| | | | | | | | | This is similar to Gallium's existing glsl_to_tgsi::remove_output_read lowering pass, but done entirely inside the GLSL compiler. Signed-off-by: Vincent Lejeune <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl: Move ir_variable.cpp to builtin_variables.cpp.Eric Anholt2011-11-111-1/+1
| | | | | | It's only about builtins, not variables in general. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Refactor source lists to Makefile.sourcesChia-I Wu2011-11-021-0/+104
With the hope that Android.mk and SConscript can share the file to reduce future breakage. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Chad Versace <[email protected]>