<HTML>

<TITLE>Shading Language Support</TITLE>

<link rel="stylesheet" type="text/css" href="mesa.css"></head>

<BODY>

<H1>Shading Language Support</H1>

<p>
This page describes the features and status of Mesa's support for the
<a href="http://opengl.org/documentation/glsl/" target="_parent">
OpenGL Shading Language</a>.
</p>

<p>
Last updated on 15 December 2008.
</p>

<p>
Contents
</p>
<ul>
<li><a href="#envvars">Environment variables</a>
<li><a href="#120">GLSL 1.20 support</a>
<li><a href="#unsup">Unsupported Features</a>
<li><a href="#notes">Implementation Notes</a>
<li><a href="#hints">Programming Hints</a>
<li><a href="#standalone">Stand-alone GLSL Compiler</a>
<li><a href="#implementation">Compiler Implementation</a>
<li><a href="#validation">Compiler Validation</a>
</ul>



<a name="envvars">
<h2>Environment Variables</h2>

<p>
The <b>MESA_GLSL</b> environment variable can be set to a comma-separated
list of keywords to control some aspects of the GLSL compiler and shader
execution.  These are generally used for debugging.
</p>
<ul>
<li>dump - print GLSL shader code to stdout at link time
<li>log - log all GLSL shaders to files.
    The filenames will be "shader_X.vert" or "shader_X.frag" where X
    the shader ID.
<li>nopt - disable compiler optimizations
<li>opt - force compiler optimizations
<li>uniform - print message to stdout when glUniform is called
<li>nopvert - force vertex shaders to be a simple shader that just transforms
    the vertex position with ftransform() and passes through the color and
    texcoord[0] attributes.
<li>nopfrag - force fragment shader to be a simple shader that passes
    through the color attribute.
</ul>
<p>
Example:  export MESA_GLSL=dump,nopt
</p>


<a name="120">
<h2>GLSL 1.20 support</h2>

<p>
GLSL version 1.20 is supported in Mesa 7.3 and later.
Among the features/differences of GLSL 1.20 are:
<ul>
<li><code>mat2x3, mat2x4</code>, etc. types and functions
<li><code>transpose(), outerProduct(), matrixCompMult()</code> functions
(but untested)
<li>precision qualifiers (lowp, mediump, highp)
<li><code>invariant</code> qualifier
<li><code>array.length()</code> method
<li><code>float[5] a;</code> array syntax
<li><code>centroid</code> qualifier
<li>unsized array constructors
<li>initializers for uniforms
<li>const initializers calling built-in functions
</ul>



<a name="unsup">
<h2>Unsupported Features</h2>

<p>
The following features of the shading language are not yet fully supported
in Mesa:
</p>

<ul>
<li>Linking of multiple shaders does not always work.  Currently, linking
    is implemented through shader concatenation and re-compiling.  This
    doesn't always work because of some #pragma and preprocessor issues.
<li>gl_ClipVertex
<li>The gl_Color and gl_SecondaryColor varying vars are interpolated
    without perspective correction
</ul>

<p>
All other major features of the shading language should function.
</p>


<a name="notes">
<h2>Implementation Notes</h2>

<ul>
<li>Shading language programs are compiled into low-level programs
    very similar to those of GL_ARB_vertex/fragment_program.
<li>All vector types (vec2, vec3, vec4, bvec2, etc) currently occupy full
    float[4] registers.
<li>Float constants and variables are packed so that up to four floats
    can occupy one program parameter/register.
<li>All function calls are inlined.
<li>Shaders which use too many registers will not compile.
<li>The quality of generated code is pretty good, register usage is fair.
<li>Shader error detection and reporting of errors (InfoLog) is not
    very good yet.
<li>The ftransform() function doesn't necessarily match the results of
    fixed-function transformation.
</ul>

<p>
These issues will be addressed/resolved in the future.
</p>


<a name="hints">
<h2>Programming Hints</h2>

<ul>
<li>Declare <em>in</em> function parameters as <em>const</em> whenever possible.
    This improves the efficiency of function inlining.
</li>
<br>
<li>To reduce register usage, declare variables within smaller scopes.
    For example, the following code:
<pre>
    void main()
    {
       vec4 a1, a2, b1, b2;
       gl_Position = expression using a1, a2.
       gl_Color = expression using b1, b2;
    }
</pre>
    Can be rewritten as follows to use half as many registers:
<pre>
    void main()
    {
       {
          vec4 a1, a2;
          gl_Position = expression using a1, a2.
       }
       {
          vec4 b1, b2;
          gl_Color = expression using b1, b2;
       }
    }
</pre>
    Alternately, rather than using several float variables, use
    a vec4 instead.  Use swizzling and writemasks to access the
    components of the vec4 as floats.
</li>
<br>
<li>Use the built-in library functions whenever possible.
    For example, instead of writing this:
<pre>
        float x = 1.0 / sqrt(y);
</pre>
    Write this:
<pre>
        float x = inversesqrt(y);
</pre>
<li>
   Use ++i when possible as it's more efficient than i++
</li>
</ul>


<a name="standalone">
<h2>Stand-alone GLSL Compiler</h2>

<p>
A unique stand-alone GLSL compiler driver has been added to Mesa.
<p>

<p>
The stand-alone compiler (like a conventional command-line compiler)
is a tool that accepts Shading Language programs and emits low-level
GPU programs.
</p>

<p>
This tool is useful for:
<p>
<ul>
<li>Inspecting GPU code to gain insight into compilation
<li>Generating initial GPU code for subsequent hand-tuning
<li>Debugging the GLSL compiler itself
</ul>

<p>
After building Mesa, the glslcompiler can be built by manually running:
</p>
<pre>
    make realclean
    make linux
    cd src/mesa/drivers/glslcompiler
    make
</pre>


<p>
Here's an example of using the compiler to compile a vertex shader and
emit GL_ARB_vertex_program-style instructions:
</p>
<pre>
    bin/glslcompiler --debug --numbers --fs progs/glsl/CH06-brick.frag.txt
</pre>
<p>
results in:
</p>
<pre>
# Fragment Program/Shader
  0: RCP TEMP[4].x, UNIFORM[2].xxxx;
  1: RCP TEMP[4].y, UNIFORM[2].yyyy;
  2: MUL TEMP[3].xy, VARYING[0], TEMP[4];
  3: MOV TEMP[1], TEMP[3];
  4: MUL TEMP[0].w, TEMP[1].yyyy, CONST[4].xxxx;
  5: FRC TEMP[1].z, TEMP[0].wwww;
  6: SGT.C TEMP[0].w, TEMP[1].zzzz, CONST[4].xxxx;
  7: IF (NE.wwww); # (if false, goto 9);
  8:    ADD TEMP[1].x, TEMP[1].xxxx, CONST[4].xxxx;
  9: ENDIF;
 10: FRC TEMP[1].xy, TEMP[1];
 11: SGT TEMP[2].xy, UNIFORM[3], TEMP[1];
 12: MUL TEMP[1].z, TEMP[2].xxxx, TEMP[2].yyyy;
 13: LRP TEMP[0], TEMP[1].zzzz, UNIFORM[0], UNIFORM[1];
 14: MUL TEMP[0].xyz, TEMP[0], VARYING[1].xxxx;
 15: MOV OUTPUT[0].xyz, TEMP[0];
 16: MOV OUTPUT[0].w, CONST[4].yyyy;
 17: END
</pre>

<p>
Note that some shading language constructs (such as uniform and varying
variables) aren't expressible in ARB or NV-style programs.
Therefore, the resulting output is not always legal by definition of
those program languages.
</p>
<p>
Also note that this compiler driver is still under development.
Over time, the correctness of the GPU programs, with respect to the ARB
and NV languagues, should improve.
</p>



<a name="implementation">
<h2>Compiler Implementation</h2>

<p>
The source code for Mesa's shading language compiler is in the
<code>src/mesa/shader/slang/</code> directory.
</p>

<p>
The compiler follows a fairly standard design and basically works as follows:
</p>
<ul>
<li>The input string is tokenized (see grammar.c) and parsed
(see slang_compiler_*.c) to produce an Abstract Syntax Tree (AST).
The nodes in this tree are slang_operation structures
(see slang_compile_operation.h).
The nodes are decorated with symbol table, scoping and datatype information.
<li>The AST is converted into an Intermediate representation (IR) tree
(see the slang_codegen.c file).
The IR nodes represent basic GPU instructions, like add, dot product,
move, etc. 
The IR tree is mostly a binary tree, but a few nodes have three or four
children.
In principle, the IR tree could be executed by doing an in-order traversal.
<li>The IR tree is traversed in-order to emit code (see slang_emit.c).
This is also when registers are allocated to store variables and temps.
<li>In the future, a pattern-matching code generator-generator may be
used for code generation.
Programs such as L-BURG (Bottom-Up Rewrite Generator) and Twig look for
patterns in IR trees, compute weights for subtrees and use the weights
to select the best instructions to represent the sub-tree.
<li>The emitted GPU instructions (see prog_instruction.h) are stored in a
gl_program object (see mtypes.h).
<li>When a fragment shader and vertex shader are linked (see slang_link.c)
the varying vars are matched up, uniforms are merged, and vertex
attributes are resolved (rewriting instructions as needed).
</ul>

<p>
The final vertex and fragment programs may be interpreted in software
(see prog_execute.c) or translated into a specific hardware architecture
(see drivers/dri/i915/i915_fragprog.c for example).
</p>

<h3>Code Generation Options</h3>

<p>
Internally, there are several options that control the compiler's code
generation and instruction selection.
These options are seen in the gl_shader_state struct and may be set
by the device driver to indicate its preferences:

<pre>
struct gl_shader_state
{
   ...
   /** Driver-selectable options: */
   GLboolean EmitHighLevelInstructions;
   GLboolean EmitCondCodes;
   GLboolean EmitComments;
};
</pre>

<ul>
<li>EmitHighLevelInstructions
<br>
This option controls instruction selection for loops and conditionals.
If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK
instructions will be emitted.
Otherwise, those constructs will be implemented with BRA instructions.
</li>

<li>EmitCondCodes
<br>
If set, condition codes (ala GL_NV_fragment_program) will be used for
branching and looping.
Otherwise, ordinary registers will be used (the IF instruction will
examine the first operand's X component and do the if-part if non-zero).
This option is only relevant if EmitHighLevelInstructions is set.
</li>

<li>EmitComments
<br>
If set, instructions will be annoted with comments to help with debugging.
Extra NOP instructions will also be inserted.
</br>

</ul>


<a name="validation">
<h2>Compiler Validation</h2>

<p>
A <a href="http://glean.sf.net" target="_parent">Glean</a> test has
been create to exercise the GLSL compiler.
</p>
<p>
The <em>glsl1</em> test runs over 170 sub-tests to check that the language
features and built-in functions work properly.
This test should be run frequently while working on the compiler to catch
regressions.
</p>
<p>
The test coverage is reasonably broad and complete but additional tests
should be added.
</p>


</BODY>
</HTML>