| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
This is taken from the ogl-math project, with Inverse renamed to adj
(since it's not actually the inverse), transposed, and our types
plugged in. There are potential CSE opportunities in this code
(particularly for hardware with RCP but not DIV), but we should be
doing CSE anyway, so don't hand-optimize.
Fixes piglit inverse tests.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
This reverts commit 9947656168d09f9019600fccc42ca8e0de49b83a.
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Eric Anholt <[email protected]>
Signed-off-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
automake uses variables named *_SOURCES.
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Eric Anholt <[email protected]>
Signed-off-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
With the hope that Android.mk and SConscript can share the file to reduce
future breakage.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In i965 GEN6+ (and I suspect most other hardware), gl_ClipDistance
needs to be laid out as a pair of vec4's (the first containing clip
distances 0-3, and the second containing clip distances 4-7).
However, it is declared in GLSL as an array of 8 floats.
This lowering pass acts at the GLSL level, modifying the declaration
of gl_ClipDistance so that it is an array of vec4's rather than an
array of floats, and renaming it to gl_ClipDistanceMESA. In addition,
it modifies all accesses to the array so that they access the
appropiate component of one of the vec4's.
Since some hardware may not internally represent gl_ClipDistance as a
pair of vec4's, this lowering pass is optional. To enable it, set the
LowerClipDistance flag in gl_shader_compiler_options to true.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
builtin_stubs.cpp is only supposed to be used for builtin_compiler. It
contains a stub version of _mesa_glsl_initialize_functions() that does
nothing.
libglsl.a already contains builtin_function.cpp, the generated file that
contains a version of _mesa_glsl_initialize_functions() that actually
initializes all the built-in functions.
By mistakenly linking to builtin_stubs, glsl_compiler and glsl_test are
unable to compile any shaders that use built-in functions.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Reviewed-By: Christopher James Halse Rogers
<[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new build artifact, glsl_test, which can be used for
testing optimization passes in isolation.
I'm hoping that we will be able to add other useful standalone tests
to this executable in the future. Accordingly, it is built in a
modular fashion: the main() function uses its first argument to
determine which test function to invoke, removes that argument from
argv[], and then calls that function to interpret the rest of the
command line arguments and perform the test. Currently the only test
function is "optpass", which tests optimization passes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moves the following functions from main.cpp (the main cpp
file for the standalone executable that is used to create the built-in
functions) to standalone_scaffolding.cpp, so that they can be re-used
in other standalone executables:
- initialize_context()*
- _mesa_new_shader()
- _mesa_reference_shader()
*initialize_context contained some code that was specific to main.cpp,
so it was split into two functions: initialize_context() (which
remains in main.cpp), and initialize_context_from_defaults() (which is
in standalone_scaffolding.cpp).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL 1.20 and later specs say:
"Recursion is not allowed, not even statically. Static recursion is
present if the static function call graph of the program contains
cycles."
Recursion is detected and rejected both a compile-time and at
link-time. The complie-time check happens to detect some cases that
may be removed by various optimization passes. The spec doesn't seem
to allow this, but other vendors (e.g., NVIDIA) appear to only check
at link-time after all optimizations.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33885
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
|
|
|
|
|
| |
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=36651
NOTE: This is a candidate for the 7.10 branch.
|
|
|
|
| |
Signed-off-by: Tobias Droste <[email protected]>
|
|
|
|
|
| |
SCons has built-in support for .ll and .yy, but not .lpp and .ypp. Since
there's no real benefit to using the old names, change them.
|
|
|
|
|
|
| |
Fixes bug #34346.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Hopefully should fix bug #34468.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This an adds --enable-shared-dricore option to configure. When enabled,
DRI modules will link against a shared copy of the common mesa routines
rather than statically linking these.
This saves about 30MB on disc with a full complement of classic DRI
drivers.
v2: Only enable with a gcc-compatible compiler that handles rpath
Handle DRI_CFLAGS without filter-out magic
Build shared libraries with the full mklib voodoo
Fix typos
v3: Resolve conflicts with talloc removal patches
Signed-off-by: Christopher James Halse Rogers <[email protected]>
|
|
|
|
|
|
| |
Broken since e0c1fc32832b66b52e6352ba563288ee48a1face.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Also, add a 'glcpp' target so you can type 'make glcpp' instead of
'make glcpp/glcpp'.
|
|
|
|
|
|
|
|
|
|
| |
This patch cleans up many of the extra copies in GLSL IR introduced by
i965's scalarizing passes. It doesn't result in a statistically
significant performance difference on nexuiz high settings (n=3) or my
demo (n=10), due to brw_fs.cpp's register coalescing covering most of
those extra moves anyway. However, it does make the debug of wine's
GLSL shaders much more tractable, and reduces instruction count of
glsl-fs-convolution-2 from 376 to 288.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Python is already necessary for other parts of Mesa, so there's no
reason we can't just generate it. This patch updates both make and
SCons to do so.
|
|
|
|
|
|
|
|
| |
We always want to use '.' as the decimal point.
See http://bugs.freedesktop.org/show_bug.cgi?id=24531
NOTE: this is a candidate for the 7.10 branch.
|
|
|
|
|
|
|
| |
This should allow lower_if_to_cond_assign to work in the presence of
discards, fixing bug #31690 and likely #31983.
NOTE: This is a candidate for the 7.9 branch.
|
|
|
|
| |
NOTE: This is a candidate for the 7.9 branch.
|
|
|
|
|
|
|
| |
This should save on the overhead of tree-walking and provide a
convenient place to add more instruction lowering in the future.
Signed-off-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The vector operator collects 2, 3, or 4 scalar components into a
vector. Doing this has several advantages. First, it will make
ud-chain tracking for components of vectors much easier. Second, a
later optimization pass could collect scalars into vectors to allow
generation of SWZ instructions (or similar as operands to other
instructions on R200 and i915). It also enables an easy way to
generate IR for SWZ instructions in the ARB_vertex_program assembler.
|
|
|
|
|
| |
This helps distinguish between lowering passes, optimization passes, and
other compiler code.
|
|
|
|
|
|
|
|
|
|
|
| |
First, it changes autoconf to use a "python2" binary when available,
rather than plain "python" (which is ambiguous). Secondly, it changes
the Makefiles to use $(PYTHON) $(PYTHON_FLAGS) rather than calling
python directly.
Signed-off-by: Xavier Chantry <[email protected]>
Signed-off-by: Matthew William Cox <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currenly GLSL happily generates indirect addressing of any kind of
arrays.
Unfortunately DirectX 9 GPUs are not guaranteed to support any of them in
general.
This pass fixes that by lowering such constructs to a binary search on the
values, followed at the end by vectorized generation of equality masks, and
4 conditional assignments for each mask generation.
Note that this requires the ir_binop_equal change so that we can emit SEQ
to generate the boolean masks.
Unfortunately, ir_structure_splitting is too dumb to turn the resulting
constant array references to individual variables, so this will need to
be added too before this pass can actually be effective for temps.
Several patches in the glsl2-lower-variable-indexing were squashed
into this commit. These patches fix bugs in Luca's original
implementation, and the individual patches can be seen in that branch.
This was done to aid bisecting in the future.
Signed-off-by: Ian Romanick <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes in v2:
- Base class renamed to ir_control_flow_visitor
- Tried to comply with coding style
This is a new pass that supersedes ir_if_return and "lowers" jumps
to if/else structures.
Currently it causes no regressions on softpipe and nv40, but I'm not sure
whether the piglit glsl tests are thorough enough, so consider this
experimental.
It can be asked to:
1. Pull jumps out of ifs where possible
2. Remove all "continue"s, replacing them with an "execute flag"
3. Replace all "break" with a single conditional one at the end of the loop
4. Replace all "return"s with a single return at the end of the function,
for the main function and/or other functions
This gives several great benefits:
1. All functions can be inlined after this pass
2. nv40 and other pre-DX10 chips without "continue" can be supported
3. nv30 and other pre-DX10 chips with no control flow at all are better supported
Note that for full effect we should also teach the unroller to unroll
loops with a fixed maximum number of iterations but with the canonical
conditional "break" that this pass will insert if asked to.
Continues are lowered by adding a per-loop "execute flag", initialized to
TRUE, that when cleared inhibits all execution until the end of the loop.
Breaks are lowered to continues, plus setting a "break flag" that is checked
at the end of the loop, and trigger the unique "break".
Returns are lowered to breaks/continues, plus adding a "return flag" that
causes loops to break again out of their enclosing loops until all the
loops are exited: then the "execute flag" logic will ignore everything
until the end of the function.
Note that "continue" and "return" can also be implemented by adding
a dummy loop and using break.
However, this is bad for hardware with limited nesting depth, and
prevents further optimization, and thus is not currently performed.
|
| |
|
| |
|
|
|
|
| |
This is the next step on the road to loop unrolling
|
|
|
|
| |
This is the first step eventually leading to loop unrolling.
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of 1.20, variable names, function names, and structure type names all
share a single namespace, and should conflict with one another in the
same scope, or hide each other in nested scopes.
However, in 1.10, variables and functions can share the same name in the
same scope. Structure types, however, conflict with/hide both.
Fixes piglit tests redeclaration-06.vert, redeclaration-11.vert,
redeclaration-19.vert, and struct-05.vert.
|
|
|
|
| |
this has make clean remove all the objects.
|
|
|
|
|
|
|
|
|
|
| |
I had used pkg-config from the Makefile because I didn't want to screw
around with the non-autoconf build, but that doesn't work because the
PKG_CONFIG_PATH or TALLOC_LIBS/TALLOC_CFLAGS that people set at
configure time needs to be respected and may not be present at build
time.
Bug #29585
|
|
|
|
|
|
|
|
|
| |
This copies over a dummy builtin_functions.cpp and rebuilds a
bootstrapped version of the compiler, then uses that to generate the
proper list of builtins. Finally, it rebuilds the compiler with the new
list.
Unfortunately, it's no longer automatic, but at least it works.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each language version/extension and target now has a "profile" containing
all of the available builtin function prototypes. These are written in
GLSL, and come directly out of the GLSL spec (except for expanding genType).
A new builtins/ir/ folder contains the hand-written IR for each builtin,
regardless of what version includes it. Only those definitions that have
prototypes in the profile will be included.
The autogenerated IR for texture builtins is no longer written to disk,
so there's no longer any confusion as to what's hand-written or
generated.
All scripts are now in python instead of perl.
|
|
|
|
|
|
| |
With the glsl2-965 branch, the optimization of glsl-algebraic-rcp-rcp
regressed due to noop swizzles hiding information from ir_algebraic.
This cleans up those noop swizzles for us.
|
|
|
|
|
| |
I keep copy and pasting this code all over, so consolidate it in one
place.
|
|
|
|
|
| |
Also remove the --never-interactive command line option for the
preprocessor lexer. This was already done for main compiler lexer.
|