summaryrefslogtreecommitdiffstats
path: root/src/glsl/builtins/ir
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Convert ir_call to be a statement rather than a value.Kenneth Graunke2012-04-022-45/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aside from ir_call, our IR is cleanly split into two classes: - Statements (typeless; used for side effects, control flow) - Values (deeply nestable, pure, typed expression trees) Unfortunately, ir_call confused all this: - For void functions, we placed ir_call directly in the instruction stream, treating it as an untyped statement. Yet, it was a subclass of ir_rvalue, and no other ir_rvalue could be used in this way. - For functions with a return value, ir_call could be placed in arbitrary expression trees. While this fit naturally with the source language, it meant that expressions might not be pure, making it difficult to transform and optimize them. To combat this, we always emitted ir_call directly in the RHS of an ir_assignment, only using a temporary variable in expression trees. Many passes relied on this assumption; the acos and atan built-ins violated it. This patch makes ir_call a statement (ir_instruction) rather than a value (ir_rvalue). Non-void calls now take a ir_dereference of a variable, and store the return value there---effectively a call and assignment rolled into one. They cannot be embedded in expressions. All expression trees are now pure, without exception. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl/builtins: Add missing mix(genType, genType, bvec) built-ins.Kenneth Graunke2012-01-061-1/+1
| | | | | | | | | | | | The IR for mix(float, float, bool) was missing a write mask, causing the IR reader to die horribly. Furthermore, I neglected to add any of the new prototypes to the 1.30 profiles. Fixes oglconform's glsl-bif-com advanced.mix test cases. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44477 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add isinf() and isnan() builtins.Paul Berry2011-10-312-0/+34
| | | | | | | | | | | | | The implementations are as follows: isinf(x) = (abs(x) == +infinity) isnan(x) = (x != x) Note: the latter formula is not necessarily obvious. It works because NaN is the only floating point number that does not equal itself. Fixes piglit tests "isinf-and-isnan fs_basic" and "isinf-and-isnan vs_basic".
* glsl: Add '.ir' extension to builtin IR filesPaul Berry2011-10-3165-0/+0
| | | | | | | | This patch adds the extension '.ir' to all the files in src/glsl/builtins/ir/, and changes generate_builtins.py so that it no longer globs on '*' to find the files to build. This prevents spurious files (such as EMACS' infamous *~ backup files) from breaking the build.
* glsl 1.30: Fix numerical instabilities in asinhPaul Berry2011-09-281-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The formula we were previously using for asinh: asinh x = ln(x + sqrt(x * x + 1)) is numerically unstable: when x is a large negative value, the quantity x + sqrt(x * x + 1) is a small positive value (on the order of 1/(2|x|)). Since the logarithm function is very sensitive in this range, any error in the computation of the square root manifests as a large error in the result. This patch changes to the equivalent formula: asinh x = sign(x) * ln(abs(x) + sqrt(x * x + 1)) which is only slightly more expensive to compute, and is numerically stable for all x. Fixes piglit tests spec/glsl-1.30/execution/built-in-functions/[fv]s-asinh-*. Reviewed-by: Chad Versace <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl/builtins: Fix invalid float constant in noise4 built-in.Kenneth Graunke2011-09-071-2/+2
| | | | | | | Throwing away the extra numbers ought to match the existing behavior. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl/builtins: Fix invalid vecN constants in hyperbolic functions.Kenneth Graunke2011-09-075-21/+21
| | | | | | | | | | Each of these vecN constants only provided one component, which is illegal. The printed IR is meant to contain exactly as many components as are necessary; the IR reader does not splat single values. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: improve the accuracy of the atan(x,y) builtin function.Paul Berry2011-08-011-1/+3
| | | | | | | | | | The previous formula for atan(x,y) returned a value of +/- pi whenever |x|<0.0001, and used a formula based on atan(y/x) otherwise. This broke in cases where both x and y were small (e.g. atan(1e-5, 1e-5)). This patch modifies the formula so that it returns a value of +/- pi whenever |x|<1e-8*|y|, and uses the formula based on atan(y/x) otherwise.
* glsl: improve the accuracy of the asin() builtin function.Paul Berry2011-08-011-28/+40
| | | | | | | | | | | | | | | | | | | | | | | | The previous formula for asin(x) was algebraically equivalent to: sign(x)*(pi/2 - sqrt(1-|x|)*(A + B|x| + C|x|^2)) where A, B, and C were arbitrary constants determined by a curve fit. This formula had a worst case absolute error of 0.00448, an unbounded worst case relative error, and a discontinuity near x=0. Changed the formula to: sign(x)*(pi/2 - sqrt(1-|x|)*(pi/2 + (pi/4-1)|x| + A|x|^2 + B|x|^3)) where A and B are arbitrary constants determined by a curve fit. This has a worst case absolute error of 0.00039, a worst case relative error of 0.000405, and no discontinuities. I don't expect a significant performance degradation, since the extra multiply-accumulate should be fast compared to the sqrt() computation. Fixes piglit tests {vs,fs}-asin-float and {vs,fs}-atan-*
* glsl: improve the accuracy of the radians() builtin functionPaul Berry2011-07-281-4/+4
| | | | | | | | | | | The constant used in the radians() function didn't have enough precision, causing a relative error of 1.676e-5, which is far worse than the precision of 32-bit floats. This patch reduces the relative error to 1.14e-9, which is the best we can do in 32 bits. Fixes piglit tests {fs,vs}-radians-{float,vec2,vec3,vec4}. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/builtins: Actually implement int/ivec variants of abs().Kenneth Graunke2011-06-141-0/+20
| | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> NOTE: This is a candidate for stable release branches (and don't forget to re-run "make builtins" after cherry-picking.)
* glsl/builtins: Remove unnecessary (constant bool (1)) from assignments.Kenneth Graunke2011-01-1212-269/+265
| | | | This isn't strictly necessary, but is definitely nicer.
* glsl/builtins: Compute the correct value for smoothstep(vec, vec, vec).Kenneth Graunke2010-12-171-87/+34
| | | | | | | | These mistakenly computed 't' instead of t * t * (3.0 - 2.0 * t). Also, properly vectorize the smoothstep(float, float, vec) variants. NOTE: This is a candidate for the 7.9 and 7.10 branches.
* glsl: Reimplement the "cross" built-in without ir_binop_cross.Kenneth Graunke2010-11-171-3/+5
| | | | | | We are not aware of any GPU that actually implements the cross product as a single instruction. Hence, there's no need for it to be an opcode. Future commits will remove it entirely.
* glsl: Implement the asinh, acosh, and atanh built-in functions.Kenneth Graunke2010-11-153-0/+79
|
* glsl/builtins: Clean up some ugly autogenerated code in atan.Kenneth Graunke2010-11-031-20/+5
| | | | | | In particular, calling the abs function is silly, since there's already an expression opcode for that. Also, assigning to temporaries then assigning those to the final location is rather redundant.
* glsl/builtins: Rename 'x' to 'y_over_x' in atan(float) implementation.Kenneth Graunke2010-11-031-4/+4
| | | | For consistency with the vec2/vec3/vec4 variants.
* glsl: Add support for GLSL 1.30's modf built-in.Kenneth Graunke2010-10-211-0/+41
|
* glsl: Add support for the 1.30 round() built-in.Kenneth Graunke2010-10-141-0/+21
| | | | | | | This implements round() via the ir_unop_round_even opcode, rather than adding a new opcode. We may wish to add one in the future, since it might enable a small performance increase on some hardware, but for now, this should suffice.
* glsl: Add front-end support for GLSL 1.30's roundEven built-in.Kenneth Graunke2010-10-141-0/+21
| | | | Implemented using the op-code introduced in the previous commit.
* glsl: Add front-end support for the "trunc" built-in.Kenneth Graunke2010-10-141-0/+21
|
* glsl: Rework assignments with write_masks to have LHS chan count match RHS.Eric Anholt2010-09-223-37/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It turns out that most people new to this IR are surprised when an assignment to (say) 3 components on the LHS takes 4 components on the RHS. It also makes for quite strange IR output: (assign (constant bool (1)) (x) (var_ref color) (swiz x (var_ref v) )) (assign (constant bool (1)) (y) (var_ref color) (swiz yy (var_ref v) )) (assign (constant bool (1)) (z) (var_ref color) (swiz zzz (var_ref v) )) But even worse, even we get it wrong, as shown by this line of our current step(float, vec4): (assign (constant bool (1)) (w) (var_ref t) (expression float b2f (expression bool >= (swiz w (var_ref x))(var_ref edge)))) where we try to assign a float to the writemasked-out x channel and don't supply anything for the actual w channel we're writing. Drivers right now just get lucky since ir_to_mesa spams the float value across all the source channels of a vec4. Instead, the RHS will now have a number of components equal to the number of components actually being written. Hopefully this confuses everyone less, and it also makes codegen for a scalar target simpler. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl/builtins: Switch comparison functions to just return an expression.Kenneth Graunke2010-09-184-180/+36
|
* glsl/builtins: Fix equal and notEqual builtins.Kenneth Graunke2010-09-182-24/+24
| | | | | Commit 309cd4115b7cba669a0bf858e7809cb6dae90ddf incorrectly converted these to all_equal and any_nequal, which is the wrong operation.
* glsl2: Port equal() and notEqual() to ir_unop_all_equal and ir_unop_any_nequalIan Romanick2010-09-132-120/+24
|
* glsl2: Implement noise[1234] built-in functions using ir_unop_noiseIan Romanick2010-09-094-52/+229
|
* glsl/builtins: normalize of a negative scalar should be -1.0.Kenneth Graunke2010-09-091-1/+1
|
* glsl: Fix for scalar float built-in definitions.Kenneth Graunke2010-09-082-2/+2
| | | | These need abs, and we need more tests.
* glsl: Fix typo in builtin step() using a wrong channel.Eric Anholt2010-09-081-1/+1
|
* glsl/builtins: Don't use ir_binop_dot on floating point values.Kenneth Graunke2010-09-084-6/+6
| | | | ir_binop_dot is only defined for vector types. Use ir_binop_mul.
* glsl/builtins: Simplify degenerate scalar float cases.Kenneth Graunke2010-09-083-5/+3
| | | | | | | The code being generated was just stupid, considering that: - normalize(x) = 1.0 - length(x) = x - distance(x, y) = x - y
* glsl/builtins: Convert assignments to new format (with write mask).Kenneth Graunke2010-09-0415-398/+389
|
* glsl: Add forgotten implementations of equal/notEqual on bvecs.Kenneth Graunke2010-09-012-0/+60
|
* glsl2: fix bug in atan(y, x) functionBrian Paul2010-08-311-7/+3
| | | | When x==0, the result was wrong. Fixes piglit glsl-fs-atan-1.shader_test
* mesa: Add new ir_unop_any() expression operation.Eric Anholt2010-08-231-3/+3
| | | | | | | The previous any() implementation would generate arg0.x || arg0.y || arg0.z. Having an expression operation for this makes it easy for the backend to generate something easier (DPn + SNE for 915 FS, .any predication on 965 VS)
* glsl2: Rework builtin function generation.Kenneth Graunke2010-08-1358-0/+2884
Each language version/extension and target now has a "profile" containing all of the available builtin function prototypes. These are written in GLSL, and come directly out of the GLSL spec (except for expanding genType). A new builtins/ir/ folder contains the hand-written IR for each builtin, regardless of what version includes it. Only those definitions that have prototypes in the profile will be included. The autogenerated IR for texture builtins is no longer written to disk, so there's no longer any confusion as to what's hand-written or generated. All scripts are now in python instead of perl.