summaryrefslogtreecommitdiffstats
path: root/src/glsl/builtin_functions.cpp
Commit message (Collapse)AuthorAgeFilesLines
* glsl: improve accuracy of atan()Erik Faye-Lund2014-10-101-10/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our current atan()-approximation is pretty inaccurate at 1.0, so let's try to improve the situation by doing a direct approximation without going through atan. This new implementation uses an 11th degree polynomial to approximate atan in the [-1..1] range, and the following identitiy to reduce the entire range to [-1..1]: atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) This range-reduction idea is taken from the paper "Fast computation of Arctangent Functions for Embedded Applications: A Comparative Analysis" (Ukil et al. 2011). The polynomial that approximates atan(x) is: x * 0.9999793128310355 - x^3 * 0.3326756418091246 + x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 This polynomial was found with the following GNU Octave script: x = linspace(0, 1); y = atan(x); n = [1, 3, 5, 7, 9, 11]; format long; polyfitc(x, y, n) The polyfitc function is not built-in, but too long to include here. It can be downloaded from the following URL: http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m This fixes the following piglit test: shaders/glsl-const-folding-01 Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Fix memory leak in builtin_builder::_image_prototype.Iago Toral Quiroga2014-10-021-3/+5
| | | | | | in_var calls the ir_variable constructor, which dups the variable name. Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Allow texture2DProjLod and textureCubeLod in GL ESKalyan Kondapally2014-09-291-3/+3
| | | | | | | | | | According to GLES (i.e. 1.0 and above) spec textureCubeLod and texture2DProjLod are built in functions. We seem to disable support for these functions with GLES. This patch enables the support. Signed-off-by: Kalyan Kondapally <[email protected]> Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84355
* glsl: Use bit-flags image attributes and uint16_t for the image formatIan Romanick2014-08-291-5/+5
| | | | | | | | | | | | | | | | | | | | | | All of the GL image enums fit in 16-bits. Also move the fields from the anonymous "image" structucture to the next higher structure. This will enable packing the bits with the other bitfield. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 76 40,572,916,873 68,831,248 63,328,783 5,502,465 0 After (32-bit): 70 40,577,421,777 68,487,584 62,973,695 5,513,889 0 Before (64-bit): 60 36,822,640,058 96,526,824 88,735,296 7,791,528 0 After (64-bit): 74 37,124,603,758 95,891,808 88,466,712 7,425,096 0 A real savings of 346KiB on 32-bit and 262KiB on 64-bit. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add ARB_derivative control supportIlia Mirkin2014-08-141-0/+48
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Make it possible to ignore built-ins when matching signatures.Kenneth Graunke2014-08-041-1/+2
| | | | | | | | | | | | | | | | | | Historically, we've implemented the rules for overriding built-in functions by creating multiple ir_functions and relying on the symbol table to hide the one containing built-in functions. That works, but has a few drawbacks, so the next patch will change it. Instead, we'll have a single ir_function for a particular name, which will contain both built-in and user-defined signatures. Passing an extra parameter to matching_signature makes it easy to ignore built-ins when they're supposed to be hidden. I didn't add the parameter to exact_matching_signature since it wasn't necessary. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: add new interpolateAt* builtin functionsChris Forbes2014-07-121-0/+67
| | | | | | | | | | | | V2: - Don't assume everyone wants interpolateAtSample() lowered to interpolateAtOffset. It turns out this isn't what we want most of the time for i965. Lowering can be added later in an ir pass which drivers opt into, rather than bolting it straight into the builtin definition. - Only expose the interpolateAt* builtins in the fragment language. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Use typed foreach_in_list instead of foreach_list.Matt Turner2014-07-011-2/+1
| | | | Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add support for EmitStreamVertex() and EndStreamPrimitive().Iago Toral Quiroga2014-06-301-0/+58
| | | | Reviewed-by: Chris Forbes <[email protected]>
* glsl: Modify ir_end_primitive to have a stream.Iago Toral Quiroga2014-06-301-1/+2
| | | | | | | This will be necessary to implement EndStreamPrimitive(). EndPrimitive() will produce an ir_end_primitive with the default stream 0. Reviewed-by: Chris Forbes <[email protected]>
* glsl: Modify ir_emit_vertex to have a stream.Iago Toral Quiroga2014-06-301-1/+2
| | | | | | | This will be necessary to implement EmitStreamVertex(). EmitVertex() will produce an ir_emit_vertex with the default stream 0. Reviewed-by: Chris Forbes <[email protected]>
* glsl: simplify the M_PI*f macros, fixes build on OpenBSDJonathan Gray2014-05-131-5/+3
| | | | | | | | | | | | | | | | | | The M_PI*f macros used a preprocessor paste to append 'f' to M_PI defines, which works if the values are only numbers but breaks on OpenBSD where M_PI definitions have casts and brackets to meet requirements of a future version of POSIX, http://austingroupbugs.net/view.php?id=801 http://austingroupbugs.net/view.php?id=828 Simplify the M_PI*f macros by using casts directly in the defines as suggested by Kenneth Graunke. Cc: "10.2" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78665 Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Jonathan Gray <[email protected]>
* glsl: Use M_PI_* macros.Matt Turner2014-04-151-7/+13
| | | | | Notice our multiple values for M_PI_2, which rounded ...32 up to ...4 and ...5.
* glsl: Clean up "unused parameter" warningsIan Romanick2014-03-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ../../src/glsl/builtin_functions.cpp:72:1: warning: unused parameter 'state' [-Wunused-parameter] ../../src/glsl/ir_clone.cpp:31:1: warning: unused parameter 'ht' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:44:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:50:1: warning: unused parameter 'ignore' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:68:1: warning: unused parameter 'ignore' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:149:6: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:556:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:562:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/link_uniforms.cpp:213:1: warning: unused parameter 'record_type' [-Wunused-parameter] ../../src/glsl/loop_analysis.cpp:225:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:73:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:79:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:85:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_copy_propagation_elements.cpp:189:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_cse.cpp:402:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_dead_code_local.cpp:117:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_redundant_jumps.cpp:53:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_vectorize.cpp:301:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Allow dot() on scalars, and throw out dotlike().Matt Turner2014-03-181-4/+4
| | | | | | | | | | In all uses of dotlike() we're writing generic code that operates on 1-4 component vectors. That our IR requires ir_binop_dot expressions' operands to be 2+ component vectors is an implementation detail that's not important when implementing built-in functions with dot(), which is defined for scalar floats in GLSL. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Match whitespace changes from previous patch.Matt Turner2014-03-181-4/+4
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Expose pack/unpack built-ins for ARB_gpu_shader5.Matt Turner2014-03-181-9/+17
| | | | | | | | ARB_gpu_shader5 and ES 3.0 expose different subsets of ARB_shading_language_packing. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: switch to c11 mutex functionsBrian Paul2014-03-031-7/+7
| | | | Reviewed-by: José Fonseca <[email protected]>
* glsl: rename _restrict to restrict_flagBrian Paul2014-02-121-1/+1
| | | | | | | | | | | To fix MSVC compile breakage. Evidently, _restrict is an MSVC keyword, though the docs only mention __restrict (with two underscores). Note: we may want to also rename _volatile to volatile_flag to be consistent. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74900 Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add image built-in function generator.Francisco Jerez2014-02-121-0/+251
| | | | | | | | | | | | | | | | | | Because of the combinatorial explosion of different image built-ins with different image dimensionalities and base data types, enumerating all the 242 possibilities would be annoying and a waste of .text space. Instead use a special path in the built-in builder that loops over all the known image types. v2: Generate built-ins on GLSL version 4.20 too. Rename '_has_float_data_type' to '_supports_float_data_type'. Avoid duplicating enumeration of image built-ins in create_intrinsics() and create_builtins(). v3: Use a more orthodox approach for passing image built-in generator parameters. v4: Cosmetic changes. Acked-by: Paul Berry <[email protected]>
* glsl: Add helper methods to glsl_type for dealing with images.Francisco Jerez2014-02-121-1/+1
| | | | | | | | | | | | Add predicates to query if a GLSL type is or contains an image. Rename sampler_coordinate_components() to coordinate_components(). v2: Use assert instead of unreachable. v3: No need to use a separate code-path for images in coordinate_components() after merging image and sampler fields in the glsl_type structure. Reviewed-by: Paul Berry <[email protected]>
* glsl: Add locking to builtin_builder singletonDaniel Kurtz2014-02-111-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider a multithreaded program with two contexts A and B, and the following scenario: 1. Context A calls initialize(), which allocates mem_ctx and starts building built-ins. 2. Context B calls initialize(), which sees mem_ctx != NULL and assumes everything is already set up. It returns. 3. Context B calls find(), which fails to find the built-in since it hasn't been created yet. 4. Context A finally finishes initializing the built-ins. This will break at step 3. Adding a lock ensures that subsequent callers of initialize() will wait until initialization is actually complete. Similarly, if any thread calls release while another thread is still initializing, or calling find(), the mem_ctx/shader would get free'd while from under it, leading to corruption or use-after-free crashes. Fixes sporadic failures in Piglit's glx-multithread-shader-compile. Bugzilla: https://bugs.freedesktop.org/69200 Signed-off-by: Daniel Kurtz <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "10.1 10.0" <[email protected]>
* glsl: Simplify built-in generator functions for min3/max3/mid3.Kenneth Graunke2014-01-241-77/+60
| | | | | | | | | The type of all three parameters are identical, so we don't need to specify it three times. The predicate is always identical too, so we don't need to make it a parameter, either. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Statically cast parameter exec_node to ir_variable.Kenneth Graunke2014-01-131-1/+1
| | | | | | | | | Formal function parameters are always ir_variable objects, not an arbitrary ir_instruction. So there's no need to dynamically cast here. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Convert piles of foreach_iter to the newer foreach_list macro.Kenneth Graunke2014-01-131-2/+2
| | | | | | | | | | | | | foreach_iter and exec_list_iterators have been deprecated for some time now; we just hadn't ever bothered to convert code to the newer foreach_list and foreach_list_safe macros. In these cases, we aren't editing the list, so we can use foreach_list rather than foreach_list_safe. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Namespace qualify fma to override ambiguity with fma from math.hThomas Sondergaard2014-01-081-1/+1
| | | | | | | MSVC 2013 version of math.h includes an fma() function. Cc: "10.0" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Clean up nomenclature for pipeline stages.Paul Berry2014-01-081-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we had an enum called gl_shader_type which represented pipeline stages in the order they occur in the pipeline (i.e. MESA_SHADER_VERTEX=0, MESA_SHADER_GEOMETRY=1, etc), and several inconsistently named functions for converting between it and other representations: - _mesa_shader_type_to_string: gl_shader_type -> string - _mesa_shader_type_to_index: GLenum (GL_*_SHADER) -> gl_shader_type - _mesa_program_target_to_index: GLenum (GL_*_PROGRAM) -> gl_shader_type - _mesa_shader_enum_to_string: GLenum (GL_*_{SHADER,PROGRAM}) -> string This patch tries to clean things up so that we use more consistent terminology: the enum is now called gl_shader_stage (to emphasize that it is in the order of pipeline stages), and the conversion functions are: - _mesa_shader_stage_to_string: gl_shader_stage -> string - _mesa_shader_enum_to_shader_stage: GLenum (GL_*_SHADER) -> gl_shader_stage - _mesa_program_enum_to_shader_stage: GLenum (GL_*_PROGRAM) -> gl_shader_stage - _mesa_progshader_enum_to_string: GLenum (GL_*_{SHADER,PROGRAM}) -> string In addition, MESA_SHADER_TYPES has been renamed to MESA_SHADER_STAGES, for consistency with the new name for the enum. Reviewed-by: Kenneth Graunke <[email protected]> v2: Also rename the "target" field of _mesa_glsl_parse_state and the "target" parameter of _mesa_shader_stage_to_string to "stage". Reviewed-by: Brian Paul <[email protected]>
* glsl: rename min(), max() functions to fix MSVC buildBrian Paul2014-01-061-3/+3
| | | | | | | | Evidently, there's some other definition of "min" and "max" that causes MSVC to choke on these function names. Renaming to min2() and max2() fixes things. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: implement mid3 built-in functionMaxence Le Doré2014-01-061-0/+38
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: implement max3 built-in functionMaxence Le Doré2014-01-061-0/+38
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Implement min3 built-in functionMaxence Le Doré2014-01-061-0/+38
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add a shader_trinary_minmax predicateMaxence Le Doré2014-01-061-0/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Replace _mesa_glsl_parser_targets enum with gl_shader_type.Paul Berry2013-12-171-11/+11
| | | | | | These enums were redundant. Reviewed-by: Brian Paul <[email protected]>
* glsl: Simplify the built-in function linking code.Kenneth Graunke2013-12-011-2/+1
| | | | | | | | | | | | | | | | | | Previously, we stored an array of up to 16 additional shaders to link, as well as a count of how many each shader actually needed. Since the built-in functions rewrite, all the built-ins are stored in a single shader. So all we need is a boolean indicating whether a shader needs to link against built-ins or not. During linking, we can avoid creating the temporary array if none of the shaders being linked need built-ins. Otherwise, it's simply a copy of the array that has one additional element. This is much simpler. This patch saves approximately 128 bytes of memory per gl_shader object. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Create an accessor for the built-in function shader.Kenneth Graunke2013-12-011-2/+10
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Enable dFdx, dFdy, and fwidth by default in GLSL ES 3.00.Kenneth Graunke2013-11-071-1/+2
| | | | | | | | | | | | | | | Previously, we only exposed them in desktop GL or with: #extension GL_OES_standard_derivatives : enable GLSL ES 3.00 includes these without an extension, so we need to expose them by default. Note that the above #extension line results in an error or desktop GL, so we don't need to worry about this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add built-in functions and constants required for ↵Francisco Jerez2013-10-291-0/+58
| | | | | | | | ARB_shader_atomic_counters. v2: Represent atomics as GLSL intrinsics. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Basic support for built-in intrinsics.Francisco Jerez2013-10-291-3/+46
| | | | | | | | | | | | | | | | | Fix the linker to deal with intrinsic functions which are undefined all the way down to the driver back-end, and introduce intrinsic definition helpers in the built-in generator. We still need to figure out what kind of interface we want for drivers to communicate to the GLSL front-end which of the supported intrinsics should use a default GLSL implementation and which should use a hardware-specific override. As there's no default GLSL implementation for atomic ops, this seems like something we can worry about later on. Reviewed-by: Ian Romanick <[email protected]> v2: Define local helper function to generate ir_call nodes in the builtin generator.
* glsl: add signatures for textureGatherOffsets()Chris Forbes2013-10-261-0/+30
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add support for texture functions with offset arraysChris Forbes2013-10-261-0/+9
| | | | | | | This is needed for textureGatherOffsets() Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add new textureGather[Offset]() overloads for shadow samplersChris Forbes2013-10-261-0/+10
| | | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add support for separate reference Z for shadow samplersChris Forbes2013-10-261-5/+15
| | | | | | | | | | ARB_gpu_shader5's textureGather*() functions which take shadow samplers have a separate `refz` parameter rather than adding it to the coordinate. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: relax const offset requirement for textureGatherOffsetChris Forbes2013-10-261-20/+41
| | | | | | | | | | Prior to ARB_gpu_shader5 / GLSL 4.0, the offset is required to be a constant expression. With that extension, it is relaxed to be an arbitrary expression. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add ARB_gpu_shader5 textureGatherOffset signaturesChris Forbes2013-10-261-0/+16
| | | | | | | | - gsampler2DRect - optional `comp` parameter Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Implement [iu]mulExtended() built-ins for ARB_gpu_shader5.Matt Turner2013-10-071-0/+31
| | | | | | | | | | These built-ins have two "out" parameters, which makes implementing them efficiently with our current compiler infrastructure difficult. Instead, implement them in terms of the existing ir_binop_mul IR (to return the low 32-bits) and a new ir_binop_mul64 which returns the high 32-bits. v2: Rename mul64 -> imul_high as suggested by Ken. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Implement usubBorrow() built-in for ARB_gpu_shader5.Matt Turner2013-10-071-0/+21
| | | | | | | | | | | | | | i965 implements this with a single (multiple destination) instruction, SUBB. Emitting SUBB directly from usubBorrow() would be ideal, but our optimization passes don't know how to copy with expressions with side-effects. Radeon has an SUBB_UINT instruction that only generates the borrow bit. I've chosen to go this route and implement usubBorrow() by doing the subtraction and the borrow operations separately. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Implement uaddCarry() built-in for ARB_gpu_shader5.Matt Turner2013-10-071-0/+21
| | | | | | | | | | | | | | i965 implements this with a single (multiple destination) instruction, ADDC. Emitting ADDC directly from uaddCarry() would be ideal, but our optimization passes don't know how to copy with expressions with side-effects. Radeon has an ADDC_UINT instruction that only generates the carry bit. I've chosen to go this route and implement uaddCarry() by doing the addition and the carry operations separately. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: add ARB_gpu_shader5's additional textureGather signaturesChris Forbes2013-10-061-1/+26
| | | | | | | | | | | - gsampler2DRect support - optional `comp` parameter Future patches will add shadow sampler support and textureGatherOffsets(). Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add support for specifying the component in textureGatherChris Forbes2013-10-061-0/+13
| | | | | | | | | | | ARB_gpu_shader5 introduces new variants of textureGather* which have an explicit component selector, rather than relying purely on the sampler's swizzle state. This patch adds the GLSL plumbing for the extra parameter. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add plumbing for GL_ARB_texture_query_levelsChris Forbes2013-10-051-0/+56
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>