summaryrefslogtreecommitdiffstats
path: root/src/compiler/glsl/builtin_functions.cpp
Commit message (Collapse)AuthorAgeFilesLines
* mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_orderingKevin Rogovin2018-08-281-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | This extension provides new GLSL built-in function beginFragmentShaderOrderingIntel() that guarantees (taking wording of GL_INTEL_fragment_shader_ordering extension) that any memory transactions issued by shader invocations from previous primitives mapped to same xy window coordinates (and same sample when per-sample shading is active), complete and are visible to the shader invocation that called beginFragmentShaderOrderingINTEL(). One advantage of INTEL_fragment_shader_ordering over ARB_fragment_shader_interlock is that it provides a function that operates as a memory barrie (instead of a defining a critcial section) that can be called under arbitary control flow from any function (in contrast the begin/end of ARB_fragment_shader_interlock may only be called once, from main(), under no control flow. Signed-off-by: Kevin Rogovin <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* mesa: expose AMD_gpu_shader_int64Marek Olšák2018-08-241-1/+2
| | | | | | | | | because the closed driver exposes it. It's equivalent to ARB_gpu_shader_int64. In this patch, I did everything the same as we do for ARB_gpu_shader_int64. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add built-in functions for INTEL_shader_atomic_float_minmaxIan Romanick2018-08-221-1/+32
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add built-in functions for NV_shader_atomic_floatIan Romanick2018-08-221-3/+48
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* Add NV_fragment_shader_interlock support.Kevin Rogovin2018-08-201-0/+18
| | | | | | | | The main purpose for having NV_fragment_shader_interlock extension is because that extension is also for GLES31 while the ARB extension is for GL only. Reviewed-by: Plamena Manolova <[email protected]>
* mesa/util: add allow_glsl_relaxed_es driconfig overrideTimothy Arceri2018-06-191-1/+2
| | | | | | | | | | | | | | | This relaxes a number of ES shader restrictions allowing shaders to follow more desktop GLSL like rules. This initial implementation relaxes the following: - allows linking ES shaders with desktop shaders - allows mismatching precision qualifiers - always enables standard derivative builtins These relaxations allow Google Earth VR shaders to compile. Reviewed-by: Dave Airlie <[email protected]>
* mesa: Add GL/GLSL plumbing for ARB_fragment_shader_interlock.Plamena Manolova2018-06-011-0/+54
| | | | | | | | | | | | | This extension provides new GLSL built-in functions beginInvocationInterlockARB() and endInvocationInterlockARB() that delimit a critical section of fragment shader code. For pairs of shader invocations with "overlapping" coverage in a given pixel, the OpenGL implementation will guarantee that the critical section of the fragment shader will be executed for only one fragment at a time. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* mesa: include mtypes.h lessMarek Olšák2018-04-121-1/+1
| | | | | | | | | | - remove mtypes.h from most header files - add main/menums.h for often used definitions - remove main/core.h v2: fix radv build Reviewed-by: Brian Paul <[email protected]>
* mesa: implement ARB_compatibilityMarek Olšák2018-02-231-1/+1
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: add OES_EGL_image_external_essl3 supportIlia Mirkin2018-02-061-0/+17
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Remove ir_binop_greater and ir_binop_lequal expressionsIan Romanick2017-10-301-7/+16
| | | | | | | | | | | | | | | | | | | NIR does not have these instructions. TGSI and Mesa IR both implement them using < and >=, repsectively. Removing them deletes a bunch of code and means I don't have to add code to the SPIR-V generator for them. v2: Rebase on 2+ years of change... and fix a major bug added in the rebase. text data bss dec hex filename 8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so before 8254235 268856 294072 8817163 868a0b 32-bit i965_dri.so after 7815339 345592 420592 8581523 82f193 64-bit i965_dri.so before 7813995 345560 420592 8580147 82ec33 64-bit i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: stop cloning builtin fuctions _mesa_glsl_find_builtin_function()Timothy Arceri2017-08-111-10/+1
| | | | | | | | | | | | | | | | | | | | | The cloning was introduced in f81ede469910d to fix a problem with shaders including IR that was owned by builtins. However the approach of cloning the whole function each time we reference a builtin lead to a significant reduction in the GLSL IR compilers performance. The previous patch fixes the ownership problem in a more precise way. So we can now remove this cloning. Testing on a Ryzen 7 1800X shows a ~15% decreases in compiling the Deus Ex: Mankind Divided shaders on radeonsi (which take 5min+ on some machines). Looking just at the GLSL IR compiler the speed up is ~40%. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: update the extensions/functions that are enabled for 460Samuel Pitoiset2017-08-071-17/+94
| | | | | | | | Other ones are either unsupported or don't have any helper function checks. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: check if any of the named builtins are available firstIlia Mirkin2017-07-051-2/+11
| | | | | | | | | | | | | | | | | | | | | _mesa_glsl_has_builtin_function is used to determine whether any variant of a builtin are available, for the purpose of enforcing the GLSL ES 3.00+ rule that overloads or overrides of builtins are disallowed. However the builtin_builder contains information on all builtins, irrespective of parse state, or versions, or extension enablement. As a result we would say that a builtin existed even if it was not actually available. To resolve this, first check if at least one signature is available for a builtin before returning true. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101666 Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Reviewed-by: Timothy Arceri <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: rename image_* qualifiers to memory_*Samuel Pitoiset2017-05-041-15/+15
| | | | | | | | | It doesn't make sense to prefix them with 'image' because they are called "Memory Qualifiers" and they can be applied to members of storage buffer blocks. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Andres Gomez <[email protected]>
* glsl: implement arb_shader_ballot builtins using intrinsicsNicolai Hähnle2017-04-281-3/+83
|
* glsl: implement arb_shader_group_vote builtins via intrinsicsNicolai Hähnle2017-04-281-6/+32
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: make use of glsl_type::is_double()Samuel Pitoiset2017-04-211-6/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* glsl: use the BA1 macro for textureQueryLevels()Samuel Pitoiset2017-04-111-32/+33
| | | | | | | For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: use the BA1 macro for textureSamples()Samuel Pitoiset2017-04-111-9/+10
| | | | | | | For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: use the BA1 macro for textureCubeArrayShadow()Samuel Pitoiset2017-04-111-5/+6
| | | | | | | For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: add ARB_shader_ballot builtin functionsNicolai Hähnle2017-04-051-0/+77
| | | | Reviewed-by: Marek Olšák <[email protected]>
* glsl: use -O1 optimization for builtin_functions.cpp with MinGWBrian Paul2017-03-311-0/+20
| | | | | | | | | | | | | | | | Some versions of MinGW-w64 such as 5.3.1 and 6.2.0 produce bad code with -O2 or -O3 causing a random driver crash when running programs that use GLSL. Most Mesa demos in the glsl/ directory trigger the bug, but not the fragcoord.c test. Use a #pragma to force -O1 for this file for later MinGW versions. Luckily, this is basically one-time setup code. I suspect the bug is related to the sheer size of this file. This should let us move to newer versions of MinGW-w64 for Mesa. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: fix clockARB builtin functionNicolai Hähnle2017-03-311-1/+1
| | | | | | | The underlying intrinsic is defined to always have a uvec2 return type. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: builtin: always return clones of the builtinsLionel Landwerlin2017-03-091-5/+17
| | | | | | | | | | | | | | | | | | | | | | Builtins are created once and allocated using their own private ralloc context. When reparenting IR that includes builtins, we might be steal bits of builtins. This is problematic because these builtins might now be freed when the shader that includes then last is disposed. This might also lead to inconsistent ralloc trees/lists if shaders are created on multiple threads. Rather than including builtins directly into a shader's IR, we should include clones of them in the ralloc context of the shader that requires them. This fixes double free issues we've been seeing when running shader-db on a big multicore (72 threads) server. v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better reflect how this function is used. (Ken) v3: Rename ctx to mem_ctx (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: correct compute shader checks for memoryBarrier functionsMarc Di Luzio2017-02-061-6/+12
| | | | | | | | | | | | | | | | As per the spec - "The functions memoryBarrierShared() and groupMemoryBarrier() are available only in compute shaders; the other functions are available in all shader types." Conform to this by adding another delegate to check for compute shader support instead of only whether the current stage is compute This allows some fragment shaders in Dirt Rally to compile Cc: "17.0" <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).Francisco Jerez2017-01-311-1/+21
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* glsl: Rewrite atan2 implementation to fix accuracy and handling of ↵Francisco Jerez2017-01-311-36/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zero/infinity. This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add "built-in" functions to do 64%64 => 64 modulusIan Romanick2017-01-201-0/+8
| | | | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" functions to do 64/64 => 64 divisionIan Romanick2017-01-201-0/+8
| | | | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" function for 64-bit integer sign()Ian Romanick2017-01-201-0/+4
| | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" functions to do 64x64 => 64 multiplicationIan Romanick2017-01-201-0/+9
| | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Move builtin_function related prototypes to a separate fileIan Romanick2017-01-201-0/+1
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add interaction between ARB_gpu_shader_int64 and ARB_shader_clockIan Romanick2017-01-201-1/+19
| | | | | | | | | | If ARB_gpu_shader_int64 is supported, ARB_shader_clock also adds clockARB() that returns a uint64_t. Rather than add new opcodes and intrinsics for this, just wrap the existing intrinsic with a packUint2x32. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add 64-bit integer functionsDave Airlie2017-01-201-3/+174
| | | | | | | | | | | | | These are all the allowed 64-bit functions from ARB_gpu_shader_int64 spec. v2: restrict int64/double functions better. v3 (idr): Delete spurious blank lines. Suggested by Matt. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Do not allow scalar types in vector relational functionsBoyan Ding2017-01-091-19/+10
| | | | | | | | | | According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector Relational Functions", functions of this type do not operate on scalar types, so remove scalar types from signature definitions to make the behavior consistent with glslangValidator and other drivers. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Boyan Ding <[email protected]>
* treewide: s/comparitor/comparator/Ilia Mirkin2016-12-121-6/+6
| | | | | | | | | | git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* glsl: Use a simpler formula for tanhJason Ekstrand2016-12-091-8/+10
| | | | | | | | | | | | The formula we have used in the past is a trivial reduction from the definition by simply multiplying both the numerator and denominator of the formula by 2. However, multiplying by e^x, you can further reduce it. This allows us to get rid of one side of the clamp and two of exponential functions which should make it faster. The new formula still passes the dEQP precision tests for tanh so it should be fine. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* compiler/glsl: fix precision problem of tanhHaixia Shi2016-12-091-2/+10
| | | | | | | | | | | | | | | | Clamp input scalar value to range [-10, +10] to avoid precision problems when the absolute value of input is too large. Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test failures. v2: added more explanation in the comment. v3: fixed a typo in the comment. Signed-off-by: Haixia Shi <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0" <[email protected]>
* glsl: Disable textureOffset(sampler2DArrayShadow, ...) in GLSL ES.Kenneth Graunke2016-10-161-1/+7
| | | | | | | | | | | | This has apparently never existed in GLSL ES. Fixes dEQP-GLES3.functional.shaders.texture_functions.invalid .textureoffset_sampler2darrayshadow_vec4_ivec2_vertex and .textureoffset_sampler2darrayshadow_vec4_ivec2_fragment Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98244 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Kill __intrinsic_atomic_subIan Romanick2016-10-041-8/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just generate an __intrinsic_atomic_add with a negated parameter. Some background on the non-obvious reasons for the the big change to builtin_builder::call()... this is cribbed from some discussion with Ilia on mesa-dev. Why change builtin_builder::call() to allow taking dereferences and create them here rather than just feeding in the ir_variables directly? The problem is the neg_data ir_variable node would have to be in two lists at the same time: the instruction stream and parameters. The ir_variable node is automatically added to the instruction stream by the call to make_temp. Restructuring the code so that the ir_variables could be in parameters then move them to the instruction stream would have been pretty terrible. ir_call in the instruction stream has an exec_list that contains ir_dereference_variable nodes. The builtin_builder::call method previously took an exec_list of ir_variables and created a list of ir_dereference_variable. All of the original users of that method wanted to make a function call using exactly the set of parameters passed to the built-in function (i.e., call __intrinsic_atomic_add using the parameters to atomicAdd). For these users, the list of ir_variables already existed: the list of parameters in the built-in function signature. This new caller doesn't do that. It wants to call a function with a parameter from the function and a value calculated in the function. So, I changed builtin_builder::call to take a list that could either be a list of ir_variable or a list of ir_dereference_variable. In the former case it behaves just as it previously did. In the latter case, it uses (and removes from the input list) the ir_dereference_variable nodes instead of creating new ones. text data bss dec hex filename 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so before 6036923 283160 28608 6348691 60df93 lib64/i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Remove ir_function_signature::_is_intrinsic fieldIan Romanick2016-10-041-2/+0
| | | | | | | | | text data bss dec hex filename 6036491 283160 28608 6348259 60dde3 lib64/i965_dri.so before 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Add ir_function_signature::is_intrinsic() methodIan Romanick2016-10-041-2/+2
| | | | | | | | | | | | This necessetated renaming the is_intrinsic field to _is_intrinsic. The next commit will remove the field. text data bss dec hex filename 6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so before 6036491 283160 28608 6348259 60dde3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Track a unique intrinsic ID with each intrinsic functionIan Romanick2016-10-041-72/+136
| | | | | | | | | text data bss dec hex filename 6037483 283160 28608 6349251 60e1c3 lib64/i965_dri.so before 6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* glsl: Delete ftransform support from builtin_functions.cpp.Kenneth Graunke2016-09-231-26/+4
| | | | | | | This is now handled directly by ast_function.cpp. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by; Ian Romanick <[email protected]>
* mesa: add EXT_texture_cube_map_array supportIlia Mirkin2016-08-281-0/+2
| | | | | | | | This is identical to OES_texture_cube_map_array support. dEQP has tests which use this extension. Also it is part of AEP. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add support for OES_texture_cube_map_arrayIan Romanick2016-08-261-10/+19
| | | | | | | | | | | This has a separate enable flag because this extension also requires OES_geometry_shader. It is possible that some drivers may support OpenGL ES 3.1 and ARB_texture_cube_map but not support OES_geometry_shader. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add and use has_texture_cube_map_array helperIan Romanick2016-08-261-4/+2
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* MESA_shader_integer_functions: Expose new built-in functionsIan Romanick2016-07-191-11/+20
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl/main: remove unused params and make function staticTimothy Arceri2016-06-301-1/+1
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>