aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler/glsl/builtin_functions.cpp
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Enable textureSize for samplerExternalOESYevhenii Kolesnikov2019-10-311-0/+2
| | | | | | | | | From OES_EGL_image_external_essl3 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1901 Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl/builtin: Add alternate versions of atan using new opsNeil Roberts2019-10-121-2/+31
| | | | | | | | | | Adds alternate versions of the atan builtin functions that use ir_unop_atan and ir_binop_atan2 instead of inlining to the IR implementation of the function. These alternatives are selected if the IR is going to be consumed by NIR. In that case the IR ops will be translated to the appropriate NIR op. Reviewed-by: Kristian H. Kristensen <[email protected]>
* glsl: Add helperInvocationEXT() builtinCaio Marcelo de Oliveira Filho2019-09-301-0/+36
| | | | | | | | | | | From EXT_demote_to_helper_invocation, implemented with the existing nir_intrinsic_is_helper_invocation. Such builtin is necessary when using `demote` because we can't redefine the value of gl_HelperInvocation (since it is an input variable). Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/compiler: rework tear down of builtin/typesLionel Landwerlin2019-08-211-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | The issue we're running into when running CTS is that glsl types are deleted while builtins depending on them are not. This happens because on one hand we have glsl types ref counted, but builtins are not. Instead builtins are destroyed when unloading libGL or explicitly calling glReleaseShaderCompiler(). This change removes almost entirely any dealing with glsl types ref/unref by letting the builtins deal with it instead. In turn we introduce a builtin ref count mechanism. Each GL context takes a reference on the builtins when compiling a shader for the first time. It releases the reference when the context is destroyed. It can also explicitly release those when glReleaseShaderCompiler() is called. Finally we also take a reference on the glsl types when loading libGL to avoid recreating glsl types too often. v2: Ensure we take a reference if we don't have one in link step (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796 Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: add EXT_shader_image_load_store new image functionsPierre-Eric Pelloux-Prayer2019-08-061-0/+12
| | | | | | | | This extension has 2 functions that are missing from the ARB versions: - imageAtomicIncWrap - imageAtomicDecWrap Reviewed-by: Marek Olšák <[email protected]>
* glsl: handle differences between ARB/EXT versions of shader_image_load_storePierre-Eric Pelloux-Prayer2019-08-061-1/+13
| | | | Reviewed-by: Marek Olšák <[email protected]>
* glsl: Add builtin functions for EXT_texture_shadow_lodPaulo Zanoni2019-07-301-0/+26
| | | | | | | | | | | | | With the help of Sagar, Ian and Ivan. v2: Fix dependencies (Ian Romanick) v3: 1) fix function name (Marek Olsak) 2) Add check for extension enable (Marek Olsak) Signed-off-by: Paulo Zanoni <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Allow _textureCubeArrayShadow function to accept ir_texture_opcodePaulo Zanoni2019-07-301-4/+19
| | | | | | | | | | | This will be used to support one of the function from Ext_texture_shadow_lod specification. With the help of Sagar, Ian and Ivan. Signed-off-by: Paulo Zanoni <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: fix and clean up NV_compute_shader_derivatives supportMarek Olšák2019-05-021-54/+24
| | | | | | | - make sure compute shader derivatives are exposed for all extensions - unify duplicated code Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: handle interactions between EXT_gpu_shader4 and texture extensionsMarek Olšák2019-04-241-281/+354
| | | | | | also, EXT_texture_buffer_object has to be enabled separately. Reviewed-by: Eric Anholt <[email protected]>
* glsl: add texture builtin functions for EXT_gpu_shader4Marek Olšák2019-04-241-25/+667
| | | | | | | | | v2: some fixes to texture functions thanks to piglit tests Reviewed-by: Timothy Arceri <[email protected]> (v1) Reviewed-by: Ian Romanick <[email protected]> (v1) Tested-by: Dieter Nützel <[email protected]> (v1) Reviewed-by: Eric Anholt <[email protected]>
* glsl: add arithmetic builtin functions for EXT_gpu_shader4Marek Olšák2019-04-241-13/+35
| | | | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Enable texture builtins for NV_compute_shader_derivativesCaio Marcelo de Oliveira Filho2019-04-081-140/+153
| | | | | | | Renamed a few predicates from "fs_only" to be "derivative_only" (or similar pairs). Reviewed-by: Ian Romanick <[email protected]>
* glsl: Enable derivative builtins for NV_compute_shader_derivativesCaio Marcelo de Oliveira Filho2019-04-081-9/+25
| | | | Reviewed-by: Ian Romanick <[email protected]>
* glsl: [u/i]mulExtended optimization for GLSLSagar Ghuge2019-03-041-2/+30
| | | | | | | | | | | | | | | Optimize mulExtended to use 32x32->64 multiplication. Drivers which are not based on NIR, they can set the MUL64_TO_MUL_AND_MUL_HIGH lowering flag in order to have same old behavior. v2: Add missing condition check (Jason Ekstrand) Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <Matt Turner <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: Expose EXT_texture_query_lod and add support for its use shadersGert Wollny2019-03-031-1/+2
| | | | | | | | | | EXT_texture_query_lod provides the same functionality for GLES like the ARB extension with the same name for GL. v2: Set ES 3.0 as minimum GLES version as required by the extension Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: expose AMD_texture_texture4Marek Olšák2018-12-041-0/+10
| | | | | | because the closed driver exposes it. Tested by piglit. Reviewed-by: Ilia Mirkin <[email protected]>
* mesa: Revert INTEL_fragment_shader_ordering supportMatt Turner2018-12-031-17/+0
| | | | | | | | | | | | | | | | This extension is not properly tested (testing for GL_ARB_fragment_shader_interlock is not sufficient), and since this was noted in review on August 28th no tests have been sent. Revert "i965: Add INTEL_fragment_shader_ordering support." Revert "mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering" This reverts commit 03ecec9ed2099f6e2b62994b33dc948dc731e7b8. This reverts commit 119435c8778dd26cb7c8bcde9f04b3982239fe60. Cc: [email protected] Acked-by: Jason Ekstrand <[email protected]> Acked-by: Eric Anholt <[email protected]>
* mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_orderingKevin Rogovin2018-08-281-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | This extension provides new GLSL built-in function beginFragmentShaderOrderingIntel() that guarantees (taking wording of GL_INTEL_fragment_shader_ordering extension) that any memory transactions issued by shader invocations from previous primitives mapped to same xy window coordinates (and same sample when per-sample shading is active), complete and are visible to the shader invocation that called beginFragmentShaderOrderingINTEL(). One advantage of INTEL_fragment_shader_ordering over ARB_fragment_shader_interlock is that it provides a function that operates as a memory barrie (instead of a defining a critcial section) that can be called under arbitary control flow from any function (in contrast the begin/end of ARB_fragment_shader_interlock may only be called once, from main(), under no control flow. Signed-off-by: Kevin Rogovin <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* mesa: expose AMD_gpu_shader_int64Marek Olšák2018-08-241-1/+2
| | | | | | | | | because the closed driver exposes it. It's equivalent to ARB_gpu_shader_int64. In this patch, I did everything the same as we do for ARB_gpu_shader_int64. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add built-in functions for INTEL_shader_atomic_float_minmaxIan Romanick2018-08-221-1/+32
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add built-in functions for NV_shader_atomic_floatIan Romanick2018-08-221-3/+48
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* Add NV_fragment_shader_interlock support.Kevin Rogovin2018-08-201-0/+18
| | | | | | | | The main purpose for having NV_fragment_shader_interlock extension is because that extension is also for GLES31 while the ARB extension is for GL only. Reviewed-by: Plamena Manolova <[email protected]>
* mesa/util: add allow_glsl_relaxed_es driconfig overrideTimothy Arceri2018-06-191-1/+2
| | | | | | | | | | | | | | | This relaxes a number of ES shader restrictions allowing shaders to follow more desktop GLSL like rules. This initial implementation relaxes the following: - allows linking ES shaders with desktop shaders - allows mismatching precision qualifiers - always enables standard derivative builtins These relaxations allow Google Earth VR shaders to compile. Reviewed-by: Dave Airlie <[email protected]>
* mesa: Add GL/GLSL plumbing for ARB_fragment_shader_interlock.Plamena Manolova2018-06-011-0/+54
| | | | | | | | | | | | | This extension provides new GLSL built-in functions beginInvocationInterlockARB() and endInvocationInterlockARB() that delimit a critical section of fragment shader code. For pairs of shader invocations with "overlapping" coverage in a given pixel, the OpenGL implementation will guarantee that the critical section of the fragment shader will be executed for only one fragment at a time. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* mesa: include mtypes.h lessMarek Olšák2018-04-121-1/+1
| | | | | | | | | | - remove mtypes.h from most header files - add main/menums.h for often used definitions - remove main/core.h v2: fix radv build Reviewed-by: Brian Paul <[email protected]>
* mesa: implement ARB_compatibilityMarek Olšák2018-02-231-1/+1
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: add OES_EGL_image_external_essl3 supportIlia Mirkin2018-02-061-0/+17
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Remove ir_binop_greater and ir_binop_lequal expressionsIan Romanick2017-10-301-7/+16
| | | | | | | | | | | | | | | | | | | NIR does not have these instructions. TGSI and Mesa IR both implement them using < and >=, repsectively. Removing them deletes a bunch of code and means I don't have to add code to the SPIR-V generator for them. v2: Rebase on 2+ years of change... and fix a major bug added in the rebase. text data bss dec hex filename 8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so before 8254235 268856 294072 8817163 868a0b 32-bit i965_dri.so after 7815339 345592 420592 8581523 82f193 64-bit i965_dri.so before 7813995 345560 420592 8580147 82ec33 64-bit i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: stop cloning builtin fuctions _mesa_glsl_find_builtin_function()Timothy Arceri2017-08-111-10/+1
| | | | | | | | | | | | | | | | | | | | | The cloning was introduced in f81ede469910d to fix a problem with shaders including IR that was owned by builtins. However the approach of cloning the whole function each time we reference a builtin lead to a significant reduction in the GLSL IR compilers performance. The previous patch fixes the ownership problem in a more precise way. So we can now remove this cloning. Testing on a Ryzen 7 1800X shows a ~15% decreases in compiling the Deus Ex: Mankind Divided shaders on radeonsi (which take 5min+ on some machines). Looking just at the GLSL IR compiler the speed up is ~40%. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: update the extensions/functions that are enabled for 460Samuel Pitoiset2017-08-071-17/+94
| | | | | | | | Other ones are either unsupported or don't have any helper function checks. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: check if any of the named builtins are available firstIlia Mirkin2017-07-051-2/+11
| | | | | | | | | | | | | | | | | | | | | _mesa_glsl_has_builtin_function is used to determine whether any variant of a builtin are available, for the purpose of enforcing the GLSL ES 3.00+ rule that overloads or overrides of builtins are disallowed. However the builtin_builder contains information on all builtins, irrespective of parse state, or versions, or extension enablement. As a result we would say that a builtin existed even if it was not actually available. To resolve this, first check if at least one signature is available for a builtin before returning true. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101666 Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Reviewed-by: Timothy Arceri <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: rename image_* qualifiers to memory_*Samuel Pitoiset2017-05-041-15/+15
| | | | | | | | | It doesn't make sense to prefix them with 'image' because they are called "Memory Qualifiers" and they can be applied to members of storage buffer blocks. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Andres Gomez <[email protected]>
* glsl: implement arb_shader_ballot builtins using intrinsicsNicolai Hähnle2017-04-281-3/+83
|
* glsl: implement arb_shader_group_vote builtins via intrinsicsNicolai Hähnle2017-04-281-6/+32
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: make use of glsl_type::is_double()Samuel Pitoiset2017-04-211-6/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* glsl: use the BA1 macro for textureQueryLevels()Samuel Pitoiset2017-04-111-32/+33
| | | | | | | For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: use the BA1 macro for textureSamples()Samuel Pitoiset2017-04-111-9/+10
| | | | | | | For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: use the BA1 macro for textureCubeArrayShadow()Samuel Pitoiset2017-04-111-5/+6
| | | | | | | For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: add ARB_shader_ballot builtin functionsNicolai Hähnle2017-04-051-0/+77
| | | | Reviewed-by: Marek Olšák <[email protected]>
* glsl: use -O1 optimization for builtin_functions.cpp with MinGWBrian Paul2017-03-311-0/+20
| | | | | | | | | | | | | | | | Some versions of MinGW-w64 such as 5.3.1 and 6.2.0 produce bad code with -O2 or -O3 causing a random driver crash when running programs that use GLSL. Most Mesa demos in the glsl/ directory trigger the bug, but not the fragcoord.c test. Use a #pragma to force -O1 for this file for later MinGW versions. Luckily, this is basically one-time setup code. I suspect the bug is related to the sheer size of this file. This should let us move to newer versions of MinGW-w64 for Mesa. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: fix clockARB builtin functionNicolai Hähnle2017-03-311-1/+1
| | | | | | | The underlying intrinsic is defined to always have a uvec2 return type. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* glsl: builtin: always return clones of the builtinsLionel Landwerlin2017-03-091-5/+17
| | | | | | | | | | | | | | | | | | | | | | Builtins are created once and allocated using their own private ralloc context. When reparenting IR that includes builtins, we might be steal bits of builtins. This is problematic because these builtins might now be freed when the shader that includes then last is disposed. This might also lead to inconsistent ralloc trees/lists if shaders are created on multiple threads. Rather than including builtins directly into a shader's IR, we should include clones of them in the ralloc context of the shader that requires them. This fixes double free issues we've been seeing when running shader-db on a big multicore (72 threads) server. v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better reflect how this function is used. (Ken) v3: Rename ctx to mem_ctx (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: correct compute shader checks for memoryBarrier functionsMarc Di Luzio2017-02-061-6/+12
| | | | | | | | | | | | | | | | As per the spec - "The functions memoryBarrierShared() and groupMemoryBarrier() are available only in compute shaders; the other functions are available in all shader types." Conform to this by adding another delegate to check for compute shader support instead of only whether the current stage is compute This allows some fragment shaders in Dirt Rally to compile Cc: "17.0" <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).Francisco Jerez2017-01-311-1/+21
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* glsl: Rewrite atan2 implementation to fix accuracy and handling of ↵Francisco Jerez2017-01-311-36/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zero/infinity. This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add "built-in" functions to do 64%64 => 64 modulusIan Romanick2017-01-201-0/+8
| | | | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" functions to do 64/64 => 64 divisionIan Romanick2017-01-201-0/+8
| | | | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" function for 64-bit integer sign()Ian Romanick2017-01-201-0/+4
| | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" functions to do 64x64 => 64 multiplicationIan Romanick2017-01-201-0/+9
| | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>