summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* glsl: fixer lexer for unreachable definesTimothy Arceri2018-09-062-23/+38
| | | | | | | | | | | | | | | | | | | | | | | | If we have something like: #ifdef NOT_DEFINED #define A_MACRO(x) \ if (x) #endif The # on the #define is not skipped but the define itself is so this then gets recognised as #if. Until 28a3731e3f this didn't happen because we ended up in <HASH>{NONSPACE} where BEGIN INITIAL was called stopping the problem from happening. This change makes sure we never call RETURN_TOKEN_NEVER_SKIP for if/else/endif when processing a define. Cc: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107772 Tested-By: Eero Tamminen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: avoid lowering texcoord array except in simple casesIlia Mirkin2018-08-291-0/+6
| | | | | | | | | With compat creeping up to geometry and tess shaders, lowering texcoord accesses/writes becomes more complicated. Since it's an optimization anyways, just avoid the complication for now. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: add a mechanism to allow layout qualifiers on function paramsTimothy Arceri2018-08-303-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spec is quite clear this is not allowed: From Section 4.4. (Layout Qualifiers) of the GLSL 4.60 spec: "Layout qualifiers can appear in several forms of declaration. They can appear as part of an interface block definition or block member, as shown in the grammar in the previous section. They can also appear with just an interface-qualifier to establish layouts of other declarations made with that qualifier: layout-qualifier interface-qualifier ; Or, they can appear with an individual variable declared with an interface qualifier: layout-qualifier interface-qualifier declaration ;" From Section 4.10 (Memory Qualifiers) of the GLSL 4.60 spec: "Layout qualifiers cannot be used on formal function parameters, and layout qualification is not included in parameter matching." However on the Nvidia binary driver they actually fail to compile if image function params don't have a layout qualifier. This results in applications such as No Mans Sky using layout qualifiers on params. I've submitted a CTS test to expose this problem in the Nvidia driver but until that is resolved this patch will help Mesa drivers work around the issue. Reviewed-by: Marek Olšák <[email protected]>
* glsl: skip stringification in preprocessor if in unreachable branchTimothy Arceri2018-08-301-2/+4
| | | | | | | This fixes compilation of some "No Mans Sky" shaders where the stringification happens in branches intended for DX12. Reviewed-by: Ian Romanick <[email protected]>
* anv,i965: Lower away image derefs in the driverJason Ekstrand2018-08-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the back-end compiler turn image access into magic uniform reads and there was a complex contract between back-end compiler and driver about setting up and filling out those params. As of this commit, both drivers now lower image_deref_load_param_intel intrinsics to load_uniform intrinsics controlled by the driver and lower the other image_deref_* intrinsics to image_* intrinsics which take an actual binding table index. There are still "magic" uniforms but they are now added and controlled entirely by the driver and that contract no longer spans components. This also has the side-effect of making most image use compile-time binding table indices. Previously, all image access pulled the binding table index from a uniform. Part of the reason for this was that the magic uniforms made it difficult to decouple binding table indices from the uniforms and, since they are indexed completely differently (especially in Vulkan), it was hard to pull them apart. Now that the driver is handling both, it's trivial to decouple the two and provide actual binding table indices. Shader-db results on Kaby Lake: total instructions in shared programs: 15166872 -> 15164293 (-0.02%) instructions in affected programs: 115834 -> 113255 (-2.23%) helped: 191 HURT: 0 total cycles in shared programs: 571311495 -> 571196465 (-0.02%) cycles in affected programs: 4757115 -> 4642085 (-2.42%) helped: 73 HURT: 67 total spills in shared programs: 10951 -> 10926 (-0.23%) spills in affected programs: 742 -> 717 (-3.37%) helped: 7 HURT: 0 total fills in shared programs: 22226 -> 22201 (-0.11%) fills in affected programs: 1146 -> 1121 (-2.18%) helped: 7 HURT: 0 Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add handle/index-based image intrinsicsJason Ekstrand2018-08-293-20/+82
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Use a bitfield for image access qualifiersJason Ekstrand2018-08-296-29/+39
| | | | | | | | | | This commit expands the current memory access enum to contain the extra two bits provided for images. We choose to follow the SPIR-V convention of NonReadable and NonWriteable because readonly implies that you *can* read so readonly + writeonly doesn't make as much sense as NonReadable + NonWriteable. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/link,i965: Make ImageAccess four-stateJason Ekstrand2018-08-292-6/+10
| | | | | | | | | | | | | | | | | The GLSL spec allows you to set both the "readonly" and "writeonly" qualifiers on images to indicate that it can only be used with imageSize. However, we had no way of representing this int he linked shader and flagged it as GL_READ_ONLY. This is good from a "does it use this buffer?" perspective but not from a format and access lowering perspective. By using GL_NONE for if "readonly" and "writeonly" are both set, we can detect this case in the driver and handle it correctly. Nothing currently relies on the type of surface in the "readonly" + "writeonly" case but that's about to change. i965 is the only drier which uses the ImageAccess field and gl_bindless_image::access is currently unused. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Do image load/store lowering to NIRJason Ekstrand2018-08-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | This commit moves our storage image format conversion codegen into NIR instead of doing it in the back-end. This has the advantage of letting us run it through NIR's optimizer which is pretty effective at shrinking things down. In the common case of rgba8, the number of instructions emitted after NIR is done with it is half of what it was with the lowering happening in the back-end. On the downside, the back-end's lowering is able to directly use predicates and the NIR lowering has to use IFs. Shader-db results on Kaby Lake: total instructions in shared programs: 15166910 -> 15166872 (<.01%) instructions in affected programs: 5895 -> 5857 (-0.64%) helped: 15 HURT: 0 Clearly, we don't have that much image_load_store happening in the shaders in shader-db.... Reviewed-by: Kenneth Graunke <[email protected]>
* nir/types: Add a wrapper for coordinate_componentsJason Ekstrand2018-08-292-0/+8
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Make image load/store intrinsics variable-widthJason Ekstrand2018-08-293-4/+11
| | | | | | | | | Instead of requiring 4 components, this allows them to potentially use fewer. Both the SPIR-V and GLSL paths still generate vec4 intrinsics so drivers which assume 4 components should be safe. However, we want to be able to shrink them for i965. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/format_convert: Fix a bitmask in unpack_11f11f10fJason Ekstrand2018-08-291-1/+1
| | | | | | Fixes: 4e337b42f9a2 "nir/format_convert: Add pack/unpack for R11F_G11F_B10F" Reviewed-by: Kenneth Graunke <[email protected]>
* nir/format_convert: Rename pack_r11g11b10f to pack_11f11f10fJason Ekstrand2018-08-291-1/+1
| | | | | | This matches the unpack function. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/format_convert: Add [us]norm conversion helpersJason Ekstrand2018-08-291-0/+56
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/format_convert: Rename nir_format_bitcast_uint_vecJason Ekstrand2018-08-291-2/+3
| | | | | | | | We have a name for that, it's called a uvec. This just makes the function name a bit shorter. While we're here, we also add an assert for one of the assumptions this function makes. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/format_convert: Add vec mask and sign-extend helpersJason Ekstrand2018-08-291-8/+27
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/format_convert: Add support for unpacking signed integersJason Ekstrand2018-08-291-8/+29
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/opcodes: Make unpack_half_2x16_split_* variable-widthJason Ekstrand2018-08-291-4/+4
| | | | | | | There is nothing inherent about these opcodes that requires them to only take scalars. It's very convenient if we let them take vectors as well. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/algebraic: Add some max/min optimizationsJason Ekstrand2018-08-291-0/+6
| | | | | | | | | | | | | | | Found by inspection. This doesn't help much now but we'll see this pattern with images if you load UNORM and then store UNORM. Shader-db results on Kaby Lake: total instructions in shared programs: 15166916 -> 15166910 (<.01%) instructions in affected programs: 761 -> 755 (-0.79%) helped: 6 HURT: 0 Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/algebraic: Add more extract_[iu](8|16) optimizationsJason Ekstrand2018-08-291-0/+10
| | | | | | | | | | | | | | | | This adds the "(a << N) >> M" family of mask or sign-extensions. Not a huge win right now but this pattern will soon be generated by NIR format lowering code. Shader-db results on Kaby Lake: total instructions in shared programs: 15166918 -> 15166916 (<.01%) instructions in affected programs: 36 -> 34 (-5.56%) helped: 2 HURT: 0 Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/algebraic: Be more careful converting ushr to extract_u8/16Jason Ekstrand2018-08-291-2/+2
| | | | | | | | | | If it's not the right bit-size, it may not actually be the correct extraction. For now, we'll only worry about 32-bit versions. Fixes: 905ff8619824 "nir: Recognize open-coded extract_u16" Fixes: 76289fbfa84a "nir: Recognize open-coded extract_u8" Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/linker: Link all out vars from a shader objects on a single stagevadym.shovkoplias2018-08-291-0/+37
| | | | | | | | | | | During intra stage linking some out variables can be dropped because it is not used in a shader with the main function. But these out vars can be referenced on later stages which can lead to further linking errors. Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
* nir: add loop unroll support for wrapper loopsTimothy Arceri2018-08-291-0/+77
| | | | | | | | | | | | | | | | | | | | This adds support for unrolling the classic do { // ... } while (false) that is used to wrap multi-line macros. GLSL IR also wraps switch statements in a loop like this. shader-db results IVB: total loops in shared programs: 2515 -> 2512 (-0.12%) loops in affected programs: 33 -> 30 (-9.09%) helped: 3 HURT: 0 Reviewed-by: Jason Ekstrand <[email protected]>
* nir/opt_loop_unroll: Remove unneeded phis if we make progressTimothy Arceri2018-08-291-1/+9
| | | | | | | | | | | Now that SSA values can be derefs and they have special rules, we have to be a bit more careful about our LCSSA phis. In particular, we need to clean up in case LCSSA ended up creating a phi node for a deref. This avoids validation issues with some CTS tests with the following patch, but its possible this we could also see the same problem with the existing unrolling passes. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add complex_loop bool to loop infoTimothy Arceri2018-08-292-2/+12
| | | | | | | | | | | In order to be sure loop_terminator_list is an accurate representation of all the jumps in the loop we need to be sure we didn't encounter any other complex behaviour such as continues, nested breaks, etc during analysis. This will be used in the following patch. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: always attempt to find loop terminatorsTimothy Arceri2018-08-291-7/+7
| | | | | | | This will help later patches with unrolling loops that end with a break i.e. loops the always exit on their first interation. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Remove outdated commentCaio Marcelo de Oliveira Filho2018-08-281-3/+0
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_orderingKevin Rogovin2018-08-286-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | This extension provides new GLSL built-in function beginFragmentShaderOrderingIntel() that guarantees (taking wording of GL_INTEL_fragment_shader_ordering extension) that any memory transactions issued by shader invocations from previous primitives mapped to same xy window coordinates (and same sample when per-sample shading is active), complete and are visible to the shader invocation that called beginFragmentShaderOrderingINTEL(). One advantage of INTEL_fragment_shader_ordering over ARB_fragment_shader_interlock is that it provides a function that operates as a memory barrie (instead of a defining a critcial section) that can be called under arbitary control flow from any function (in contrast the begin/end of ARB_fragment_shader_interlock may only be called once, from main(), under no control flow. Signed-off-by: Kevin Rogovin <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* glsl/linker: Allow unused in blocks which are not declated on previous stagevadym.shovkoplias2018-08-272-3/+9
| | | | | | | | | | | | | | | | | | | | >From Section 4.3.4 (Inputs) of the GLSL 1.50 spec: "Only the input variables that are actually read need to be written by the previous stage; it is allowed to have superfluous declarations of input variables." Fixes: * interstage-multiple-shader-objects.shader_test v2: Update comment in ir.h since the usage of "used" field has been extended. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247 Signed-off-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Pull block_ends_in_jump into nir.hJason Ekstrand2018-08-273-23/+13
| | | | | | | We had two different implementations in different files. May as well have one and put it in nir.h. Reviewed-by: Timothy Arceri <[email protected]>
* Revert "configure: allow building with python3"Emil Velikov2018-08-245-5/+5
| | | | | | | | | | | | | | This reverts commit ae7898dfdbe5c8dab7d11c71862353f1ae43feb0. Turns out the python scripts are _not_ fully python 3 compatible. As Ilia reported using get_xmlpool.py with LANG=C produces some weird output - see the link for details. Even though the issue was spotted with the autoconf build, it exposes a genuine problem with the script (and lack of lang handling of the meson build.) https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
* mesa: expose AMD_gpu_shader_int64Marek Olšák2018-08-245-12/+18
| | | | | | | | | because the closed driver exposes it. It's equivalent to ARB_gpu_shader_int64. In this patch, I did everything the same as we do for ARB_gpu_shader_int64. Reviewed-by: Ian Romanick <[email protected]>
* mesa: expose ARB_post_depth_coverage in the Compatibility profileMarek Olšák2018-08-241-0/+1
| | | | | | It only contains GLSL changes. v2: allow the layout qualifier on GLSL <= 1.30
* nir: Add an array copy optimizationJason Ekstrand2018-08-234-0/+415
| | | | | | | | | | | | This peephole optimization looks for a series of load/store_deref or copy_deref instructions that copy an array from one variable to another and turns it into a copy_deref that copies the entire array. The pattern it looks for is extremely specific but it's good enough to pick up on the input array copies in DXVK and should also be able to pick up the sequence generated by spirv_to_nir for a OpLoad of a large composite followed by OpStore. It can always be improved later if needed. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add a array-of-vector variable shrinking passJason Ekstrand2018-08-232-0/+718
| | | | | | | This pass looks for variables with vector or array-of-vector types and narrows the type to only the components used. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add an array splitting passJason Ekstrand2018-08-232-0/+584
| | | | | | | | | | | | | | | | | | | | | | | This pass looks for array variables where at least one level of the array is never indirected and splits it into multiple smaller variables. This pass doesn't really do much now because nir_lower_vars_to_ssa can already see through arrays of arrays and can detect indirects on just one level or even see that arr[i][0][5] does not alias arr[i][1][j]. This pass exists to help other passes more easily see through arrays of arrays. If a back-end does implement arrays using scratch or indirects on registers, having more smaller arrays is likely to have better memory efficiency. v2 (Jason Ekstrand): - Better comments and naming (some from Caio) - Rework to use one hash map instead of two v2.1 (Jason Ekstrand): - Fix a couple of bugs that were added in the rework including one which basically prevented it from running Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add a structure splitting passJason Ekstrand2018-08-234-0/+278
| | | | | | | | | | | This pass doesn't really do much now because nir_lower_vars_to_ssa can already see through structures and considers them to be "split". This pass exists to help other passes more easily see through structure variables. If a back-end does implement arrays using scratch or indirects on registers, having more smaller arrays is likely to have better memory efficiency. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/types: Add array_or_matrix helpersJason Ekstrand2018-08-232-0/+17
| | | | Reviewed-by: Thomas Helland<[email protected]>
* glsl: fix error checking against MAX_UNIFORM_LOCATIONSMarek Olšák2018-08-231-2/+6
| | | | Tested-by: Dieter Nützel <[email protected]>
* mesa: add ctx->Const.MaxGeometryShaderInvocationsMarek Olšák2018-08-232-1/+3
| | | | | | | radeonsi wants to report a different value Reviewed-by: Ian Romanick <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* configure: allow building with python3Emil Velikov2018-08-235-5/+5
| | | | | | | | | | | | Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python3 chosen prior to python2 Signed-off-by: Emil Velikov <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* glsl: remove execute bit and shebang from python testsEmil Velikov2018-08-233-3/+0
| | | | | | | | | | | | | Just like the rest of the tree - these should be run either as part of the build system check target, or at the very least with an explicitly versioned python executable. Fixes: db8cd8e3677 ("glcpp/tests: Convert shell scripts to a python script") Fixes: 97c28cb0823 ("glsl/tests: Convert optimization-test.sh to pure python") Fixes: 3b52d292273 ("glsl/tests: reimplement warnings-test in python") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Add floating point atomic min, max, and compare-swap instrinsicsIan Romanick2018-08-224-8/+50
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add floating point atomic add instrinsicsIan Romanick2018-08-225-5/+22
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add support for lowering shared-variable float atomicsIan Romanick2018-08-221-3/+3
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add support for lowering SSBO float atomicsIan Romanick2018-08-221-3/+3
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add built-in functions for INTEL_shader_atomic_float_minmaxIan Romanick2018-08-221-1/+32
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa: Extension boilerplate for INTEL_shader_atomic_float_minmaxIan Romanick2018-08-222-0/+3
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add built-in functions for NV_shader_atomic_floatIan Romanick2018-08-221-3/+48
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa: Extension boilerplate for NV_shader_atomic_floatIan Romanick2018-08-222-0/+3
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>