aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Refactor implicit conversion into its own helperAndres Gomez2016-08-051-80/+86
| | | | | | | | v2: Refactor also the conversion to constant and replacement code (Timothy). Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Andres Gomez <[email protected]>
* glsl/types: disallow implicit conversions before GLSL 1.20Andres Gomez2016-08-051-4/+4
| | | | | | | | | Implicit conversions were added in the GLSL 1.20 spec version. v2: Join the checks for GLSL 1.10 and ESSL (Timothy). Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Andres Gomez <[email protected]>
* nir: Make nir_opt_remove_phis see through moves.Kenneth Graunke2016-08-041-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found a shader in Tales of Maj'Eyal that contains: if ssa_21 { block block_1: /* preds: block_0 */ ...instructions that prevent the select peephole... vec1 32 ssa_23 = imov ssa_4 vec1 32 ssa_24 = imov ssa_4.y vec1 32 ssa_25 = imov ssa_4.z /* succs: block_3 */ } else { block block_2: /* preds: block_0 */ vec1 32 ssa_26 = imov ssa_4 vec1 32 ssa_27 = imov ssa_4.y vec1 32 ssa_28 = imov ssa_4.z /* succs: block_3 */ } block block_3: /* preds: block_1 block_2 */ vec1 32 ssa_29 = phi block_1: ssa_23, block_2: ssa_26 vec1 32 ssa_30 = phi block_1: ssa_24, block_2: ssa_27 vec1 32 ssa_31 = phi block_1: ssa_25, block_2: ssa_28 Here, copy propagation will bail because phis cannot perform swizzles, and CSE won't do anything because there is no dominance relationship between the imovs. By making nir_opt_remove_phis handle identical moves, we can eliminate the phis and rewrite everything to use ssa_4 directly, so all the moves become dead and get eliminated. I don't think we need to check "exact" - just the alu sources. Presumably phi sources should match in their exactness. On Broadwell: total instructions in shared programs: 11639872 -> 11638535 (-0.01%) instructions in affected programs: 134222 -> 132885 (-1.00%) helped: 338 HURT: 0 v2: Fix return value to be NULL, not false (caught by Iago). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Make nir_alu_srcs_equal non-static.Kenneth Graunke2016-08-042-1/+4
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Turn imov/fmov of undef into undef.Kenneth Graunke2016-08-041-1/+3
| | | | | | | | | | | | | | | | | | | | | | On Broadwell: total instructions in shared programs: 11640214 -> 11639872 (-0.00%) instructions in affected programs: 17744 -> 17402 (-1.93%) helped: 78 HURT: 0 total spills in shared programs: 2924 -> 2922 (-0.07%) spills in affected programs: 104 -> 102 (-1.92%) helped: 1 HURT: 0 total fills in shared programs: 4394 -> 4389 (-0.11%) fills in affected programs: 237 -> 232 (-2.11%) helped: 1 HURT: 0 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Allow opt_peephole_select to work on empty blocks.Eric Anholt2016-08-031-7/+0
| | | | | | | | | | | | | | nir_opt_peephole_select has the job of removing IF statements with no side effects. However, if the IF statement's successor didn't have any instructions in it, we were skipping it, which occurred in mupen64 on vc4 with glsl_to_nir enabled: instructions in affected programs: 6134 -> 4120 (-32.83%) total uniforms in shared programs: 38268 -> 38219 (-0.13%) No changes on Haswell shader-db. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: fix validation messageTimothy Arceri2016-08-031-2/+2
| | | | | | | Looks like a copy and paste error from f752effa087 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* ast: Updated AST_NUM_OPERATORS for coherence with ast_operatorsAndres Gomez2016-08-022-9/+10
| | | | | | | | | | | | | | | AST_NUM_OPERATORS stores the dimension of the ast_operators enumeration but was not updated after its last modification. This doesn't add any real modification for any code paths but it makes sense for coherence. v2 (Eric Engestrom): Just place the define at the end of the enumeration, not below. Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* glsl: be more strict on block qualifiersTimothy Arceri2016-07-311-11/+73
| | | | | | | V2: Add spec references and allow patch qualifier (Ken) Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96528
* glsl: add name param to validate_flags()Timothy Arceri2016-07-313-10/+9
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: add component to ast_type_qualifier::validate_flagsTimothy Arceri2016-07-311-1/+2
| | | | | | | | This was added with ARB_enhanced_layouts. V2: Add an extra format specifier for the new qualifier. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: fix uninitialized instance variableJan Ziak2016-07-291-0/+1
| | | | | | | | Valgrind detected that variable ir_copy_propagation_visitor::killed_all is uninitialized. Signed-off-by: Jan Ziak (http://atom-symbol.net) <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* glsl: fix optimization of discard nested multiple levelsNicolai Hähnle2016-07-281-1/+8
| | | | | | | | | | | | The order of optimizations can lead to the conditional discard optimization being applied twice to the same discard statement. In this case, we must ensure that both conditions are applied. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96762 Cc: [email protected] Tested-by: Kai Wasserbäch <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: free hash tables earlierTimothy Arceri2016-07-281-7/+3
| | | | | | | These are only used by get_matching_input() which has been call at this point so free the hash tables. Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl: Remove references to tail_pred.Matt Turner2016-07-261-9/+9
|
* glsl: Avoid aliasing violations.Matt Turner2016-07-262-10/+4
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Separate overlapping sentinel nodes in exec_list.Matt Turner2016-07-2622-134/+162
| | | | | | | | | | | I do appreciate the cleverness, but unfortunately it prevents a lot more cleverness in the form of additional compiler optimizations brought on by -fstrict-aliasing. No difference in OglBatch7 (n=20). Co-authored-by: Davin McCall <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: reuse main extension table to appropriately restrict extensionsIlia Mirkin2016-07-238-328/+205
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we were only restricting based on ES/non-ES-ness and whether the overall enable bit had been flipped on. However we have been adding more fine-grained restrictions, such as based on compat profiles, as well as specific ES versions. Most of the time this doesn't matter, but it can create awkward situations and duplication of logic. Here we separate the main extension table into a separate object file, linked to the glsl compiler, which makes use of it with a custom function which takes the ES-ness of the shader into account (thus allowing desktop shaders to properly use ES extensions that would otherwise have been disallowed.) We can also now use this logic to generate #define's for all supported extensions automatically, removing the duplicate (and often inaccurate) list in glcpp. The effect of this change should be nil in most cases. However in some situations, extensions like GL_ARB_gpu_shader5 which were formerly available in compat contexts on the GLSL side of things will now become inaccessible. This regresses two ES CTS tests: ES3-CTS.shaders.shader_integer_mix.define ES31-CTS.shader_integer_mix.define however that is due to them using #version 100 instead of 300 es. As the extension is only defined for ES3, I believe this is the correct behavior. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v2) v2 -> v3: integrate glcpp defines into the same mechanism
* nir: Lower interp_var_at_* like a normal load_var for flat inputs.Kenneth Graunke2016-07-221-0/+4
| | | | | | | | | | "flat centroid" and "flat sample" both just mean "flat", so we should ignore interpolateAtCentroid/Sample and just return the flat value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97032 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* spirv/nir: Add support for ImageQuerySamplesJason Ekstrand2016-07-221-0/+3
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* spirv/nir: Handle texture projectorsJason Ekstrand2016-07-221-0/+15
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: Refactor coordinate handling in handle_textureJason Ekstrand2016-07-221-29/+28
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* spirv/nir: Refactor type handling in handle_textureJason Ekstrand2016-07-221-5/+8
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* spirv/nir: Move opcode selection higher up in handle_textureJason Ekstrand2016-07-221-48/+48
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* spirv/nir: Don't increment coord_components for array lod queriesJason Ekstrand2016-07-221-1/+1
| | | | | | | | | For lod query instructions, we really don't care whether or not the sampler is an array type because that doesn't factor into the LOD. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir/lower_tex: Add support for lowering coordinate offsetsJason Ekstrand2016-07-222-0/+64
| | | | | | | | | | On i965, we can't support coordinate offsets for texelFetch or rectangle textures. Previously, we were doing this with a GLSL pass but we need to do it in NIR if we want those workarounds for SPIR-V. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir/lower_tex: Add some helpers for working with tex sourcesJason Ekstrand2016-07-221-16/+30
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir: Add a helper for determining the type of a texture sourceJason Ekstrand2016-07-221-0/+44
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* spirv/nir: Properly handle gather componentsJason Ekstrand2016-07-221-1/+11
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* spirv/nir: Add support for shadow samplers that return vec4Jason Ekstrand2016-07-221-1/+2
| | | | | | | | | | While SPIR-V technically doesn't support "old style" shadow, the shadow-compare gather instruction does return a vec4 so we need to be able to set the old_style_shadow bit in NIR. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* spirv/nir: Fix some texture opcode assertsJason Ekstrand2016-07-221-2/+2
| | | | | | | | | We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an explicit-LOD texturing instruction. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* glsl: subroutine types cannot be comparedAndres Gomez2016-07-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | subroutine variables are to be used just in the way functions are called. Although the spec doesn't say it explicitely, this means that these variables are not to be used in any other way than those left for function calls. Therefore, a comparison between 2 subroutine variables should also cause a compilation error. From The OpenGL® Shading Language 4.40, page 117: " To use subroutines, a subroutine type is declared, one or more functions are associated with that subroutine type, and a subroutine variable of that type is declared. The function currently assigned to the variable function is then called by using function calling syntax replacing a function name with the name of the subroutine variable. Subroutine variables are uniforms, and are assigned to specific functions only through commands (UniformSubroutinesuiv) in the OpenGL API." From The OpenGL® Shading Language 4.40, page 118: " Subroutine uniform variables are called the same way functions are called. When a subroutine variable (or an element of a subroutine variable array) is associated with a particular function, all function calls through that variable will call that particular function." Fixes GL44-CTS.shader_subroutine.subroutines_cannot_be_assigned_float_int_values_or_be_compared Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Add a base const_index to shared atomic intrinsics.Kenneth Graunke2016-07-211-10/+10
| | | | | | | | | | | | | | | | | | | Commit 52e75dcb8c04c0dde989970c4c587cbe8313f7cf made nir_lower_io start using nir_intrinsic_set_base instead of writing const_index[0] directly. However, those intrinsics apparently don't /have/ a base, so this caused assert failures. However, the old code was happily setting non-existent const_index fields, so it was pretty bogus too. Jason pointed out that load_shared and store_shared have a base, and that the i965 driver uses that field. So presumably atomics should have one as well, so that loads/stores/atomics all refer to variables with consistent addressing. Cc: "12.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: re-enable varying packing in GL4.4+Timothy Arceri2016-07-221-30/+24
| | | | | | | | We can still do packing we just need to get the packing type from the consumer rather than the producer. Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97033
* nir: add doubles component packing supportTimothy Arceri2016-07-211-0/+20
| | | | | | | | | | | This makes sure we give the correct driver location for doubles when using component packing. Specifically it handles packing a dvec3 with a double which is the only packing scenario allowed which spans across two locations. Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir/inline: Constant-initialize local variables in the callee if neededJason Ekstrand2016-07-201-2/+40
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir: Add a nir_deref_foreach_leaf helperJason Ekstrand2016-07-202-0/+120
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir: Add nir_load_interpolated_input lowering code.Kenneth Graunke2016-07-202-5/+100
| | | | | | | | | | | | | | Now nir_lower_io can optionally produce load_interpolated_input and load_barycentric_* intrinsics for fragment shader inputs. flat inputs continue using regular load_input. v2: Use a nir_shader_compiler_options flag rather than ad-hoc boolean passing (in response to review feedback from Chris Forbes). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add new intrinsics for fragment shader input interpolation.Kenneth Graunke2016-07-205-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backends can normally handle shader inputs solely by looking at load_input intrinsics, and ignore the nir_variables in nir->inputs. One exception is fragment shader inputs. load_input doesn't capture the necessary interpolation information - flat, smooth, noperspective mode, and centroid, sample, or pixel for the location. This means that backends have to interpolate based on the nir_variables, then associate those with the load_input intrinsics (say, by storing a map of which variables are at which locations). With GL_ARB_enhanced_layouts, we're going to have multiple varyings packed into a single vec4 location. The intrinsics make this easy: simply load N components from location <loc, component>. However, working with variables and correlating the two is very awkward; we'd much rather have intrinsics capture all the necessary information. Fragment shader input interpolation typically works by producing a set of barycentric coordinates, then using those to do a linear interpolation between the values at the triangle's corners. We represent this by introducing five new load_barycentric_* intrinsics: - load_barycentric_pixel (ordinary variable) - load_barycentric_centroid (centroid qualified variable) - load_barycentric_sample (sample qualified variable) - load_barycentric_at_sample (ARB_gpu_shader5's interpolateAtSample()) - load_barycentric_at_offset (ARB_gpu_shader5's interpolateAtOffset()) Each of these take the interpolation mode (smooth or noperspective only) as a const_index, and produce a vec2. The last two also take a sample or offset source. We then introduce a new load_interpolated_input intrinsic, which is like a normal load_input intrinsic, but with an additional barycentric coordinate source. The intention is that flat inputs will still use regular load_input intrinsics. This makes them distinguishable from normal inputs that need fancy interpolation, while also providing all the necessary data. This nicely unifies regular inputs and interpolateAt functions. Qualifiers and variables become irrelevant; there are just load_barycentric intrinsics that determine the interpolation. v2: Document the interp_mode const_index value, define a new BARYCENTRIC() helper rather than using SYSTEM_VALUE() for some of them (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Fix uninitialized use of 'replacement'.Kenneth Graunke2016-07-191-1/+1
| | | | | | | | | | | For intrinsics we don't care about, just skip to the next loop iteration and process the next instruction. We don't want to execute the rest of the code. This was a bug in commit cdfc05ea6e8c87876cdbf588aa8e03d70f3da4bb. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* nir/algebraic: Optimize fabs(u2f(x))Ian Romanick2016-07-191-0/+1
| | | | | | | | | | I noticed this when I tried to do frexp(float(some_unsigned)) in the ir_unop_find_lsb lowering pass. The code generated for frexp() uses fabs, and this resulted in an extra instruction. Ultimately I ended up not using frexp. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add lowering pass for ir_bin_imul_highIan Romanick2016-07-192-0/+150
| | | | | | | | | | This isn't the lowering pass you want. Most GPUs that can support GLSL 1.30 have a multiply unit that can do something more interesting than 32x32->32. Many have 32x16->48. Any GPU that does, should do the lowering in the backend. This is just the thing that will always work. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add lowering pass for ir_unop_find_msbIan Romanick2016-07-192-0/+107
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add lowering pass for ir_unop_find_lsbIan Romanick2016-07-192-0/+87
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add lowering pass for ir_unop_bitfield_reverseIan Romanick2016-07-192-0/+92
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add lowering pass for ir_quadop_bitfield_insertIan Romanick2016-07-192-0/+74
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add lowering pass for ir_triop_bitfield_extractIan Romanick2016-07-192-0/+81
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add lowering pass for ir_unop_bit_countIan Romanick2016-07-192-0/+54
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* MESA_shader_integer_functions: Allow new function overload matching rulesIan Romanick2016-07-191-5/+7
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* MESA_shader_integer_functions: Allow implicit int->uint conversionsIan Romanick2016-07-192-6/+10
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>