summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir: add pass to move load_constRob Clark2018-06-194-0/+144
| | | | | | | | | | | | Run this pass late (after opt loop) to move load_const instructions back into the basic blocks which use the result, in cases where a load_const is only consumed in a single block. This helps reduce register usage in cases where the backend driver cannot lower the load_const to a uniform. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: add comment for loop_unroll passRob Clark2018-06-191-0/+4
| | | | | | | | Save the next person from digging through the code to figure out what the indirect_mask parameter actually does. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: fix random typoRob Clark2018-06-191-1/+1
| | | | | | | Just something I stumbled across. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: fix desktop glsl linking regressionTimothy Arceri2018-06-191-1/+2
| | | | | | | The prog->Shaders[i]->IsES check was accidentally removed causing ES linking rules to be applied to desktop GLSL. Fixes: 725b1a406dbe ("mesa/util: add allow_glsl_relaxed_es driconfig override")
* mesa/util: add allow_glsl_relaxed_es driconfig overrideTimothy Arceri2018-06-192-10/+15
| | | | | | | | | | | | | | | This relaxes a number of ES shader restrictions allowing shaders to follow more desktop GLSL like rules. This initial implementation relaxes the following: - allows linking ES shaders with desktop shaders - allows mismatching precision qualifiers - always enables standard derivative builtins These relaxations allow Google Earth VR shaders to compile. Reviewed-by: Dave Airlie <[email protected]>
* mesa/util: add allow_glsl_builtin_const_expression driconf overrideTimothy Arceri2018-06-191-1/+2
| | | | | | | Google Earth VR shaders uses builtins in constant expressions with GLSL 1.10. That feature wasn't allowed until GLSL 1.20. Reviewed-by: Dave Airlie <[email protected]>
* nir: Document a couple instances of parent_instrIan Romanick2018-06-151-0/+2
| | | | | | | | | | | | | | nir_ssa_def::parent_instr and nir_src::parent_instr have the same name, but they mean really different things. I choose to save the next person the hour+ that I just spent figuring that out. Even now that I know, I doubt I'd notice in code review that someone typed foo->parent_instr when they actually meant foo->ssa->parent_instr. v2: Minor wording tweak in nir_ssa_def::parent_instr. Suggested by Jason. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Don't copy propagate elements from SSBO or shared variables eitherIan Romanick2018-06-141-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | Since SSBOs can be written by a different GPU thread, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. The same shader was helped by this patch and the previous. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399119 -> 14399113 (<.01%) instructions in affected programs: 683 -> 677 (-0.88%) helped: 1 HURT: 0 total cycles in shared programs: 532973113 -> 532971865 (<.01%) cycles in affected programs: 524666 -> 523418 (-0.24%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
* glsl: Don't copy propagate from SSBO or shared variables eitherIan Romanick2018-06-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | Since SSBOs can be written by other GPU threads, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399120 -> 14399119 (<.01%) instructions in affected programs: 684 -> 683 (-0.15%) helped: 1 HURT: 0 total cycles in shared programs: 532978931 -> 532973113 (<.01%) cycles in affected programs: 530484 -> 524666 (-1.10%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
* glsl: allow standalone semicolons outside main()Dave Airlie2018-06-141-0/+1
| | | | | | | | | | | GLSL 4.60 offically added this but games and older CTS suites actually had shaders that did this, we may as well enable it everywhere. Adding stable because it appears apps in the wild do this. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: <[email protected]>
* spirv: add/hookup SpvCapabilityStencilExportEXTGustavo Lima Chaves2018-06-083-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: An attempt to support SpvExecutionModeStencilRefReplacingEXT's behavior also follows, with the interpretation to said mode being we prevent writes to the built-in FragStencilRefEXT variable when the execution mode isn't set. v3: A more cautious reading of 1db44252d01bf7539452ccc2b5210c74b8dcd573 led me to a missing change that would stop (what I later discovered were) GPU hangs on the CTS test written to exercise this. v4: Turn FragStencilRefEXT decoration usage without StencilRefReplacingEXT mode into a warning, instead of trying to make the variable read-only. If we are to follow the originating extension on GL, the built-in variable in question should never be readable anyway. v5/v6: rebases. v7: Fix check for gen9 lost in rebase. (Ilia) Reduce the scope of the bool used to track whether SpvExecutionModeStencilRefReplacingEXT was used. Was in shader_info, moved to vtn_builder. (Jason) v8: Assert for fragment shader handling StencilRefReplacingEXT execution mode. (Caio) Remove warning logic, since an entry point might not have StencilRefReplacingEXT execution mode, but the global output variable might still exist for another entry point in the module. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add global invocation id intrinsic.Plamena Manolova2018-06-072-0/+5
| | | | | | | | Add the missing nir intrinsic for the gl_GlobalInvocationID compute shader variable. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir: add opt_if_loop_terminator()Timothy Arceri2018-06-071-0/+68
| | | | | | | | | | | | | | | | | | | | | | | | This pass detects potential loop terminators and moves intructions from the non breaking branch after the if-statement. This enables both the new opt_if_simplification() pass and loop unrolling to potentially progress further. Unexpectedly this change speed up shader-db run times by ~3% Ivy Bridge shader-db results (all changes in dolphin/ubershaders): total instructions in shared programs: 9995662 -> 9995338 (-0.00%) instructions in affected programs: 87845 -> 87521 (-0.37%) helped: 27 HURT: 0 total cycles in shared programs: 230931495 -> 230925015 (-0.00%) cycles in affected programs: 56391385 -> 56384905 (-0.01%) helped: 27 HURT: 0 Reviewed-by: Ian Romanick <[email protected]>
* nir: move ends_in_break() helper to nir_loop_analyze.hTimothy Arceri2018-06-072-13/+13
| | | | | | | We will use the helper while simplifying potential loop terminators in the following patch. Reviewed-by: Ian Romanick <[email protected]>
* nir: Look into uniform structs for samplers when counting num_textures.Eric Anholt2018-06-061-12/+44
| | | | | | | | | | | | | | mesa/st decides whether to update samplers after a program change based on whether num_textures is nonzero. By not counting samplers in a uniform struct, we would segfault in KHR-GLES3.shaders.struct.uniform.sampler_vertex if it was run in the same context after a non-vertex-shader-uniform testcase (as is the case during a full conformance run). v2: Implement using two separate pure functions instead of updating pointers. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add lowering for nir_op_bit_count.Eric Anholt2018-06-062-0/+38
| | | | | | | | | This is basically the same as the GLSL lowering path. v2: Fix typo in the link Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Add lowering for nir_op_bitfield_reverse.Eric Anholt2018-06-062-1/+48
| | | | | | | This is basically the same as the GLSL lowering path. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Add an ALU lowering pass for mul_high.Eric Anholt2018-06-064-0/+170
| | | | | | | | This is based on the glsl/lower_instructions.cpp implementation, but should be much more readable. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Add lowering for find_lsb.Eric Anholt2018-06-062-0/+6
| | | | | | | There is a fairly simple relation to turn this into ufind_msb. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Add lowering for ifind_msb to ufind_msb.Eric Anholt2018-06-062-0/+6
| | | | | | | | ufind_msb is easily expressed in terms of clz, and we can reduce ifind_msb to that. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Add lowering from ibitfield_extract/ubitfield_extract to shifts.Eric Anholt2018-06-062-0/+19
| | | | | | | | V3D doesn't have opcodes for ibfe/ubfe, so we need to lower similarly to glsl/lower_instructions.cpp. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Add lowering for bitfieldInsert without using bfi.Eric Anholt2018-06-062-0/+19
| | | | | | | | | | | If you don't have HW to do bfi, then lowering bitfieldInsert to bfi makes things harder than keeping the "bits" argument around. This still uses bfm, but I've added the obvious lowering of bfm if you need it. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Take 'double' as reserved after GLSL ES 1.0zhaowei yuan2018-06-051-1/+1
| | | | | | | | | GLSL ES 1.0.17 specifies that "double" is a keyword reserved Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106823 Signed-off-by: zhaowei yuan <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: implement the GLSL equivalent of if simplication in nir_opt_ifSamuel Pitoiset2018-06-041-5/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass turns: if (cond) { } else { do_work(); } into: if (!cond) { do_work(); } else { } Here's the vkpipeline-db stats (from affected shaders) on Polaris10: Totals from affected shaders: SGPRS: 17272 -> 17296 (0.14 %) VGPRS: 18712 -> 18740 (0.15 %) Spilled SGPRs: 1179 -> 1142 (-3.14 %) Code Size: 1503364 -> 1515176 (0.79 %) bytes Max Waves: 916 -> 911 (-0.55 %) This pass only affects Serious Sam 2017 (Vulkan) on my side. The stats are not really good for now. Some shaders look quite dumb but this will be improved with further NIR passes, like ifs combination. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: make is_comparison() a non-static helper functionSamuel Pitoiset2018-06-042-25/+25
| | | | | | | | | Rename and change the prototype for consistency regarding nir_tex_instr_is_query(). This function will be used in the following patch. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: use num_components wrappers in print/validate.Dave Airlie2018-06-042-15/+5
| | | | | | These wrappers were introduces, so start using them. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Lower !f2b(x) to x == 0.0Ian Romanick2018-06-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | Some trivial help now, but it also prevents ~40 regressions caused by Samuel's "nir: implement the GLSL equivalent of if simplication in nir_opt_if" patch. All Gen4+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14369557 -> 14369555 (<.01%) instructions in affected programs: 442 -> 440 (-0.45%) helped: 2 HURT: 0 total cycles in shared programs: 532425772 -> 532425743 (<.01%) cycles in affected programs: 6086 -> 6057 (-0.48%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add some missing "optimization undo" patternsIan Romanick2018-06-011-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | d8d18516b0a and 03fb13f6467 added some patterns to undo conversions like (('ior', ('flt', a, b), ('flt', a, c)), ('flt', a, ('fmax', b, c))) If further optimization cause some of the operands to either be the same or be constants, undoing the transformation can lead to further savings. I don't know why these patterns were not added in those patches. I did not check to see which specific patterns actually helped. I just added all of them for symmetry. This prevents some loop unrolling regressions Plane Shift caused by Samuel's "nir: implement the GLSL equivalent of if simplication in nir_opt_if" patch. Skylake and Broadwell had similar results. (Skylake shown) total instructions in shared programs: 14369768 -> 14369557 (<.01%) instructions in affected programs: 44076 -> 43865 (-0.48%) helped: 141 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.50 x̃: 1 helped stats (rel) min: 0.07% max: 1.52% x̄: 0.66% x̃: 0.60% 95% mean confidence interval for instructions value: -1.67 -1.32 95% mean confidence interval for instructions %-change: -0.72% -0.59% Instructions are helped. total cycles in shared programs: 532430629 -> 532425772 (<.01%) cycles in affected programs: 1170832 -> 1165975 (-0.41%) helped: 101 HURT: 5 helped stats (abs) min: 1 max: 160 x̄: 48.54 x̃: 32 helped stats (rel) min: <.01% max: 8.49% x̄: 2.76% x̃: 2.03% HURT stats (abs) min: 2 max: 22 x̄: 9.20 x̃: 4 HURT stats (rel) min: <.01% max: 0.05% x̄: 0.02% x̃: <.01% 95% mean confidence interval for cycles value: -53.64 -38.00 95% mean confidence interval for cycles %-change: -3.06% -2.20% Cycles are helped. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Add ir_binop_vector_extract in NIRJuan A. Suarez Romero2018-06-011-0/+9
| | | | | | | | | | | | | | | Implement ir_binop_vector_extract using NIR operations. Based on SPIR-V to NIR approach. This fixes: dEQP-GLES3.functional.shaders.indexing.moredynamic.with_value_from_indexing_expression_fragment Piglit's glsl-fs-vec4-indexing-8.shader_test CC: [email protected] Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Iago Toral <[email protected]>
* mesa: Add GL/GLSL plumbing for ARB_fragment_shader_interlock.Plamena Manolova2018-06-0112-1/+191
| | | | | | | | | | | | | This extension provides new GLSL built-in functions beginInvocationInterlockARB() and endInvocationInterlockARB() that delimit a critical section of fragment shader code. For pairs of shader invocations with "overlapping" coverage in a given pixel, the OpenGL implementation will guarantee that the critical section of the fragment shader will be executed for only one fragment at a time. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* compiler/spirv: reject invalid shader code properlyMartin Pelikán2018-06-012-5/+38
| | | | | | | | | | After bebe3d626e5, b->fail_jump is prepared after vtn_create_builder which can longjmp(3) to it through its vtx_assert()s. This corrupts the stack and creates confusing core dumps, so we need to avoid it. While there, I decided to print the offending values for debugability. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: optimize iand(ieq(a, 0), ieq(b, 0)) to ieq(ior(a, b), 0)Samuel Pitoiset2018-05-311-0/+2
| | | | | | | | | | | | | | Totals from affected shaders: SGPRS: 80 -> 80 (0.00 %) VGPRS: 48 -> 48 (0.00 %) Code Size: 2120 -> 2096 (-1.13 %) bytes Max Waves: 16 -> 16 (0.00 %) Only two Rise of Tomb Raider shaders are affected on my side. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: add unsigned comparison simplificationsTimothy Arceri2018-05-301-0/+2
| | | | | | | This avoids loop unrolling regressions in Wolfenstein II on DXVK with an upcoming optimisation series from Samuel. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl: parse #version XXX compatibilityMarek Olšák2018-05-291-4/+8
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir/print: fix printing of 8/16 bit constant variablesKarol Herbst2018-05-291-0/+31
| | | | | | | v2 (Jose Maria Casanova Crespo <[email protected]>): add float16 support Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
* nir: Implement optional b2f->iand loweringAlyssa Rosenzweig2018-05-182-1/+7
| | | | | | | | | | | | | | | | | | This pass is required by the Midgard compiler; our instruction set uses NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction. Normally, this lowering pass would be implemented in a backend-specific algebraic pass, but this conflicts with the existing iand->b2f pass in nir_opt_algebraic.py, hanging the compiler. This patch thus makes the existing pass optional (default on -- all other backends should remain unaffected), adding an optional pass for lowering the opposite direction. v2: Defer lowering until late algebraic optimisations to allow optimising the b2f instruction itself. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* spirv: fix visiting inner loops with same break/continue blockSamuel Pitoiset2018-05-151-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | We should stop walking through the CFG when the inner loop's break block ends up as the same block as the outer loop's continue block because we are already going to visit it. This fixes the following assertion which ends up by crashing in RADV or ANV: SPIR-V parsing FAILED: In file ../src/compiler/spirv/vtn_cfg.c:381 block->node.link.next == NULL 0 bytes into the SPIR-V binary This also fixes a crash with a camera shader from SteamVR. v2: make use of vtn_get_branch_type() and add an assertion Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106090 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106504 CC: 18.0 18.1 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meson: remove dependency antipatternEric Engestrom2018-05-141-1/+1
| | | | | | | | | | | | | | | | `dep_valgrind != []` now (0.45) produces a warning that is quite explicit: WARNING: Trying to compare values of different types (DependencyHolder, list) using !=. The result of this is undefined and will become a hard error in a future Meson release. `dep_valgrind = []` used to be the recommended way to deal with non-existant dependency, but these don't work with `.found()`, so now the recommended way is to declare a impossible dependency, which null_dep does for us in Mesa. In short, we don't need and shouldn't check for `!= []` anywhere anymore. Reviewed-by: Dylan Baker <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* anv,nir: add generated files to .gitignore(s)Rhys Perry2018-05-121-0/+2
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir/format_convert: Add code for bitcasting vectorsJason Ekstrand2018-05-091-0/+53
| | | | | | | This is a fairly direct port from blorp. The only real change is that the nir_format_convert version doesn't assume that everything is a vec4. Reviewed-by: Topi Pohjolainen <[email protected]>
* nir/format_convert: Add a function to pack RGB9_E5 formatsJason Ekstrand2018-05-091-0/+64
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* nir/format_convert: Add pack/unpack for R11F_G11F_B10FJason Ekstrand2018-05-091-0/+38
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* nir/format_convert: Add linear <-> sRGB helpersJason Ekstrand2018-05-091-0/+26
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* nir: Add the start of a format conversion helper headerJason Ekstrand2018-05-093-0/+108
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bugBrian Paul2018-05-081-1/+7
| | | | | | | | | | | | | | | Change the size of the bitset from 128 bits to 96. This works around an apparent GCC 5.4 bug in which bad SSE code is generated, leading to a crash in ast_type_qualifier::validate_in_qualifier() (ast_type.cpp:654). This can be repro'd with the Piglit test tests/spec/glsl-1.50/execution/ varying-struct-basic-gs-fs.shader_test Bugzilla:https://bugs.freedesktop.org/show_bug.cgi?id=105497 Cc: [email protected] Reviewed-by: Charmaine Lee <[email protected]> Tested-by: Charmaine Lee <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Transform discard_if(true) into discardMatt Turner2018-05-071-1/+16
| | | | | | | | | | | | | | | | Noticed while reviewing Tim Arceri's NIR inlining series. Without his series: instructions in affected programs: 16 -> 14 (-12.50%) helped: 2 With his series: instructions in affected programs: 196 -> 174 (-11.22%) helped: 22 Reviewed-by: Jason Ekstrand <[email protected]>
* compiler/spirv: add implementation to check for SpvCapabilityInt16 supportIago Toral Quiroga2018-05-032-1/+4
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* compiler/spirv: implement 16-bit bitcastsIago Toral Quiroga2018-05-031-9/+22
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* compiler/lower_64bit_packing: rename the pass to be more genericIago Toral Quiroga2018-05-034-6/+6
| | | | | | It can do 32-bit packing too now. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lower_64bit_packing: extend the pass to handle packing from / to 16-bit.Iago Toral Quiroga2018-05-031-5/+59
| | | | | | | With 16-bit support we can now do 32-bit packing, a follow-up patch will rename the pass to something more generic. Reviewed-by: Jason Ekstrand <[email protected]>