aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* spirv: don't discard access set by vtn_pointer_dereferenceLionel Landwerlin2019-07-301-1/+1
| | | | | | | | | We can have a access flag already set here so just augment the existing ones. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 0fb61dfdeb ("spirv: propagate access qualifiers through ssa & pointer") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add builtin functions for EXT_texture_shadow_lodPaulo Zanoni2019-07-301-0/+26
| | | | | | | | | | | | | With the help of Sagar, Ian and Ivan. v2: Fix dependencies (Ian Romanick) v3: 1) fix function name (Marek Olsak) 2) Add check for extension enable (Marek Olsak) Signed-off-by: Paulo Zanoni <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Allow _textureCubeArrayShadow function to accept ir_texture_opcodePaulo Zanoni2019-07-301-4/+19
| | | | | | | | | | | This will be used to support one of the function from Ext_texture_shadow_lod specification. With the help of Sagar, Ian and Ivan. Signed-off-by: Paulo Zanoni <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: extension boilerplate for EXT_texture_shadow_lodPaulo Zanoni2019-07-302-0/+3
| | | | | | | | With the help of Sagar, Ian and Ivan. Signed-off-by: Paulo Zanoni <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/find_array_copies: Use correct parent array lengthConnor Abbott2019-07-301-2/+3
| | | | | | | | | | | instr->type is the type of the array element, not the type of the array being dereferenced. Rather than fishing out the parent type, just use parent->num_children which should be the length plus 1. While we're here add another assert for the issue fixed by the previous commit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251 Fixes: 156306e5e62 ("nir/find_array_copies: Handle wildcards and overlapping copies") Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Fix comparison for nir_deref_instr_is_known_out_of_bounds()Connor Abbott2019-07-301-1/+1
| | | | | | | | There was an off-by-one error. Fixes: 156306e5e62 ("nir/find_array_copies: Handle wildcards and overlapping copies") Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel: Use a system value for gl_FragCoordJason Ekstrand2019-07-291-13/+3
| | | | | | | | | | | | It's kind-of an anomaly that the Intel drivers are still treating gl_FragCoord as an input. It also makes zero sense because we have to special-case it in the back-end. Because ANV is the only user of nir_lower_wpos_center, we go ahead and just update it to look for nir_intrinsic_load_frag_coord as part of this patch. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Treat gl_FragCoord as a varying even when it's a system valueJason Ekstrand2019-07-291-1/+3
| | | | | | | This fixes glsl-fcoord-invariant-pass.shader_test on drivers that set GLSLFragCoordIsSysVal which includes radeonsi among others. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Fix helgrind complaints about data race in trivial_swizzle init.Eric Anholt2019-07-291-3/+3
| | | | | | | | | | Even if the data race wasn't real (I'm not great at reasoning about this), helgrind is a nice enough tool that keeping noise out of it is probably worthwhile. Besides, typing out the numbers keeps the data in the read-only data section instead of emitting code to initialize it every time. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir/find_array_copies: Handle wildcards and overlapping copiesConnor Abbott2019-07-293-185/+405
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit rewrites opt_find_array_copies to be able to handle an array copy sequence with other intervening operations in between. In particular, this handles the case where we OpLoad an array of structs and then OpStore it, which generates code like: foo[0].a = bar[0].a foo[0].b = bar[0].b foo[1].a = bar[1].a foo[1].b = bar[1].b ... that wasn't recognized by the previous pass. In order to correctly handle copying arrays of arrays, and in particular to correctly handle copies involving wildcards, we need to use a tree structure similar to lower_vars_to_ssa so that we can walk all the partial array copies invalidated by a particular write, including ones where one of the common indices is a wildcard. I actually think that when factoring in the needed hashing/comparing code, a hash table based approach wouldn't be a lot smaller anyways. All of the changes come from tessellation control shaders in Strange Brigade, where we're able to remove the DXVK-inserted copy at the beginning of the shader. These are the result for radv: Totals from affected shaders: SGPRS: 4576 -> 4576 (0.00 %) VGPRS: 13784 -> 5560 (-59.66 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread Code Size: 329940 -> 263268 (-20.21 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 330 -> 898 (172.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Print array deref indices as decimalConnor Abbott2019-07-291-1/+1
| | | | | | | We print the size as decimal too, and using hex without a leading "0x" was very confusing. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Optimize umod loweringSagar Ghuge2019-07-261-25/+23
| | | | | | | | | We don't have calculate final quotient in order to calculate unsigned modulo result. Once we are done with error correction we have partial result which can be used to find out modulo operation result Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* spirv: propagate access qualifiers through ssa & pointerLionel Landwerlin2019-07-263-4/+62
| | | | | | | | | | | | | | | | | | | | | | | | | Not only variables can be flagged as NonUniformEXT but also expressions. We're currently ignoring it in an expression such as : imageLoad(data[nonuniformEXT(rIndex)], 0) The associated SPIRV : OpDecorate %69 NonUniformEXT ... %69 = OpLoad %61 %68 This changes propagates access qualifiers through ssa & pointers so that when it hits a OpLoad/OpStore style instructions, qualifiers are not forgotten. Fixes failure the following tests : dEQP-VK.descriptor_indexing.* Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 8ed583fe523703 ("spirv: Handle the NonUniformEXT decoration") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: wrap push ssa/pointer valuesLionel Landwerlin2019-07-264-69/+89
| | | | | | | | | | This refactor allows for common code to apply decoration on all ssa/pointer values. In particular this will allow to propagage access qualifiers. Signed-off-by: Lionel Landwerlin <[email protected]> Suggested-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: add access to image_deref intrinsicsLionel Landwerlin2019-07-261-0/+3
| | | | | | | | | | | SPIRV added the ability to access variables and have expressions non dynamically uniform and because spirv_to_nir generates deref instructions, we'll need to have that access there. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: report no function instead of empty candidate listErik Faye-Lund2019-07-251-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When generating the error message for a missing function error where all available overloads were missing due to a too low GLSL version, we used to report something like this: ---8<--- 0:224(14): error: no matching function for call to `textureCubeLod(samplerCube, vec3, float)'; candidates are: 0:224(14): error: type mismatch ---8<--- This is a pretty confusing error message, and can throw people off when debugging. So let's instead check if any overload is available before we decide what to print. This allow us to report something like this instead: ---8<--- 0:224(14): error: no function with name 'textureCubeLod' 0:224(14): error: type mismatch ---8<--- This is arguably easier to understand for programmers, and doesn't send you on a wild goose chase to figure out what argument is wrong just because you stopped reading the message prematurely. I'm of course referring to a friend, not me. For sure. I would never do that. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/algebraic: add scmp algebraic optimizationsJonathan Marek2019-07-241-0/+16
| | | | | | | | | | When 'x' is the result of a scmp op: x != 0.0 or x == 1.0: passthrough x == 0.0 or x != 1.0: invert Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/algebraic: add option to lower fall_equalN/fany_nequalNJonathan Marek2019-07-242-0/+9
| | | | | | | | | Add generic lowerings for fall_equalN/fany_nequalN. These should be optimal for vec4 backends that doesn't have any special instructions for it, as long as they support saturate. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/algebraic: add fdot2 optimizationsJonathan Marek2019-07-241-0/+3
| | | | | | | | | Add simple fdot2 optimizations that are missing. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/algebraic: add option to lower fdphJonathan Marek2019-07-242-1/+6
| | | | | | | | For backends that don't have a 'fdph' instructions Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: replace lower_sincos with algebraic optJonathan Marek2019-07-245-143/+12
| | | | | | | | This version has less ops for the same precision. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Acked-by: Matt Turner <[email protected]>
* nir/algebraic: allow swizzle in nir_algebraic replace expressionJonathan Marek2019-07-244-6/+22
| | | | | | | | This is to allow optimizations in nir_opt_algebraic not otherwise possible Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Matt Turner <[email protected]>
* nir,intel: lower if (cond) demote() to new intrinsic demote_if(cond)Daniel Schürmann2019-07-244-21/+35
| | | | | | | This will effectively enable the optimization in anv. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lower_subgroups: Properly lower masks when subgroup_size == 0Jason Ekstrand2019-07-241-5/+11
| | | | | | | | | | | | Instead of building a constant mask (which depends on knowing the subgroup size), we build an expression. Because the pass uses the nir_shader_lower_instructions helper, subgroup lowering will be run on any newly emitted instructions as well as the previously existing instructions. In particular, if the subgroup size is known, the newly emitted subgroup_size intrinsic will get turned into a constant and a later constant folding pass will clean it up. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add lowering for nir_op_irem and nir_op_imodSagar Ghuge2019-07-241-2/+16
| | | | | | | | | | Tested on Gen > 9. v2: 1) Fix lowering 2) Keep a consistent i/u order (Matt Turner) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/lower_io: Return SSA defs from helpersJason Ekstrand2019-07-231-25/+42
| | | | | | | | I can't find a single place where nir_lower_io is called after going out of SSA which is the only real reason why you wouldn't do this. Returning SSA defs is more idiomatic and is required for the next commit. Reviewed-by: Matt Turner <[email protected]>
* nir/gather_info: Look for uses of helper invocationsJason Ekstrand2019-07-232-0/+27
| | | | | | | | | The one obvious omission here is gl_HelperInvocation itself. However, the spec doesn't require that we generate then when gl_HelperInvocation is used, it merely mandates that we report them if they are there. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/gather_info: Move setting uses_64bit out of the switchJason Ekstrand2019-07-231-5/+6
| | | | | | | | | Otherwise, as we add things to the switch, we're going to forget and add some 64-bit op at some point in the future and it'll stop getting flagged. There's no reason why we can't do the check for derivatives. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a nir_tex_instr_has_implicit_derivatives helperJason Ekstrand2019-07-232-11/+14
| | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Move nir_alu_instr_is_comparison to the ALU sectionJason Ekstrand2019-07-231-23/+23
| | | | | Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: use | instead of || operatorAndrii Simiklit2019-07-231-1/+1
| | | | | | | | | | | warning: use of logical '||' with constant operand note: use '|' for a bitwise operation Fixes: 758fdce9fee ("nir: Add some generic helpers for writing lowering passes") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]>
* nir: don't return voidEric Engestrom2019-07-231-1/+2
| | | | | | Fixes: 14531d676b11999123c0 ("nir: make nir_const_value scalar") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir: Remove a bunch of large stack arraysJason Ekstrand2019-07-224-6/+15
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* util: use standard name for snprintf()Eric Engestrom2019-07-196-28/+27
| | | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Only rematerialize comparisons with all SSA sourcesJason Ekstrand2019-07-191-0/+15
| | | | | | | | | | | Otherwise, you may end up moving a register read and that could result in an incorrect shader. This commit fixes a rendering issue in Elite: Dangerous. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111152 Fixes: 3ee2e84c60 "nir: Rematerialize compare instructions" Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* spirv: Fix order of barriers in SpvOpControlBarrierDaniel Schürmann2019-07-191-4/+4
| | | | | | | | Semantically, the memory barrier has to come first to wait for the completion of pending memory requests. Afterwards, the workgroups can be synchronized. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: use a switch when printing intrinsic indicesCaio Marcelo de Oliveira Filho2019-07-191-8/+32
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* nir/algebraic: mark a few comparison simplifications as preciseRhys Perry2019-07-191-2/+2
| | | | | | | | No vkpipeline-db changes found. Signed-off-by: Rhys Perry <[email protected]> Reveiewed-by: Alyssa Rosenzweig [email protected] Reviewed-by: Connor Abbott <[email protected]>
* nir/algebraic: optimize contradictory iand operandsRhys Perry2019-07-191-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Some of these were found in a few GTAV, Rise of the Tomb Raider and Shadow of the Tomb Raider shaders. Results from vkpipeline-db run with ACO: Totals from affected shaders: SGPRS: 376 -> 376 (0.00 %) VGPRS: 220 -> 220 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 13492 -> 11560 (-14.32 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 69 -> 69 (0.00 %) Wait states: 0 -> 0 (0.00 %) v2: use False instead of 0 Signed-off-by: Rhys Perry <[email protected]> Reveiewed-by: Alyssa Rosenzweig [email protected] Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_clip: add support for geometry shadersTimothy Arceri2019-07-192-0/+58
| | | | | | | This will be used to enabled compat profile support for geometry shaders. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_clip: add lower_clip_outputs() helperTimothy Arceri2019-07-191-42/+51
| | | | | | | This will be reused in the following patch to add support for clip vertex lowering in geometry shaders. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_clip: add create_clipdist_vars() helperTimothy Arceri2019-07-191-16/+18
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_clip: add a find_clipvertex_and_position_outputs() helperTimothy Arceri2019-07-191-24/+35
| | | | | | | | This will allow code sharing in a following patch that adds support for lowering in geometry shaders. It also allows us to exit early if there is no lowering to do which allows a small code tidy up. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/large_constants: De-duplicate constantsCaio Marcelo de Oliveira Filho2019-07-181-21/+75
| | | | | | | | | | | | | | | | | | | | | If a function has a constant and is called more than once, after inlining we may end up with different variables representing the same constant. This commit look into the data and de-duplicate them. The first pass now will collect the constant data in a per variable buffer, then de-duplication happens (by sorting then linear walk), and the second pass will use the data in var->data.location. One side-effect of the current implementation is that constants will be reordered. If this turns out to be a problem is something that can be fixed. An alternative strategy considered was to perform this in a per-function basis and then merge the results, the problem is that we would have to fix up the offsets during the merge. Given the data we have, the current patch is good enough. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/large_constants: Use ralloc for var_infosCaio Marcelo de Oliveira Filho2019-07-181-3/+3
| | | | | | | | This will be used later on to allocate constant data for each variable (and then deduplicate). Also drop initializing found_read, as it is already implicitly false in the literal. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Allow internal changes to the instr in nir_shader_lower_instructions().Eric Anholt2019-07-182-1/+11
| | | | | | | | | v3d's NIR txf_ms lowering wants to swizzle around the input coordinates in NIR, but doesn't generate a new txf_ms instructions as replacement. It's pretty easy to allow that in nir_shader_lower_instructions, and it may be common in lowering passes. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add gl_PointCoord system valueAndreas Baierl2019-07-183-0/+6
| | | | | | | | | | gl_PointCoord handling needs some special bits set in lima/ppir code generation. Treating gl_PointCoord as a system value makes it easier to distinguish from a regular varying. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Optionally declare gl_PointCoord as a system valueAndreas Baierl2019-07-183-2/+8
| | | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_viewport: Check variable mode firstConnor Abbott2019-07-181-1/+2
| | | | | | | | | | The location is unused for shader_temp and function_temp variables, and due to the way we nir_lower_io_to_temproraries demotes shader_out variables to shader_temp variables, it happened to equal VARYING_SLOT_POS for the gl_Position temporary, which made this pass fail with the offline compiler due to this coming before vars_to_ssa. Reviewed-by: Qiang Yu <[email protected]>
* nir: add a V3D-specific intrinsic for per-sample color writesIago Toral Quiroga2019-07-181-0/+9
| | | | | | | | | | | For per-sample color writes we need the output intrinsic to pack the sample index, which is not provided with regular store_output intrinsics unless we figured out a way to encode it into the base or the offset. v2: - Drop the writemask (Eric) Reviewed-by: Eric Anholt <[email protected]>