summaryrefslogtreecommitdiffstats
path: root/src/compiler/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir: Separate a weird compare with zero to two compares with zeroIan Romanick2018-01-301-0/+2
| | | | | | | | | | | min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0). No shader-db changes, but it does prevent 6 to 12 instruction regressions in the next patch on all measured Intel platforms. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Simplify min and max of b2fIan Romanick2018-01-301-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Rebase on almost 2 years. Require that one of the arguments to fmin or fmax be used only once. This prevents some regressions. shader-db results: Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14526021 -> 14525913 (<.01%) instructions in affected programs: 4613 -> 4505 (-2.34%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4 helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42% total cycles in shared programs: 533118710 -> 533118403 (<.01%) cycles in affected programs: 34334 -> 34027 (-0.89%) helped: 24 HURT: 0 helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14 helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03% No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Undo possible damage caused by rearranging or-compounded float comparesIan Romanick2018-01-301-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shader-db results: Skylake and Broadwell had similar results (Skylake shown) total instructions in shared programs: 14525898 -> 14525836 (<.01%) instructions in affected programs: 1964 -> 1902 (-3.16%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1 helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86% 95% mean confidence interval for instructions value: -9.46 0.60 95% mean confidence interval for instructions %-change: -3.97% -0.24% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533119892 -> 533115756 (<.01%) cycles in affected programs: 96061 -> 91925 (-4.31%) helped: 13 HURT: 1 helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300 helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46% 95% mean confidence interval for cycles value: -379.43 -211.43 95% mean confidence interval for cycles %-change: -4.84% -3.01% Cycles are helped. Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown). total instructions in shared programs: 9033948 -> 9033898 (<.01%) instructions in affected programs: 535 -> 485 (-9.35%) helped: 2 HURT: 0 total cycles in shared programs: 84631402 -> 84628949 (<.01%) cycles in affected programs: 63197 -> 60744 (-3.88%) helped: 13 HURT: 2 helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140 helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01% HURT stats (abs) min: 4 max: 8 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31% 95% mean confidence interval for cycles value: -253.40 -73.67 95% mean confidence interval for cycles %-change: -4.24% -2.25% Cycles are helped. No changes on GM45 or Iron Lake. v2: Add a couple more tautological compares. Suggested by Elie. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Be more conservative about rearranging or-compounded comparesIan Romanick2018-01-301-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If both comparisons are used as sources for instructions other than the ior, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. shader-db results: Skylake total instructions in shared programs: 14526147 -> 14525898 (<.01%) instructions in affected programs: 70239 -> 69990 (-0.35%) helped: 102 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.86 -2.02 95% mean confidence interval for instructions %-change: -0.46% -0.31% Instructions are helped. total cycles in shared programs: 533120531 -> 533119892 (<.01%) cycles in affected programs: 994875 -> 994236 (-0.06%) helped: 76 HURT: 26 helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13 helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18% HURT stats (abs) min: 1 max: 167 x̄: 54.62 x̃: 26 HURT stats (rel) min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39% 95% mean confidence interval for cycles value: -19.44 6.91 95% mean confidence interval for cycles %-change: -0.30% 0.15% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 14816005 -> 14815787 (<.01%) instructions in affected programs: 64658 -> 64440 (-0.34%) helped: 97 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.62 -1.87 95% mean confidence interval for instructions %-change: -0.45% -0.30% Instructions are helped. total cycles in shared programs: 559340386 -> 559339907 (<.01%) cycles in affected programs: 1090491 -> 1090012 (-0.04%) helped: 66 HURT: 28 helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16 helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27% HURT stats (abs) min: 2 max: 226 x̄: 39.07 x̃: 11 HURT stats (rel) min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20% 95% mean confidence interval for cycles value: -15.94 5.75 95% mean confidence interval for cycles %-change: -0.35% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 Haswell total instructions in shared programs: 9034106 -> 9033948 (<.01%) instructions in affected programs: 24096 -> 23938 (-0.66%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4 helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64% 95% mean confidence interval for instructions value: -4.71 -3.60 95% mean confidence interval for instructions %-change: -0.84% -0.58% Instructions are helped. total cycles in shared programs: 84631628 -> 84631402 (<.01%) cycles in affected programs: 148674 -> 148448 (-0.15%) helped: 14 HURT: 14 helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12 helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21% HURT stats (abs) min: 1 max: 10 x̄: 6.00 x̃: 5 HURT stats (rel) min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: -19.42 3.28 95% mean confidence interval for cycles %-change: -0.59% 0.05% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10015456 -> 10015293 (<.01%) instructions in affected programs: 27701 -> 27538 (-0.59%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4 helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52% 95% mean confidence interval for instructions value: -4.87 -3.71 95% mean confidence interval for instructions %-change: -0.82% -0.51% Instructions are helped. total cycles in shared programs: 87524771 -> 87524569 (<.01%) cycles in affected programs: 112324 -> 112122 (-0.18%) helped: 6 HURT: 12 helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20 helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26% HURT stats (abs) min: 1 max: 16 x̄: 5.50 x̃: 5 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -29.14 6.69 95% mean confidence interval for cycles %-change: -0.93% 0.08% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10545655 -> 10545465 (<.01%) instructions in affected programs: 37198 -> 37008 (-0.51%) helped: 42 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4 helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49% 95% mean confidence interval for instructions value: -5.14 -3.91 95% mean confidence interval for instructions %-change: -0.68% -0.47% Instructions are helped. total cycles in shared programs: 146113059 -> 146112427 (<.01%) cycles in affected programs: 423514 -> 422882 (-0.15%) helped: 32 HURT: 10 helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12 helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11% HURT stats (abs) min: 12 max: 19 x̄: 14.70 x̃: 14 HURT stats (rel) min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14% 95% mean confidence interval for cycles value: -26.03 -4.07 95% mean confidence interval for cycles %-change: -0.43% -0.05% Cycles are helped. Iron Lake total instructions in shared programs: 7886959 -> 7886925 (<.01%) instructions in affected programs: 1340 -> 1306 (-2.54%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8 helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43% 95% mean confidence interval for instructions value: -20.44 3.44 95% mean confidence interval for instructions %-change: -5.78% 0.89% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178116996 -> 178116888 (<.01%) cycles in affected programs: 6262 -> 6154 (-1.72%) helped: 2 HURT: 2 helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61 helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62% HURT stats (abs) min: 6 max: 8 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: -93.27 39.27 95% mean confidence interval for cycles %-change: -5.38% 2.27% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857887 -> 4857870 (<.01%) instructions in affected programs: 674 -> 657 (-2.52%) helped: 2 HURT: 0 total cycles in shared programs: 122180816 -> 122180744 (<.01%) cycles in affected programs: 3764 -> 3692 (-1.91%) helped: 1 HURT: 1 helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78 helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34% Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: See through an fneg to apply existing optimizationsIan Romanick2018-01-301-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Doing the same for the existing feq and fne transformations didn't help anything in shader-db. shader-db results: Broadwell and Skylake (Skylake shown) total instructions in shared programs: 14529463 -> 14526147 (-0.02%) instructions in affected programs: 402420 -> 399104 (-0.82%) helped: 2136 HURT: 131 helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1 helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12% HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57% 95% mean confidence interval for instructions value: -1.51 -1.41 95% mean confidence interval for instructions %-change: -3.06% -2.78% Instructions are helped. total cycles in shared programs: 533146915 -> 533120531 (<.01%) cycles in affected programs: 10356261 -> 10329877 (-0.25%) helped: 1933 HURT: 844 helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16 helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88% HURT stats (abs) min: 1 max: 423 x̄: 36.17 x̃: 12 HURT stats (rel) min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59% 95% mean confidence interval for cycles value: -11.78 -7.22 95% mean confidence interval for cycles %-change: -1.98% -1.65% Cycles are helped. Haswell total instructions in shared programs: 9037416 -> 9034106 (-0.04%) instructions in affected programs: 389831 -> 386521 (-0.85%) helped: 2184 HURT: 120 helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57% 95% mean confidence interval for instructions value: -1.49 -1.39 95% mean confidence interval for instructions %-change: -2.68% -2.41% Instructions are helped. total cycles in shared programs: 84636243 -> 84631628 (<.01%) cycles in affected programs: 4745058 -> 4740443 (-0.10%) helped: 1904 HURT: 960 helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18 helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38% HURT stats (abs) min: 1 max: 1080 x̄: 55.11 x̃: 14 HURT stats (rel) min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81% 95% mean confidence interval for cycles value: -4.51 1.29 95% mean confidence interval for cycles %-change: -1.64% -1.25% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 0 Sandy Bridge and Ivy Bridge (Ivy Bridge shown) total instructions in shared programs: 10018873 -> 10015456 (-0.03%) instructions in affected programs: 512820 -> 509403 (-0.67%) helped: 2268 HURT: 162 helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88% HURT stats (abs) min: 1 max: 4 x̄: 1.59 x̃: 1 HURT stats (rel) min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50% 95% mean confidence interval for instructions value: -1.46 -1.35 95% mean confidence interval for instructions %-change: -2.38% -2.12% Instructions are helped. total cycles in shared programs: 87538223 -> 87524771 (-0.02%) cycles in affected programs: 5435520 -> 5422068 (-0.25%) helped: 1916 HURT: 946 helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18 helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97% HURT stats (abs) min: 1 max: 633 x̄: 45.41 x̃: 11 HURT stats (rel) min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62% 95% mean confidence interval for cycles value: -7.34 -2.06 95% mean confidence interval for cycles %-change: -1.62% -1.26% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7888446 -> 7886959 (-0.02%) instructions in affected programs: 331581 -> 330094 (-0.45%) helped: 1160 HURT: 97 helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1 helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25% 95% mean confidence interval for instructions value: -1.25 -1.12 95% mean confidence interval for instructions %-change: -0.91% -0.75% Instructions are helped. total cycles in shared programs: 178130766 -> 178116996 (<.01%) cycles in affected programs: 12534564 -> 12520794 (-0.11%) helped: 1856 HURT: 187 helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4 helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.55 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02% 95% mean confidence interval for cycles value: -7.41 -6.07 95% mean confidence interval for cycles %-change: -0.28% -0.22% Cycles are helped. GM45 total instructions in shared programs: 4858912 -> 4857887 (-0.02%) instructions in affected programs: 237565 -> 236540 (-0.43%) helped: 867 HURT: 57 helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: -1.18 -1.04 95% mean confidence interval for instructions %-change: -0.88% -0.71% Instructions are helped. total cycles in shared programs: 122189118 -> 122180816 (<.01%) cycles in affected programs: 8776418 -> 8768116 (-0.09%) helped: 1213 HURT: 166 helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4 helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.35 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02% 95% mean confidence interval for cycles value: -6.78 -5.26 95% mean confidence interval for cycles %-change: -0.24% -0.18% Cycles are helped. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: add lower_all_io_to_temps flagTimothy Arceri2018-01-311-0/+2
| | | | | | | This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <[email protected]>
* nir/st_glsl_to_nir: add param to disable splitting of inputsTimothy Arceri2018-01-312-10/+15
| | | | | | | | | We need this because we will always copy fs outputs to temps and split the arrays, but do not want to do either of these with fs inputs as it is unnessisary and makes handling interpolateAt builtins difficult. Reviewed-by: Marek Olšák <[email protected]>
* nir: partially revert c2acf97fcc9b32eTimothy Arceri2018-01-301-6/+23
| | | | | | | | | | | | | | | | | | c2acf97fcc9b32e changed the use of double_inputs_read to be inconsitent with its previous meaning. Here we re-enable the gather info code that was removed as the modified code from c2acf97fcc9b32e now uses the double_inputs member rather than double_inputs_read. This change allows us to use double_inputs_read with gallium drivers without impacting double_inputs which is used by i965. We also make use of the compiler option vs_inputs_dual_locations to allow for the difference in behaviour between drivers that handle vs inputs as taking up two locations for doubles, versus those that treat them as taking a single location. Reviewed-by: Karol Herbst <[email protected]>
* nir: add vs_inputs_dual_locations compiler optionTimothy Arceri2018-01-301-0/+6
| | | | | | | | | | | | | Allows nir drivers to either use a single or dual locations for vs double inputs. i965 uses dual locations for both OpenGL and Vulkan drivers, for now gallium OpenGL drivers only use a single location. The following patch will also make use of this option when calling nir_shader_gather_info(). Reviewed-by: Karol Herbst <[email protected]>
* compiler: tidy up double_inputs_read usesTimothy Arceri2018-01-301-2/+6
| | | | | | | | | | | | | First we move double_inputs_read into a vs struct in the union, double_inputs_read is only used for vs inputs so this will save space and also allows us to add a new double_inputs field. We add the new field because c2acf97fcc9b changed the behaviour of double_inputs_read, and while it's no longer used to track actual reads in i965 we do still want to track this for gallium drivers. Reviewed-by: Marek Olšák <[email protected]>
* nir: mark unused space in packed_tex_dataTapani Pälli2018-01-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | This change cleans following scary warnings in valgrind output when disk cache is being written: ==6532== Uninitialised byte(s) found during client check request ==6532== at 0x14423FAD: blob_write_bytes (blob.c:152) ==6532== by 0x144240FB: blob_write_uint32 (blob.c:194) ==6532== by 0x144001A5: write_tex (nir_serialize.c:613) and later (loads of): ==6532== Use of uninitialised value of size 8 ==6532== at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11) ==6532== by 0x13F65014: util_hash_crc32 (crc32.c:127) ==6532== by 0x13F5DABA: cache_put (disk_cache.c:947) Signed-off-by: Tapani Pälli <[email protected]> Cc: [email protected] Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nir: add a 'const' qualifier to nir_ssa_def_components_read()Samuel Pitoiset2018-01-122-2/+2
| | | | | | | | To avoid compilation warnings and because this helper shouldn't update anything. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* meson: Use dependencies for nirDylan Baker2018-01-111-3/+15
| | | | | | | | | | | | | | | | | This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <[email protected]> Signed-off-by: Dylan Baker <[email protected]>
* meson: Use consistent style for testsDylan Baker2018-01-111-9/+10
| | | | | | | Don't use intermediate variables, use consistent whitespace. Acked-by: Eric Engestrom <[email protected]> Signed-off-by: Dylan Baker <[email protected]>
* nir: Silence unused parameter warningsIan Romanick2018-01-101-2/+2
| | | | | | | | | | | | | | | | In file included from src/compiler/nir/nir_opt_algebraic.c:4:0: src/compiler/nir/nir_search_helpers.h: In function ‘is_not_const’: src/compiler/nir/nir_search_helpers.h:118:59: warning: unused parameter ‘num_components’ [-Wunused-parameter] is_not_const(nir_alu_instr *instr, unsigned src, unsigned num_components, ^~~~~~~~~~~~~~ src/compiler/nir/nir_search_helpers.h:119:29: warning: unused parameter ‘swizzle ’ [-Wunused-parameter] const uint8_t *swizzle) ^~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: add missing local_group_size intrinsicRob Clark2017-12-302-0/+5
| | | | | | | | | For GL_ARB_compute_variable_group_size Reported-by: Karol Herbst <[email protected]> Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/linking: always set the used_across_stages/outputs_read bitsDave Airlie2017-12-191-6/+7
| | | | | | | | | | | | If we don't remap and output this code would trample the outputs read bits. This fixes a regression in dEQP-VK.tessellation.shader_input_output.barrier Fixes: 1c9c42d16b4c (nir: add varying component packing helpers) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: Add a new lowering option to lower all txd to txl.Eric Anholt2017-12-142-6/+14
| | | | | | VC5 requires that all txd are lowered in the shader. Reviewed-by: Ian Romanick <[email protected]>
* nir: Fix interaction of GL_CLAMP lowering with texture offsets.Eric Anholt2017-12-141-33/+42
| | | | | | | | | | | We want the clamping of the coordinate to apply after the offset, so we need to do math to lower the offset out of the instruction. Fixes texwrap offset cases for GL_CLAMP with GL_NEAREST on vc5. Note: I moved the get_texture_size() verbatim, so that it was defined before use. Reviewed-by: Ian Romanick <[email protected]>
* nir: fix shift for uint64_tTimothy Arceri2017-12-131-2/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* spirv: Add a prepass to set types on vtn_valuesJason Ekstrand2017-12-111-1/+10
| | | | | | | | This autogenerated pass will automatically find and set the type field on all vtn_values. This way we always have the type and can use it for validation and other checks. Reviewed-by: Ian Romanick <[email protected]>
* nir/opcodes: Fix constant-folding of bitfield_insertJames Legg2017-12-071-2/+2
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119 CC: <[email protected]> CC: Samuel Pitoiset <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Handle fp16 rounding modes at nir_type_conversion_opJose Maria Casanova Crespo2017-12-063-4/+25
| | | | | | | | | | | | | | | | | nir_type_conversion enables new operations to handle rounding modes to convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne and nir_op_f2f16_rtz. The undefined behaviour doesn't has any effect and uses the original nir_op_f2f16 operation. v2: Indentation fixed (Jason Ekstrand) v3: Use explicit case for undefined rounding and assert if rounding mode is used for non 16-bit float conversions (Jason Ekstrand) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Populate conversion opcodes to 16-bit typesEduardo Lima Mitev2017-12-061-1/+1
| | | | | | | | | | | | | | | This will include the following NIR ALU opcodes: * nir_op_i2i16 * nir_op_i2f16 * nir_op_u2u16 * nir_op_u2f16 * nir_op_f2i16 * nir_op_f2u16 * nir_op_f2f16 v2: Remove "from" 16-bit in commit subject (Topi Pohjolainen) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add rounding modes enumJose Maria Casanova Crespo2017-12-061-0/+10
| | | | | | | v2: Added comments describing each of the rounding modes. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add support for 16-bit types (half float, int16 and uint16)Eduardo Lima Mitev2017-12-063-0/+21
| | | | | | | | | v2: Renamed glsl_half_float_type() to glsl_float16_t_type(). (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Eduardo Lima <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add a vulkan_resource_reindex intrinsicJason Ekstrand2017-12-051-1/+8
| | | | | | | | This is required for being able to handle OpPtrAccessChain in SPIR-V where the base type of the incoming pointer requires us to add to the block index instead of the byte offset. Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: allow builin arrays to be loweredTimothy Arceri2017-12-041-7/+10
| | | | | | | Galliums nir drivers expect this to be done. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir: add array lowering function that assumes there are no indirectsTimothy Arceri2017-12-042-1/+44
| | | | | | | | | | The gallium glsl->nir pass currently lowers away all indirects on both inputs and outputs. This fuction allows us to lower vs inputs and fs outputs and also lower things one stage at a time as we don't need to worry about indirects on the other side of the shaders interface. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir: fix support for scalar arrays in nir_lower_io_types()Timothy Arceri2017-12-041-7/+3
| | | | | | | This was just recreating the same vector type we alreay had and hitting an assert for scalars. Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: add varying component packing helpersTimothy Arceri2017-12-042-0/+332
| | | | | | | | | | | | v2: update shader info input/output masks when pack components v3: make sure interpolation loc matches, this is required for the radeonsi NIR backend. v4: 33dca36f4f28 fixed nir_gather_info to update outputs_read correct, make sure we also adjust this correctly when packing components. Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Nicolai Hähnle <[email protected]> (v3)
* nir: add varying array splitting passTimothy Arceri2017-12-043-0/+385
| | | | | | | | | | | | | V2: - fix matrix support, non-array matrices were being skipped in v1 v3: - handle lowering of tcs output loads correctly - correctly mark indirect locations for either in or out not both when processing a stage. - use nir_src_copy() when lowering stores. Reviewed-by: Nicolai Hähnle <[email protected]>
* compiler: fix typoEric Engestrom2017-11-281-1/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* compiler: use NDEBUG to guard assertsEric Engestrom2017-11-283-6/+6
| | | | | | | | | nir_validate.c's #endif already had the correct NDEBUG comment Fixes: dcb1acdea00a8f2c29777 "nir/validate: Only build in debug mode" Fixes: 9ff71b649b4b3808a9e17 "i965/nir: Validate that NIR passes call nir_metadata_preserve()" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: fill outputs_read field and add patch outputs read (v2)Dave Airlie2017-11-271-12/+28
| | | | | | | | This is to be used for TCS optimisations on radv. v2: don't set written on reads (nha) Reviewed-by: Timothy Arceri <[email protected]>
* nir: allow texture offsets with cube mapsIlia Mirkin2017-11-251-2/+13
| | | | | | | | | | GL doesn't have this, but some hardware supports it. This is convenient for lowering tg4 to plain texture calls, which is necessary on Adreno A4xx hardware. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir/gather_info: recognize load_patch_vertices_in as a system valueIago Toral Quiroga2017-11-221-0/+1
| | | | | | | | This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is generated to load gl_PatchVerticesIn in the SPIR-V path for both Vulkan and OpenGL. Reviewed-by: Marek Olšák <[email protected]>
* nir/spirv: tg4 requires a samplerAlex Smith2017-11-131-1/+0
| | | | | | | | | | Gather operations in both GLSL and SPIR-V require a sampler. Fixes gathers returning garbage when using separate texture/samplers (on AMD, was using an invalid sampler descriptor). Signed-off-by: Alex Smith <[email protected]> Cc: "17.2 17.3" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add streams to nir dataTimothy Arceri2017-11-121-0/+8
| | | | | | This will be used by gallium drivers. Reviewed-by: Marek Olšák <[email protected]>
* nir: handle get_buffer_size in nir_lower_atomics_to_ssboRob Clark2017-11-101-0/+1
| | | | | | | | | Overlooked initially, be we need to remap the SSBO index for this as well. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Don't print swizzles when there are more than 4 componentsMatt Turner2017-11-081-1/+1
| | | | | | | | | ... as can happen with various types like mat4, or else we'll smash the stack writing past the end of components_local[]. Fixes: 5a0d3e1129b7 ("nir: Print the components referenced for split or packed shader in/outs.") Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Validate base types on array dereferencesJason Ekstrand2017-11-071-2/+16
| | | | | | | | | We were already validating that the parent type goes along with the child type but we weren't actually validating that the parent type is reasonable. This fixes that. Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir,intel/compiler: Use a fixed subgroup sizeJason Ekstrand2017-11-072-24/+13
| | | | | | | | | | | | | | | | The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared as a uniform. This means that it cannot change across an invocation such as a draw call or a compute dispatch. For compute shaders, we're ok because we only ever use one dispatch size. For fragment, however, the hardware dynamically chooses between SIMD8 and SIMD16 which violates the spec. Instead, let's just pick a subgroup size based on the shader stage. The fixed size we choose for compute shaders is a bit higher than strictly needed but there's no real harm in that. The advantage is that, if they do anything interesting with the value, NIR will see it as an immediate and can optimize better. Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir/lower_subgroups: Lower ballot intrinsics to the specified bit sizeJason Ekstrand2017-11-073-30/+83
| | | | | | | | | | | | | | Ballot intrinsics return a bitfield of subgroups. In GLSL and some SPIR-V extensions, they return a uint64_t. In SPV_KHR_shader_ballot, they return a uvec4. Also, some back-ends would rather pass around 32-bit values because it's easier than messing with 64-bit all the time. To solve this mess, we make nir_lower_subgroups take a new parameter called ballot_bit_size and it lowers whichever thing it gets in from the source language (uint64_t or uvec4) to a scalar with the specified number of bits. This replaces a chunk of the old lowering code. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir/builder: Add a nir_imm_intN_t helperJason Ekstrand2017-11-071-0/+12
| | | | | | | This lets you easily build integer immediates of arbitrary bit size. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/lower_system_values: Lower SUBGROUP_*_MASK based on typeJason Ekstrand2017-11-071-2/+3
| | | | | | | | | The SUBGROUP_*_MASK system values are uint64_t when coming in from GLSL but uvec4 when coming in from SPIR-V. Lowering based on type allows us to nicely handle both. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Make ballot intrinsics variable-sizeJason Ekstrand2017-11-072-6/+7
| | | | | | | | This way they can return either a uvec4 or a uint64_t. At the moment, this is a no-op since we still always return a uint64_t. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add a ssa_dest_init_for_type helperJason Ekstrand2017-11-071-0/+9
| | | | | | | This would be useful a number of places Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add a new subgroups lowering passJason Ekstrand2017-11-075-181/+200
| | | | | | | | | | | | This commit pulls nir_lower_read_invocations_to_scalar along with most of the guts of nir_opt_intrinsics (which mostly does subgroup lowering) into a new nir_lower_subgroups pass. There are various other bits of subgroup lowering that we're going to want to do so it makes a bit more sense to keep it all together in one pass. We also move it in i965 to happen after nir_lower_system_values to ensure that because we want to handle the subgroup mask system value intrinsics here. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/cs: Push subgroup ID instead of base thread IDJason Ekstrand2017-11-071-3/+1
| | | | | | | | | | We're going to want subgroup ID for SPIR-V subgroups eventually anyway. We really only want to push one and calculate the other from it. It makes a bit more sense to push the subgroup ID because it's simpler to calculate and because it's a real API thing. The only advantage to pushing the base thread ID is to avoid a single SHL in the shader. Reviewed-by: Iago Toral Quiroga <[email protected]>