aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* glsl/linker: fix bug when checking precision qualifierSamuel Iglesias Gonsálvez2018-02-281-8/+3
| | | | | | | | | | | | | | | According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders" section, the precision qualifier should match for uniform variables. This also applies to previous GLSL ES 3.x specs. This 'if' checks the condition for uniform variables, while for UBOs it is checked in link_interface_blocks.cpp. Fixes: b50b82b8a553 ("glsl/es31: precision qualifier doesn't need to match in shader interface block members") Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* nir: fix interger divide by zero crash during constant foldingTimothy Arceri2018-02-281-2/+2
| | | | | | | | | | | | From the GLSL 4.60 spec Section 5.9 (Expressions): "Dividing by zero does not cause an exception but does result in an unspecified value." Fixes: 89285e4d47a6 "nir: add new constant folding infrastructure" Reviewed-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271
* nir: add lower_ldexp to nir compiler optionsTimothy Arceri2018-02-282-2/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* util/bitset: Make C++ wrapper trivially constructible.Francisco Jerez2018-02-272-3/+0
| | | | | | | | | | | | | | | In order to fix a build failure on compilers not implementing unrestricted unions, which is a C++11 feature. v2: Provide signed integer comparison and assignment operators instead of BITSET_WORD ones to avoid spurious ambiguity warnings on comparisons with a signed integer literal. Fixes: ba79a90fb52e1e81fb "glsl: Switch ast_type_qualifier to a 128-bit bitset." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105238 Tested-by: Roland Scheidegger <[email protected]> Tested-By: George Kyriazis <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* glsl: Silence warnings when reading from a framebuffer fetch output.Francisco Jerez2018-02-241-0/+1
| | | | | | | Framebuffer fetch outputs are implicitly initialized upon entry to the fragment shader. Reviewed-by: Plamena Manolova <[email protected]>
* glsl: Specify framebuffer fetch coherency mode in ↵Francisco Jerez2018-02-242-2/+3
| | | | | | | | | | | | | | | | | | | | | lower_blend_equation_advanced(). This requires passing an extra argument to the lowering pass because the KHR_blend_equation_advanced specification doesn't seem to define any mechanism for the implementation to determine at compile-time whether coherent blending can ever be used (not even an "#extension KHR_blend_equation_advanced_coherent" directive seems to be required in the shader source AFAICT). In the long run we'll probably want to do state-dependent recompiles based on the value of ctx->Color.BlendCoherent, but right now there would be no benefit from that because the only driver that supports coherent framebuffer fetch is i965 on SKL+ hardware, which are unable to support the non-coherent path for the moment because of texture layout issues, so framebuffer fetch coherency is always enabled for them. Reviewed-by: Plamena Manolova <[email protected]>
* glsl: Add support for the framebuffer fetch layout(noncoherent) qualifier.Francisco Jerez2018-02-245-2/+61
| | | | | | | | | | This allows the application to request framebuffer fetch coherency with per-fragment output granularity. Coherent framebuffer fetch outputs (which is the default if no qualifier is present for compatibility with older versions of the EXT_shader_framebuffer_fetch extension) will have ir_variable_data::memory_coherent set to true. Reviewed-by: Plamena Manolova <[email protected]>
* glsl: Allow layout token for EXT_shader_framebuffer_fetch_non_coherent.Francisco Jerez2018-02-241-1/+2
| | | | | | | EXT_shader_framebuffer_fetch_non_coherent requires layout qualifiers even on GL(ES) 2. Reviewed-by: Plamena Manolova <[email protected]>
* glsl: Initialize ir_variable_data::fb_fetch_output earlier for GL(ES) 2.Francisco Jerez2018-02-241-2/+7
| | | | | | | | | | At the same point where it is initialized on GL(ES) 3.0+ so we can implement some common layout qualifier handling in a future commit. Until now the fb_fetch_output flag would be inherited from the original implicit gl_LastFragData declaration at a later point in the AST to GLSL IR translation. Reviewed-by: Plamena Manolova <[email protected]>
* glsl: Replace MESA_shader_framebuffer_fetch extension flags with EXT ones.Francisco Jerez2018-02-242-6/+4
| | | | Reviewed-by: Plamena Manolova <[email protected]>
* glsl: Switch ast_type_qualifier to a 128-bit bitset.Francisco Jerez2018-02-243-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This should end the drought of bits in the ast_type_qualifier object. The bitset_t type works pretty much as a drop-in replacement for the current uint64_t bitset. The only catch is that the bitset_t type as defined in the previous commit doesn't have a trivial constructor (because it has a user-defined constructor), so it cannot be used as union member without providing a user-defined constructor for the union (which causes it in turn to be non-trivially constructible). This annoyance could be easily addressed in C++11 by declaring the default constructor of bitset_t to be the implicitly defined one -- IMO one more reason to drop support for GCC 4.2-4.3. The other minor change was required because glsl_parser_extras.cpp was hard-coding the type of bitset temporaries as uint64_t, which (unlike would have been the case if the uint64_t had been replaced with e.g. an __int128) would otherwise have caused a build failure, because the boolean conversion operator of bitset_t is marked explicit (if C++11 is available), so the bitset won't be silently truncated down to 1 bit in order to use it to initialize the uint64_t temporaries (yikes). Reviewed-by: Plamena Manolova <[email protected]>
* mesa: implement ARB_compatibilityMarek Olšák2018-02-235-3/+6
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a)Samuel Pitoiset2018-02-221-0/+2
| | | | | | | | | Similar for the 4 case. Suggested by Bas. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b))Samuel Pitoiset2018-02-221-1/+1
| | | | | | | | Otherwise the code size increases because the original fexp2() instructions can't be deleted. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: apply memory qualifiers to imagesSamuel Pitoiset2018-02-221-3/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl: Parse 'layout' as a token with advanced blending or bindlessKenneth Graunke2018-02-211-0/+2
| | | | | | | | | | | | | Both KHR_blend_equation_advanced and ARB_bindless_texture provide layout qualifiers, and are exposed in compatibility contexts. We need to parse the layout qualifier as a token in order for those to work, but forgot to extend this check. ARB_shader_image_load_store would need a similar treatment, but we don't expose that in legacy OpenGL contexts. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161 Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: remove old assertTimothy Arceri2018-02-221-1/+0
| | | | | | | | | | | | This was originally intended to make sure the remap location was not -1. However the code has changed alot since then, the location is now never set to -1 and we also handle components meaning this old assert has been doing comparisions with the pointer to the array of component data. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
* glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.Eric Anholt2018-02-201-1/+1
| | | | | Fixes: d32956935edf ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed") Reviewed-by: Ian Romanick <[email protected]>
* glsl: Silence warnings in the uniform initializer test about 16-bit typesEric Anholt2018-02-201-0/+9
| | | | | | | | | They should probably get unit tests implemented, but this cleans up a bunch of warnings in my build for now. Fixes: 59f458cd8703 ("glsl: Add 16-bit types") Cc: Eduardo Lima Mitev <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl/nir: add pixel_center_integer to shader infoTimothy Arceri2018-02-192-0/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: fix sizeof(pointer) bugEric Engestrom2018-02-161-1/+1
| | | | | | | | | | Doesn't really change anything to the test though ¯\_(ツ)_/¯ CID: 1429511 Fixes: e8495646afb06a9dd7786 "glsl/tests: changes to test_disk_cache_create test" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* mesa: move STATE_LENGTH to shader_enums.h and use it everywhereMarek Olšák2018-02-134-7/+13
| | | | Reviewed-by: Brian Paul <[email protected]>
* mesa,glsl,nir: reduce gl_state_index size to 2 bytesMarek Olšák2018-02-137-12/+15
| | | | | | | | | Let's use the new gl_state_index16 type everywhere and remove the typecasts. This helps reduce the size of gl_program_parameter. Reviewed-by: Brian Paul <[email protected]>
* mesa: reduce the size of gl_programMarek Olšák2018-02-131-5/+4
| | | | | | gl_program: 1456 -> 976 bytes Reviewed-by: Brian Paul <[email protected]>
* glsl/tests: Fix strict aliasing warning about int64/double.Eric Anholt2018-02-121-3/+19
| | | | | Fixes: 4bf986274728 ("glsl/tests: Add UINT64 and INT64 types") Reviewed-by: Rhys Kidd <[email protected]>
* glsl/linker: remove ubo explicit binding handlingAlejandro Piñeiro2018-02-091-56/+2
| | | | | | | | | | | | This is already handled at link_uniform_blocks, specifically at process_block_array_leaf. Additionally, this code was not handling correctly arrays of arrays. When creating the name of the block to set the binding, it only took into account the first level, so any attempt to set a explicit binding on a array of array ubo would trigger an assertion. Reviewed-by: Timothy Arceri <[email protected]>
* meson: Add build option for toolsScott D Phillips2018-02-082-3/+6
| | | | | | | | | | | | | | | Add a build option to control building some of the misc tools we have. Also set the executables to install, presumably you want that if you're asking for the build. v2: set 'install:' to the with_tools value, not true (Jordan) handle 'all' in a the comma list (Dylan) Add freedreno's tools (Dylan) Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* glsl/tests: changes to test_disk_cache_create testTapani Pälli2018-02-071-6/+22
| | | | | | | | | | | | | | | Next patch will allow disk_cache instance to be created without path set for it, modify some test cases that assume disk_cache creation to fail with invalid path. Creation should succeed but simple put/get test fail. v2: leave tests as is but check that both cache struct exists and try simple put/get that should fail with invalid path set (Emil) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jordan Justen <[email protected]> (v1) Reviewed-by: Emil Velikov <[email protected]>
* glsl/tests: move utility functions in cache_testTapani Pälli2018-02-071-35/+35
| | | | | | | | | Patch moves functions higher so that we can utilize them from test_disk_cache_create which is modified by next patch. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radv: run nir_opt_shrink_loadSamuel Pitoiset2018-02-061-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVM can't shrink loads. Polaris10: Totals from affected shaders: SGPRS: 62528 -> 59955 (-4.11 %) VGPRS: 44708 -> 44616 (-0.21 %) Spilled SGPRs: 16 -> 8 (-50.00 %) Code Size: 1355504 -> 1355172 (-0.02 %) bytes Max Waves: 11710 -> 11670 (-0.34 %) Vega10: Totals from affected shaders: SGPRS: 51448 -> 50371 (-2.09 %) VGPRS: 39140 -> 39048 (-0.24 %) Spilled SGPRs: 16 -> 16 (0.00 %) Code Size: 1307188 -> 1304296 (-0.22 %) bytes Max Waves: 11312 -> 11292 (-0.18 %) This reduces SGPRs spilling in MadMax, and it also reduces number of SGPRs in DOW3 and F12017. The number of waves slightly decreases in F1 but I don't see any performance changes after benchmarking it. Talos and Serious Sam are not affected because they don't use any push constants. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add nir_opt_shrink_load passSamuel Pitoiset2018-02-064-0/+71
| | | | | | | | | This is a very simple pass that just shrinks load_push_constant intrinsics when some components are unused. For now, it can just shrink vec4 to vec3, vec3 to vec2 and so on. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: split constant initializers on in/out structsIago Toral Quiroga2018-02-061-0/+8
| | | | | | | | | The SPIR-V parser splits in/out struct variables and creates a separate variable for each first-level member of the struct. When the struct variable has an initializer this means that we also need to split the initializer. Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: add OES_EGL_image_external_essl3 supportIlia Mirkin2018-02-065-2/+23
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl/linker: check same name is not used in block and outsideJuan A. Suarez Romero2018-02-051-23/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According with OpenGL GLSL 3.20 spec, section 4.3.9: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside") that covered this case, but did not take in account that precision qualifiers are ignored when comparing blocks with no instance name. With this commit, the original tests KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression is fixed, which was broken by previous commit. v2: use helper varibles (Matteo Bruni) Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777 CC: Mark Janes <[email protected]> CC: "18.0" <[email protected]> Tested-by: Matteo Bruni <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>
* mesa: Put materials at the end of the generic block.Mathias Fröhlich2018-02-011-1/+6
| | | | | | | | | | | | | | | | | | | The materials are now moved to the end of the generic attributes block to the range 4-15. Before, the way the position and generic 0 attribute is handled was dependent on the presence and kind of the currently attached vertex program. With this change the way the position attribute and the generic 0 attribute is treated only depends on the enabled flag of those two arrays. This will later help to untangle the update dependencies between enabled arrays and shader inputs. v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Use defines for the aliased material array attributes.Mathias Fröhlich2018-02-011-0/+10
| | | | | | | | | | Instead of just assuming that the material attributes just overlap with the generic attributes 0-12, give them symbolic defines so that we can easier move them to an other range. Signed-off-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nir: Distribute binary operations with constants into bcselIan Romanick2018-01-301-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was specifically designed to simplify 1+mix(0, a-1, condition) to mix(1, a, condition) by pushing the 1+ inside. Skylake, Broadwell, and Haswell had similar results. Skylake shown. total instructions in shared programs: 14521753 -> 14521716 (<.01%) instructions in affected programs: 10619 -> 10582 (-0.35%) helped: 51 HURT: 14 helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1 helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95% HURT stats (abs) min: 1 max: 11 x̄: 2.57 x̃: 1 HURT stats (rel) min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32% 95% mean confidence interval for instructions value: -1.31 0.17 95% mean confidence interval for instructions %-change: -0.80% -0.27% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533000205 -> 533003533 (<.01%) cycles in affected programs: 110610 -> 113938 (3.01%) helped: 43 HURT: 28 helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16 helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67% HURT stats (abs) min: 2 max: 3066 x̄: 160.50 x̃: 14 HURT stats (rel) min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62% 95% mean confidence interval for cycles value: -43.81 137.56 95% mean confidence interval for cycles %-change: -1.47% 3.60% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10018840 -> 10018713 (<.01%) instructions in affected programs: 9431 -> 9304 (-1.35%) helped: 51 HURT: 3 helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1 helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81% HURT stats (abs) min: 1 max: 12 x̄: 4.67 x̃: 1 HURT stats (rel) min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22% 95% mean confidence interval for instructions value: -5.36 0.66 95% mean confidence interval for instructions %-change: -1.66% -0.46% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 87571944 -> 87572785 (<.01%) cycles in affected programs: 117234 -> 118075 (0.72%) helped: 42 HURT: 23 helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30 helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74% HURT stats (abs) min: 1 max: 2341 x̄: 131.35 x̃: 10 HURT stats (rel) min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61% 95% mean confidence interval for cycles value: -61.05 86.93 95% mean confidence interval for cycles %-change: -3.47% -0.33% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10542933 -> 10542844 (<.01%) instructions in affected programs: 11487 -> 11398 (-0.77%) helped: 52 HURT: 3 helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1 helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72% HURT stats (abs) min: 1 max: 11 x̄: 4.33 x̃: 1 HURT stats (rel) min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22% 95% mean confidence interval for instructions value: -3.17 -0.07 95% mean confidence interval for instructions %-change: -1.13% -0.52% Instructions are helped. total cycles in shared programs: 146098397 -> 146097094 (<.01%) cycles in affected programs: 128140 -> 126837 (-1.02%) helped: 47 HURT: 8 helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18 helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95% HURT stats (abs) min: 1 max: 16 x̄: 8.75 x̃: 9 HURT stats (rel) min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34% 95% mean confidence interval for cycles value: -37.49 -9.90 95% mean confidence interval for cycles %-change: -1.22% -0.71% Cycles are helped. Iron Lake total instructions in shared programs: 7886711 -> 7886509 (<.01%) instructions in affected programs: 10425 -> 10223 (-1.94%) helped: 50 HURT: 2 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89% 95% mean confidence interval for instructions value: -8.05 0.28 95% mean confidence interval for instructions %-change: -1.83% -0.26% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178115324 -> 178114612 (<.01%) cycles in affected programs: 765726 -> 765014 (-0.09%) helped: 39 HURT: 1 helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8 helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -32.07 -3.53 95% mean confidence interval for cycles %-change: -0.86% 0.10% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857762 -> 4857661 (<.01%) instructions in affected programs: 5523 -> 5422 (-1.83%) helped: 25 HURT: 1 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86% 95% mean confidence interval for instructions value: -9.99 2.22 95% mean confidence interval for instructions %-change: -2.01% 0.08% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 122179674 -> 122179194 (<.01%) cycles in affected programs: 530162 -> 529682 (-0.09%) helped: 22 HURT: 1 helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7 helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -46.56 4.82 95% mean confidence interval for cycles %-change: -1.20% 0.36% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Rearrange logic op-compounded integer comparesIan Romanick2018-01-301-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14521769 -> 14521753 (<.01%) instructions in affected programs: 8782 -> 8766 (-0.18%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.23% -0.16% Instructions are helped. total cycles in shared programs: 533000376 -> 533000205 (<.01%) cycles in affected programs: 447035 -> 446864 (-0.04%) helped: 9 HURT: 9 helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40 helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09% HURT stats (abs) min: 1 max: 52 x̄: 16.78 x̃: 10 HURT stats (rel) min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12% 95% mean confidence interval for cycles value: -25.07 6.07 95% mean confidence interval for cycles %-change: -0.08% 0.27% Inconclusive result (value mean confidence interval includes 0). No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Rearrange and-compounded float comparesIan Romanick2018-01-301-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If both comparisons are used as sources for instructions other than the iand, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. It is interesting to me that on Iron Lake only 81 shaders have instruction counts changed, but 726 shaders have cycle counts changed. shader-db results: Skylake total instructions in shared programs: 14525728 -> 14521017 (-0.03%) instructions in affected programs: 1164726 -> 1160015 (-0.40%) helped: 1692 HURT: 5 helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2 helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 12 x̄: 3.20 x̃: 1 HURT stats (rel) min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86% 95% mean confidence interval for instructions value: -3.52 -2.03 95% mean confidence interval for instructions %-change: -0.86% -0.74% Instructions are helped. total cycles in shared programs: 533115449 -> 532991404 (-0.02%) cycles in affected programs: 119401803 -> 119277758 (-0.10%) helped: 1145 HURT: 467 helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18 helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42% HURT stats (abs) min: 1 max: 1590 x̄: 92.15 x̃: 15 HURT stats (rel) min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39% 95% mean confidence interval for cycles value: -122.16 -31.74 95% mean confidence interval for cycles %-change: -0.94% -0.57% Cycles are helped. total spills in shared programs: 9597 -> 9534 (-0.66%) spills in affected programs: 403 -> 340 (-15.63%) helped: 1 HURT: 1 total fills in shared programs: 13904 -> 13790 (-0.82%) fills in affected programs: 1627 -> 1513 (-7.01%) helped: 2 HURT: 1 LOST: 0 GAINED: 2 Broadwell total instructions in shared programs: 14816966 -> 14812590 (-0.03%) instructions in affected programs: 1499885 -> 1495509 (-0.29%) helped: 1672 HURT: 15 helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 21 x̄: 9.20 x̃: 8 HURT stats (rel) min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53% 95% mean confidence interval for instructions value: -3.14 -2.05 95% mean confidence interval for instructions %-change: -0.85% -0.73% Instructions are helped. total cycles in shared programs: 559353622 -> 559345595 (<.01%) cycles in affected programs: 139893703 -> 139885676 (<.01%) helped: 921 HURT: 697 helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18 helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87% HURT stats (abs) min: 1 max: 2370 x̄: 178.03 x̃: 38 HURT stats (rel) min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14% 95% mean confidence interval for cycles value: -59.64 49.72 95% mean confidence interval for cycles %-change: -1.02% -0.66% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 78902 -> 78861 (-0.05%) spills in affected programs: 2418 -> 2377 (-1.70%) helped: 1 HURT: 11 total fills in shared programs: 83782 -> 83678 (-0.12%) fills in affected programs: 3515 -> 3411 (-2.96%) helped: 2 HURT: 11 LOST: 0 GAINED: 5 Haswell and Ivy Bridge had similar results. Haswell shown. total instructions in shared programs: 9033898 -> 9032010 (-0.02%) instructions in affected programs: 308064 -> 306176 (-0.61%) helped: 921 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1 helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23% 95% mean confidence interval for instructions value: -2.21 -1.87 95% mean confidence interval for instructions %-change: -0.88% -0.68% Instructions are helped. total cycles in shared programs: 84628949 -> 84620520 (<.01%) cycles in affected programs: 2164913 -> 2156484 (-0.39%) helped: 518 HURT: 359 helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20 helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01% HURT stats (abs) min: 1 max: 586 x̄: 36.43 x̃: 8 HURT stats (rel) min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40% 95% mean confidence interval for cycles value: -15.17 -4.05 95% mean confidence interval for cycles %-change: -0.77% -0.32% Cycles are helped. LOST: 0 GAINED: 4 Sandy Bridge total instructions in shared programs: 10544860 -> 10542933 (-0.02%) instructions in affected programs: 360019 -> 358092 (-0.54%) helped: 931 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1 helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33% 95% mean confidence interval for instructions value: -2.23 -1.89 95% mean confidence interval for instructions %-change: -0.76% -0.58% Instructions are helped. total cycles in shared programs: 146106820 -> 146098397 (<.01%) cycles in affected programs: 3435047 -> 3426624 (-0.25%) helped: 572 HURT: 329 helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15 helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33% HURT stats (abs) min: 1 max: 1714 x̄: 30.93 x̃: 6 HURT stats (rel) min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19% 95% mean confidence interval for cycles value: -16.85 -1.85 95% mean confidence interval for cycles %-change: -0.39% -0.01% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7886925 -> 7886711 (<.01%) instructions in affected programs: 25763 -> 25549 (-0.83%) helped: 75 HURT: 6 helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1 helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53% HURT stats (abs) min: 1 max: 16 x̄: 6.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86% 95% mean confidence interval for instructions value: -3.69 -1.60 95% mean confidence interval for instructions %-change: -2.54% -0.57% Instructions are helped. total cycles in shared programs: 178116888 -> 178115324 (<.01%) cycles in affected programs: 5858790 -> 5857226 (-0.03%) helped: 484 HURT: 242 helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 4.07 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03% 95% mean confidence interval for cycles value: -2.76 -1.55 95% mean confidence interval for cycles %-change: -0.12% 0.01% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857870 -> 4857762 (<.01%) instructions in affected programs: 13994 -> 13886 (-0.77%) helped: 39 HURT: 5 helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2 helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48% HURT stats (abs) min: 1 max: 16 x̄: 4.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86% 95% mean confidence interval for instructions value: -3.86 -1.05 95% mean confidence interval for instructions %-change: -2.61% 0.04% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 122180744 -> 122179674 (<.01%) cycles in affected programs: 3686646 -> 3685576 (-0.03%) helped: 273 HURT: 141 helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 3.66 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02% 95% mean confidence interval for cycles value: -3.42 -1.75 95% mean confidence interval for cycles %-change: -0.15% 0.03% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Separate a weird compare with zero to two compares with zeroIan Romanick2018-01-301-0/+2
| | | | | | | | | | | min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0). No shader-db changes, but it does prevent 6 to 12 instruction regressions in the next patch on all measured Intel platforms. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Simplify min and max of b2fIan Romanick2018-01-301-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Rebase on almost 2 years. Require that one of the arguments to fmin or fmax be used only once. This prevents some regressions. shader-db results: Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14526021 -> 14525913 (<.01%) instructions in affected programs: 4613 -> 4505 (-2.34%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4 helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42% total cycles in shared programs: 533118710 -> 533118403 (<.01%) cycles in affected programs: 34334 -> 34027 (-0.89%) helped: 24 HURT: 0 helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14 helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03% No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Undo possible damage caused by rearranging or-compounded float comparesIan Romanick2018-01-301-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shader-db results: Skylake and Broadwell had similar results (Skylake shown) total instructions in shared programs: 14525898 -> 14525836 (<.01%) instructions in affected programs: 1964 -> 1902 (-3.16%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1 helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86% 95% mean confidence interval for instructions value: -9.46 0.60 95% mean confidence interval for instructions %-change: -3.97% -0.24% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533119892 -> 533115756 (<.01%) cycles in affected programs: 96061 -> 91925 (-4.31%) helped: 13 HURT: 1 helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300 helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46% 95% mean confidence interval for cycles value: -379.43 -211.43 95% mean confidence interval for cycles %-change: -4.84% -3.01% Cycles are helped. Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown). total instructions in shared programs: 9033948 -> 9033898 (<.01%) instructions in affected programs: 535 -> 485 (-9.35%) helped: 2 HURT: 0 total cycles in shared programs: 84631402 -> 84628949 (<.01%) cycles in affected programs: 63197 -> 60744 (-3.88%) helped: 13 HURT: 2 helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140 helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01% HURT stats (abs) min: 4 max: 8 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31% 95% mean confidence interval for cycles value: -253.40 -73.67 95% mean confidence interval for cycles %-change: -4.24% -2.25% Cycles are helped. No changes on GM45 or Iron Lake. v2: Add a couple more tautological compares. Suggested by Elie. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: Be more conservative about rearranging or-compounded comparesIan Romanick2018-01-301-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If both comparisons are used as sources for instructions other than the ior, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. shader-db results: Skylake total instructions in shared programs: 14526147 -> 14525898 (<.01%) instructions in affected programs: 70239 -> 69990 (-0.35%) helped: 102 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.86 -2.02 95% mean confidence interval for instructions %-change: -0.46% -0.31% Instructions are helped. total cycles in shared programs: 533120531 -> 533119892 (<.01%) cycles in affected programs: 994875 -> 994236 (-0.06%) helped: 76 HURT: 26 helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13 helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18% HURT stats (abs) min: 1 max: 167 x̄: 54.62 x̃: 26 HURT stats (rel) min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39% 95% mean confidence interval for cycles value: -19.44 6.91 95% mean confidence interval for cycles %-change: -0.30% 0.15% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 14816005 -> 14815787 (<.01%) instructions in affected programs: 64658 -> 64440 (-0.34%) helped: 97 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.62 -1.87 95% mean confidence interval for instructions %-change: -0.45% -0.30% Instructions are helped. total cycles in shared programs: 559340386 -> 559339907 (<.01%) cycles in affected programs: 1090491 -> 1090012 (-0.04%) helped: 66 HURT: 28 helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16 helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27% HURT stats (abs) min: 2 max: 226 x̄: 39.07 x̃: 11 HURT stats (rel) min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20% 95% mean confidence interval for cycles value: -15.94 5.75 95% mean confidence interval for cycles %-change: -0.35% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 Haswell total instructions in shared programs: 9034106 -> 9033948 (<.01%) instructions in affected programs: 24096 -> 23938 (-0.66%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4 helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64% 95% mean confidence interval for instructions value: -4.71 -3.60 95% mean confidence interval for instructions %-change: -0.84% -0.58% Instructions are helped. total cycles in shared programs: 84631628 -> 84631402 (<.01%) cycles in affected programs: 148674 -> 148448 (-0.15%) helped: 14 HURT: 14 helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12 helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21% HURT stats (abs) min: 1 max: 10 x̄: 6.00 x̃: 5 HURT stats (rel) min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: -19.42 3.28 95% mean confidence interval for cycles %-change: -0.59% 0.05% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10015456 -> 10015293 (<.01%) instructions in affected programs: 27701 -> 27538 (-0.59%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4 helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52% 95% mean confidence interval for instructions value: -4.87 -3.71 95% mean confidence interval for instructions %-change: -0.82% -0.51% Instructions are helped. total cycles in shared programs: 87524771 -> 87524569 (<.01%) cycles in affected programs: 112324 -> 112122 (-0.18%) helped: 6 HURT: 12 helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20 helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26% HURT stats (abs) min: 1 max: 16 x̄: 5.50 x̃: 5 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -29.14 6.69 95% mean confidence interval for cycles %-change: -0.93% 0.08% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10545655 -> 10545465 (<.01%) instructions in affected programs: 37198 -> 37008 (-0.51%) helped: 42 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4 helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49% 95% mean confidence interval for instructions value: -5.14 -3.91 95% mean confidence interval for instructions %-change: -0.68% -0.47% Instructions are helped. total cycles in shared programs: 146113059 -> 146112427 (<.01%) cycles in affected programs: 423514 -> 422882 (-0.15%) helped: 32 HURT: 10 helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12 helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11% HURT stats (abs) min: 12 max: 19 x̄: 14.70 x̃: 14 HURT stats (rel) min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14% 95% mean confidence interval for cycles value: -26.03 -4.07 95% mean confidence interval for cycles %-change: -0.43% -0.05% Cycles are helped. Iron Lake total instructions in shared programs: 7886959 -> 7886925 (<.01%) instructions in affected programs: 1340 -> 1306 (-2.54%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8 helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43% 95% mean confidence interval for instructions value: -20.44 3.44 95% mean confidence interval for instructions %-change: -5.78% 0.89% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178116996 -> 178116888 (<.01%) cycles in affected programs: 6262 -> 6154 (-1.72%) helped: 2 HURT: 2 helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61 helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62% HURT stats (abs) min: 6 max: 8 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: -93.27 39.27 95% mean confidence interval for cycles %-change: -5.38% 2.27% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857887 -> 4857870 (<.01%) instructions in affected programs: 674 -> 657 (-2.52%) helped: 2 HURT: 0 total cycles in shared programs: 122180816 -> 122180744 (<.01%) cycles in affected programs: 3764 -> 3692 (-1.91%) helped: 1 HURT: 1 helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78 helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34% Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: See through an fneg to apply existing optimizationsIan Romanick2018-01-301-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Doing the same for the existing feq and fne transformations didn't help anything in shader-db. shader-db results: Broadwell and Skylake (Skylake shown) total instructions in shared programs: 14529463 -> 14526147 (-0.02%) instructions in affected programs: 402420 -> 399104 (-0.82%) helped: 2136 HURT: 131 helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1 helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12% HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57% 95% mean confidence interval for instructions value: -1.51 -1.41 95% mean confidence interval for instructions %-change: -3.06% -2.78% Instructions are helped. total cycles in shared programs: 533146915 -> 533120531 (<.01%) cycles in affected programs: 10356261 -> 10329877 (-0.25%) helped: 1933 HURT: 844 helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16 helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88% HURT stats (abs) min: 1 max: 423 x̄: 36.17 x̃: 12 HURT stats (rel) min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59% 95% mean confidence interval for cycles value: -11.78 -7.22 95% mean confidence interval for cycles %-change: -1.98% -1.65% Cycles are helped. Haswell total instructions in shared programs: 9037416 -> 9034106 (-0.04%) instructions in affected programs: 389831 -> 386521 (-0.85%) helped: 2184 HURT: 120 helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57% 95% mean confidence interval for instructions value: -1.49 -1.39 95% mean confidence interval for instructions %-change: -2.68% -2.41% Instructions are helped. total cycles in shared programs: 84636243 -> 84631628 (<.01%) cycles in affected programs: 4745058 -> 4740443 (-0.10%) helped: 1904 HURT: 960 helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18 helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38% HURT stats (abs) min: 1 max: 1080 x̄: 55.11 x̃: 14 HURT stats (rel) min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81% 95% mean confidence interval for cycles value: -4.51 1.29 95% mean confidence interval for cycles %-change: -1.64% -1.25% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 0 Sandy Bridge and Ivy Bridge (Ivy Bridge shown) total instructions in shared programs: 10018873 -> 10015456 (-0.03%) instructions in affected programs: 512820 -> 509403 (-0.67%) helped: 2268 HURT: 162 helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88% HURT stats (abs) min: 1 max: 4 x̄: 1.59 x̃: 1 HURT stats (rel) min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50% 95% mean confidence interval for instructions value: -1.46 -1.35 95% mean confidence interval for instructions %-change: -2.38% -2.12% Instructions are helped. total cycles in shared programs: 87538223 -> 87524771 (-0.02%) cycles in affected programs: 5435520 -> 5422068 (-0.25%) helped: 1916 HURT: 946 helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18 helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97% HURT stats (abs) min: 1 max: 633 x̄: 45.41 x̃: 11 HURT stats (rel) min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62% 95% mean confidence interval for cycles value: -7.34 -2.06 95% mean confidence interval for cycles %-change: -1.62% -1.26% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7888446 -> 7886959 (-0.02%) instructions in affected programs: 331581 -> 330094 (-0.45%) helped: 1160 HURT: 97 helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1 helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25% 95% mean confidence interval for instructions value: -1.25 -1.12 95% mean confidence interval for instructions %-change: -0.91% -0.75% Instructions are helped. total cycles in shared programs: 178130766 -> 178116996 (<.01%) cycles in affected programs: 12534564 -> 12520794 (-0.11%) helped: 1856 HURT: 187 helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4 helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.55 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02% 95% mean confidence interval for cycles value: -7.41 -6.07 95% mean confidence interval for cycles %-change: -0.28% -0.22% Cycles are helped. GM45 total instructions in shared programs: 4858912 -> 4857887 (-0.02%) instructions in affected programs: 237565 -> 236540 (-0.43%) helped: 867 HURT: 57 helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: -1.18 -1.04 95% mean confidence interval for instructions %-change: -0.88% -0.71% Instructions are helped. total cycles in shared programs: 122189118 -> 122180816 (<.01%) cycles in affected programs: 8776418 -> 8768116 (-0.09%) helped: 1213 HURT: 166 helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4 helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.35 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02% 95% mean confidence interval for cycles value: -6.78 -5.26 95% mean confidence interval for cycles %-change: -0.24% -0.18% Cycles are helped. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* nir: add lower_all_io_to_temps flagTimothy Arceri2018-01-311-0/+2
| | | | | | | This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <[email protected]>
* nir/st_glsl_to_nir: add param to disable splitting of inputsTimothy Arceri2018-01-312-10/+15
| | | | | | | | | We need this because we will always copy fs outputs to temps and split the arrays, but do not want to do either of these with fs inputs as it is unnessisary and makes handling interpolateAt builtins difficult. Reviewed-by: Marek Olšák <[email protected]>
* nir: partially revert c2acf97fcc9b32eTimothy Arceri2018-01-301-6/+23
| | | | | | | | | | | | | | | | | | c2acf97fcc9b32e changed the use of double_inputs_read to be inconsitent with its previous meaning. Here we re-enable the gather info code that was removed as the modified code from c2acf97fcc9b32e now uses the double_inputs member rather than double_inputs_read. This change allows us to use double_inputs_read with gallium drivers without impacting double_inputs which is used by i965. We also make use of the compiler option vs_inputs_dual_locations to allow for the difference in behaviour between drivers that handle vs inputs as taking up two locations for doubles, versus those that treat them as taking a single location. Reviewed-by: Karol Herbst <[email protected]>
* nir: add vs_inputs_dual_locations compiler optionTimothy Arceri2018-01-302-5/+15
| | | | | | | | | | | | | Allows nir drivers to either use a single or dual locations for vs double inputs. i965 uses dual locations for both OpenGL and Vulkan drivers, for now gallium OpenGL drivers only use a single location. The following patch will also make use of this option when calling nir_shader_gather_info(). Reviewed-by: Karol Herbst <[email protected]>
* compiler: tidy up double_inputs_read usesTimothy Arceri2018-01-304-9/+20
| | | | | | | | | | | | | First we move double_inputs_read into a vs struct in the union, double_inputs_read is only used for vs inputs so this will save space and also allows us to add a new double_inputs field. We add the new field because c2acf97fcc9b changed the behaviour of double_inputs_read, and while it's no longer used to track actual reads in i965 we do still want to track this for gallium drivers. Reviewed-by: Marek Olšák <[email protected]>
* nir: mark unused space in packed_tex_dataTapani Pälli2018-01-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | This change cleans following scary warnings in valgrind output when disk cache is being written: ==6532== Uninitialised byte(s) found during client check request ==6532== at 0x14423FAD: blob_write_bytes (blob.c:152) ==6532== by 0x144240FB: blob_write_uint32 (blob.c:194) ==6532== by 0x144001A5: write_tex (nir_serialize.c:613) and later (loads of): ==6532== Use of uninitialised value of size 8 ==6532== at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11) ==6532== by 0x13F65014: util_hash_crc32 (crc32.c:127) ==6532== by 0x13F5DABA: cache_put (disk_cache.c:947) Signed-off-by: Tapani Pälli <[email protected]> Cc: [email protected] Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Emil Velikov <[email protected]>