Understanding configure.py
============================
.. highlight:: none
Botan's build is handled with a custom Python script, ``configure.py``.
This document tries to explain how configure works.
.. note::
You only need to read this if you are modifying the library,
or debugging some problem with your build. For how to use it,
see :ref:`building`.
Build Structure
--------------------
Modules are a group of related source and header files, which can be
individually enabled or disabled at build time. Modules can depend on
other modules; if a dependency is not available then the module itself
is also removed from the list. Examples of modules in the existing
codebase are ``asn1`` and ``x509``, Since ``x509`` depends on (among
other things) ``asn1``, disabling ``asn1`` will also disable ``x509``.
Most modules define one or more macros, which application code can use
to detect the modules presence or absence. The value of each macro is
a datestamp, in the form YYYYMMDD which indicates the last time this
module changed in a way that would be visible to an application. For
example if a class gains a new function, the datestamp should be
incremented. That allows applications to detect if the new feature is
available.
What ``configure.py`` does
-----------------------------
First, all command line options are parsed.
Then all of the files giving information about target CPUs, compilers,
etc are parsed and sanity checked.
In ``calculate_cc_min_version`` the compiler version is detected using
the preprocessor.
Then in ``check_compiler_arch`` the target architecture are detected, again
using the preprocessor.
Now that the target is identified and options have been parsed, the modules to
include into the artifact are picked, in ``ModulesChooser``.
In ``create_template_vars``, a dictionary of variables is created which describe
different aspects of the build. These are serialized to
``build/build_config.json``.
Up until this point no changes have been made on disk. This occurs in
``do_io_for_build``. Build output directories are created, and header files are
linked into ``build/include/botan``. Templates are processed to create the
Makefile, ``build.h`` and other artifacts.
When Modifying ``configure.py``
--------------------------------
For now, any changes to ``configure.py`` must work under both CPython 2.7 and
CPython 3.x. In a future major release, support for CPython2 will be dropped,
but until then if making modifications verify the code works as expected on
both versions.
Run ``./src/scripts/ci_build.py lint`` to run Pylint checks after any change.
Template Language
--------------------
Various output files are generated by processing input files using a simple
template language. All input files are stored in ``src/build-data`` and use the
suffix ``.in``. Anything not recognized as a template command is passed through
to the output unmodified. The template elements are:
* Variable substitution, ``%{variable_name}``. The configure script creates
many variables for various purposes, this allows getting their value within
the output. If a variable is not defined, an error occurs.
If a variable reference ends with ``|upper``, the value is uppercased before
being inserted into the template output.
* Iteration, ``%{for variable} block %{endfor}``. This iterates over a list and
repeats the block as many times as it is included. Variables within the block
are expanded. The two template elements ``%{for ...}`` and ``%{endfor}`` must
appear on lines with no text before or after.
* Conditional inclusion, ``%{if variable} block %{endif}``. If the variable
named is defined and true (in the Python sense of the word; if the variable
is empty or zero it is considered false), then the block will be included and
any variables expanded. As with the for loop syntax, both the start and end
of the conditional must be on their own lines with no additional text.
Adding a new module
--------------------
Create a directory in the appropriate place and create a info.txt file.
Syntax of ``info.txt``
------------------------
.. warning::
The syntax described here is documented to make it easier to use
and understand, but it is not considered part of the public API
contract. That is, the developers are allowed to change the syntax
at any time on the assumption that all users are contained within
the library itself. If that happens this document will be updated.
Modules and files describing information about the system use the same
parser and have common syntactical elements.
Comments begin with '#' and continue to end of line.
There are three main types: maps, lists, and variables.
A map has a syntax like::
NAME1 -> VALUE1
NAME2 -> VALUE2
...
The interpretation of the names and values will depend on the map's name
and what type of file is being parsed.
A list has similar syntax, it just doesn't have values::
ELEM1
ELEM2
...
Lastly there are single value variables like::
VAR1 SomeValue
VAR2 "Quotes Can Be Used (And will be stripped out)"
VAR3 42
Variables can have string, integer or boolean values. Boolean values
are specified with 'yes' or 'no'.
Module Syntax
---------------------
The ``info.txt`` files have the following elements. Not all are required; a minimal
file for a module with no dependencies might just contain a macro define.
Lists:
* ``comment`` and ``warning`` provides block-comments which
are displayed to the user at build time.
* ``requires`` is a list of module dependencies. An ``os_features`` can be
specified as a condition for needing the dependency by writing it before
the module name and separated by a ``?``, e.g. ``rtlgenrandom?dyn_load``.
* ``header:internal`` is the list of headers (from the current module)
which are internal-only.
* ``header:public`` is a the list of headers (from the
current module) which should be exported for public use. If neither
``header:internal`` nor ``header:public`` are used then all headers
in the current directory are assumed public.
.. note:: If you omit a header from both internal and public lists, it will
be ignored.
* ``header:external`` is used when naming headers which are included
in the source tree but might be replaced by an external version. This is used
for the PKCS11 headers.
* ``arch`` is a list of architectures this module may be used on.
* ``isa`` lists ISA features which must be enabled to use this module.
Can be proceeded by an ``arch`` name followed by a ``:`` if it is only needed
on a specific architecture, e.g. ``x86_64:ssse3``.
* ``cc`` is a list of compilers which can be used with this module. If the
compiler name is suffixed with a version (like "gcc:5.0") then only compilers
with that minimum version can use the module.
* ``os_features`` is a list of OS features which are required in order to use this
module. Each line can specify one or more features combined with ','. Alternatives
can be specified on additional lines.
Maps:
* ``defines`` is a map from macros to datestamps. These macros will be defined in
the generated ``build.h``.
* ``libs`` specifies additional libraries which should be linked if this module is
included. It maps from the OS name to a list of libraries (comma seperated).
* ``frameworks`` is a macOS/iOS specific feature which maps from an OS name to
a framework.
Variables:
* ``load_on`` Can take on values ``never``, ``always``, ``auto``, ``dep`` or ``vendor``.
TODO describe the behavior of these
* ``endian`` Required endian for the module (``any`` (default), ``little``, ``big``)
An example::
# Disable this by default
load_on never
sse2
DEFINE1 -> 20180104
DEFINE2 -> 20190301
I have eaten
the plums
that were in
the icebox
There are no more plums
header1.h
header_helper.h
whatever.h
x86_64
gcc:4.9 # gcc 4.8 doesn't work for
clang
# Can work with POSIX+getentropy or Win32
posix1,getentropy
win32
macos -> FramyMcFramerson
qnx -> foo,bar,baz
solaris -> socket
Supporting a new CPU type
---------------------------
CPU information is stored in ``src/build-data/arch``.
There is also a file ``src/build-data/detect_arch.cpp`` which is used
for build-time architecture detection using the compiler preprocessor.
Supporting this is optional but recommended.
Lists:
* ``aliases`` is a list of alternative names for the CPU architecture.
* ``isa_extensions`` is a list of possible ISA extensions that can be used on
this architecture. For example x86-64 has extensions "sse2", "ssse3",
"avx2", "aesni", ...
Variables:
* ``endian`` if defined should be "little" or "big". This can also be
controlled or overridden at build time.
* ``family`` can specify a family group for several related architecture.
For example both x86_32 and x86_64 use ``family`` of "x86".
* ``wordsize`` is the default wordsize, which controls the size of limbs
in the multi precision integers. If not set, defaults to 32.
Supporting a new compiler
---------------------------
Compiler information is stored in ``src/build-data/cc``. Looking over
those files will probably help understanding, especially the ones for
GCC and Clang which are most complete.
In addition to the info file, for compilers there is a file
``src/build-data/detect_version.cpp``. The ``configure.py`` script runs the
preprocessor over this file to attempt to detect the compiler
version. Supporting this is not strictly necessary.
Maps:
* ``binary_link_commands`` gives the command to use to run the linker,
it maps from operating system name to the command to use. It uses
the entry "default" for any OS not otherwise listed.
* ``cpu_flags_no_debug`` unused, will be removed
* ``cpu_flags`` used to emit CPU specific flags, for example LLVM
bitcode target uses ``-emit-llvm`` flag. Rarely needed.
* ``isa_flags`` maps from CPU extensions (like NEON or AES-NI) to
compiler flags which enable that extension. These have the same name
as the ISA flags listed in the architecture files.
* ``lib_flags`` has a single possible entry "debug" which if set maps
to additional flags to pass when building a debug library.
Rarely needed.
* ``mach_abi_linking`` specifies flags to enable when building and
linking on a particular CPU. This is usually flags that modify
ABI. There is a special syntax supported here
"all!os1,arch1,os2,arch2" which allows setting ABI flags which are
used for all but the named operating systems and/or architectures.
* ``sanitizers`` is a map of sanitizers the compiler supports. It must
include "default" which is a list of sanitizers to include by default
when sanitizers are requested. The other keys should map to compiler
flags.
* ``so_link_commands`` maps from operating system to the command to
use to build a shared object.
Variables:
* ``binary_name`` the default name of the compiler binary.
* ``linker_name`` the name of the linker to use with this compiler.
* ``macro_name`` a macro of the for ``BOTAN_BUILD_COMPILER_IS_XXX``
will be defined.
* ``output_to_object`` (default "-o") gives the compiler option used to
name the output object.
* ``output_to_exe`` (default "-o") gives the compiler option used to
name the output object.
* ``add_include_dir_option`` (default "-I") gives the compiler option used
to specify an additional include dir.
* ``add_lib_dir_option`` (default "-L") gives the compiler option used
to specify an additional library dir.
* ``add_sysroot_option`` gives the compiler option used to specify the sysroot.
* ``add_lib_option`` (default "-l%s") gives the compiler option to
link in a library. ``%s`` will be replaced with the library name.
* ``add_framework_option`` (default "-framework") gives the compiler option
to add a macOS framework.
* ``preproc_flags`` (default "-E") gives the compiler option used to run
the preprocessor.
* ``compile_flags`` (default "-c") gives the compiler option used to compile a file.
* ``debug_info_flags`` (default "-g") gives the compiler option used to enable debug info.
* ``optimization_flags`` gives the compiler optimization flags to use.
* ``size_optimization_flags`` gives compiler optimization flags to use when
compiling for size. If not set then ``--optimize-for-size`` will use
the default optimization flags.
* ``sanitizer_optimization_flags`` gives compiler optimization flags to use
when building with sanitizers.
* ``coverage_flags`` gives the compiler flags to use when generating coverage
information.
* ``stack_protector_flags`` gives compiler flags to enable stack overflow checking.
* ``shared_flags`` gives compiler flags to use when generation shared libraries.
* ``lang_flags`` gives compiler flags used to enable the required version of C++.
* ``warning_flags`` gives warning flags to enable.
* ``maintainer_warning_flags`` gives extra warning flags to enable during maintainer
mode builds.
* ``visibility_build_flags`` gives compiler flags to control symbol visibility
when generation shared libraries.
* ``visibility_attribute`` gives the attribute to use in the ``BOTAN_DLL`` macro
to specify visibility when generation shared libraries.
* ``ar_command`` gives the command to build static libraries
* ``ar_options`` gives the options to pass to ``ar_command``, if not set here
takes this from the OS specific information.
* ``ar_output_to`` gives the flag to pass to ``ar_command`` to specify where to
output the static library.
* ``werror_flags`` gives the complier flags to treat warnings as errors.
Supporting a new OS
---------------------------
Operating system information is stored in ``src/build-data/os``.
Lists:
* ``aliases`` is a list of alternative names which will be accepted
* ``target_features`` is a list of target specific OS features. Some of these
are supported by many OSes (for example "posix1") others are specific to
just one or two OSes (such as "getauxval"). Adding a value here causes a new
macro ``BOTAN_TARGET_OS_HAS_XXX`` to be defined at build time. Use
``configure.py --list-os-features`` to list the currently defined OS
features.
* ``feature_macros`` is a list of macros to define.
Variables:
* ``ar_command`` gives the command to build static libraries
* ``ar_options`` gives the options to pass to ``ar_command``
* ``ar_output_to`` gives the flag to pass to ``ar_command`` to specify where to
output the static library.
* ``bin_dir`` (default "bin") specifies where binaries should be installed,
relative to install_root.
* ``cli_exe_name`` (default "botan") specifies the name of the command line utility.
* ``default_compiler`` specifies the default compiler to use for this OS.
* ``doc_dir`` (default "doc") specifies where documentation should be installed,
relative to install_root
* ``header_dir`` (default "include") specifies where include files
should be installed, relative to install_root
* ``install_root`` (default "/usr/local") specifies where to install
by default.
* ``lib_dir`` (default "lib") specifies where library should be installed,
relative to install_root.
* ``lib_prefix`` (default "lib") prefix to add to the library name
* ``library_name``
* ``man_dir`` specifies where man files should be installed, relative to install_root
* ``obj_suffix`` (default "o") specifies the suffix used for object files
* ``program_suffix`` (default "") specifies the suffix used for executables
* ``shared_lib_symlinks`` (default "yes) specifies if symbolic names should be
created from the base and patch soname to the library name.
* ``soname_pattern_abi``
* ``soname_pattern_base``
* ``soname_pattern_patch``
* ``soname_suffix`` file extension to use for shared library if ``soname_pattern_base``
is not specified.
* ``static_suffix`` (default "a") file extension to use for static library.
* ``use_stack_protector`` (default "true") specify if by default stack smashing
protections should be enabled.
* ``uses_pkg_config`` (default "yes") specify if by default a pkg-config file
should be created.