Python packaging guide
This guide is intended to cover the packaging of Python projects: source trees having legacy setup.py or modern pyproject.toml (or both) files.
A packaging of Python distribution implies two mandatory steps:
- build — build some artifact which can be either bare source files or binary distribution also known as wheel
- install — installation of produced build artifact
There are optional steps as well:
Build
Python projects can be one of two types:
- pure Python distributions which don’t require any actual building process and can be just copied/pasted onto destination as is
- platform-dependent distributions which require special building process and can’t be just copied/pasted onto destination as is
Sources
There are two supported options for sources:
VCS and gear
A source tree produced by gear is not a valid VCS checkout due to missing SCM’s internal data (e.g. .git directory and its content).
Some packaging plugins may rely on SCM. For example,
- To ensure all relevant files are packaged when running the sdist command.
- When using include_package_data to include package data as part of the build or bdist_wheel.
Thus, actual SCM source tree is required in this case, otherwise some expected data (everything, besides Python packages or modules) may not be packaged.
For example, basic git configuration maybe done in RPM specfile with:
if [ ! -d .git ]; then git init git config user.email author@example.com git config user.name author git add . git commit -m 'release' git tag '%version' fi
pypi.org
Though it’s possible to use wheels from pypi.org as source for installation (skipping build step), in general, it’s bad idea to just unpack such a wheel for many reasons like:
- wheels may be platform-specific
- wheels are prebuilt distributions, builders of which may not follow distro’s build rules and policies, can link to whatever they want and so on
- wheels are usually stripped (there are no tests, docs, etc.)
Thus, only source distribution can be used from pypi.org for packaging, though it’s recommended to use a source of sdist e.g., a code fetched from code hostings.
Build roles
With accepted PEP517 a build process is more standardized and is split into two responsibilities:
- build frontend takes arbitrary source tree and calls build backend on them. User can use one frontend for building any PEP517-aware project.
- build backend actually builds sdist or wheel. Backend is per-project defined entity.
Build dependencies
With accepted PEP518 the format of static build dependencies is standardized and can be configured in TOML via pyproject.toml file.
For the vast majority of projects the PEP517/518 configuration will be:
[build-system] requires = ["setuptools", "wheel"] build-backend = "setuptools.build_meta"
- distro packager is responsible for specifying these requirements in RPM specfile. Though these dependencies may not be enough since a backend may have dynamic ones. See get-requires-for-build-wheel for details
- distro packager may add any other required build dependency but must follow «necessary and sufficient» rule
- presence of %pyproject_build or %pyproject_install in RPM specfile automatically adds build dependency on pyproject-installer . But if that distribution is used for something different from aforementioned macros the corresponding requirement should be set explicitly
- presence of %tox_check or %tox_check_pyproject in RPM specfile automatically adds build dependency on tox and its required plugins. But if that distribution is used for something different from aforementioned macros the corresponding requirement should be set explicitly
- dependencies required for check or build docs must be RPM-conditional i.e., they can be opted out by RPM means
- sources of upstream’s dependencies can be managed with the help of %pyproject_deps RPM macros. See https://www.altlinux.org/Management_of_Python_dependencies_sources for details.
Build with RPM
rpm-build-python3 provides RPM macro %pyproject_build as a CLI interface for build frontend (See pyproject-installer for details).
Implementation details
Install
A wheel is a regular zip archive having special name and suffix «.whl». Thus, an installer mostly just verifies content of a wheel, unpacks it and lays out unpacked files if necessary.
Install with RPM
rpm-build-python3 provides RPM macro %pyproject_install as a CLI interface for installer (See pyproject-installer for details).
- installation should be done with the help of this macro unless otherwise is absolutely necessary
Implementation details
Bytecompilation
PEP427 says that wheel installers should compile any installed .py to .pyc (see compiled-python-files and pyc directories for details). But rpm-build-python3 already does this by default with the help of python3.compileall.py script which compiles Python modules with the given optimization level (disabled, -O switch removes assert statements, the -OO switch removes both assert statements and __doc__ strings).
Check
Testing is vital in Python eco-system. Never skip or disable tests unless it’s absolutely necessary.
Due to security reasons the hasher in ALT’s build-farm is network-isolated environment and thus, testing in such an env can be very tricky or is not possible at all. Though the most of unit tests require nothing for run and check should be enabled by default. In contrast, integration tests usually require Internet’s availability and thereby, should be skipped. Nowadays the vast majority of projects use pytest or stdlib’s unittest as tests’ runner and tox to automate testing.
Check with RPM
It’s often required to run project’s tests during downstream packaging in the global isolated environment. For example, the user of such environment is unprivileged (can’t install distributions into global sitepackages) and has no access to Internet (can’t install distributions from package indexes). On the other hand built Python distributions should not be unintentionally installed into user sitepackages to avoid their interference with any further distributions being tested. Though some of tests can be run in current Python environment without any change (e.g. flat layout and pure Python package), some may require setting of PYTHONPATH environment variable (e.g. src layout and pure Python package). But things may be more complex in case of arch-dependent Python packages, .pth hooks or entry_points plugins where PYTHONPATH way can’t help. This is one of the reasons why venv(stdlib) or virtualenv(third-party) exists. There is really nice tool tox for automation of testing process. But it’s overkill to use it for the aforementioned task, for example:
- by default tox download and install dependencies of test env and dependencies of package (though this can be overcome with options and external plugins)
- tox has many runtime dependencies
Check with pyproject_installer
The recommended way to run tests is one of RPM macros based on pyproject_installer 's command run :
- %pyproject_run (i.e. python3 -m pyproject_installer run) allows execution of arbitrary command within non-isolated venv-based Python virtual environment:
- create minimal virtual environment with the help of stdlib's venv that:
- has no installed pip, setuptools
- has access to system and user site packages
Check with tox
rpm-build-python3 provides several RPM macros as a CLI interface for tox :
- %tox_check :
- set required tox's and pip's env variables for running in network-isolated env
- pass down NO_INTERNET env variable to tox env. This variable can be used by tests' runner for making tests requring Internet conditional
- run tox with system site distributions
- relax all tox env's dependencies
- generate console scripts for every system site distribution defining them
- same as %tox_check but tox will use a wheel built by %pyproject_build
Packaging
Subpackages
- if docs or tests are required by something or somebody else they must be subpackaged into their own RPM packages, otherwise they must not be shipped.
Metadata
If a project is assumed to be reusable it must publish corresponding meta information. The generation of such data is a job of build backend. But there are projects which are intended only for internal usage and they don't want to produce any meta information in this case (usually such projects are not published on index like pypi.org).
- installed metadata (content of dist-info directory) shouldn't be stripped any more than it's already done. Nowadays, only METADATA and entry_points.txt are actually used by metadata parsers (See importlib.metadata for details). This data is crucial for many aspects of Python eco-system.
- %pyproject_distinfo RPM macro can help if more finer-grained control over a packaged metadata is desired or required. This macro normalizes a given name according to PEP427 rules. Note: not everyone build backend follows these rules and can produce wheel having filename with . (dot character) and/or upper case characters in distribution name's part of that filename. See https://discuss.python.org/t/amending-pep-427-and-pep-625-on-package-normalization-rules/17226 for details. it's not always possible to predict produced wheel filename.
Migration to PEP517 in RPM specfile
More and more Python projects are migrating from legacy setup.py-packaging and completely removing this file from their tree. This makes it impossible to package such projects with old Python RPM macros which unconditionally rely on existent setup.py.
New RPM macros
In most of the cases the legacy RPM macros for build or install Python project can be (and must be) replaced with the modern ones e.g.:
legacy macro modern macro python3_build pyproject_build python3_build_debug pyproject_build python3_install pyproject_install python3_build_install pyproject_build and pyproject_install New build dependencies
As it was said before a distro packager is responsible for specifying build requirements in RPM specfile.
This can be done manually or automatically. Current section describes only manual way to achieve it.
All the static requirements for building a project are given in pyproject.toml.
[build-system] requires = ["setuptools"] build-backend = "setuptools.build_meta"
setuptools in turn requires wheel distribution to build a wheel. Thereby, the following build dependencies must be specified in RPM specfile explicitly:
BuildRequires: python3(setuptools) BuildRequires: python3(wheel)
Примечание: Building sdist or wheel may require additional distributions, names of which can only be obtained from a build backend.
Legacy setup.py-formatted projects
Even if a project still uses legacy packaging format it's possible to build such a project with PEP517/518-aware RPM macros.
The following default configuration
[build-system] requires = ["setuptools", "wheel"] build-backend = "setuptools.build_meta:__legacy__"
Additionally, the legacy setuptools.build_meta:__legacy__ backend will be used if:
Примечание: Those default build dependencies must be specified in RPM specfile explicitly:
BuildRequires: python3(setuptools) BuildRequires: python3(wheel)
Installed metadata
The new format of metadata ( dist-info will be generated and packaged on migrating to PEP517-aware RPM macros.
For example, egg-info (previous format) vs dist-info :- my_distribution-1.0-py3.10.egg-info - my_distribution-1.0-py3.10.egg-info/dependency_links.txt - my_distribution-1.0-py3.10.egg-info/entry_points.txt - my_distribution-1.0-py3.10.egg-info/not-zip-safe - my_distribution-1.0-py3.10.egg-info/PKG-INFO - my_distribution-1.0-py3.10.egg-info/requires.txt - my_distribution-1.0-py3.10.egg-info/SOURCES.txt - my_distribution-1.0-py3.10.egg-info/top_level.txt + my_distribution-1.0.dist-info + my_distribution-1.0.dist-info/entry_points.txt + my_distribution-1.0.dist-info/METADATA
setuptools
setuptools is the mandatory dependency of legacy RPM macros, but it's an optional build backend for the new ones. Thus, setuptools is no longer pulled into a build environment just on usage of modern macros and should be specified explicitly if it's still needed.
- create minimal virtual environment with the help of stdlib's venv that: