I was fiddling around with my git-squash utility and after making it Python 3 friendly, I observed building a wheel got slower after adding a
pyproject.toml to the project. With
21.1.2 and Python
3.9.4 on macOS I measured the time it took to build a wheel with and without a
pyproject.toml in the root. All of the results below are from the hyperfine benchmarking tool.
pyproject.toml file it takes about 3 seconds:
Adding the contents of a
pyproject.toml file with the defaults prescribed in PEP 518 results in a
pyproject.toml file that looks like:
With this file in the root of the package it takes about 6 seconds to build the package:
git-squash is a package with exactly one file, building a wheel should be very straight forward and fast. Almost doubling the time is very surprising.
The results were so alarming that I ended up filing a GitHub issue against pip. It turns out I am not the first person to observe that PEP 518 style builds with
pip are a lot slower than the world before.
It’s really concerning because using a
pyproject.toml file is the future of Python packaging and allows the ecosystem to move beyond
setuptools as the default ‘backend’ for building wheels and
pip as the default ‘frontend’ for invoking setuptools. Further this behaviour will become default for all builds eventually, so even removing a
pyproject.toml file from the root will have no effect in the future.
The rest of this post will dive into some background, why pip is slower and what can be done to speed up pip without giving up the benefits of
setuptools are the de-facto Python tools for packaging.
pip implements dependency resolution and can fetch wheels from PyPI or build wheels from source locally. The latter step has a hard dependency on
setuptools can be improved, alternatives to both have emerged. Dependency resolution can be done with
poetry, while wheels can be built with
enscons. As the ecosystem is growing, PEP 517 and PEP 518 dictate how these tools can interoperate.
Primarily these PEPs allow a package to have a
pyproject.toml file which lists what packages are needed to build them. For example a
pyproject.toml file with:
indicates that the package needs at least version 3 of
flit to build a wheel.
This simple declarative method allows for packages to be built by their preferred mechanism of building wheels by any number of tools. This is a good thing and will allow for competitor to
setuptools to flourish, hopefully improving the entire ecosystem.
With the above in mind we can take a look at what
pip is doing. Prior to the invention of
pip assumed that every package had to be built with
setuptools and packages had no way of indicating which version of
setuptools. This means that the user had to install some version of
setuptools adjacent to each other, and then ask
pip to build a wheel.
pip would use whatever version of
setuptools was available and that was the best we had.
This lead to a few problems which
pyproject.toml fixed. For example, if a package was using features introduced in newer
setuptools it could not indicate a minimum version of
setuptools required, meaning it was possible for the build to fail. Another issue would be that it was possible to publish a package that no one else could build. An example is the
setup.py invoked by
setuptools could import other packages, and unless the author documented what those packages were and how to get them, no one else could build the package.
pip’s implementation of PEP 518 solves the above problems by creating an empty environment for every package to turn into a wheel. Then it resolves the dependencies listed in the
build-system section of
pyproject.toml and installs them into the just created environment. Since the defaults prescribed in PEP 518 are
wheel without any version constraints,
pip has to resolve those dependencies from scratch by reaching out to PyPI and downloading the latest versions of each. Even having
flit > 3 would require
pip to reach out to PyPI to see what the latest
flit release is.
We can see that having a
pyproject.toml in the root of the package and specifying
pip wheel results in failure of building the wheel:
This happens even though I already have
setuptools installed in my local environment. This also means that if I don’t have internet access the build would fail as well.
With PEP 518 solving a lot of issues, I think it’s best to speed
pip up instead. Since the additional time is spent creating clean environments and re-resolving dependencies, we can reduce the work required entirely by avoiding dependency resolution. This can be done by downloading the wheels for the build requirements ahead of time and pointing
pip to this set of wheels while disabling dependency resolution. Fortunately
pip comes with all of the pieces required to do this.
First we can download
wheel wheels to a local directory.
Then when we build the package with the
pyproject.toml that requires
wheel we instruct
pip to not reach out to PyPI and instead only look at the
dist/build-wheel directory for dependencies. This will force
pip to use the exact wheels we downloaded.
This brings down the build just above 5s. This is a little bit is faster then before but not much.
There is also an escape hatch to building an isolated environment every time with the
--no-build-isolation flag. Using it is risky but combining this flag with cached wheels results in the best results:
Careful use of creating fresh virtual environments for builds by hand and the above flag could speed up building wheels immensely.