Building wheels with pip
is getting slower
I was fiddling around with my git-squash utility and after making it Python 3 friendly, I observed building a wheel got slower after adding a pyproject.toml
to the project. With pip
21.1.2
and Python 3.9.4
on macOS I measured the time it took to build a wheel with and without a pyproject.toml
in the root. All of the results below are from the hyperfine benchmarking tool.
Without a pyproject.toml
file it takes about 3 seconds:
Adding the contents of a pyproject.toml
file with the defaults prescribed in PEP 518 results in a pyproject.toml
file that looks like:
With this file in the root of the package it takes about 6 seconds to build the package:
Considering git-squash
is a package with exactly one file, building a wheel should be very straight forward and fast. Almost doubling the time is very surprising.
The results were so alarming that I ended up filing a GitHub issue against pip. It turns out I am not the first person to observe that PEP 518 style builds with pip
are a lot slower than the world before.
It’s really concerning because using a pyproject.toml
file is the future of Python packaging and allows the ecosystem to move beyond setuptools
as the default ‘backend’ for building wheels and pip
as the default ‘frontend’ for invoking setuptools. Further this behaviour will become default for all builds eventually, so even removing a pyproject.toml
file from the root will have no effect in the future.
The rest of this post will dive into some background, why pip is slower and what can be done to speed up pip without giving up the benefits of pyproject.toml
.
Background
pip
and setuptools
are the de-facto Python tools for packaging. pip
implements dependency resolution and can fetch wheels from PyPI or build wheels from source locally. The latter step has a hard dependency on setuptools
.
While pip
and setuptools
can be improved, alternatives to both have emerged. Dependency resolution can be done with pipenv
and poetry
, while wheels can be built with flit
or enscons
. As the ecosystem is growing, PEP 517 and PEP 518 dictate how these tools can interoperate.
Primarily these PEPs allow a package to have a pyproject.toml
file which lists what packages are needed to build them. For example a pyproject.toml
file with:
indicates that the package needs at least version 3 of flit
to build a wheel.
This simple declarative method allows for packages to be built by their preferred mechanism of building wheels by any number of tools. This is a good thing and will allow for competitor to pip
and setuptools
to flourish, hopefully improving the entire ecosystem.
What is pip
Doing?
With the above in mind we can take a look at what pip
is doing. Prior to the invention of pyproject.toml
, pip
assumed that every package had to be built with setuptools
and packages had no way of indicating which version of setuptools
. This means that the user had to install some version of pip
and setuptools
adjacent to each other, and then ask pip
to build a wheel. pip
would use whatever version of setuptools
was available and that was the best we had.
This lead to a few problems which pyproject.toml
fixed. For example, if a package was using features introduced in newer setuptools
it could not indicate a minimum version of setuptools
required, meaning it was possible for the build to fail. Another issue would be that it was possible to publish a package that no one else could build. An example is the setup.py
invoked by setuptools
could import other packages, and unless the author documented what those packages were and how to get them, no one else could build the package.
pip
’s implementation of PEP 518 solves the above problems by creating an empty environment for every package to turn into a wheel. Then it resolves the dependencies listed in the build-system
section of pyproject.toml
and installs them into the just created environment. Since the defaults prescribed in PEP 518 are setuptools
and wheel
without any version constraints, pip
has to resolve those dependencies from scratch by reaching out to PyPI and downloading the latest versions of each. Even having flit > 3
would require pip
to reach out to PyPI to see what the latest flit
release is.
We can see that having a pyproject.toml
in the root of the package and specifying --no-index
to pip wheel
results in failure of building the wheel:
This happens even though I already have setuptools
installed in my local environment. This also means that if I don’t have internet access the build would fail as well.
Speeding pip
up
With PEP 518 solving a lot of issues, I think it’s best to speed pip
up instead. Since the additional time is spent creating clean environments and re-resolving dependencies, we can reduce the work required entirely by avoiding dependency resolution. This can be done by downloading the wheels for the build requirements ahead of time and pointing pip
to this set of wheels while disabling dependency resolution. Fortunately pip
comes with all of the pieces required to do this.
First we can download setuptool
and wheel
wheels to a local directory.
Then when we build the package with the pyproject.toml
that requires setuptools
and wheel
we instruct pip
to not reach out to PyPI and instead only look at the dist/build-wheel
directory for dependencies. This will force pip
to use the exact wheels we downloaded.
This brings down the build just above 5s. This is a little bit is faster then before but not much.
There is also an escape hatch to building an isolated environment every time with the --no-build-isolation
flag. Using it is risky but combining this flag with cached wheels results in the best results:
Careful use of creating fresh virtual environments for builds by hand and the above flag could speed up building wheels immensely.