Building wheels with pip is getting slower

Created On:

I was fiddling around with my git-squash utility and after making it Python 3 friendly, I observed building a wheel got slower after adding a pyproject.toml to the project. With pip 21.1.2 and Python 3.9.4 on macOS I measured the time it took to build a wheel with and without a pyproject.toml in the root. All of the results below are from the hyperfine benchmarking tool.

Without a pyproject.toml file it takes about 3 seconds:

Benchmark #1: pip wheel -w dist/ --only-binary ":all:" --no-deps .
  Time (mean ± σ):      3.082 s ±  0.178 s    [User: 1.228 s, System: 1.764 s]
  Range (min … max):    2.909 s …  3.383 s    10 runs

Adding the contents of a pyproject.toml file with the defaults prescribed in PEP 518 results in a pyproject.toml file that looks like:

[build-system]
requires = ["setuptools", "wheel"]

With this file in the root of the package it takes about 6 seconds to build the package:

Benchmark #1: pip wheel -w dist/ --only-binary ":all:" --no-deps .
  Time (mean ± σ):      5.987 s ±  0.037 s    [User: 3.807 s, System: 2.035 s]
  Range (min … max):    5.939 s …  6.067 s    10 runs

Considering git-squash is a package with exactly one file, building a wheel should be very straight forward and fast. Almost doubling the time is very surprising.

The results were so alarming that I ended up filing a GitHub issue against pip. It turns out I am not the first person to observe that PEP 518 style builds with pip are a lot slower than the world before.

It’s really concerning because using a pyproject.toml file is the future of Python packaging and allows the ecosystem to move beyond setuptools as the default ‘backend’ for building wheels and pip as the default ‘frontend’ for invoking setuptools. Further this behaviour will become default for all builds eventually, so even removing a pyproject.toml file from the root will have no effect in the future.

The rest of this post will dive into some background, why pip is slower and what can be done to speed up pip without giving up the benefits of pyproject.toml.

Background

pip and setuptools are the de-facto Python tools for packaging. pip implements dependency resolution and can fetch wheels from PyPI or build wheels from source locally. The latter step has a hard dependency on setuptools.

While pip and setuptools can be improved, alternatives to both have emerged. Dependency resolution can be done with pipenv and poetry, while wheels can be built with flit or enscons. As the ecosystem is growing, PEP 517 and PEP 518 dictate how these tools can interoperate.

Primarily these PEPs allow a package to have a pyproject.toml file which lists what packages are needed to build them. For example a pyproject.toml file with:

[build-system]
requires = ["flit > 3"]

indicates that the package needs at least version 3 of flit to build a wheel.

This simple declarative method allows for packages to be built by their preferred mechanism of building wheels by any number of tools. This is a good thing and will allow for competitor to pip and setuptools to flourish, hopefully improving the entire ecosystem.

What is pip Doing?

With the above in mind we can take a look at what pip is doing. Prior to the invention of pyproject.toml, pip assumed that every package had to be built with setuptools and packages had no way of indicating which version of setuptools. This means that the user had to install some version of pip and setuptools adjacent to each other, and then ask pip to build a wheel. pip would use whatever version of setuptools was available and that was the best we had.

This lead to a few problems which pyproject.toml fixed. For example, if a package was using features introduced in newer setuptools it could not indicate a minimum version of setuptools required, meaning it was possible for the build to fail. Another issue would be that it was possible to publish a package that no one else could build. An example is the setup.py invoked by setuptools could import other packages, and unless the author documented what those packages were and how to get them, no one else could build the package.

pip’s implementation of PEP 518 solves the above problems by creating an empty environment for every package to turn into a wheel. Then it resolves the dependencies listed in the build-system section of pyproject.toml and installs them into the just created environment. Since the defaults prescribed in PEP 518 are setuptools and wheel without any version constraints, pip has to resolve those dependencies from scratch by reaching out to PyPI and downloading the latest versions of each. Even having flit > 3 would require pip to reach out to PyPI to see what the latest flit release is.

We can see that having a pyproject.toml in the root of the package and specifying --no-index to pip wheel results in failure of building the wheel:

$ pip wheel -w dist/ --only-binary ":all:" --no-deps --no-index .

...

  ERROR: Could not find a version that satisfies the requirement setuptools (from versions: none
)
  ERROR: No matching distribution found for setuptools

This happens even though I already have setuptools installed in my local environment. This also means that if I don’t have internet access the build would fail as well.

Speeding pip up

With PEP 518 solving a lot of issues, I think it’s best to speed pip up instead. Since the additional time is spent creating clean environments and re-resolving dependencies, we can reduce the work required entirely by avoiding dependency resolution. This can be done by downloading the wheels for the build requirements ahead of time and pointing pip to this set of wheels while disabling dependency resolution. Fortunately pip comes with all of the pieces required to do this.

First we can download setuptool and wheel wheels to a local directory.

$ cat requirements-build.txt
wheel==0.36.2
setuptools==57.0.0
$ pip download -r requirements-build.txt -d dist/build-wheels --only-binary ':all:'
...
$ ls dist/build-wheels
setuptools-57.0.0-py3-none-any.whl wheel-0.36.2-py2.py3-none-any.whl

Then when we build the package with the pyproject.toml that requires setuptools and wheel we instruct pip to not reach out to PyPI and instead only look at the dist/build-wheel directory for dependencies. This will force pip to use the exact wheels we downloaded.

$ pip wheel -w dist/ --only-binary ":all:" --no-deps --no-index --find-links ./dist/build-whe
els .
...
Successfully built git-squash

This brings down the build just above 5s. This is a little bit is faster then before but not much.

Benchmark #1: pip wheel -w dist/ --only-binary ":all:" --no-deps --no-index --find-links ./dist/
build-wheels .
  Time (mean ± σ):      5.394 s ±  0.043 s    [User: 3.345 s, System: 1.997 s]
  Range (min … max):    5.335 s …  5.465 s    10 runs

There is also an escape hatch to building an isolated environment every time with the --no-build-isolation flag. Using it is risky but combining this flag with cached wheels results in the best results:

Benchmark #1: pip wheel -w dist/ --only-binary ":all:" --no-deps --no-index --find-links ./dist/
build-wheels --no-build-isolation .
  Time (mean ± σ):      2.868 s ±  0.071 s    [User: 1.178 s, System: 1.653 s]
  Range (min … max):    2.787 s …  3.034 s    10 runs

Careful use of creating fresh virtual environments for builds by hand and the above flag could speed up building wheels immensely.