profile picture

Managing Python Environments in 2022 (for the 99%)

April 25, 2022 • python best-practices

TL;DR: Install mambaforge, use mamba to install packages, and pin direct dependencies inside your environment.yml.

Pip, venv, virtualenv, pyenv, pipenv, micropipenv, pip-tools, conda, miniconda, mamba, micromamba, poetry, hatch, pdm, pyflow 🤯 These days even the most senior Python developer is confused about all the options to manage environments.

Instead of untangling the whole convolutated mess that is the Python environment management ecosystem by means of a thorough explanation, I decided to go a different route, not unlike another Alexander did in the past:

The Legend of the Gordian Knot

In 333 B.C. Alexander the Great encountered an oxcart tied to a post with a complicated knot that seemed to have no beginning and no end. Legend had it that whoever solved the problem of the knot would rule all of Asia. As his goal was to rule the world, this was a challenge he could not pass up. He studied the knot for several days and could find no way to untie it. Asking “Does it matter how I solve the problem?” he drew his sword and cut the knot, exposing the ends required to untie the knot.

That night there was a great storm and the prophets took it as a sign that Zeus was pleased. Alexander went on to rule not only Asia, but much of the world. (source)

Alexander cuts through the Gordian Knot

So in true Alexandrian fashion, I’ll cut through the mess and

If you’re unsure, use Mambaforge

Some people get really worked up about The Right Way to Manage Environments. If you are happy with your current workflow, by all means, stick to it. But differences aren’t so pronounced as people make them out to be, often due to various misconceptions. You can make most of the tools work for most cases, with most differences boiling down to matters of personal opinion.

For almost everyone else, for the 99% developers, my suggestion is:

Install mambaforge, use mamba to install packages, and pin direct dependencies inside your environment.yml.

Mambaforge is a so-called “conda-forge distribution,” combining mamba (a drop-in replacement for conda) with conda-forge. This gives you access to the following things:

Possible Objections

But Alex, I’m managing dependencies for my Docker image; installing mambaforge will blow up my image size!

No, you’re doing it wrong: use multi-stage builds.

But Alex, I’m developing a Python library that I want to distribute through pypi, and poetry is much better at that.

If you’re creating a (pypi) package, then you’ll want to add something like poetry or hatch to your stack. So if that’s your main concern and you don’t need any of the benefits mambaforge gives you, then it might make sense to ditch mamba and use poetry for everything. Otherwise, I’d stick to my recommendation and just add poetry for its packaging capabilities on top of mambaforge, which will take care of both package and environment management.

Conda Best Practices

There are a few things you should know for managing your environment with conda (mamba) on a day-to-day basis, so keep in mind the following best practices:

An imaginary tale of how we got ourselves into this

To understand how we ended up with so many tools, I think it helps to think in terms of successive improvements on the status quo. Now I don’t know about the actual genesis, so I’m taking some poetic license to sacrifice historical accuracy for the sake of understanding, but you could imagine the progression of tools as follows:


Share on:
❧
profile picture

Alexander Seifert

Hi, I'm Alex and I write this blog. Here you'll find articles and tutorials mostly about Natural Language Processing and related areas.

Follow me on Twitter for updates or contact me.