Build/manage Python and R virtual environments with conda

Before working with conda environments, start by installing conda.

Note

We recommend using mamba as a drop-in replacement for creating conda environments or installing packages. You will need to have mamba installed first by running:

conda install -n base -c conda-forge mamba

After installing conda, the only environment will be the default called base. You can install anything you like in base but we recommend limiting this to only the most essential commonly used tools you use because the larger an environment gets the more complicated the dependencies become.

List available environments

conda env list

# Example output
# conda environments:
#
# base                  *  /home/user/miniconda3
# datasci                  /home/user/miniconda3/envs/datasci
# plantcv                  /home/user/miniconda3/envs/plantcv

The star indicates the current active environment, which you can also see listed next to your command-line prompt, for example: (base) [user@stargate ~]$

Activate an environment

Activating an environment alters your shell session so that your active paths are set to the environment executable directory and libraries.

Activate the datasci environment in the example above:

conda activate datasci

You should note the change to your command-line prompt: (datasci) [user@stargate ~]$

Deactivate and environment

You can turn off an environment by deactivating the currently activate environment.

conda deactivate

Deleting an environment

Environments can be removed, in the example below we delete the datasci environment:

conda env remove -n datasci

Working with conda packages

Packages installed in a conda environment are downloaded from the Anaconda package repository. conda packages are maintained by a variety of sources, including core packages in the default channel maintained by Anaconda, and commonly used community channels conda-forge and bioconda. However, regardless of channel, these packages are all findable through the search interface at Anaconda. For example, if you wanted to install the R package ggplot2 and you search for it, you will find that the package name is r-ggplot2 and it is available from the conda-forge channel (it is also available in other channels, but the versions are older).

Install one or more packages in the active environment

When working in an environment, you can install additional packages. For example:

A new Python package:

mamba install -c conda-forge plantcv

A couple R packages:

mamba install -c conda-forge r-ggplot2 r-dplyr

A specific version of a program:

mamba install -c bioconda samtools=1.15.1

Note

You can install packages or software in an environment even if there is no available conda package. For example, in Python pip install seaborn would install seaborn from PyPI instead of Anaconda, or in R install.packages("ggplot2") would install ggplot2 from CRAN.

Create a new environment with a command

For relatively simple environments, you can specify all the packages on the command-line:

mamba create -n myenv -c conda-forge -c bioconda 'r-base>=4' r-ggplot2 samtools

Create a new environment from a configuration file

For larger environments and for better documentation and reproducibility, you can build a simple configuration file with environment specifications. The environment files are encoded in YAML, for example environment.yml:

name: myenv
channels:
    - conda-forge
    - bioconda
dependencies:
    - python=3.10
    - matplotlib
    - numpy
    - pandas
    - scipy
    - scikit-image
    - scikit-learn
    - opencv>=4

Then to create the environment:

mamba env create -f environment.yml

Additional examples are available here: /bioinfo/envs/configs/

Using conda environments with HTCondor jobs

See using conda environments in HTCondor.