We recommend running panpipes within a virtual environment to prevent conflicts. In the following, we provide instructions on how to do this using conda, mamba, or python venv.
Note: For installation instructions on Apple machines with M chips, scroll down. If you are working on a Windows machine, please use Windows Subsystem for Linux (WSL) to run panpipes in order to have access to the latest r-base version. Oxford BMRC Rescomp users find additional advice on the installation here.
To run panpipes, we install it in a conda environment with R and python.
Panpipes has a lot of dependencies, so you may want to consider the faster mamba
instead of conda
for installation.
Panpipes can be installed via different methods, either from PyPi
or from the Github repository.
Option 1 describes the installation via PyPi
in a manually configured conda environment (no cloning of the repository necessary).
First, create a conda environment to run panpipes in. For that, we replicate the suggestions made here:
conda config --add channels conda-forge
conda config --set channel_priority strict
# you should remove the strict priority afterwards!
conda search r-base
conda create --name pipeline_env python=3.10 r-base=4.3.0
Next, we activate the environment:
conda activate pipeline_env
Let's first install the R packages:
conda install -c conda-forge r-tidyverse r-optparse r-ggforce r-ggraph r-xtable r-hdf5r r-clustree r-cowplot
Finally, we install panpipes.
You can install it either from PyPi
(shown here) or a nightly version from the Github repository (shown in Option 2):
# Install panpipes from PyPi
pip install panpipes
If you intend to use panpipes for spatial analysis, instead install:
pip install 'panpipes[spatial]'
The extra [spatial]
includes the squidpy
, cell2location
, and tangram-sc
packages.
The packages we support are in active development. We provide installation instructions to deal with the issues encountered, which we test in our github actions
- To install a version of panpipes that has a working MultiVI please install panpipes with
pip install 'panpipes[multivipatch]'
- To install a version of panpipes to run old scvi-tools models in the refmap workflow, please install panpipes with
pip install 'panpipes[refmap_old]'
If you prefer to use the most recent development version, install the nightly panpipes version from the Github repository.
To make the installation easier, we provide a minimal conda config file in pipeline_env.yaml
.
First, clone the panpipes repository and navigate to the root directory of the repository:
git clone https://github.com/DendrouLab/panpipes.git
cd panpipes
Then, create the conda environment and install the nightly version of panpipes using the following command:
conda env create --file=pipeline_env.yaml
conda activate pipeline_env
pip install -e .
To install the spatial dependencies from the repo, use
pip install .[spatial]
The packages we support are in active development. We provide installation instructions to deal with the issues encountered, which we test in our github actions
- To install a version of panpipes that has a working MultiVI please install panpipes with
pip install '.[multivipatch]'
- To install a version of panpipes to run old scvi-tools models in the refmap workflow, please install panpipes with
pip install '.[refmap_old]'
Panpipes requires the unix package time
.
You can check if it installed with dpkg-query -W time
.
If time
is not already installed, you can install it using:
conda install time
or
apt-get install time
As an alternative to a conda environment, you can also install panpipes in a python virtual environment.
Navigate to where you want to create your virtual environment and follow the steps below to create a pip
virtual environment.
# Create a panpipes/venv folder
python3 -m venv --prompt=panpipes python3-venv-panpipes/
Activate the environment:
source python3-venv-panpipes/bin/activate
As explained above, you can install panpipes from PyPi
with:
pip install panpipes
Alternatively, you can install a nightly version of panpipes by cloning the Github repository (see instructions above).
If you are using a venv virtual environment, the pipeline will call a local R installation, so make sure R is installed and install the required packages with the command we provide below.
(This executable requires that you specify a CRAN mirror in your .Rprofile
).
for example, add this line to your .Rprofile
to automatically fetch the preferred mirror:
Note: Remember to customise with your preferred R mirror.
options(repos = c(CRAN="https://cran.uni-muenster.de/"))
Now, to automatically install the R dependecies, run:
panpipes install_r_dependencies
If you want more control over your installation use the script on github.
Running with the option --vanilla
or --no-site-file
prevents R from reading your .Renvironment
or .Rprofile
in case you want to use different settings from you local R installation.
You can expect the installation of R libraries to take quite some time, this is not something related to panpipes
but how R manages their libraries and dependencies!
To check the installation was successful, run the following line:
panpipes --help
A list of available pipelines should appear!
If you intend to install panpipes via conda on a macOS machine with M-Chip, you might face issues when installing or using certain workflows of panpipes.
This is because panpipes relies on scvi-tools
, which currently only supports execution on Apple Silicon machines when installed using a native Python version (due to a dependency on JAX).
Follow these steps to install panpipes on an Apple Silicon machine:
-
Install Homebrew
-
Install Apple Silicon version of Mambaforge (If you already have Anaconda/Miniconda installed, make sure having both mamba and conda won't cause conflicts). Additionally, we need clang which is included in llvm, so we install that as well:
brew install --cask mambaforge
brew install llvm
- Create a new environment using mamba (here with python 3.10) and activate it:
conda config --add channels conda-forge
conda config --set channel_priority strict
# you should remove the strict priority afterwards!
mamba search r-base
mamba create --name pipeline_env
mamba activate pipeline_env
- Add the osx-64 channel to the environment, then install Python and R Because not all R packages are available via the ARM64 channel, we need to specify the osx-64 channel to install all required R packages:
conda config --env --set subdir osx-64
mamba install python=3.10 r-base=4.3.0
- Install R dependencies and panpipes itself:
mamba install -c conda-forge r-tidyverse r-optparse r-ggforce r-ggraph r-xtable r-hdf5r r-clustree r-cowplot
pip install panpipes # or install the nightly version by cloning the repository as described above
- Ensure the
time
package is installed If thetime
package is not already installed, you can install it using:
mamba install time
You're all set to run panpipes
on your local machine.
Note: If you encounter an issue with the
jax
dependency when running panpipes, try reinstalling it from source:pip uninstall jax jaxlib mamba install -c conda-forge jaxlib mamba install -c conda-forge jax
If you want to configure it on an HPC server, follow the instructions in the following section.
This section is for users who want to run panpipes on a High-Performance Computing (HPC) cluster with a job scheduler like SGE or SLURM.
Note: You only need this configuration step if you want to use an HPC to dispatch individual task as separate parallel jobs. You won't need this for a local installation of panpipes.
Create a yml file for the cgat core pipeline software to read
vim ~/.cgat.yml
For SGE servers:
cluster:
queue_manager: sge
queue: short
condaenv:
For Slurm servers:
cluster:
queue_manager: slurm
queue: short
There are likely other options that relate to your specific server. These are added as:
cluster:
queue_manager: slurm
queue: short
options: --qos=xxx --exclude=compute-node-0[02-05,08-19],compute-node-010-0[05,07,35,37,51,64,68-71]
See cgat-core documentation for cluster specific additional configuration instructions.
Note that there is extra information on the .cgat.yml for Oxford BMRC rescomp users in docs/installation_rescomp
if echo $DRMAA_LIBRARY_PATH
does not return anything, add DRMAA_LIBRARY_PATH to your bash environment (the line below is specific to the rescomp server)
You need to find the path to this libdrmaa.so.1.0 file on your server, ask your sys admin if you cannot find it!
PATH_TO_DRMAA = ""
echo "export DRMAA_LIBRARY_PATH=$PATH_TO/libdrmaa.so.1.0" >> ~/.bashrc
If using Conda environments, you can use one single big environment (the instructions provided do just that) or create one for each of the workflows in panpipes, (i.e. one workflow = one environment). The environment (s) should be specified in the .cgat.yml global configuration file or in each of the single workflows pipeline.yml configuration files, and it will be picked up by the pipeline as the default environment. Please note that if you specify the Conda environment in the workflows configuration file this will be the first choice to run the pipeline.
If no environment is specified, the default behaviour of the pipeline is to inherit environment variables from the node where the pipeline is run. However there have been reported issues on SLURM clusters where this was not the default behaviour. In such instances we recommend adding the Conda environment param in the .cgat.yml file or in each of the pipeline.yml independently.
cluster:
queue_manager: slurm
queue: cpu_p
options: --qos=xxx --exclude=compute-node-0[02-05,08-19],compute-node-010-0[05,07,35,37,51,64,68-71]
condaenv: /path/to/pipeline_env
or:
# ----------------------- #
# Visualisation pipeline DendrouLab
# ----------------------- #
# written by Charlotte Rich-Griffin and Fabiola Curion
# WARNING: Do not edit any line with the format `continuous_vars: &continuous_vars` or `continuous_vars: *continuous_vars`
# ------------------------
# compute resource options
# ------------------------
resources:
# Number of threads used for parallel jobs
# this must be enough memory to load your mudata and do computationally intensive tasks
threads_high: 1
# this must be enough memory to load your mudata and do computationally light tasks
threads_medium: 1
# this must be enough memory to load text files and do plotting, requires much less memory than the other two
threads_low: 1
# path to conda env, leave blank if running native or your cluster automatically inherits the login node environment
condaenv: /path/to/env