Accessing software
Because different software packages sometimes require conflicting dependencies, we will handle installing software. You will then access (i.e, load/initialize) software in your job scripts using one of the following mechanisms:
- Modules
- Conda environments
Here we will briefly describe each of these mechanisms.
Modules
Environment Modules (or more commonly just modules) is a system for dynamically loading software packages using a command line tool called module
.
Using modules in a job script
If your software is accessible via a module, you just add something like the following to your job script:
# Load newest version of a module
module load moduleName
# Load a specific version of a module
module load moduleName/version
We recommend that in your job script you clear any loaded modules before loading other ones. This can be done using module purge
For example:
module load heimdall
# RUN HEIMDALL
module purge
module load miniconda
# DO SOMETHING WITH CONDA
Some modules require some additional lines to be added to the job script in order to use them
- Miniconda: add
eval "$(conda shell.bash hook)"
- Gaussian16: add (including the period at the beginning):
. $g16root/g16/bsd/g16.profile
Showing available modules
You can see what modules are available on the cluster using module avail
The output will look something like:
------------------------------------------- /usr/share/modulefiles -------------------------------------------
pmi/pmix-x86_64
---------------------------------------------- /opt/modulefiles ----------------------------------------------
gcc/8.3.1 heimdall mpich/3.3.1-gcc-8.4.1 (D) openmpi/3.1.4-gcc-8.4.1
gcc/8.4.1 miniconda mvapich2/2.3.2-gcc-8.3.1 openmpi/4.0.2-gcc-8.3.1
gcc/8.5.0 (D) mpich/3.3.1-gcc-8.3.1 openmpi/3.1.4-gcc-8.3.1 openmpi/4.0.2-gcc-8.4.1 (D)
Where:
D: Default Module
Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
In this output there are 7 different modules:
- pmi
- gcc
- heimdall
- miniconda
- mpich
- mvapich
- openmpi
The (D) besides a module specifies the default version that will be loaded using module load moduleName
. For example, module load gcc
will load GCC version 8.5 while module load gcc/8.3.1
loads version 8.3.1 of GCC
Conda environments
Conda is an open-source package and environment management system that was orginally designed for Python software management. Most, but not all, of the scientific software loaded through Conda is written in Python.
Using Conda in a job script
The cluster uses Miniconda for software management which must be loaded using the module
command (as described above) before it can used. Once loaded, you can activate an environment as:
conda activate environmentName
replacing environmentName
with the actual name of the environment (e.g. conda activate neuron
) Once the environment is activated, in your job script you run your code as you normally would.
You can deactivate the current Conda enviroment using the command conda deactivate
(you do not need to use the environment name).
Only a single Conda environemnt can be active at one time.
If you try to activate another one using conda activate
, the current environment will be deactivated meaning its software will not be available to run.
Showing available Conda environments
Assuming Conda has been activated as described above, then you can list out all environments using conda env list
. It will output something like the following:
# conda environments:
#
base * /opt/apps/miniconda3
cudatoolkit /opt/apps/miniconda3/envs/cudatoolkit
fetch /opt/apps/miniconda3/envs/fetch
neuron /opt/apps/miniconda3/envs/neuron
psrchive /opt/apps/miniconda3/envs/psrchive
pysigproc /opt/apps/miniconda3/envs/pysigproc
The left column is the environment name. That is the environment name you will use with the conda activate
command. The right column is the path (i.e., folder) where that environment is located, but the actual location of the Conda environment is not important. What matters most for your job scripts is the environment name.