This is an old revision of the document!
Getting started with the Sleep Revolution slurm cluster
Slurm is a workload manager,scheduler system. It's build from computer nodes, 1 node is the Login Node, a master node and worker node's
The login node is called cheetah.sleep.ru.is 130.208.209.30
.
The cluster is a linux environment, connect with SSH
Open a terminal to log-in ssh <myUser>@cheetah.sleep.ru.is
About the user environment :
Each user has his home folder, the path is [/mount/home/<username>]
The home should have 2 files, a readMe.lst file and template (slurm) script file (exampleJob.sh)
- The first step for users is to make sure their working environment fits their needs.
For those who are using Python, we recommend, for ease of use, using a Python virtual environment
User can then install different version of Python and all necessary modules for their work.
- However user can setup and install their Python modules locally without a Python virtual environment, by using Pip3 install <Module>
- Every user needs to use a script file for instructing the cluster what to do
- The cluster gives users access to memory, cpu and GPU. To execute your job on the cluster, first update exampleJob.sh to run your code (or make a copy)
Then submit your job to the queue with command sbatch exampleJob.sh
Your template script file (exampleJob.sh) looks like this
#!/bin/bash
#SBATCH --account=staff
#SBATCH --job-name=sleepJob
#SBATCH --gpus-per-node=1
#SBATCH --mem-per-cpu=2G
#SBATCH --output=Slurm.log
Useful commands
cmd | Descr |
---|---|
sbatch <exampleJob.sh> | Submit job |
sacct | My jobs |
squeue | Show all jobs on the queue |
sinfo | Information about the cluster |
srun | Run job interactively |
scancel <id> | kill job with <id> |
Example of Python virtual environment
Python venv
Miniconda
Aaconda