Table of Contents

Here are the rules of engagement for the cluster

The cluster is a very versatile tool that can help us work together to achieve miraculous goals. But as with all shared resources, there must be certain rules that apply to all of us to make the cooperation possible, and as headache-free as possible.

0) Do not download data.

The cluster is for the express purpose of handling access to and the processing of the Sleep Revolution data. Therefore you should under no circumstances attempt to download data from the cluster. Failure to heed that rule will result in an instant ban from the cluster and a review of your activities on the Sleep Revolution infrastructure.

1) Do not copy datasets around.

The cluster can store an impressive amount of data, but not infinitely. If everyone copied datasets around, we'd fill up the storage space pretty quickly. We also have to be very careful that we know exactly what versions of the datasets we are working with, and where they are stored.

This does not apply to “processing checkpoints” for example saving a intermediate step while you're processing the data, so that you do not have to start from scratch whenever you make changes to your code

2) Do not (try) to hog the system resources.

This one can be tricky since this can be a completely involuntary act. For instance, python has a tendency with some packages like numpy to spawn processes to speed up computing.

Processing power is like all else, not an infinite resource, and we must be careful to not hog all the resources.

3) Do not try to "cheat the system"

If there is any facet of the cluster that seems off, or counterintuitive to you, please do not try to hack around with it. The clusters configuration is very specific and tailored to the work we do in the Sleep Revolution.