cluster:rules_of_engagement

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
cluster:rules_of_engagement [2023/07/19 14:01] benediktcluster:rules_of_engagement [2023/07/19 14:03] (current) – [1) Do not copy datasets around.] benedikt
Line 10: Line 10:
 The cluster can store an impressive amount of data, but not infinitely. If everyone copied datasets around, we'd fill up the storage space pretty quickly. We also have to be very careful that we know exactly what versions of the datasets we are working with, and where they are stored. The cluster can store an impressive amount of data, but not infinitely. If everyone copied datasets around, we'd fill up the storage space pretty quickly. We also have to be very careful that we know exactly what versions of the datasets we are working with, and where they are stored.
  
 +This does not apply to "processing checkpoints" for example saving a intermediate step while you're processing the data, so that you do not have to start from scratch whenever you make changes to your code
 ===== 2) Do not (try) to hog the system resources. ===== ===== 2) Do not (try) to hog the system resources. =====
 This one can be tricky since this can be a completely involuntary act. For instance, python has a tendency with some packages like numpy to spawn processes to speed up computing.  This one can be tricky since this can be a completely involuntary act. For instance, python has a tendency with some packages like numpy to spawn processes to speed up computing. 
  • cluster/rules_of_engagement.1689775306.txt.gz
  • Last modified: 2023/07/19 14:01
  • by benedikt