====== Tracking Machine Learning Experiments with ML Flow ======
How-to-Guide for setting up and connecting to ML Flow, tracking experiments and analysing them.
ML Flow Structure:
{{:cluster:usage_tips:mlflow_with_sleepvae.png?600|}}
==== Installing ML Flow ====
1) Install MLflow to lukaenv (conda activate your-custom-environment)
pip install mlflow
2) Run the following commands to setup a server on localhost:
ARTIFACT_ROOT="file://home/your-username@sleep.ru.is/your-directory"
Choose any port number e.g. 5001 (you cannot choose the same port number as other users on the cluster)
mlflow server --host 0.0.0.0 --port=5001 --default-artifact-root ARTIFACT_ROOT
3) Run the MLflow version of the autoencoder (or setup your own experimental codes with whatever you want to track; see quick instructions "How to use MLflow" below)
4) Create a tunnel from the cluster localhost:5000 to your own localhost port to view and compare the experiments through your own browser:
Run the following code on your computers command prompt to map the localhost ports:
ssh -L localhost:5000:130.208.209.4:your-portnumber your-username@130.208.209.4
5) Open localhost:5000 through your browser
==== How to use ML Flow ====
1) Configure run in the beginning of training code:
import mlflow
mlflow_tracking_uri = 'http://localhost:your-port'
mlflow_experiment = 'test'
mlflow.set_tracking_uri(mlflow_tracking_uri)
mlflow.set_experiment(mlflow_experiment)
2) Add "with mlflow.start_run():" before your training code
3) Start the tracking
Track parameters by putting these lines in the code of your model:
mlflow.log_param('testparameter', testparam)
Track metrics (can change every epoch):
mlflow.log_metric('testparameter', testparam, step=epoch)
Track artefacts (figures, codes, etc. need to be saved before logging) using:
mlflow.log_artifact('/tmp/test_fig.png')