====== Tracking Machine Learning Experiments with ML Flow ====== How-to-Guide for setting up and connecting to ML Flow, tracking experiments and analysing them. ML Flow Structure: {{:cluster:usage_tips:mlflow_with_sleepvae.png?600|}} ==== Installing ML Flow ==== 1) Install MLflow to lukaenv (conda activate your-custom-environment) pip install mlflow 2) Run the following commands to setup a server on localhost: ARTIFACT_ROOT="file://home/your-username@sleep.ru.is/your-directory" Choose any port number e.g. 5001 (you cannot choose the same port number as other users on the cluster) mlflow server --host 0.0.0.0 --port=5001 --default-artifact-root ARTIFACT_ROOT 3) Run the MLflow version of the autoencoder (or setup your own experimental codes with whatever you want to track; see quick instructions "How to use MLflow" below) 4) Create a tunnel from the cluster localhost:5000 to your own localhost port to view and compare the experiments through your own browser: Run the following code on your computers command prompt to map the localhost ports: ssh -L localhost:5000:130.208.209.4:your-portnumber your-username@130.208.209.4 5) Open localhost:5000 through your browser ==== How to use ML Flow ==== 1) Configure run in the beginning of training code: import mlflow mlflow_tracking_uri = 'http://localhost:your-port' mlflow_experiment = 'test' mlflow.set_tracking_uri(mlflow_tracking_uri) mlflow.set_experiment(mlflow_experiment) 2) Add "with mlflow.start_run():" before your training code 3) Start the tracking Track parameters by putting these lines in the code of your model: mlflow.log_param('testparameter', testparam) Track metrics (can change every epoch): mlflow.log_metric('testparameter', testparam, step=epoch) Track artefacts (figures, codes, etc. need to be saved before logging) using: mlflow.log_artifact('/tmp/test_fig.png')