Show pageOld revisionsBacklinksBack to top This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Tracking Machine Learning Experiments with ML Flow ====== How-to-Guide for setting up and connecting to ML Flow, tracking experiments and analysing them. ML Flow Structure: {{:cluster:usage_tips:mlflow_with_sleepvae.png?600|}} ==== Installing ML Flow ==== 1) Install MLflow to lukaenv (conda activate your-custom-environment) <code> pip install mlflow </code> 2) Run the following commands to setup a server on localhost: <code> ARTIFACT_ROOT="file://home/your-username@sleep.ru.is/your-directory" </code> Choose any port number e.g. 5001 (you cannot choose the same port number as other users on the cluster) <code> mlflow server --host 0.0.0.0 --port=5001 --default-artifact-root ARTIFACT_ROOT </code> 3) Run the MLflow version of the autoencoder (or setup your own experimental codes with whatever you want to track; see quick instructions "How to use MLflow" below) 4) Create a tunnel from the cluster localhost:5000 to your own localhost port to view and compare the experiments through your own browser: Run the following code on your computers command prompt to map the localhost ports: <code> ssh -L localhost:5000:130.208.209.4:your-portnumber your-username@130.208.209.4 </code> 5) Open localhost:5000 through your browser ==== How to use ML Flow ==== 1) Configure run in the beginning of training code: <code> import mlflow mlflow_tracking_uri = 'http://localhost:your-port' mlflow_experiment = 'test' mlflow.set_tracking_uri(mlflow_tracking_uri) mlflow.set_experiment(mlflow_experiment) </code> 2) Add "with mlflow.start_run():" before your training code 3) Start the tracking Track parameters by putting these lines in the code of your model: <code> mlflow.log_param('testparameter', testparam) </code> Track metrics (can change every epoch): <code> mlflow.log_metric('testparameter', testparam, step=epoch) </code> Track artefacts (figures, codes, etc. need to be saved before logging) using: <code> mlflow.log_artifact('/tmp/test_fig.png') </code> cluster/usage_tips/mlflow.txt Last modified: 2022/11/28 05:48by matiasrus