Track experiments and deploy models in Azure Machine Learning
In this article, learn how to add logging code to your training script with the MLflow API and track the experiment in Azure Machine Learning. You can monitor run metrics, to enhance the model creation process.
This diagram shows that with MLflow Tracking, you track the run metrics of an experiment, and store model artifacts in your Azure Machine Learning workspace:
Prerequisites
Create a new notebook
The Azure Machine Learning and MLFlow SDK are preinstalled on the Data Science Virtual Machine (DSVM). You can access these resources in the azureml_py36_* conda environment. In JupyterLab, select on the launcher and select this kernel:
Set up the workspace
Go to the Azure portal and select the workspace you provisioned as part of the prerequisites. Note the Download config.json configuration file, as shown in the next image. Download this file, and store it in your working directory on the DSVM.
The config file contains workspace name, subscription, etc. information. You don't need to hard code these parameters with this file.
Track DSVM runs
To set the Azure Machine Learning workspace object, add this code to your notebook or script:
import mlflow
from azureml.core import Workspace
ws = Workspace.from_config()
mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
Note
The tracking URI is valid for up to one hour. If you restart your script after some idle time, use the get_mlflow_tracking_uri API to get a new URI.
Load the data
This example uses the diabetes dataset, a well-known small dataset included with scikit-learn. This cell loads the dataset and splits it into random training and testing sets.
from sklearn.datasets import load_diabetes
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
import joblib
X, y = load_diabetes(return_X_y = True)
columns = ['age', 'gender', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
data = {
"train":{"X": X_train, "y": y_train},
"test":{"X": X_test, "y": y_test}
}
print ("Data contains", len(data['train']['X']), "training samples and",len(data['test']['X']), "test samples")
Add tracking
Add experiment tracking using the Azure Machine Learning SDK, and upload a persisted model into the experiment run record. This code sample adds logs, and uploads a model file to the experiment run. The model is also registered in the Azure Machine Learning model registry:
# Get an experiment object from Azure Machine Learning
from azureml.mlflow import register_model
experiment_name = 'experiment_with_mlflow'
mlflow.set_experiment(experiment_name)
with mlflow.start_run():
# Log the algorithm parameter alpha to the run
mlflow.log_param('alpha', 0.03)
# Create, fit, and test the scikit-learn Ridge regression model
regression_model = Ridge(alpha=0.03)
regression_model.fit(data['train']['X'], data['train']['y'])
preds = regression_model.predict(data['test']['X'])
# Output the Mean Squared Error to the notebook and to the run
print('Mean Squared Error is', mean_squared_error(data['test']['y'], preds))
mlflow.log_metric('mse', mean_squared_error(data['test']['y'], preds))
# Save the model
model_file_name = 'model.pkl'
joblib.dump(value = regression_model, filename = model_file_name)
# upload the model file explicitly into artifacts
mlflow.log_artifact(model_file_name)
# register the model
register_model(mlflow.active_run(), 'diabetes_model', 'model.pkl', model_framework="ScikitLearn")
View runs in Azure Machine Learning
You can view the experiment run in Azure Machine Learning studio. Select Experiments in the left-hand menu, and select the 'experiment_with_mlflow'. If you decided to name your experiment differently in the above snippet, select the name that you chose:
The logged Mean Squared Error (MSE) should become visible:
If you select the run, you can view other details, and the selected model, in the Outputs+logs.
Deploy model in Azure Machine Learning
This section describes how to deploy models, trained on a DSVM, to Azure Machine Learning.
Step 1: Create Inference Compute
On the left-hand menu in Azure Machine Learning studio select Compute, as shown in this screenshot:
In the New Inference cluster pane, fill in the details for
- Compute Name
- Kubernetes Service - select create new
- Select the region
- Select the virtual machine size (for the purposes of this tutorial, the default of Standard_D3_v2 is sufficient)
- Cluster Purpose - select Dev-test
- Number of nodes should equal 1
- Network Configuration - Basic
as shown in this screenshot:
Select Create.
Step 2: Deploy no-code inference service
When we registered the model in our code using register_model
, we specified the framework as sklearn. Azure Machine Learning supports no code deployments for these frameworks:
- scikit-learn
- Tensorflow SaveModel format
- ONNX model format
No-code deployment means that you can deploy straight from the model artifact. You don't need to specify any specific scoring script.
To deploy the diabetes model, go to the left-hand menu in the Azure Machine Learning studio and select Models. Next, select the registered diabetes_model:
Next, select the Deploy button in the model details pane:
The model will deploy to the Inference Cluster (Azure Kubernetes Service) we created in step 1. Provide a name for the service, and the name of the AKS compute cluster (created in step 1), to fill in the details. We also recommend that you increase the CPU reserve capacity from 0.1 to 1, and the Memory reserve capacity from 0.5 to 1. Select Advanced and fill in the details to set this increase. Then select Deploy, as shown in this screenshot:
Step 3: Consume
When the model successfully deploys, select Endpoints from the left-hand menu, then select the name of the deployed service. The model details pane should become visible, as shown in this screenshot:
The deployment state should change from transitioning to healthy. Additionally, the details section provides the REST endpoint and Swagger URLs that application developers can use to integrate your ML model into their apps.
You can test the endpoint with Postman, or you can use the Azure Machine Learning SDK:
APPLIES TO: Python SDK azureml v1
from azureml.core import Webservice
import json
# if you called your service differently then change the name below
service = Webservice(ws, name="diabetes-service")
input_payload = json.dumps({
'data': X_test[0:2].tolist(),
'method': 'predict' # If you have a classification model, you can get probabilities by changing this to 'predict_proba'.
})
output = service.run(input_payload)
print(output)
Step 4: Clean up
Delete the Inference Compute you created in Step 1, to avoid ongoing compute charges. To do this, on the left-hand menu in the Azure Machine Learning studio, select Compute > Inference Clusters > Select the specific Inference Compute Resource > Delete.
Next Steps
- Learn more about deploying models in Azure Machine Learning