You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .github/docs/architecture-guide.md
+22-13Lines changed: 22 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,8 @@
1
1
# Architecture Guide
2
2
3
-
This example scenario demonstrates how to use Azure Databricks and Azure Kubernetes Service to develop an [ML Ops](https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment) platform for real-time model inference. This solution can manage the end-to-end machine learning life cycle and incorporates important [ML Ops](https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment) principles when developing, deploying, and monitoring machine learning models at scale.
3
+
This repository illustrates an end-to-end proof-of-concept scenario that demonstrates how to use Azure Databricks and Azure Kubernetes Service to develop an [MLOps](https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment) platform for online inference workloads. This solution can manage the end-to-end machine learning life cycle and incorporates important [MLOps](https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment) principles when developing, deploying, and monitoring machine learning models at scale.
4
+
5
+
This approach can easily be extended to address batch inference workloads and incorporate other useful services when managing APIs at scale such as [Azure API Management](https://docs.microsoft.com/en-us/azure/api-management/api-management-key-concepts).
4
6
5
7
## Potential use cases
6
8
@@ -12,20 +14,20 @@ This approach is best suited for:
12
14
13
15
## Architecture
14
16
17
+
A holistic high-level architecture for an MLOps Platform based on the approach outlined in this repository is as follows.
18
+
15
19

16
20
17
21
At a high level, this solution design addresses each stage of the machine learning lifecycle:
18
22
19
-
- Data Preparation: this includes sourcing, cleaning, and transforming the data for processing and analysis. Data can live in a data lake or data warehouse and be stored in a feature store after it's curated.
20
-
- Model Development: this includes core components of the model development process such as experiment tracking and model registration using [MLflow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow/).
21
-
- Model Deployment: this includes implementing a CI/CD pipeline to containerize machine learning models as API services. These services will be deployed to Azure Kubernetes clusters for end-users to consume.
22
-
- Model Monitoring: this includes monitoring the API performance and model data drift by analyzing log telemetry with Azure Monitor.
23
+
-**Data Preparation:** this includes sourcing, cleaning, and transforming the data for processing and analysis. Data can live in a data lake or data warehouse and be stored in a feature store after it's curated.
24
+
-**Model Development:** this includes core components of the model development process such as experiment tracking and model registration using [MLFlow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow/).
25
+
-**Model Deployment:** this includes implementing a CI/CD pipeline to containerize machine learning models as API services. These services will be deployed to Azure Kubernetes clusters for end-users to consume.
26
+
-**Model Monitoring:** this includes monitoring the API performance and model data drift by analyzing log telemetry with Azure Monitor.
23
27
24
28
> **NOTE:**
25
29
>
26
-
>- When implementing a [CI/CD pipeline](https://docs.microsoft.com/en-us/azure/architecture/microservices/ci-cd) different tools such as Azure DevOps Pipelines or GitHub Actions can be used.
27
-
>- The services covered by this architecture are only a subset of a much larger family of Azure services.
28
-
>- Specific business requirements for your analytics use case could require the use of different services or features that are not considered in this design.
30
+
> The proof-of-concept that is focused on in this repository and documented in the implementation guide only addresses online (or real-time) inference workloads depicted in the above high-level design. Batch inference workloads are not covered as part of this repository.
29
31
30
32
## Components
31
33
@@ -36,15 +38,21 @@ The following components are used as part of this design:
36
38
-[Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro): managed and private Docker registry service based on the open-source Docker.
37
39
-[Azure Data Lake Gen 2](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction): scalable solution optimized for storing massive amounts of unstructured data.
38
40
-[Azure Monitor](https://docs.microsoft.com/en-us/azure/azure-monitor/overview): a comprehensive solution for collecting, analyzing, and acting on telemetry from your workloads.
39
-
-[MLflow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow): open-source solution integrated within Databricks for managing the end-to-end machine learning life cycle.
41
+
-[MLFlow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow): open-source solution integrated within Databricks for managing the end-to-end machine learning life cycle.
40
42
-[Azure DevOps](https://azure.microsoft.com/solutions/devops/) or [GitHub](https://azure.microsoft.com/products/github/): solutions for implementing DevOps practices to enforce automation and compliance with your workload development and deployment pipelines.
41
43
44
+
> **NOTE:**
45
+
>
46
+
>- When implementing a [CI/CD pipeline](https://docs.microsoft.com/en-us/azure/architecture/microservices/ci-cd) different tools such as Azure DevOps Pipelines or GitHub Actions can be used.
47
+
>- The services covered in this design are only a subset of a much larger family of Azure services.
48
+
>- Specific business requirements for your analytics use case could require the use of different services or features that are not considered in this design.
49
+
42
50
## Considerations
43
51
44
-
Before implementing this solution some factors you might want to consider, include:
52
+
Before implementing this solution some factors you might want to consider, include:
45
53
46
54
- This solution is designed for teams who require a high degree of customization and have extensive expertise deploying and managing Kubernetes workloads. If your data science team does not have this expertise consider deploying models to another service like [Azure Machine Learning](https://azure.microsoft.com/services/machine-learning).
47
-
- The [Machine Learning DevOps Guide](https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/azure-best-practices/ai-machine-learning-mlops#machine-learning-devops-mlops-best-practices-with-azure-machine-learning) presents best practices and learnings on adopting ML operations (ML Ops) in the enterprise with Machine Learning.
55
+
- The [Machine Learning DevOps Guide](https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/azure-best-practices/ai-machine-learning-mlops#machine-learning-devops-mlops-best-practices-with-azure-machine-learning) presents best practices and learnings on adopting ML operations (MLOps) in the enterprise with Machine Learning.
48
56
- Follow the recommendations and guidelines defined in the [Azure Well-Architected Framework](https://docs.microsoft.com/en-us/azure/architecture/framework) to improve the quality of your Azure solutions.
49
57
- When implementing a [CI/CD pipeline](/azure/architecture/microservices/ci-cd) different tools such as Azure Pipelines or GitHub Actions can be used.
50
58
- Specific business requirements for your analytics use case could require the use of different services or features that are not considered in this design.
@@ -55,9 +63,9 @@ All services deployed in this solution use a consumption-based pricing model. Th
55
63
56
64
## Deploy this scenario
57
65
58
-
A proof-of-concept implementation of this scenario is available at the [ML Ops Platform using Databricks and Kubernetes](https://github.com/nfmoore/databricks-kubernetes-mlops-poc) repository. This sample illustrates:
66
+
A proof-of-concept implementation of this scenario is available at the [MLOps Platform using Databricks and Kubernetes](https://github.com/nfmoore/databricks-kubernetes-mlops-poc) repository. This sample illustrates:
59
67
60
-
- How an ML Flow model can be trained on Databricks.
68
+
- How an MLFlow model can be trained on Databricks.
61
69
- How to package models as a web service using open-source tools.
62
70
- How to deploy to Kubernetes via CI/CD.
63
71
- How to monitor API performance and model data drift.
@@ -69,3 +77,4 @@ You may also find these Architecture Center articles useful:
-[Team Data Science Process for data scientists](https://docs.microsoft.com/en-us/azure/architecture/data-science-process/overview)
71
79
-[Modern analytics architecture with Azure Databricks](https://docs.microsoft.com/en-us/azure/architecture/solution-ideas/articles/azure-databricks-modern-analytics-architecture)
80
+
-[Building A Clinical Data Drift Monitoring System With Azure DevOps, Azure Databricks, And MLflow](https://devblogs.microsoft.com/cse/2020/10/29/building-a-clinical-data-drift-monitoring-system-with-azure-devops-azure-databricks-and-mlflow/)
0 commit comments