Skip to content

Commit 0de3819

Browse files
authored
Updated documentation (#6)
1 parent 9384785 commit 0de3819

28 files changed

+127
-115
lines changed

.github/docs/architecture-guide.md

Lines changed: 22 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# Architecture Guide
22

3-
This example scenario demonstrates how to use Azure Databricks and Azure Kubernetes Service to develop an [ML Ops](https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment) platform for real-time model inference. This solution can manage the end-to-end machine learning life cycle and incorporates important [ML Ops](https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment) principles when developing, deploying, and monitoring machine learning models at scale.
3+
This repository illustrates an end-to-end proof-of-concept scenario that demonstrates how to use Azure Databricks and Azure Kubernetes Service to develop an [MLOps](https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment) platform for online inference workloads. This solution can manage the end-to-end machine learning life cycle and incorporates important [MLOps](https://docs.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment) principles when developing, deploying, and monitoring machine learning models at scale.
4+
5+
This approach can easily be extended to address batch inference workloads and incorporate other useful services when managing APIs at scale such as [Azure API Management](https://docs.microsoft.com/en-us/azure/api-management/api-management-key-concepts).
46

57
## Potential use cases
68

@@ -12,20 +14,20 @@ This approach is best suited for:
1214

1315
## Architecture
1416

17+
A holistic high-level architecture for an MLOps Platform based on the approach outlined in this repository is as follows.
18+
1519
![design](./images/architecture.png)
1620

1721
At a high level, this solution design addresses each stage of the machine learning lifecycle:
1822

19-
- Data Preparation: this includes sourcing, cleaning, and transforming the data for processing and analysis. Data can live in a data lake or data warehouse and be stored in a feature store after it's curated.
20-
- Model Development: this includes core components of the model development process such as experiment tracking and model registration using [MLflow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow/).
21-
- Model Deployment: this includes implementing a CI/CD pipeline to containerize machine learning models as API services. These services will be deployed to Azure Kubernetes clusters for end-users to consume.
22-
- Model Monitoring: this includes monitoring the API performance and model data drift by analyzing log telemetry with Azure Monitor.
23+
- **Data Preparation:** this includes sourcing, cleaning, and transforming the data for processing and analysis. Data can live in a data lake or data warehouse and be stored in a feature store after it's curated.
24+
- **Model Development:** this includes core components of the model development process such as experiment tracking and model registration using [MLFlow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow/).
25+
- **Model Deployment:** this includes implementing a CI/CD pipeline to containerize machine learning models as API services. These services will be deployed to Azure Kubernetes clusters for end-users to consume.
26+
- **Model Monitoring:** this includes monitoring the API performance and model data drift by analyzing log telemetry with Azure Monitor.
2327

2428
> **NOTE:**
2529
>
26-
>- When implementing a [CI/CD pipeline](https://docs.microsoft.com/en-us/azure/architecture/microservices/ci-cd) different tools such as Azure DevOps Pipelines or GitHub Actions can be used.
27-
>- The services covered by this architecture are only a subset of a much larger family of Azure services.
28-
>- Specific business requirements for your analytics use case could require the use of different services or features that are not considered in this design.
30+
> The proof-of-concept that is focused on in this repository and documented in the implementation guide only addresses online (or real-time) inference workloads depicted in the above high-level design. Batch inference workloads are not covered as part of this repository.
2931
3032
## Components
3133

@@ -36,15 +38,21 @@ The following components are used as part of this design:
3638
- [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro): managed and private Docker registry service based on the open-source Docker.
3739
- [Azure Data Lake Gen 2](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction): scalable solution optimized for storing massive amounts of unstructured data.
3840
- [Azure Monitor](https://docs.microsoft.com/en-us/azure/azure-monitor/overview): a comprehensive solution for collecting, analyzing, and acting on telemetry from your workloads.
39-
- [MLflow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow): open-source solution integrated within Databricks for managing the end-to-end machine learning life cycle.
41+
- [MLFlow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow): open-source solution integrated within Databricks for managing the end-to-end machine learning life cycle.
4042
- [Azure DevOps](https://azure.microsoft.com/solutions/devops/) or [GitHub](https://azure.microsoft.com/products/github/): solutions for implementing DevOps practices to enforce automation and compliance with your workload development and deployment pipelines.
4143

44+
> **NOTE:**
45+
>
46+
>- When implementing a [CI/CD pipeline](https://docs.microsoft.com/en-us/azure/architecture/microservices/ci-cd) different tools such as Azure DevOps Pipelines or GitHub Actions can be used.
47+
>- The services covered in this design are only a subset of a much larger family of Azure services.
48+
>- Specific business requirements for your analytics use case could require the use of different services or features that are not considered in this design.
49+
4250
## Considerations
4351

44-
Before implementing this solution some factors you might want to consider, include:
52+
Before implementing this solution some factors you might want to consider, include:
4553

4654
- This solution is designed for teams who require a high degree of customization and have extensive expertise deploying and managing Kubernetes workloads. If your data science team does not have this expertise consider deploying models to another service like [Azure Machine Learning](https://azure.microsoft.com/services/machine-learning).
47-
- The [Machine Learning DevOps Guide](https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/azure-best-practices/ai-machine-learning-mlops#machine-learning-devops-mlops-best-practices-with-azure-machine-learning) presents best practices and learnings on adopting ML operations (ML Ops) in the enterprise with Machine Learning.
55+
- The [Machine Learning DevOps Guide](https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/azure-best-practices/ai-machine-learning-mlops#machine-learning-devops-mlops-best-practices-with-azure-machine-learning) presents best practices and learnings on adopting ML operations (MLOps) in the enterprise with Machine Learning.
4856
- Follow the recommendations and guidelines defined in the [Azure Well-Architected Framework](https://docs.microsoft.com/en-us/azure/architecture/framework) to improve the quality of your Azure solutions.
4957
- When implementing a [CI/CD pipeline](/azure/architecture/microservices/ci-cd) different tools such as Azure Pipelines or GitHub Actions can be used.
5058
- Specific business requirements for your analytics use case could require the use of different services or features that are not considered in this design.
@@ -55,9 +63,9 @@ All services deployed in this solution use a consumption-based pricing model. Th
5563

5664
## Deploy this scenario
5765

58-
A proof-of-concept implementation of this scenario is available at the [ML Ops Platform using Databricks and Kubernetes](https://github.com/nfmoore/databricks-kubernetes-mlops-poc) repository. This sample illustrates:
66+
A proof-of-concept implementation of this scenario is available at the [MLOps Platform using Databricks and Kubernetes](https://github.com/nfmoore/databricks-kubernetes-mlops-poc) repository. This sample illustrates:
5967

60-
- How an ML Flow model can be trained on Databricks.
68+
- How an MLFlow model can be trained on Databricks.
6169
- How to package models as a web service using open-source tools.
6270
- How to deploy to Kubernetes via CI/CD.
6371
- How to monitor API performance and model data drift.
@@ -69,3 +77,4 @@ You may also find these Architecture Center articles useful:
6977
- [Machine Learning Operations maturity model](https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-maturity-model)
7078
- [Team Data Science Process for data scientists](https://docs.microsoft.com/en-us/azure/architecture/data-science-process/overview)
7179
- [Modern analytics architecture with Azure Databricks](https://docs.microsoft.com/en-us/azure/architecture/solution-ideas/articles/azure-databricks-modern-analytics-architecture)
80+
- [Building A Clinical Data Drift Monitoring System With Azure DevOps, Azure Databricks, And MLflow](https://devblogs.microsoft.com/cse/2020/10/29/building-a-clinical-data-drift-monitoring-system-with-azure-devops-azure-databricks-and-mlflow/)
-11.9 KB
Loading
260 KB
Loading
396 KB
Loading
418 KB
Loading
331 KB
Loading
263 KB
Loading
234 KB
Loading
371 KB
Loading
288 KB
Loading

0 commit comments

Comments
 (0)