nfmoore
diff --git a/‎.github/docs/architecture-guide.md‎
Lines changed: 4 additions & 11 deletions b/‎.github/docs/architecture-guide.md‎
Lines changed: 4 additions & 11 deletions
diff --git a/‎.github/docs/implementation-guide.md‎
Lines changed: 4 additions & 10 deletions b/‎.github/docs/implementation-guide.md‎
Lines changed: 4 additions & 10 deletions
diff --git a/‎README.md‎
Lines changed: 5 additions & 12 deletions b/‎README.md‎
Lines changed: 5 additions & 12 deletions
diff --git a/‎infrastructure/main.bicep‎
Lines changed: 38 additions & 0 deletions b/‎infrastructure/main.bicep‎
Lines changed: 38 additions & 0 deletions
@@ -38,7 +38,9 @@ The following components are used as part of this design:
 - [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro): managed and private Docker registry service based on the open-source Docker.
 - [Azure Data Lake Gen 2](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction): scalable solution optimized for storing massive amounts of unstructured data.
 - [Azure Monitor](https://docs.microsoft.com/en-us/azure/azure-monitor/overview): a comprehensive solution for collecting, analyzing, and acting on telemetry from your workloads.
-- [MLFlow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow): open-source solution integrated within Databricks for managing the end-to-end machine learning life cycle.
+- [MLflow](https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow): open-source solution integrated within Databricks for managing the end-to-end machine learning life cycle.
+- [Azure API Management](https://docs.microsoft.com/en-us/azure/api-management/api-management-key-concepts): a fully managed service that enables customers to publish, secure, transform, maintain, and monitor APIs.
+- [Azure Application Gateway](https://docs.microsoft.com/en-us/azure/application-gateway/overview): a web traffic load balancer that enables you to manage traffic to your web applications.
 - [Azure DevOps](https://azure.microsoft.com/solutions/devops/) or [GitHub](https://azure.microsoft.com/products/github/): solutions for implementing DevOps practices to enforce automation and compliance with your workload development and deployment pipelines.
 
 > **NOTE:**
@@ -61,20 +63,11 @@ Before implementing this solution some factors you might want to consider, inclu
 
 All services deployed in this solution use a consumption-based pricing model. The [Azure pricing calculator](https://azure.microsoft.com/pricing/calculator) can be used to estimate costs for a specific scenario. For other considerations, see [Cost Optimization](https://docs.microsoft.com/en-us/azure/architecture/framework/#cost-optimization) in the Well-Architected Framework.
 
-## Deploy this scenario
-
-A proof-of-concept implementation of this scenario is available at the [MLOps Platform using Databricks and Kubernetes](https://github.com/nfmoore/databricks-kubernetes-mlops-poc) repository. This sample illustrates:
-
-- How an MLFlow model can be trained on Databricks.
-- How to package models as a web service using open-source tools.
-- How to deploy to Kubernetes via CI/CD.
-- How to monitor API performance and model data drift.
-
 ## Related resources
 
 You may also find these Architecture Center articles useful:
 
 - [Machine Learning Operations maturity model](https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-maturity-model)
 - [Team Data Science Process for data scientists](https://docs.microsoft.com/en-us/azure/architecture/data-science-process/overview)
 - [Modern analytics architecture with Azure Databricks](https://docs.microsoft.com/en-us/azure/architecture/solution-ideas/articles/azure-databricks-modern-analytics-architecture)
-- [Building A Clinical Data Drift Monitoring System With Azure DevOps, Azure Databricks, And MLflow](https://devblogs.microsoft.com/cse/2020/10/29/building-a-clinical-data-drift-monitoring-system-with-azure-devops-azure-databricks-and-mlflow/)
+- [Building A Clinical Data Drift Monitoring System With Azure DevOps, Azure Databricks, And MLflow](https://devblogs.microsoft.com/cse/2020/10/29/building-a-clinical-data-drift-monitoring-system-with-azure-devops-azure-databricks-and-mlflow/)
@@ -9,7 +9,7 @@
 
 ### 1.1. Create repository
 
-Log in to your GitHub account, navigate to the [databricks-kubernetes-real-time-mlflow-model-deployment-poc](https://github.com/nfmoore/databricks-kubernetes-real-time-mlflow-model-deployment-poc) repository and click `use this template` to create a new repository from this template. Rename the template and leave it public. Use [these](https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/creating-a-repository-from-a-template) instructions for more details about creating a repository from a template.
+Log in to your GitHub account, navigate to the [databricks-kubernetes-mlops-poc](https://github.com/nfmoore/databricks-kubernetes-mlops-poc) repository and click `use this template` to create a new repository from this template. Rename the template and leave it public. Use [these](https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/creating-a-repository-from-a-template) instructions for more details about creating a repository from a template.
 
 ### 1.2. Deploy resources
 
@@ -25,12 +25,6 @@ To deploy the resources for this proof-of-concept in your Azure environment clic
 
 After the resources have been successfully deployed some services need to be configured before you can train, register, deploy and monitor the machine learning models.
 
-#### Log Analytics Workspace
-
-For the Log Analytics workspace, Azure Monitor for Containers needs to be enabled. To enable this, click on an AKS cluster deployed as part of 1.2 above, click on the Logs tab in the monitoring section, then select your Log Analytics workspace and click enable. This process is shown in the image below. Ensure to repeat this process for the second AKS cluster in your resource group.
-
-![1-2](.github/../images/implementation/1-2.png)
-
 #### Azure Databricks
 
 For Azure Databricks you need to enable the [Files in Repo](https://docs.microsoft.com/en-us/azure/databricks/repos#enable-support-for-arbitrary-files-in-databricks-repos) feature (which is not enabled by default at the time of developing this proof-of-concept), generate a new [Databricks Access Token](https://docs.microsoft.com/en-au/azure/databricks/dev-tools/api/latest/authentication), and create a [cluster with custom libraries](https://docs.microsoft.com/en-au/azure/databricks/libraries/cluster-libraries).
@@ -97,8 +91,8 @@ You need to create the following secrets:
 
 | Secret name | How to find secret value |
 |:------------|:-------------------------|
-| AZURE_CREDENTIALS | A JSON object with details of your Azure Service Principal. [This](https://github.com/marketplace/actions/azure-login#configure-deployment-credentials) document will help you configure a service principal with a secret. The value will look something like: ` { "clientId": "<GUID>", "clientSecret": "<GUID>", "subscriptionId": "<GUID>", "tenantId": "<GUID>", ... }`|
-| DATABRICKS_HOST | This is the `instance name` or `per-workspace URL` of your Azure Databricks service. Its value can be found from the Databricks service page on the Azure Portal under the `URL` parameter. For more information [this]( https://docs.microsoft.com/en-us/azure/databricks/workspace/workspace-details#per-workspace-url) resource can be used. The value will look something like ` https://adb-5555555555555555.19.azuredatabricks.net`|
+| AZURE_CREDENTIALS | A JSON object with details of your Azure Service Principal. [This](https://github.com/marketplace/actions/azure-login#configure-deployment-credentials) document will help you configure a service principal with a secret. The value will look something like: `{ "clientId": "<GUID>", "clientSecret": "<GUID>", "subscriptionId": "<GUID>", "tenantId": "<GUID>", ... }`|
+| DATABRICKS_HOST | This is the `instance name` or `per-workspace URL` of your Azure Databricks service. Its value can be found from the Databricks service page on the Azure Portal under the `URL` parameter. For more information [this]( https://docs.microsoft.com/en-us/azure/databricks/workspace/workspace-details#per-workspace-url) resource can be used. The value will look something like `https://adb-5555555555555555.19.azuredatabricks.net`|
 | DATABRICKS_TOKEN | This is the value of the `Access Token` you created in `1.3`. The value should look something like `dapi55555555555555555555555555555555-2`. |
 | CONTAINER_REGISTRY_NAME | The name of the ACR service deployed in template two. |
 | CONTAINER_REGISTRY_PASSWORD | This can be found in the access keys section of the ACR service page. The Admin Account section of [this]( https://docs.microsoft.com/en-us/azure/container-registry/container-registry-authentication?tabs=azure-cli#admin-account) document contains more information. |
@@ -259,4 +253,4 @@ Lower values for a feature indicate a greater likelihood of drift and values bel
 
 Custom charts can also be developed using this data by selecting the `Chart` tab and changing the values in the `Chart formatting` section.
 
-![4-3](.github/../images/implementation/4-3.png)
+![4-3](.github/../images/implementation/4-3.png)
@@ -2,13 +2,16 @@
 
 ## Overview
 
+> For additional insights into applying this approach to operationalize your machine learning workloads refer to this article — [Machine Learning at Scale with Databricks and Kubernetes](https://medium.com/@nfmoore/machine-learning-at-scale-with-databricks-and-kubernetes-9fa59232bfa6)
 This repository contains resources for an end-to-end proof of concept which illustrates how an MLFlow model can be trained on Databricks, packaged as a web service, deployed to Kubernetes via CI/CD, and monitored within Microsoft Azure. A high-level solution design is shown below:
 
 ![workflow](.github/docs/images/workflow.png)
 
-For more information on a generic solution design see the [Architecture Guide](.github/docs/architecture-guide.md)
+Within Azure Databricks, the `IBM HR Analytics Employee Attrition & Performance` [dataset](https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset) available from Kaggle will be used to develop and register a machine learning model. This model will predict the likelihood of attrition for an employee along with metrics capturing data drift and outliers to access the model's validity.
 
-> For additional insights into applying this approach to operationalize your machine learning workloads refer to this article — [Machine Learning at Scale with Databricks and Kubernetes](https://medium.com/@nfmoore/machine-learning-at-scale-with-databricks-and-kubernetes-9fa59232bfa6)
+This model will then be deployed as an API for real-time inference using Azure Kubernetes  Service. This API can be integrated with external applications used by HR teams to provide additional insights into the likelihood of attrition for a given employee within the organization. This information can be used to determine if a high-impact employee is likely to leave the organization and hence provide HR with the ability to proactively incentivize the employee to stay.
+
+The design covered in this proof-of-concept can be generalized to many machine learning workloads. For more information on a generic solution design see the [Architecture Guide](.github/docs/architecture-guide.md).
 
 ## Getting Started
 
@@ -24,16 +27,6 @@ This repository contains detailed step-by-step instructions on how to implement
 
 For detailed step-by-step instructions see the [Implementation Guide](.github/docs/implementation-guide.md).
 
-## Scenario
-
-This proof-of-concept will be based on a common problem in HR analytics - employee attrition. Employee Attrition refers to the process by which employees leave an organization – for example, through resignation for personal reasons or retirement – and are not immediately replaced.
-
-Within this proof-of-concept, a machine learning model will be developed to predict the likelihood of attrition for an employee along with metrics capturing data drift and outliers to access the model's validity. This implementation uses the `IBM HR Analytics Employee Attrition & Performance` [dataset](https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset) available from Kaggle.
-
-The scenario in this repository will first develop a machine learning model which will then be deployed as an API for online inference. This API can be integrated with external applications used by HR teams to provide additional insights into the likelihood of attrition for a given employee within the organization. 
-
-The scenario in this repository will first develop a machine learning model which will then be deployed as an API for online inference. This API can be integrated with external applications used by HR teams to provide additional insights into the likelihood of attrition for a given employee within the organization. This information can be used to determine if a high-impact employee is likely to leave the organization and hence provide HR with the ability to proactively incentivize the employee to stay.
-
 ## License
 
 Details on licensing for the project can be found in the [LICENSE](./LICENSE) file.
@@ -1,3 +1,32 @@
+//********************************************************
+// General Parameters
+//********************************************************
+
+@description('Resource Location')
+param resourceLocation string = resourceGroup().location
+
+@description('Virtual Network IP Address Prefixes')
+param vNetIPAddressPrefixesForFirstDeployment array = [
+  '192.168.0.0/16'
+]
+
+@description('AKS Subnet IP Address Prefix')
+param subnetAksIpAddressPrefixForFirstDeployment string = '192.168.0.0/24'
+
+@description('App Gateway IP Address Prefix')
+param subnetAppGwIpAddressPrefixForFirstDeployment string = '192.168.1.0/24'
+
+@description('Virtual Network IP Address Prefixes')
+param vNetIPAddressPrefixesForSecondDeployment array = [
+  '192.167.0.0/16'
+]
+
+@description('AKS Subnet IP Address Prefix')
+param subnetAksIpAddressPrefixForSecondDeployment string = '192.167.0.0/24'
+
+@description('App Gateway IP Address Prefix')
+param subnetAppGwIpAddressPrefixForSecondDeployment string = '192.167.1.0/24'
+
 //********************************************************
 // Modules
 //********************************************************
@@ -6,20 +35,29 @@ module m_databricks './modules/databricks.bicep' = {
   name: 'm_databricks'
   params: {
     resourceInstance: '01'
+    location: resourceLocation
   }
 }
 
 module m_microservices_01 './modules/microservices.bicep' = {
   name: 'm_microservices_01'
   params: {
     resourceInstance: '01'
+    location: resourceLocation
+    vNetIPAddressPrefixes: vNetIPAddressPrefixesForFirstDeployment
+    subnetAksIpAddressPrefix: subnetAksIpAddressPrefixForFirstDeployment
+    subnetAppGwIpAddressPrefix: subnetAppGwIpAddressPrefixForFirstDeployment
   }
 }
 
 module m_microservices_02 './modules/microservices.bicep' = {
   name: 'm_microservices_02'
   params: {
     resourceInstance: '02'
+    location: resourceLocation
+    vNetIPAddressPrefixes: vNetIPAddressPrefixesForSecondDeployment
+    subnetAksIpAddressPrefix: subnetAksIpAddressPrefixForSecondDeployment
+    subnetAppGwIpAddressPrefix: subnetAppGwIpAddressPrefixForSecondDeployment
     useExistingContainerRegistry: true
     useExistingLogAnalyticsWorkspace: true
     containerRegistryName: m_microservices_01.outputs.containerRegistryName