Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 96 additions & 0 deletions .github/workflows/use-visitor-counter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
name: Use Visitor Counter Logic

on:
pull_request:
branches:
- main
schedule:
- cron: '0 0 * * *' # Runs daily at midnight
workflow_dispatch: # Allows manual triggering

permissions:
contents: write
pull-requests: write

jobs:
update-visitor-count:
runs-on: ubuntu-latest

steps:
- name: Checkout current repository
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Shallow clone visitor counter logic
run: git clone --depth=1 https://github.com/brown9804/github-visitor-counter.git

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '20'

- name: Install dependencies for github-visitor-counter
run: |
cd github-visitor-counter
npm ci

- name: Run visitor counter logic (updates markdown badges and metrics.json)
run: node github-visitor-counter/update_repo_views_counter.js
env:
TRAFFIC_TOKEN: ${{ secrets.TRAFFIC_TOKEN }}
REPO: ${{ github.repository }}

- name: Move generated metrics.json to root
run: mv github-visitor-counter/metrics.json .

- name: List files for debugging
run: |
ls -l
ls -l github-visitor-counter

- name: Clean up visitor counter logic
run: rm -rf github-visitor-counter

- name: Configure Git author
run: |
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"

# Commit and push logic for PR events
- name: Commit and push changes (PR)
if: github.event_name == 'pull_request'
env:
TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
git fetch origin
git checkout ${{ github.head_ref }}
git add "*.md" metrics.json
git commit -m "Update visitor count" || echo "No changes to commit"
git remote set-url origin https://x-access-token:${TOKEN}@github.com/${{ github.repository }}
git pull --rebase origin ${{ github.head_ref }} || echo "No rebase needed"
git push origin HEAD:${{ github.head_ref }}

# Commit and push logic for non-PR events (schedule, workflow_dispatch)
- name: Commit and push changes (non-PR)
if: github.event_name != 'pull_request'
env:
TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
git fetch origin
git checkout ${{ github.ref_name }} || git checkout -b ${{ github.ref_name }} origin/${{ github.ref_name }}
git add "*.md" metrics.json
git commit -m "Update visitor count" || echo "No changes to commit"
git remote set-url origin https://x-access-token:${TOKEN}@github.com/${{ github.repository }}
git pull --rebase origin ${{ github.ref_name }} || echo "No rebase needed"
git push origin HEAD:${{ github.ref_name }}

- name: Create Pull Request (non-PR)
if: github.event_name != 'pull_request'
uses: peter-evans/create-pull-request@v6
with:
token: ${{ secrets.GITHUB_TOKEN }}
branch: update-visitor-count
title: "Update visitor count"
body: "Automated update of visitor count"
base: main
31 changes: 18 additions & 13 deletions Purview/Cost-Estimation.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Last updated: 2025-07-17

> [!IMPORTANT]
> The general formula to keep in mind for estimating the cost of Microsoft Purview is: <br/>
>
> - **Cost of Data Map**: Calculated based on the number of capacity units and the price per capacity unit per hour. <br/>
> - **Cost of Scanning**: Calculated based on the total duration (in minutes) of all scans in a month, divided by 60 minutes per hour, multiplied by the number of vCores per scan, and the price per vCore per hour. <br/>
> - **Cost of Resource Set**: Calculated based on the total duration (in hours) of processing resource set data assets in a month, multiplied by the price per vCore per hour.
Expand All @@ -38,26 +39,29 @@ Last updated: 2025-07-17
$$

1. Data Map (Always on):
- Number of Capacity Units: Typically 1
- Total Hours in a Month: 730 hours
- Price per Capacity Unit per Hour: \$0.411

- Number of Capacity Units: Typically 1
- Total Hours in a Month: 730 hours
- Price per Capacity Unit per Hour: \$0.411

$$
\text{Total Cost for Data Map} = \text{Number of Capacity Units} \times \text{Total Hours in a Month} \times \text{Price per Capacity Unit per Hour}
$$

2. Scanning (Pay as you go):
- Total Minutes of Scanning in a Month: [M] minutes
- Number of vCores per Scan: 32
- Price per vCore per Hour: \$0.63

- Total Minutes of Scanning in a Month: [M] minutes
- Number of vCores per Scan: 32
- Price per vCore per Hour: \$0.63

$$
\text{Total Cost for Scanning} = \left( \frac{\text{Total Minutes of Scanning in a Month}}{60} \right) \times \text{Number of vCores per Scan} \times \text{Price per vCore per Hour}
$$

3. Resource Set:
- Total Hours of Processing in a Month: [H] hours
- Price per vCore per Hour: \$0.21

- Total Hours of Processing in a Month: [H] hours
- Price per vCore per Hour: \$0.21

$$
\text{Total Cost for Resource Set} = \text{Total Hours of Processing in a Month} \times \text{Price per vCore per Hour}
Expand Down Expand Up @@ -117,10 +121,11 @@ $$
## Cost Estimation for Different Metadata Volumes

> [!IMPORTANT]
> Microsoft Purview `scans metadata to classify, label, and protect data assets`. It does `not scan the actual data content but rather the information about the data`. <br/>
> `The size of the data itself does not directly` impact the cost of `metadata scanning unless it affects the amount of metadata generated`. The `number of metadata assets and their complexity` are the primary factors influencing costs.
> Microsoft Purview `scans metadata to classify, label, and protect data assets`. It does `not scan the actual data content but rather the information about the data`. <br/>
> `The size of the data itself does not directly` impact the cost of `metadata scanning unless it affects the amount of metadata generated`. The `number of metadata assets and their complexity` are the primary factors influencing costs.

Assumptions:

- The number of metadata assets is assumed based on the data volume, with an average size of 1 MB per metadata asset.
- The average size of each metadata asset is assumed to be 1 MB.
- These estimates are based on the assumption that the governed assets and data management costs are applied for 100 hours per month. Actual costs may vary based on specific agreements with Microsoft, usage patterns, etc.
Expand All @@ -145,7 +150,7 @@ Assumptions:
- Include **Managed Virtual Network** and **data transfer** costs if applicable.
- Get a **real-time, region-specific estimate** (e.g., for Costa Rica or any other region).

https://github.com/user-attachments/assets/05521c11-6666-4fc8-9046-14d6958798ef
<https://github.com/user-attachments/assets/05521c11-6666-4fc8-9046-14d6958798ef>

## Additional Considerations

Expand All @@ -161,7 +166,7 @@ Assumptions:

<!-- START BADGE -->
<div align="center">
<img src="https://img.shields.io/badge/Total%20views-31-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-16</p>
<img src="https://img.shields.io/badge/Total%20views-2-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-17</p>
</div>
<!-- END BADGE -->
4 changes: 2 additions & 2 deletions Purview/DLP-Overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ Last updated: 2025-07-17

<!-- START BADGE -->
<div align="center">
<img src="https://img.shields.io/badge/Total%20views-31-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-16</p>
<img src="https://img.shields.io/badge/Total%20views-2-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-17</p>
</div>
<!-- END BADGE -->
14 changes: 5 additions & 9 deletions Purview/DLP-implementation.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ Last updated: 2025-07-17

<img width="550" alt="image" src="https://github.com/user-attachments/assets/da6ef80f-7ca8-456a-993d-a6d40bb28c53" />


- Select `Policies` > `Create policy`:

<img width="550" alt="image" src="https://github.com/user-attachments/assets/85f706eb-276e-4f7f-998f-f44bcf8fbfc3" />
Expand All @@ -40,7 +39,6 @@ Last updated: 2025-07-17

<img width="550" alt="image" src="https://github.com/user-attachments/assets/b4128763-851e-46ee-8a74-fdf189fa8762">


## Define Policy Scope

- Select the locations where the policy will apply (e.g., Exchange email, SharePoint sites, OneDrive accounts, Teams chat).
Expand All @@ -56,9 +54,7 @@ Last updated: 2025-07-17

<img width="550" alt="image" src="https://github.com/user-attachments/assets/bab1c3cf-2bc0-4646-aa8f-92805d68e30f" />


https://github.com/user-attachments/assets/a9165b97-f197-4f37-877e-9776015a3297

<https://github.com/user-attachments/assets/a9165b97-f197-4f37-877e-9776015a3297>

## Set Up Alerts and Notifications

Expand All @@ -71,7 +67,7 @@ Last updated: 2025-07-17

## Customize access and override settings

https://github.com/user-attachments/assets/eb3d57d3-5bef-43f2-b069-1d25c3ef047b
<https://github.com/user-attachments/assets/eb3d57d3-5bef-43f2-b069-1d25c3ef047b>

## Test and Deploy the Policy

Expand All @@ -81,7 +77,7 @@ https://github.com/user-attachments/assets/eb3d57d3-5bef-43f2-b069-1d25c3ef047b

<img width="550" alt="image" src="https://github.com/user-attachments/assets/944945f0-ad0c-49ea-b157-47613c48590b" />

https://github.com/user-attachments/assets/0a38b331-33e8-4e15-96be-3edbe79119f6
<https://github.com/user-attachments/assets/0a38b331-33e8-4e15-96be-3edbe79119f6>

## Monitor and Manage Policies

Expand All @@ -95,7 +91,7 @@ https://github.com/user-attachments/assets/eb3d57d3-5bef-43f2-b069-1d25c3ef047b

<!-- START BADGE -->
<div align="center">
<img src="https://img.shields.io/badge/Total%20views-31-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-16</p>
<img src="https://img.shields.io/badge/Total%20views-2-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-17</p>
</div>
<!-- END BADGE -->
17 changes: 10 additions & 7 deletions Purview/Free-and-Enterprise.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,12 @@ Last updated: 2025-07-17
- [Free vs Enterprise](#free-vs-enterprise)
- [Overview](#overview)
- [Microsoft Purview Pricing Model](#microsoft-purview-pricing-model)
- [Key Differences](#key-differences)
- [Key Differences](#key-differences)
- [How Microsoft Purview can be used](#how-microsoft-purview-can-be-used)
- [Scenario 1: Data Governance for a Financial Institution](#scenario-1-data-governance-for-a-financial-institution)
- [Scenario 2: Data Protection for a Healthcare Provider](#scenario-2-data-protection-for-a-healthcare-provider)
- [Scenario 3: Data Analytics for an E-commerce Company](#scenario-3-data-analytics-for-an-e-commerce-company)
- [Scenario 4: Compliance Management for a Global Enterprise](#scenario-4-compliance-management-for-a-global-enterprise)
- [Scenario 1: Data Governance for a Financial Institution](#scenario-1-data-governance-for-a-financial-institution)
- [Scenario 2: Data Protection for a Healthcare Provider](#scenario-2-data-protection-for-a-healthcare-provider)
- [Scenario 3: Data Analytics for an E-commerce Company](#scenario-3-data-analytics-for-an-e-commerce-company)
- [Scenario 4: Compliance Management for a Global Enterprise](#scenario-4-compliance-management-for-a-global-enterprise)
- [Examples of use cases](#examples-of-use-cases)
- [Collect metadata information from Apache Airflow](#collect-metadata-information-from-apache-airflow)

Expand Down Expand Up @@ -72,6 +72,7 @@ Last updated: 2025-07-17
## Overview

> Keypoints of Microsoft Purview: <br/>
>
> 1. `Integration with Microsoft Ecosystem`: Purview offers deep integration with Azure, Power BI, and Microsoft 365, providing a seamless experience for organizations already using these tools. <br/>
> 2. `Advanced Governance and Compliance`: Purview provides robust governance and compliance features, ensuring your data management practices meet regulatory standards. <br/>
> 3. `AI-Powered Search and Discovery`: With AI-driven capabilities, Purview enhances data discovery and classification, making it easier to find and manage data assets. <br/>
Expand Down Expand Up @@ -521,18 +522,20 @@ Find below different scenarios to manage data governance, protection, and compli
> This capability is currently in public preview and is achieved through integration with **OpenLineage**, an open framework for data lineage collection and analysis.

How it works:

1. **Enable OpenLineage in Airflow**: By enabling OpenLineage in your Airflow instance, metadata and lineage information about jobs and datasets are automatically tracked as Directed Acyclic Graphs (DAGs) execute.
2. **Azure Event Hubs**: The tracked metadata and lineage information are sent to an Azure Event Hubs instance that you configure.
3. **Microsoft Purview**: Purview subscribes to the events from Azure Event Hubs, parses them, and ingests the metadata and lineage into the data map.

This integration supports capturing metadata such as:

- Airflow workspace
- Airflow DAG
- Airflow task

<!-- START BADGE -->
<div align="center">
<img src="https://img.shields.io/badge/Total%20views-31-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-16</p>
<img src="https://img.shields.io/badge/Total%20views-2-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-17</p>
</div>
<!-- END BADGE -->
40 changes: 19 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ Last updated: 2025-07-17

</details>


> [!IMPORTANT]
> The [Azure Databases Advisor Tool](https://microsoftcloudessentials-learninghub.github.io/Azure-Databases-Purview-Advisor/) is designed to help users select the most suitable Azure database service based on their specific use case. It provides recommendations by analyzing user inputs such as data type, scalability needs, latency requirements, and more.
> The information provided and any document (such as scripts, sample codes, etc.) is provided `AS-IS` and `WITH ALL FAULTS`. Pricing estimates are for `demonstration purposes only and do not reflect final pricing`. `Microsoft assumes no liability` for your use of this information and makes no guarantees or warranties, expressed or implied, regarding its accuracy or completeness, including any pricing details. `Please note that these demos are intended as a guide and are based on personal experiences. For official guidance, support, or more detailed information, please refer to Microsoft's official documentation or contact Microsoft directly`: [Microsoft Sales and Support](https://support.microsoft.com/contactus?ContactUsExperienceEntryPointAssetId=S.HP.SMC-HOME)
Expand All @@ -33,30 +32,29 @@ Last updated: 2025-07-17
<summary><b>Details</b> (Click to expand)</summary>

> - **Formats**<br/>
> - Structured: Stored in predefined formats like rows and columns with consistent schema enforcement.<br/>
> - Unstructured: Exists in diverse formats like free text, images, audio, video, and documents that lack a formal structure.<br/>
> - Structured: Stored in predefined formats like rows and columns with consistent schema enforcement.<br/>
> - Unstructured: Exists in diverse formats like free text, images, audio, video, and documents that lack a formal structure.<br/>
> - **Storage Model**<br/>
> - Structured: Uses rigid, predefined schemas in relational databases ensuring integrity and data validation.<br/>
> - Unstructured: Stored in flexible formats such as object storage, document stores, or blob storage without a fixed schema.<br/>
> - Structured: Uses rigid, predefined schemas in relational databases ensuring integrity and data validation.<br/>
> - Unstructured: Stored in flexible formats such as object storage, document stores, or blob storage without a fixed schema.<br/>
> - **Databases**<br/>
> - Structured: Managed through SQL-based systems like Azure SQL, MySQL, and PostgreSQL.<br/>
> - Unstructured: Supported by NoSQL systems like Cosmos DB, MongoDB, and cloud-native data lakes.<br/>
> - Structured: Managed through SQL-based systems like Azure SQL, MySQL, and PostgreSQL.<br/>
> - Unstructured: Supported by NoSQL systems like Cosmos DB, MongoDB, and cloud-native data lakes.<br/>
> - **Ease of Search**<br/>
> - Structured: Easily queried using SQL, indexing, and standardized query languages.<br/>
> - Unstructured: Requires more advanced approaches like keyword extraction, OCR, or AI-assisted search tools.<br/>
> - Structured: Easily queried using SQL, indexing, and standardized query languages.<br/>
> - Unstructured: Requires more advanced approaches like keyword extraction, OCR, or AI-assisted search tools.<br/>
> - **Analysis Methods**<br/>
> - Structured: Suited for quantitative techniques, including statistical modeling, trend analysis, and aggregation.<br/>
> - Unstructured: Often analyzed with qualitative approaches like NLP, sentiment analysis, topic modeling, or deep learning.<br/>
> - Structured: Suited for quantitative techniques, including statistical modeling, trend analysis, and aggregation.<br/>
> - Unstructured: Often analyzed with qualitative approaches like NLP, sentiment analysis, topic modeling, or deep learning.<br/>
> - **Tools and Technologies**<br/>
> - Structured: RDBMS (SQL Server, Oracle), OLTP systems, CRM platforms, and OLAP tools for analytics.<br/>
> - Unstructured: NoSQL DBMS, data mining frameworks, ML pipelines, AI services, and visualization platforms like Power BI.<br/>
> - Structured: RDBMS (SQL Server, Oracle), OLTP systems, CRM platforms, and OLAP tools for analytics.<br/>
> - Unstructured: NoSQL DBMS, data mining frameworks, ML pipelines, AI services, and visualization platforms like Power BI.<br/>
> - **Specialists**<br/>
> - Structured: Typically handled by business analysts, software engineers, solution architects, and DBAs.<br/>
> - Unstructured: Requires data scientists, AI/ML specialists, information architects, and advanced data engineers.<br/>
> - Structured: Typically handled by business analysts, software engineers, solution architects, and DBAs.<br/>
> - Unstructured: Requires data scientists, AI/ML specialists, information architects, and advanced data engineers.<br/>

</details>


## Products/Services

```mermaid
Expand Down Expand Up @@ -127,7 +125,7 @@ Click here to read more about a [quick guide on SQL Server on Azure Virtual Mach
<details>
<summary><b>Azure Database for PostgreSQL</b> (PaaS) - Click to expand </summary>

> Enterprise-ready community PostgreSQL database service, fully managed by Microsoft.
> Enterprise-ready community PostgreSQL database service, fully managed by Microsoft.

> - **Benefits:** High availability with up to 99.99% SLA, built-in security, and scalability.<br/>
> - **Differentiators:** Supports PostgreSQL extensions and advanced indexing options.<br/>
Expand Down Expand Up @@ -169,7 +167,7 @@ Click here to read more about a [quick guide on Oracle Database on Azure](./sql/
<details>
<summary><b>SQL Server 2022</b> (IaaS) - Click to expand </summary>

> Latest release of SQL Server with built-in hybrid and cloud-connected capabilities.
> Latest release of SQL Server with built-in hybrid and cloud-connected capabilities.

> - **Benefits:** Brings innovations like ledger tables, Synapse Link, and built-in security enhancements.<br/>
> - **Differentiators:** Full hybrid flexibility for modern apps with backward compatibility.<br/>
Expand Down Expand Up @@ -197,7 +195,7 @@ Click here to read more about a [quick guide on Azure Cosmos DB](./nosql/azure-c
<details>
<summary><b>Azure Managed Instance for Apache Cassandra</b> (PaaS) - Click to expand </summary>

> Managed Cassandra database service designed for massive scale and availability.
> Managed Cassandra database service designed for massive scale and availability.

> - **Benefits:** Built-in automation, scalability, and hybrid deployment options.<br/>
> - **Differentiators:** Supports native Cassandra drivers and schemas with Azure-managed benefits.<br/>
Expand Down Expand Up @@ -252,7 +250,7 @@ Click here to read more about a [quick guide on Azure Cache for Redis](./nosql/a

<!-- START BADGE -->
<div align="center">
<img src="https://img.shields.io/badge/Total%20views-31-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-16</p>
<img src="https://img.shields.io/badge/Total%20views-2-limegreen" alt="Total views">
<p>Refresh Date: 2025-07-17</p>
</div>
<!-- END BADGE -->
Loading