Skip to content

Commit 8ee3eca

Browse files
authored
Merge pull request #4 from MicrosoftCloudEssentials-LearningHub/update-count
comments added - updated accordingly
2 parents 6c2c988 + a19d30f commit 8ee3eca

File tree

2 files changed

+88
-41
lines changed

2 files changed

+88
-41
lines changed

README.md

Lines changed: 86 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Costa Rica
88
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
99
[brown9804](https://github.com/brown9804)
1010

11-
Last updated: 2025-07-29
11+
Last updated: 2025-07-30
1212

1313
-----------------------------
1414

@@ -274,34 +274,68 @@ Last updated: 2025-07-29
274274

275275
## Function App: Configure/Validate the Environment variables
276276

277+
> [!IMPORTANT]
278+
> `All environment variable names must exactly match between` your `Terraform deployment configuration` (in `main.tf`) and your `Function App environment settings`. Any mismatch will cause runtime failures when the application tries to access Azure resources.
279+
277280
> [!NOTE]
278281
> This example is using system-assigned managed identity to assign RBACs (Role-based Access Control).
279282
280-
- Under `Settings`, go to `Environment variables`. And `+ Add` the following variables:
281-
282-
- `COSMOS_DB_ENDPOINT`: Your Cosmos DB account endpoint 🡢 `Review the existence of this, if not create it`
283-
- `COSMOS_DB_KEY`: Your Cosmos DB account key 🡢 `Review the existence of this, if not create it`
284-
- `COSMOS_DB_CONNECTION_STRING`: Your Cosmos DB connection string 🡢 `Review the existence of this, if not create it`
285-
- `invoicecontosostorage_STORAGE`: Your Storage Account connection string 🡢 `Review the existence of this, if not create it`
286-
- `FORM_RECOGNIZER_ENDPOINT`: For example: `https://<your-form-recognizer-endpoint>.cognitiveservices.azure.com/` 🡢 `Review the existence of this, if not create it`
287-
- `FORM_RECOGNIZER_KEY`: Your Documment Intelligence Key (Form Recognizer). 🡢
288-
- `FUNCTIONS_EXTENSION_VERSION`: `~4` 🡢 `Review the existence of this, if not create it`
289-
- `WEBSITE_RUN_FROM_PACKAGE`: `1` 🡢 `Review the existence of this, if not create it`
290-
- `FUNCTIONS_WORKER_RUNTIME`: `python` 🡢 `Review the existence of this, if not create it`
291-
- `FUNCTIONS_NODE_BLOCK_ON_ENTRY_POINT_ERROR`: `true` (This setting ensures that all entry point errors are visible in your application insights logs). 🡢 `Review the existence of this, if not create it`
283+
- Under `Settings`, go to `Environment variables`. And `+ Add` the following variables. For example:
292284

293-
<img width="550" alt="image" src="https://github.com/user-attachments/assets/31d813e7-38ba-46ff-9e4b-d091ae02706a">
285+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/ec5d60f3-5136-489d-8796-474b7250865d">
294286

295-
<img width="550" alt="image" src="https://github.com/user-attachments/assets/45313857-b337-4231-9184-d2bb46e19267">
287+
- Click on `Apply` to save your configuration.
288+
289+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/437b44bb-7735-4d17-ae49-e211eca64887">
296290

297-
<img width="550" alt="image" src="https://github.com/user-attachments/assets/074d2fa5-c64d-43bd-8ed7-af6da46d86a2">
291+
- Here are a few examples of how to get those values. `If a Terraform deployment template was used, these are linked automatically`, so please remember to review them.
298292

299-
<img width="550" alt="image" src="https://github.com/user-attachments/assets/ec5d60f3-5136-489d-8796-474b7250865d">
293+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/31d813e7-38ba-46ff-9e4b-d091ae02706a">
294+
295+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/45313857-b337-4231-9184-d2bb46e19267">
296+
297+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/074d2fa5-c64d-43bd-8ed7-af6da46d86a2">
298+
299+
> `These values depend on the specific you choose and deploy, like the AI models`, you can also adjust `LLM_MAX_TOKENS` based on your model's capabilities and `LLM_TEMPERATURE` based on your use case requirements.
300300
301-
- Click on `Apply` to save your configuration.
301+
- `FUNCTIONS_EXTENSION_VERSION`: `~4` 🡢 `Review the existence of this, if not create it`
302+
- `WEBSITE_RUN_FROM_PACKAGE`: `1` 🡢 `Review the existence of this, if not create it`
303+
- `FUNCTIONS_WORKER_RUNTIME`: `python` 🡢 `Review the existence of this, if not create it`
304+
- `FUNCTIONS_NODE_BLOCK_ON_ENTRY_POINT_ERROR`: `true` (This setting ensures that all entry point errors are visible in your application insights logs) 🡢 `Review the existence of this, if not create it`
305+
- `COSMOS_DB_ENDPOINT`: Your Cosmos DB account endpoint 🡢 `Review the existence of this, if not create it`
302306

303-
<img width="550" alt="image" src="https://github.com/user-attachments/assets/437b44bb-7735-4d17-ae49-e211eca64887">
307+
<details>
308+
<summary><b> </b> Click to see more</summary>
309+
310+
- `COSMOS_DB_KEY`: Your Cosmos DB account key 🡢 `Review the existence of this, if not create it`
311+
- `COSMOS_DB_CONNECTION_STRING`: Your Cosmos DB connection string 🡢 `Review the existence of this, if not create it`
312+
- `invoicecontosostorage_STORAGE`: Your Storage Account connection string 🡢 `Review the existence of this, if not create it`
313+
- `FORM_RECOGNIZER_ENDPOINT`: For example: `https://<your-form-recognizer-endpoint>.cognitiveservices.azure.com/` 🡢 `Review the existence of this, if not create it`
314+
- `FORM_RECOGNIZER_KEY`: Your Document Intelligence Key (Form Recognizer) 🡢 `Review the existence of this, if not create it`
315+
- `APPINSIGHTS_INSTRUMENTATIONKEY`: Your Application Insights instrumentation key 🡢 `Review the existence of this, if not create it`
316+
- `APPLICATIONINSIGHTS_CONNECTION_STRING`: Your Application Insights connection string 🡢 `Review the existence of this, if not create it`
317+
- `VISION_API_ENDPOINT`: Your Azure AI Vision endpoint for visual cue detection 🡢 `Review the existence of this, if not create it`
318+
- `VISION_API_KEY`: Your Azure AI Vision API key 🡢 `Review the existence of this, if not create it`
319+
- `VISION_API_VERSION`: `2024-02-01` (Latest stable API version) 🡢 `Review the existence of this, if not create it`. These values depend on the specific you choose and deploy
320+
- `AZURE_OPENAI_ENDPOINT`: Your Azure OpenAI service endpoint 🡢 `Review the existence of this, if not create it`
321+
- `AZURE_OPENAI_KEY`: Your Azure OpenAI API key 🡢 `Review the existence of this, if not create it`
322+
- `AZURE_OPENAI_API_VERSION`: e.g `2025-04-14` 🡢 `Review the existence of this, if not create it`. These values depend on the specific you choose and deploy
323+
- `AZURE_OPENAI_GPT4_DEPLOYMENT`: Your e.g GPT-4 deployment name for complex reasoning and analysis 🡢 `Review the existence of this, if not create it`. These values depend on the specific you choose and deploy
324+
- `AZURE_OPENAI_GPT4O_DEPLOYMENT`: Your e.g GPT-4o deployment name for advanced multimodal processing 🡢 `Review the existence of this, if not create it`. These values depend on the specific you choose and deploy
325+
- `AZURE_OPENAI_EMBEDDING_DEPLOYMENT`: Your text embedding deployment name for semantic search 🡢 `Review the existence of this, if not create it`
326+
- `AI_HUB_NAME`: Your AI Studio Hub name for model management 🡢 `Review the existence of this, if not create it`
327+
- `AI_PROJECT_NAME`: Your AI Studio Project name 🡢 `Review the existence of this, if not create it`
328+
- `AI_HUB_WORKSPACE_URL`: Your AI Hub workspace URL 🡢 `Review the existence of this, if not create it`
329+
- `AI_PROJECT_WORKSPACE_URL`: Your AI Project workspace URL 🡢 `Review the existence of this, if not create it`
330+
- `AI_STORAGE_ACCOUNT_NAME`: Your AI storage account name for model artifacts 🡢 `Review the existence of this, if not create it`
331+
- `AI_STORAGE_CONNECTION`: Your AI storage connection string 🡢 `Review the existence of this, if not create it`
332+
- `ENABLE_LLM_PROCESSING`: `true` (Enable LLM-powered PDF processing features) 🡢 `Review the existence of this, if not create it`
333+
- `LLM_MAX_TOKENS`: `4000` (Maximum tokens per request - adjust based on your model choice) 🡢 `Review the existence of this, if not create it`
334+
- `LLM_TEMPERATURE`: `0.1` (Low temperature for consistent extraction - adjust based on use case) 🡢 `Review the existence of this, if not create it`
335+
- `LLM_TIMEOUT_SECONDS`: `120` (Timeout for LLM requests - may need adjustment depending on model response time) 🡢 `Review the existence of this, if not create it`
304336

337+
</details>
338+
305339
## Function App: Develop the logic
306340

307341
- You need to install [VSCode](https://code.visualstudio.com/download)
@@ -339,7 +373,7 @@ Last updated: 2025-07-29
339373

340374
<img width="550" alt="image" src="https://github.com/user-attachments/assets/0a4ed541-a693-485c-b6ca-7d5fb55a61d2">
341375

342-
- Provide a function name, like `BlobTriggerPDFMultiLayoutsDocIntelligence`:
376+
- Provide a function name, like `BlobTriggerPDFsMultiLayoutsDocIntelligence`:
343377

344378
<img width="550" alt="image" src="https://github.com/user-attachments/assets/263cef5c-4460-46cb-8899-fb609b191d81">
345379

@@ -361,22 +395,35 @@ Last updated: 2025-07-29
361395

362396
- Now we need to update the function code to extract data from PDFs and store it in Cosmos DB, use this an example:
363397

364-
> 1. **PDF Upload**: A PDF file is uploaded to the Azure Blob Storage container (`pdfinvoices`).
365-
> 2. **Trigger Azure Function**: The upload triggers the Azure Function `BlobTriggerContosoPDFLayoutsDocIntelligence`.
366-
> 3. **Initialize Clients**: Sets up connections to Azure Document Intelligence and Cosmos DB.
367-
> - Initializes the `DocumentAnalysisClient` using the `FORM_RECOGNIZER_ENDPOINT` and `FORM_RECOGNIZER_KEY` environment variables.
368-
> - Initializes the `CosmosClient` using Azure Active Directory (AAD) via `DefaultAzureCredential`.
369-
> 4. **Read PDF from Blob Storage**: Reads the PDF content from the blob into a byte stream.
370-
> 5. **Analyze PDF**: Uses Azure Document Intelligence to analyze the layout of the PDF.
371-
> - Calls `begin_analyze_document` with the `prebuilt-layout` model.
372-
> - Waits for the analysis to complete and retrieves the layout result.
373-
> 6. **Extract Layout Data**: Parses and structures the layout data from the analysis result.
374-
> - Extracts lines, tables, and selection marks from each page.
375-
> - Logs styles (e.g., handwritten content) and organizes data into a structured dictionary.
376-
> 7. **Save Data to Cosmos DB**: Saves the structured layout data to Cosmos DB.
377-
> - Ensures the database (`ContosoDBDocIntellig`) and container (`Layouts`) exist or creates them.
378-
> - Inserts or updates the layout data using `upsert_item`.
379-
> 8. **Logging (Process and Errors)**: Logs each step of the process, including success messages and detailed error handling for debugging and monitoring.
398+
> 1. **PDF Upload**: A PDF file is uploaded to the Azure Blob Storage container (`pdfinvoices`).
399+
> 2. **Trigger Azure Function**: The upload triggers the Azure Function `BlobTriggerPDFsMultiLayoutsAIDocIntelligence`.
400+
> 3. **Initialize Clients**: Sets up connections to Azure Document Intelligence, AI Vision, OpenAI, and Cosmos DB.
401+
> - Initializes the `DocumentAnalysisClient` using the `FORM_RECOGNIZER_ENDPOINT` and `FORM_RECOGNIZER_KEY` environment variables.
402+
> - Initializes the `AzureOpenAI` client for LLM analysis using OpenAI deployment details.
403+
> - Configures the Vision API for visual cue detection.
404+
> - Sets up the `CosmosClient` for data storage.
405+
> 4. **Read PDF from Blob Storage**: Reads the PDF content from the blob into a byte stream.
406+
> 5. **Analyze PDF**: Uses Azure Document Intelligence to analyze the layout of the PDF.
407+
> - Calls `begin_analyze_document` with the `prebuilt-layout` model.
408+
> - Waits for the analysis to complete and retrieves the layout result.
409+
> 6. **Extract Layout Data**: Parses and structures the layout data from the analysis result.
410+
> - Extracts lines, tables, and selection marks from each page.
411+
> - Identifies visual selection cues using AI Vision for enhanced form recognition.
412+
> - Logs styles (e.g., handwritten content) and organizes data into a structured dictionary.
413+
> 7. **Enhance with AI Vision**: Analyzes visual elements for additional insights.
414+
> - Detects and processes visual selection cues that Document Intelligence might miss.
415+
> - Combines visual analysis with document structure understanding.
416+
> 8. **Apply LLM Analysis**: Uses Azure OpenAI for semantic understanding of document content.
417+
> - Prepares structured content for the LLM with meaningful context.
418+
> - Analyzes content relationships and extracts high-level insights.
419+
> 9. **Save Data to Cosmos DB**: Saves the structured layout data to Cosmos DB.
420+
> - Ensures the database (`DocumentAnalysisDB`) and container (`ProcessedDocuments`) exist or creates them.
421+
> - Prepares document for storage with metadata and timestamps.
422+
> - Inserts or updates the layout data using `upsert_item`.
423+
> 10. **Logging (Process and Errors)**: Logs each step of the process, including success messages and detailed error handling for debugging and monitoring.
424+
> - Uses structured logging for better traceability.
425+
> - Includes processing time metrics for performance analysis.
426+
> - Provides comprehensive error handling with meaningful diagnostics.
380427
381428
- Update the function_app.py, for example [see the code used in this demo](./src/function_app.py):
382429

@@ -444,15 +491,15 @@ Last updated: 2025-07-29
444491

445492
- Check all the logs, and traces generated. Also review the information parsed:
446493

447-
<img width="550" alt="image" src="https://github.com/user-attachments/assets/8f4631cc-162e-4c3b-913d-d146ea4e36b3">
494+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/2f28cc69-d389-4ef6-9209-57f76c9a09aa" />
448495

449496
- Validate that the information was uploaded to the Cosmos DB. Under `Data Explorer`, check your `Database`.
450497

451498
<img width="550" alt="image" src="https://github.com/user-attachments/assets/27309a6d-c654-4c76-bbc1-990a9338973c">
452499

453500
<!-- START BADGE -->
454501
<div align="center">
455-
<img src="https://img.shields.io/badge/Total%20views-1616-limegreen" alt="Total views">
456-
<p>Refresh Date: 2025-07-29</p>
502+
<img src="https://img.shields.io/badge/Total%20views-1710-limegreen" alt="Total views">
503+
<p>Refresh Date: 2025-07-30</p>
457504
</div>
458505
<!-- END BADGE -->

terraform-infrastructure/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ graph TD;
109109

110110
<!-- START BADGE -->
111111
<div align="center">
112-
<img src="https://img.shields.io/badge/Total%20views-1616-limegreen" alt="Total views">
113-
<p>Refresh Date: 2025-07-29</p>
112+
<img src="https://img.shields.io/badge/Total%20views-1710-limegreen" alt="Total views">
113+
<p>Refresh Date: 2025-07-30</p>
114114
</div>
115115
<!-- END BADGE -->

0 commit comments

Comments
 (0)