You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Function App: Configure/Validate the Environment variables
276
276
277
+
> [!IMPORTANT]
278
+
> `All environment variable names must exactly match between` your `Terraform deployment configuration` (in `main.tf`) and your `Function App environment settings`. Any mismatch will cause runtime failures when the application tries to access Azure resources.
279
+
277
280
> [!NOTE]
278
281
> This example is using system-assigned managed identity to assign RBACs (Role-based Access Control).
279
282
280
-
- Under `Settings`, go to `Environment variables`. And `+ Add` the following variables:
281
-
282
-
-`COSMOS_DB_ENDPOINT`: Your Cosmos DB account endpoint 🡢 `Review the existence of this, if not create it`
283
-
-`COSMOS_DB_KEY`: Your Cosmos DB account key 🡢 `Review the existence of this, if not create it`
284
-
-`COSMOS_DB_CONNECTION_STRING`: Your Cosmos DB connection string 🡢 `Review the existence of this, if not create it`
285
-
-`invoicecontosostorage_STORAGE`: Your Storage Account connection string 🡢 `Review the existence of this, if not create it`
286
-
-`FORM_RECOGNIZER_ENDPOINT`: For example: `https://<your-form-recognizer-endpoint>.cognitiveservices.azure.com/` 🡢 `Review the existence of this, if not create it`
287
-
-`FORM_RECOGNIZER_KEY`: Your Documment Intelligence Key (Form Recognizer). 🡢
288
-
-`FUNCTIONS_EXTENSION_VERSION`: `~4` 🡢 `Review the existence of this, if not create it`
289
-
-`WEBSITE_RUN_FROM_PACKAGE`: `1` 🡢 `Review the existence of this, if not create it`
290
-
-`FUNCTIONS_WORKER_RUNTIME`: `python` 🡢 `Review the existence of this, if not create it`
291
-
-`FUNCTIONS_NODE_BLOCK_ON_ENTRY_POINT_ERROR`: `true` (This setting ensures that all entry point errors are visible in your application insights logs). 🡢 `Review the existence of this, if not create it`
283
+
- Under `Settings`, go to `Environment variables`. And `+ Add` the following variables. For example:
- Here are a few examples of how to get those values. `If a Terraform deployment template was used, these are linked automatically`, so please remember to review them.
> `These values depend on the specific you choose and deploy, like the AI models`, you can also adjust `LLM_MAX_TOKENS` based on your model's capabilities and `LLM_TEMPERATURE` based on your use case requirements.
300
300
301
-
- Click on `Apply` to save your configuration.
301
+
-`FUNCTIONS_EXTENSION_VERSION`: `~4` 🡢 `Review the existence of this, if not create it`
302
+
-`WEBSITE_RUN_FROM_PACKAGE`: `1` 🡢 `Review the existence of this, if not create it`
303
+
-`FUNCTIONS_WORKER_RUNTIME`: `python` 🡢 `Review the existence of this, if not create it`
304
+
-`FUNCTIONS_NODE_BLOCK_ON_ENTRY_POINT_ERROR`: `true` (This setting ensures that all entry point errors are visible in your application insights logs) 🡢 `Review the existence of this, if not create it`
305
+
-`COSMOS_DB_ENDPOINT`: Your Cosmos DB account endpoint 🡢 `Review the existence of this, if not create it`
-`COSMOS_DB_KEY`: Your Cosmos DB account key 🡢 `Review the existence of this, if not create it`
311
+
-`COSMOS_DB_CONNECTION_STRING`: Your Cosmos DB connection string 🡢 `Review the existence of this, if not create it`
312
+
-`invoicecontosostorage_STORAGE`: Your Storage Account connection string 🡢 `Review the existence of this, if not create it`
313
+
-`FORM_RECOGNIZER_ENDPOINT`: For example: `https://<your-form-recognizer-endpoint>.cognitiveservices.azure.com/` 🡢 `Review the existence of this, if not create it`
314
+
-`FORM_RECOGNIZER_KEY`: Your Document Intelligence Key (Form Recognizer) 🡢 `Review the existence of this, if not create it`
315
+
-`APPINSIGHTS_INSTRUMENTATIONKEY`: Your Application Insights instrumentation key 🡢 `Review the existence of this, if not create it`
316
+
-`APPLICATIONINSIGHTS_CONNECTION_STRING`: Your Application Insights connection string 🡢 `Review the existence of this, if not create it`
317
+
-`VISION_API_ENDPOINT`: Your Azure AI Vision endpoint for visual cue detection 🡢 `Review the existence of this, if not create it`
318
+
-`VISION_API_KEY`: Your Azure AI Vision API key 🡢 `Review the existence of this, if not create it`
319
+
-`VISION_API_VERSION`: `2024-02-01` (Latest stable API version) 🡢 `Review the existence of this, if not create it`. These values depend on the specific you choose and deploy
320
+
-`AZURE_OPENAI_ENDPOINT`: Your Azure OpenAI service endpoint 🡢 `Review the existence of this, if not create it`
321
+
-`AZURE_OPENAI_KEY`: Your Azure OpenAI API key 🡢 `Review the existence of this, if not create it`
322
+
-`AZURE_OPENAI_API_VERSION`: e.g `2025-04-14` 🡢 `Review the existence of this, if not create it`. These values depend on the specific you choose and deploy
323
+
-`AZURE_OPENAI_GPT4_DEPLOYMENT`: Your e.g GPT-4 deployment name for complex reasoning and analysis 🡢 `Review the existence of this, if not create it`. These values depend on the specific you choose and deploy
324
+
-`AZURE_OPENAI_GPT4O_DEPLOYMENT`: Your e.g GPT-4o deployment name for advanced multimodal processing 🡢 `Review the existence of this, if not create it`. These values depend on the specific you choose and deploy
325
+
-`AZURE_OPENAI_EMBEDDING_DEPLOYMENT`: Your text embedding deployment name for semantic search 🡢 `Review the existence of this, if not create it`
326
+
-`AI_HUB_NAME`: Your AI Studio Hub name for model management 🡢 `Review the existence of this, if not create it`
327
+
-`AI_PROJECT_NAME`: Your AI Studio Project name 🡢 `Review the existence of this, if not create it`
328
+
-`AI_HUB_WORKSPACE_URL`: Your AI Hub workspace URL 🡢 `Review the existence of this, if not create it`
329
+
-`AI_PROJECT_WORKSPACE_URL`: Your AI Project workspace URL 🡢 `Review the existence of this, if not create it`
330
+
-`AI_STORAGE_ACCOUNT_NAME`: Your AI storage account name for model artifacts 🡢 `Review the existence of this, if not create it`
331
+
-`AI_STORAGE_CONNECTION`: Your AI storage connection string 🡢 `Review the existence of this, if not create it`
332
+
-`ENABLE_LLM_PROCESSING`: `true` (Enable LLM-powered PDF processing features) 🡢 `Review the existence of this, if not create it`
333
+
-`LLM_MAX_TOKENS`: `4000` (Maximum tokens per request - adjust based on your model choice) 🡢 `Review the existence of this, if not create it`
334
+
-`LLM_TEMPERATURE`: `0.1` (Low temperature for consistent extraction - adjust based on use case) 🡢 `Review the existence of this, if not create it`
335
+
-`LLM_TIMEOUT_SECONDS`: `120` (Timeout for LLM requests - may need adjustment depending on model response time) 🡢 `Review the existence of this, if not create it`
304
336
337
+
</details>
338
+
305
339
## Function App: Develop the logic
306
340
307
341
- You need to install [VSCode](https://code.visualstudio.com/download)
- Now we need to update the function code to extract data from PDFs and store it in Cosmos DB, use this an example:
363
397
364
-
> 1.**PDF Upload**: A PDF file is uploaded to the Azure Blob Storage container (`pdfinvoices`).
365
-
> 2.**Trigger Azure Function**: The upload triggers the Azure Function `BlobTriggerContosoPDFLayoutsDocIntelligence`.
366
-
> 3.**Initialize Clients**: Sets up connections to Azure Document Intelligence and Cosmos DB.
367
-
> - Initializes the `DocumentAnalysisClient` using the `FORM_RECOGNIZER_ENDPOINT` and `FORM_RECOGNIZER_KEY` environment variables.
368
-
> - Initializes the `CosmosClient` using Azure Active Directory (AAD) via `DefaultAzureCredential`.
369
-
> 4.**Read PDF from Blob Storage**: Reads the PDF content from the blob into a byte stream.
370
-
> 5.**Analyze PDF**: Uses Azure Document Intelligence to analyze the layout of the PDF.
371
-
> - Calls `begin_analyze_document` with the `prebuilt-layout` model.
372
-
> - Waits for the analysis to complete and retrieves the layout result.
373
-
> 6.**Extract Layout Data**: Parses and structures the layout data from the analysis result.
374
-
> - Extracts lines, tables, and selection marks from each page.
375
-
> - Logs styles (e.g., handwritten content) and organizes data into a structured dictionary.
376
-
> 7.**Save Data to Cosmos DB**: Saves the structured layout data to Cosmos DB.
377
-
> - Ensures the database (`ContosoDBDocIntellig`) and container (`Layouts`) exist or creates them.
378
-
> - Inserts or updates the layout data using `upsert_item`.
379
-
> 8.**Logging (Process and Errors)**: Logs each step of the process, including success messages and detailed error handling for debugging and monitoring.
398
+
> 1.**PDF Upload**: A PDF file is uploaded to the Azure Blob Storage container (`pdfinvoices`).
399
+
> 2.**Trigger Azure Function**: The upload triggers the Azure Function `BlobTriggerPDFsMultiLayoutsAIDocIntelligence`.
400
+
> 3.**Initialize Clients**: Sets up connections to Azure Document Intelligence, AI Vision, OpenAI, and Cosmos DB.
401
+
> - Initializes the `DocumentAnalysisClient` using the `FORM_RECOGNIZER_ENDPOINT` and `FORM_RECOGNIZER_KEY` environment variables.
402
+
> - Initializes the `AzureOpenAI` client for LLM analysis using OpenAI deployment details.
403
+
> - Configures the Vision API for visual cue detection.
404
+
> - Sets up the `CosmosClient` for data storage.
405
+
> 4.**Read PDF from Blob Storage**: Reads the PDF content from the blob into a byte stream.
406
+
> 5.**Analyze PDF**: Uses Azure Document Intelligence to analyze the layout of the PDF.
407
+
> - Calls `begin_analyze_document` with the `prebuilt-layout` model.
408
+
> - Waits for the analysis to complete and retrieves the layout result.
409
+
> 6.**Extract Layout Data**: Parses and structures the layout data from the analysis result.
410
+
> - Extracts lines, tables, and selection marks from each page.
411
+
> - Identifies visual selection cues using AI Vision for enhanced form recognition.
412
+
> - Logs styles (e.g., handwritten content) and organizes data into a structured dictionary.
413
+
> 7.**Enhance with AI Vision**: Analyzes visual elements for additional insights.
414
+
> - Detects and processes visual selection cues that Document Intelligence might miss.
415
+
> - Combines visual analysis with document structure understanding.
416
+
> 8.**Apply LLM Analysis**: Uses Azure OpenAI for semantic understanding of document content.
417
+
> - Prepares structured content for the LLM with meaningful context.
418
+
> - Analyzes content relationships and extracts high-level insights.
419
+
> 9.**Save Data to Cosmos DB**: Saves the structured layout data to Cosmos DB.
420
+
> - Ensures the database (`DocumentAnalysisDB`) and container (`ProcessedDocuments`) exist or creates them.
421
+
> - Prepares document for storage with metadata and timestamps.
422
+
> - Inserts or updates the layout data using `upsert_item`.
423
+
> 10.**Logging (Process and Errors)**: Logs each step of the process, including success messages and detailed error handling for debugging and monitoring.
424
+
> - Uses structured logging for better traceability.
425
+
> - Includes processing time metrics for performance analysis.
426
+
> - Provides comprehensive error handling with meaningful diagnostics.
380
427
381
428
- Update the function_app.py, for example [see the code used in this demo](./src/function_app.py):
382
429
@@ -444,15 +491,15 @@ Last updated: 2025-07-29
444
491
445
492
- Check all the logs, and traces generated. Also review the information parsed:
0 commit comments