You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Revise README for clarity on AI development phases
Updated the README to enhance clarity on the development process from basic coding to AI agents, including detailed phases and important considerations for production environments.
> `How we move from basic coding all the way to AI agents?`
15
-
>
16
-
> - We all `start with scripting`, no matter the language, it’s the first step. `Simple/complex instructions, written line by line`, to get something done
17
-
> - Then comes `machine learning`. At this stage, we’re not reinventing the math, we’re `leveraging powerful packages built on deep statistical and mathematical foundations.` These tools let us `automate smarter processes, like reviewing claims with predictive analytics. You’re not just coding anymore; you’re building systems that learn and adapt.`
18
-
> -`LLMs`. This is what most people mean when they say `AI.` Think of `yourself as the architect, and the LLM as your strategic engine. You can plug into it via an API/key, or through integrated services. It’s not just about automation, it’s about reasoning, understanding, and generating human-like responses.`
19
-
> - And finally, `agents`. These are LLMs with the `ability to act`. They don’t just respond, `they take initiative. They can create code, trigger workflows, make decisions, interact with tools, with other agents. It’s where intelligence meets execution`
20
14
21
15
<details>
22
16
<summary><b>List of References</b> (Click to expand)</summary>
@@ -62,9 +56,38 @@ Last updated: 2025-11-03
62
56
63
57
</details>
64
58
65
-
> [!NOTE]
66
-
> How to query from `Sharepoint Library`: [GPT-RAG Data Ingestion](https://github.com/Azure/gpt-rag-ingestion/tree/main) <br/>
59
+
> In the context of developing an E2E solution or application. Each stage builds confidence (technical, functional, and strategical), until we ready to scale and support the solution in the real world. Think of them as milestones in the journey from idea to production:
60
+
61
+
`Idea → PoC → PoV → MVP → Dev → Test → UAT → Prod → Continuous Improvement`
62
+
63
+
<details>
64
+
<summary><b>Detailed phases</b> (Click to expand)</summary>
65
+
66
+
| Phase | Goal | What Happens | Focus | Audience | Example |
|**PoC (Proof of Concept)**| Validate technical feasibility | Build a minimal version to prove that the core idea or technology can work | Infrastructure setup, basic UI, simple workflows, mock data | Internal tech teams, architects | Can we integrate this new AI model into our system? |
69
+
|**PoV (Proof of Value)**| Demonstrate business value | Expand the PoC to show how the solution aligns with business goals and KPIs | Real use cases, measurable outcomes, stakeholder engagement | Business leaders, sponsors, decision-makers | Does this solution reduce processing time by 30% as expected? |
70
+
|**MVP (Minimum Viable Product)**| Deliver a usable product with core features | Build the smallest set of features that delivers value and can be deployed | Real users, feedback loops, iterative improvements | Early adopters, pilot users | A working app with login, dashboard, and one key feature |
71
+
|**Dev (Development)**| Build and refine the product | Full-scale development of features, integrations, and backend logic | Code quality, version control, collaboration | Developers, QA, product managers | — |
72
+
|**Test (System/Integration Testing)**| Ensure the system works as expected | Run automated/manual tests, fix bugs, validate integrations | Functional testing, regression testing, performance | QA teams, developers | — |
73
+
|**UAT (User Acceptance Testing)**| Validate with real users before go-live | Business users test the system in a near-production environment | Usability, business rules, edge cases | End users, business analysts, stakeholders | — |
74
+
|**Prod (Production)**| Go live and deliver value | Deploy the solution to the live environment for real users | Stability, monitoring, support, feedback | All users, support teams, business owners | — |
75
+
76
+
> `How we move from basic coding all the way to AI agents?`
67
77
>
78
+
> - We all `start with scripting`, no matter the language, it’s the first step. `Simple/complex instructions, written line by line`, to get something done
79
+
> - Then comes `machine learning`. At this stage, we’re not reinventing the math, we’re `leveraging powerful packages built on deep statistical and mathematical foundations.` These tools let us `automate smarter processes, like reviewing claims with predictive analytics. You’re not just coding anymore; you’re building systems that learn and adapt.`
80
+
> -`LLMs`. This is what most people mean when they say `AI.` Think of `yourself as the architect, and the LLM as your strategic engine. You can plug into it via an API/key, or through integrated services. It’s not just about automation, it’s about reasoning, understanding, and generating human-like responses.`
81
+
> - And finally, `agents`. These are LLMs with the `ability to act`. They don’t just respond, `they take initiative. They can create code, trigger workflows, make decisions, interact with tools, with other agents. It’s where intelligence meets execution`
82
+
83
+
</details>
84
+
85
+
> [!NOTE]
86
+
> How to query from `Sharepoint Library`: [GPT-RAG Data Ingestion](https://github.com/Azure/gpt-rag-ingestion/tree/main)
87
+
88
+
<details>
89
+
<summary><b> Details </b> (Click to expand)</summary>
90
+
68
91
> - Access & Authentication: Integration uses a `service principal accoun`t registered in Azure Entra ID to authenticate and access the SharePoint document library via Microsoft Graph API. This avoids using personal accounts for programmatic access.
69
92
> - Data Ingestion Flow: The RAG system connects to the SharePoint library using the provided credentials, retrieves documents (mainly PDFs), and processes them for indexing.
70
93
> - Code Structure: Key integration logic resides in files such as:
@@ -86,11 +109,19 @@ SharePoint Site → Metadata Streamer → Document Downloader → Chunker → Az
86
109
87
110
Deleted Items Checker → Purge Deleted Items
88
111
```
112
+
113
+
</details>
89
114
90
115
> [!NOTE]
91
116
> How to query from `SQL on prem?`: <br/>
117
+
118
+
<details>
119
+
<summary><b> Details </b> (Click to expand)</summary>
120
+
92
121
> This process `involved converting natural language to SQL, where we integrated the SQL database with the Agentic framework. When a user submits a query from the frontend, the system extracts relevant schema details from the AI search index to generate a SQL query with a few example cases. The query is then executed on the SQL server to fetch the records, and the results are displayed in natural language on the UI using an LLM.` Here more about how it works: [GPT-RAG Orchestrator](https://github.com/Azure/gpt-rag-orchestrator)
93
122
123
+
</details>
124
+
94
125
> [!IMPORTANT]
95
126
> Disclaimer: This repository contains example of a Retrieval-Augmented Generation (RAG) chat bot with a basic architecture (designed for scenarios without network isolation), and a standard Zero-Trust Architecture deployment. This is `just a guide`. It is not an official solution. For official guidance, support, or more detailed information. Please refer [RAG with Zero-Trust – Architecture Reference to Microsoft's official documentation](https://github.com/Azure/GPT-RAG) or contact Microsoft directly: [Microsoft Sales and Support](https://support.microsoft.com/contactus?ContactUsExperienceEntryPointAssetId=S.HP.SMC-HOME)
96
127
@@ -134,6 +165,11 @@ SharePoint Site → Metadata Streamer → Document Downloader → Chunker → Az
134
165
135
166
### Important Considerations for Production Environment
136
167
168
+
<details>
169
+
<summary>Click to expand</summary>
170
+
171
+
> Some considerations:
172
+
137
173
<details>
138
174
<summary>Public Network Site</summary>
139
175
@@ -189,6 +225,9 @@ SharePoint Site → Metadata Streamer → Document Downloader → Chunker → Az
189
225
190
226
</details>
191
227
228
+
</details>
229
+
230
+
192
231
## Zero Trust Architecture
193
232
194
233
> Zero Trust AI architecture in Microsoft Azure is a `security framework designed to protect data, applications, and infrastructure by assuming that threats can come from both inside and outside the network`. This model operates on the principle of "never trust, always verify", meaning `every access request is thoroughly authenticated and authorized based on all available data points, regardless of its origin. The architecture integrates multiple layers of security, including strong identity verification, device compliance checks, and least privilege access, ensuring that only authorized users and devices can access sensitive resources`. By continuously monitoring and validating each request, Zero Trust AI architecture helps organizations minimize risks and enhance their overall security posture.
@@ -214,6 +253,11 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
214
253
215
254
> The Azure Developer CLI (azd) is an `open-source tool` designed to streamline the end-to-end developer workflow on Azure. It provides `high-level commands` that simplify common developer tasks such as `project initialization, infrastructure provisioning, code deployment, and monitoring`.
216
255
256
+
<details>
257
+
<summary><b> Details </b> (Click to expand)</summary>
258
+
259
+
> More detailed technical information:
260
+
217
261
<details>
218
262
<summary><strong>Key Features</strong></summary>
219
263
@@ -270,14 +314,24 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
> PowerShell 7 `complements Azure Developer CLI (azd) by providing robust automation capabilities that enhance the development and deployment workflows on Azure`. With PowerShell 7, you can `automate tasks such as provisioning resources, deploying applications, and managing configurations, which are integral to azd's operations.` For instance, you can use PowerShell scripts to automate the azd provision command, ensuring consistent infrastructure setup across different environments. PowerShell 7's ability to execute commands remotely aligns with azd's remote environment support, allowing seamless management of Azure resources from any location. By integrating PowerShell 7 scripts into azd workflows, developers can streamline their processes, improve efficiency, and maintain greater control over their Azure deployments.
276
322
323
+
<details>
324
+
<summary><b> Visual reference here </b> (Click to expand)</summary>
> Update the information in the `GPT-RAG_SolutionAccelerator/infra/main.parameters.json` file, and make sure to save your changes before proceeding with the infrastructure deployment.
297
353
298
354
### Step 2: Enable network isolation
299
355
300
356
> Azure network isolation is a security strategy that segments a network into distinct subnets or segments, each functioning as its own small network. This approach enhances security by preventing unauthorized access and data leakage. In Azure, network isolation can be achieved using Virtual Networks (VNets), Network Security Groups (NSGs), and Private Link, allowing precise control over inbound and outbound traffic.
301
357
358
+
<details>
359
+
<summary><b> Details </b> (Click to expand)</summary>
> `azd provision` command in Azure Developer CLI (azd) automates the deployment of necessary Azure resources for an application. It uses infrastructure-as-code templates to set up Azure services, ensuring consistent and repeatable deployments across different environments.
329
395
396
+
<details>
397
+
<summary><b> Details </b> (Click to expand)</summary>
398
+
330
399
```
331
400
azd provision
332
401
```
@@ -343,8 +412,15 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
<summary><b> Details </b> (Click to expand)</summary>
440
+
360
441
> After logging into Windows, [install PowerShell](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell-on-windows?view=powershell-7.4#installing-the-msi-package), as all other necessary components are already set up on the VM.
<summary><b> Details </b> (Click to expand)</summary>
466
+
377
467
> Please review these configurations: <br/>
378
468
>
379
469
> - RemoteFX USB Device Redirection: Allows USB devices connected to your local computer to be used in the remote desktop session.`You can access and use local USB devices like storage drives, printers, or other peripherals directly from the remote session.` <br/>
@@ -396,7 +486,6 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
> When executing the `azd init for the app` and `azd env refresh` commands, ensure that the `environment name, subscription, and region are consistent` with those used during the `initial infrastructure provisioning`.
401
490
402
491
3. Sets up a new project using the Azure GPT-RAG template: `azd init -t azure/gpt-rag`
@@ -405,7 +494,6 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
405
494
406
495
4. Logs you into Azure Developer CLI: `azd auth login`.
407
496
408
-
> [!NOTE]
409
497
> Ensure your admin account is correctly configured with Authenticator.
> **If you encounter an error with `azd deploy`:**
536
+
> **If you find an error with `azd deploy`:**
450
537
451
538
```
452
539
ERROR: getting target resource: getting default resource groups for environment:
@@ -467,8 +554,10 @@ gpt-rag-resource-group: resource not found: 0 resource groups with prefix or suf
467
554
> For example: <br/>
468
555
> If `main.parameters.json` contains `"location": "westus2"`, make sure your environment has `AZURE_LOCATION=westus2`.
469
556
557
+
</details>
558
+
470
559
> [!NOTE]
471
-
> A `golden dataset` for RAG is your trusted, `curated set of documents or files that the system retrieves from when answering questions`. It’s a clean, accurate, and `representative subset of all possible data free of noise and errors`, so the model always pulls reliable context. Is a `subset of files, for example, and known Q&A pairs chosen from the larger data source.` These are the “benchmark” `questions where the correct answers are already known`, so they can be `used later to measure system accuracy and performance`. Other `expert users are free to ask additional questions during testing, but those will still pull context from the same curated files in the golden dataset (subset datasource)`. In short, it’s the trusted evaluation set for your proof of concept for example.
560
+
> A `golden dataset` for RAG is your trusted `curated set of documents or files that the system retrieves from when answering questions`. It’s a clean, accurate, and `representative subset of all possible data free of noise and errors`, so the model always pulls reliable context. Is a `subset of files, for example, and known Q&A pairs chosen from the larger data source.` These are the “benchmark” `questions where the correct answers are already known`, so they can be `used later to measure system accuracy and performance`. Other `expert users are free to ask additional questions during testing, but those will still pull context from the same curated files in the golden dataset (subset datasource)`. In short, it’s the trusted evaluation set for your proof of concept for example.
0 commit comments