Skip to content

Commit 0e07a84

Browse files
authored
Revise README for clarity on AI development phases
Updated the README to enhance clarity on the development process from basic coding to AI agents, including detailed phases and important considerations for production environments.
1 parent c66f603 commit 0e07a84

File tree

1 file changed

+103
-14
lines changed

1 file changed

+103
-14
lines changed

README.md

Lines changed: 103 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,6 @@ Last updated: 2025-11-03
1111

1212
----------
1313

14-
> `How we move from basic coding all the way to AI agents?`
15-
>
16-
> - We all `start with scripting`, no matter the language, it’s the first step. `Simple/complex instructions, written line by line`, to get something done
17-
> - Then comes `machine learning`. At this stage, we’re not reinventing the math, we’re `leveraging powerful packages built on deep statistical and mathematical foundations.` These tools let us `automate smarter processes, like reviewing claims with predictive analytics. You’re not just coding anymore; you’re building systems that learn and adapt.`
18-
> - `LLMs`. This is what most people mean when they say `AI.` Think of `yourself as the architect, and the LLM as your strategic engine. You can plug into it via an API/key, or through integrated services. It’s not just about automation, it’s about reasoning, understanding, and generating human-like responses.`
19-
> - And finally, `agents`. These are LLMs with the `ability to act`. They don’t just respond, `they take initiative. They can create code, trigger workflows, make decisions, interact with tools, with other agents. It’s where intelligence meets execution`
2014

2115
<details>
2216
<summary><b>List of References</b> (Click to expand)</summary>
@@ -62,9 +56,38 @@ Last updated: 2025-11-03
6256

6357
</details>
6458

65-
> [!NOTE]
66-
> How to query from `Sharepoint Library`: [GPT-RAG Data Ingestion](https://github.com/Azure/gpt-rag-ingestion/tree/main) <br/>
59+
> In the context of developing an E2E solution or application. Each stage builds confidence (technical, functional, and strategical), until we ready to scale and support the solution in the real world. Think of them as milestones in the journey from idea to production:
60+
61+
`Idea → PoC → PoV → MVP → Dev → Test → UAT → Prod → Continuous Improvement`
62+
63+
<details>
64+
<summary><b>Detailed phases</b> (Click to expand)</summary>
65+
66+
| Phase | Goal | What Happens | Focus | Audience | Example |
67+
|-------|------|--------------|-------|----------|---------|
68+
| **PoC (Proof of Concept)** | Validate technical feasibility | Build a minimal version to prove that the core idea or technology can work | Infrastructure setup, basic UI, simple workflows, mock data | Internal tech teams, architects | Can we integrate this new AI model into our system? |
69+
| **PoV (Proof of Value)** | Demonstrate business value | Expand the PoC to show how the solution aligns with business goals and KPIs | Real use cases, measurable outcomes, stakeholder engagement | Business leaders, sponsors, decision-makers | Does this solution reduce processing time by 30% as expected? |
70+
| **MVP (Minimum Viable Product)** | Deliver a usable product with core features | Build the smallest set of features that delivers value and can be deployed | Real users, feedback loops, iterative improvements | Early adopters, pilot users | A working app with login, dashboard, and one key feature |
71+
| **Dev (Development)** | Build and refine the product | Full-scale development of features, integrations, and backend logic | Code quality, version control, collaboration | Developers, QA, product managers ||
72+
| **Test (System/Integration Testing)** | Ensure the system works as expected | Run automated/manual tests, fix bugs, validate integrations | Functional testing, regression testing, performance | QA teams, developers ||
73+
| **UAT (User Acceptance Testing)** | Validate with real users before go-live | Business users test the system in a near-production environment | Usability, business rules, edge cases | End users, business analysts, stakeholders ||
74+
| **Prod (Production)** | Go live and deliver value | Deploy the solution to the live environment for real users | Stability, monitoring, support, feedback | All users, support teams, business owners ||
75+
76+
> `How we move from basic coding all the way to AI agents?`
6777
>
78+
> - We all `start with scripting`, no matter the language, it’s the first step. `Simple/complex instructions, written line by line`, to get something done
79+
> - Then comes `machine learning`. At this stage, we’re not reinventing the math, we’re `leveraging powerful packages built on deep statistical and mathematical foundations.` These tools let us `automate smarter processes, like reviewing claims with predictive analytics. You’re not just coding anymore; you’re building systems that learn and adapt.`
80+
> - `LLMs`. This is what most people mean when they say `AI.` Think of `yourself as the architect, and the LLM as your strategic engine. You can plug into it via an API/key, or through integrated services. It’s not just about automation, it’s about reasoning, understanding, and generating human-like responses.`
81+
> - And finally, `agents`. These are LLMs with the `ability to act`. They don’t just respond, `they take initiative. They can create code, trigger workflows, make decisions, interact with tools, with other agents. It’s where intelligence meets execution`
82+
83+
</details>
84+
85+
> [!NOTE]
86+
> How to query from `Sharepoint Library`: [GPT-RAG Data Ingestion](https://github.com/Azure/gpt-rag-ingestion/tree/main)
87+
88+
<details>
89+
<summary><b> Details </b> (Click to expand)</summary>
90+
6891
> - Access & Authentication: Integration uses a `service principal accoun`t registered in Azure Entra ID to authenticate and access the SharePoint document library via Microsoft Graph API. This avoids using personal accounts for programmatic access.
6992
> - Data Ingestion Flow: The RAG system connects to the SharePoint library using the provided credentials, retrieves documents (mainly PDFs), and processes them for indexing.
7093
> - Code Structure: Key integration logic resides in files such as:
@@ -86,11 +109,19 @@ SharePoint Site → Metadata Streamer → Document Downloader → Chunker → Az
86109
87110
Deleted Items Checker → Purge Deleted Items
88111
```
112+
113+
</details>
89114

90115
> [!NOTE]
91116
> How to query from `SQL on prem?`: <br/>
117+
118+
<details>
119+
<summary><b> Details </b> (Click to expand)</summary>
120+
92121
> This process `involved converting natural language to SQL, where we integrated the SQL database with the Agentic framework. When a user submits a query from the frontend, the system extracts relevant schema details from the AI search index to generate a SQL query with a few example cases. The query is then executed on the SQL server to fetch the records, and the results are displayed in natural language on the UI using an LLM.` Here more about how it works: [GPT-RAG Orchestrator](https://github.com/Azure/gpt-rag-orchestrator)
93122
123+
</details>
124+
94125
> [!IMPORTANT]
95126
> Disclaimer: This repository contains example of a Retrieval-Augmented Generation (RAG) chat bot with a basic architecture (designed for scenarios without network isolation), and a standard Zero-Trust Architecture deployment. This is `just a guide`. It is not an official solution. For official guidance, support, or more detailed information. Please refer [RAG with Zero-Trust – Architecture Reference to Microsoft's official documentation](https://github.com/Azure/GPT-RAG) or contact Microsoft directly: [Microsoft Sales and Support](https://support.microsoft.com/contactus?ContactUsExperienceEntryPointAssetId=S.HP.SMC-HOME)
96127
@@ -134,6 +165,11 @@ SharePoint Site → Metadata Streamer → Document Downloader → Chunker → Az
134165

135166
### Important Considerations for Production Environment
136167

168+
<details>
169+
<summary>Click to expand</summary>
170+
171+
> Some considerations:
172+
137173
<details>
138174
<summary>Public Network Site</summary>
139175

@@ -189,6 +225,9 @@ SharePoint Site → Metadata Streamer → Document Downloader → Chunker → Az
189225
190226
</details>
191227

228+
</details>
229+
230+
192231
## Zero Trust Architecture
193232

194233
> Zero Trust AI architecture in Microsoft Azure is a `security framework designed to protect data, applications, and infrastructure by assuming that threats can come from both inside and outside the network`. This model operates on the principle of "never trust, always verify", meaning `every access request is thoroughly authenticated and authorized based on all available data points, regardless of its origin. The architecture integrates multiple layers of security, including strong identity verification, device compliance checks, and least privilege access, ensuring that only authorized users and devices can access sensitive resources`. By continuously monitoring and validating each request, Zero Trust AI architecture helps organizations minimize risks and enhance their overall security posture.
@@ -214,6 +253,11 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
214253

215254
> The Azure Developer CLI (azd) is an `open-source tool` designed to streamline the end-to-end developer workflow on Azure. It provides `high-level commands` that simplify common developer tasks such as `project initialization, infrastructure provisioning, code deployment, and monitoring`.
216255
256+
<details>
257+
<summary><b> Details </b> (Click to expand)</summary>
258+
259+
> More detailed technical information:
260+
217261
<details>
218262
<summary><strong>Key Features</strong></summary>
219263

@@ -270,14 +314,24 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
270314

271315
<https://github.com/user-attachments/assets/2a9d7c6b-1260-4ad1-8889-ce01057d2b44>
272316

317+
</details>
318+
273319
### Step 0.2: Install PowerShell 7
274320

275321
> PowerShell 7 `complements Azure Developer CLI (azd) by providing robust automation capabilities that enhance the development and deployment workflows on Azure`. With PowerShell 7, you can `automate tasks such as provisioning resources, deploying applications, and managing configurations, which are integral to azd's operations.` For instance, you can use PowerShell scripts to automate the azd provision command, ensuring consistent infrastructure setup across different environments. PowerShell 7's ability to execute commands remotely aligns with azd's remote environment support, allowing seamless management of Azure resources from any location. By integrating PowerShell 7 scripts into azd workflows, developers can streamline their processes, improve efficiency, and maintain greater control over their Azure deployments.
276322
323+
<details>
324+
<summary><b> Visual reference here </b> (Click to expand)</summary>
325+
277326
<https://github.com/user-attachments/assets/9bb475e4-7fef-46d9-9147-a28e806b4e1c>
278327

328+
</details>
329+
279330
### Step 1: Download the repository
280331

332+
<details>
333+
<summary><b> Details </b> (Click to expand)</summary>
334+
281335
> Standard orchestrator
282336
283337
```
@@ -292,21 +346,31 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
292346

293347
<https://github.com/user-attachments/assets/22d2c66b-fd1e-4967-9f6c-ae02e27b2036>
294348

349+
</details>
350+
295351
> [!IMPORTANT]
296352
> Update the information in the `GPT-RAG_SolutionAccelerator/infra/main.parameters.json` file, and make sure to save your changes before proceeding with the infrastructure deployment.
297353
298354
### Step 2: Enable network isolation
299355

300356
> Azure network isolation is a security strategy that segments a network into distinct subnets or segments, each functioning as its own small network. This approach enhances security by preventing unauthorized access and data leakage. In Azure, network isolation can be achieved using Virtual Networks (VNets), Network Security Groups (NSGs), and Private Link, allowing precise control over inbound and outbound traffic.
301357
358+
<details>
359+
<summary><b> Details </b> (Click to expand)</summary>
360+
302361
```
303362
azd env set AZURE_NETWORK_ISOLATION true
304363
```
305364

306365
<https://github.com/user-attachments/assets/4f493506-970d-4b1d-aee2-1b0972a365d7>
307366

367+
</details>
368+
308369
### Step 3: Login to Azure
309370

371+
<details>
372+
<summary><b> Details </b> (Click to expand)</summary>
373+
310374
> Make sure you log in to both:
311375
312376
1. Azure Developer CLI:
@@ -323,10 +387,15 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
323387
324388
<https://github.com/user-attachments/assets/ed8833ee-5edc-4d28-8b45-2d6ae75e2bf6>
325389
390+
</details>
391+
326392
### Step 4: Deploy the insfrastructure
327393
328394
> `azd provision` command in Azure Developer CLI (azd) automates the deployment of necessary Azure resources for an application. It uses infrastructure-as-code templates to set up Azure services, ensuring consistent and repeatable deployments across different environments.
329395
396+
<details>
397+
<summary><b> Details </b> (Click to expand)</summary>
398+
330399
```
331400
azd provision
332401
```
@@ -343,8 +412,15 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
343412
<img src="https://github.com/user-attachments/assets/b4976132-2b3e-4bdf-b02f-aa8b0643455d" alt="Centered Image" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
344413
</div>
345414
415+
</details>
416+
346417
### Step 5: VM login
347418
419+
> AI/Data Science VM
420+
421+
<details>
422+
<summary><b> Details </b> (Click to expand)</summary>
423+
348424
1. To proceed with the deployment, use the Virtual Machine connected via Bastion (set up in step 4).
349425
350426
<img width="550" alt="image" src="https://github.com/user-attachments/assets/aee21cda-d047-4f6d-a568-8c9772639ca2" />
@@ -355,14 +431,24 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
355431
356432
<https://github.com/user-attachments/assets/25ec1fb6-d999-41e4-ac17-0c16b14a946d>
357433
434+
</details>
435+
358436
### Step 6: Install PowerShell 7 in the vm
359437
438+
<details>
439+
<summary><b> Details </b> (Click to expand)</summary>
440+
360441
> After logging into Windows, [install PowerShell](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell-on-windows?view=powershell-7.4#installing-the-msi-package), as all other necessary components are already set up on the VM.
361442
362443
<https://github.com/user-attachments/assets/c089a26e-8b31-466b-a052-a05d73d488fb>
363444
445+
</details>
446+
364447
### Step 7: Update azd on the VM
365448
449+
<details>
450+
<summary><b> Details </b> (Click to expand)</summary>
451+
366452
> Launch the `Command Prompt` and enter the following command to update azd to its latest version:
367453
368454
```
@@ -371,9 +457,13 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
371457
372458
<https://github.com/user-attachments/assets/777cdd6e-fa8f-49c2-9398-f94ac45be711>
373459
460+
</details>
461+
374462
### Step 8: Application deployment
375463
376-
> [!NOTE]
464+
<details>
465+
<summary><b> Details </b> (Click to expand)</summary>
466+
377467
> Please review these configurations: <br/>
378468
>
379469
> - RemoteFX USB Device Redirection: Allows USB devices connected to your local computer to be used in the remote desktop session.`You can access and use local USB devices like storage drives, printers, or other peripherals directly from the remote session.` <br/>
@@ -396,7 +486,6 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
396486
397487
<https://github.com/user-attachments/assets/8ea84df0-ac9a-4cad-be91-bfb24548d1d1>
398488
399-
> [!IMPORTANT]
400489
> When executing the `azd init for the app` and `azd env refresh` commands, ensure that the `environment name, subscription, and region are consistent` with those used during the `initial infrastructure provisioning`.
401490
402491
3. Sets up a new project using the Azure GPT-RAG template: `azd init -t azure/gpt-rag`
@@ -405,7 +494,6 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
405494
406495
4. Logs you into Azure Developer CLI: `azd auth login`.
407496
408-
> [!NOTE]
409497
> Ensure your admin account is correctly configured with Authenticator.
410498
411499
<https://github.com/user-attachments/assets/f34a13eb-b045-40c9-8f99-edc4dc9d0d15>
@@ -445,8 +533,7 @@ From [Standard Zero-Trust Architecture](https://github.com/Azure/GPT-RAG/blob/ma
445533
446534
<https://github.com/user-attachments/assets/aa248d9b-b1eb-42e3-9e6c-5e41bfdf5484>
447535
448-
> [!NOTE]
449-
> **If you encounter an error with `azd deploy`:**
536+
> **If you find an error with `azd deploy`:**
450537
451538
```
452539
ERROR: getting target resource: getting default resource groups for environment:
@@ -467,8 +554,10 @@ gpt-rag-resource-group: resource not found: 0 resource groups with prefix or suf
467554
> For example: <br/>
468555
> If `main.parameters.json` contains `"location": "westus2"`, make sure your environment has `AZURE_LOCATION=westus2`.
469556
557+
</details>
558+
470559
> [!NOTE]
471-
> A `golden dataset` for RAG is your trusted, `curated set of documents or files that the system retrieves from when answering questions`. It’s a clean, accurate, and `representative subset of all possible data free of noise and errors`, so the model always pulls reliable context. Is a `subset of files, for example, and known Q&A pairs chosen from the larger data source.` These are the “benchmark” `questions where the correct answers are already known`, so they can be `used later to measure system accuracy and performance`. Other `expert users are free to ask additional questions during testing, but those will still pull context from the same curated files in the golden dataset (subset datasource)`. In short, it’s the trusted evaluation set for your proof of concept for example.
560+
> A `golden dataset` for RAG is your trusted `curated set of documents or files that the system retrieves from when answering questions`. It’s a clean, accurate, and `representative subset of all possible data free of noise and errors`, so the model always pulls reliable context. Is a `subset of files, for example, and known Q&A pairs chosen from the larger data source.` These are the “benchmark” `questions where the correct answers are already known`, so they can be `used later to measure system accuracy and performance`. Other `expert users are free to ask additional questions during testing, but those will still pull context from the same curated files in the golden dataset (subset datasource)`. In short, it’s the trusted evaluation set for your proof of concept for example.
472561
473562
<img width="411" height="243" alt="Untitled Diagram drawio" src="https://github.com/user-attachments/assets/40682ec2-77e4-4413-88e5-d343f036f084" />
474563

0 commit comments

Comments
 (0)