|
5 | 5 | "id": "c277c688", |
6 | 6 | "metadata": {}, |
7 | 7 | "source": [ |
8 | | - "<div style=\"background: linear-gradient(135deg, #034694 0%, #1E8449 50%, #D4AC0D 100%); color: white; padding: 20px; border-radius: 10px; box-shadow: 0 4px 8px rgba(0,0,0,0.2);\">\n", |
9 | | - " <h1 style=\"color: #FFF; text-shadow: 1px 1px 3px rgba(0,0,0,0.5);\">💬 | Step 1: Customize The Tone & Style With SFT </h1>\n", |
10 | | - " <p style=\"font-size: 16px; line-height: 1.6;\">\n", |
11 | | - " We used few shot examples to prompt-engineer a better tone. We used RAG to ground responses in our data. But this keeps growing our prompt lengths (increasing token costs and reduce effective context window available for output). How can we improve the tone and style of our bot with _more examples_ and shorter prompt length?\n", |
12 | | - " </p>\n", |
13 | | - "</div>" |
| 8 | + "# 🎯 | Cora-For-Zava: Model Customization with Fine-Tuning\n", |
| 9 | + "\n", |
| 10 | + "Welcome! This notebook will guide you through customizing an AI model using Supervised Fine-Tuning (SFT) to improve tone, style, and response consistency for your specific use case.\n", |
| 11 | + "\n", |
| 12 | + "## 🛒 Our Zava Scenario\n", |
| 13 | + "\n", |
| 14 | + "**Cora** is a customer service chatbot for **Zava** - a fictitious retailer of home improvement goods for DIY enthusiasts. While we've used few-shot examples and RAG to improve Cora's responses, these approaches increase prompt length (raising token costs and reducing available context window). Fine-tuning allows us to embed better tone and style directly into the model with shorter prompts.\n", |
| 15 | + "\n", |
| 16 | + "## 🎯 What You'll Build\n", |
| 17 | + "\n", |
| 18 | + "By the end of this notebook, you'll have:\n", |
| 19 | + "- ✅ Prepared and validated training datasets for fine-tuning\n", |
| 20 | + "- ✅ Analyzed token usage to optimize training efficiency\n", |
| 21 | + "- ✅ Uploaded training data to Azure OpenAI for processing\n", |
| 22 | + "- ✅ Submitted and monitored a fine-tuning job\n", |
| 23 | + "- ✅ Deployed a custom fine-tuned model for testing\n", |
| 24 | + "- ✅ Evaluated the improved tone and style consistency\n", |
| 25 | + "\n", |
| 26 | + "## 💡 What You'll Learn\n", |
| 27 | + "\n", |
| 28 | + "- How to prepare JSONL datasets for supervised fine-tuning\n", |
| 29 | + "- How to validate token counts and optimize training data\n", |
| 30 | + "- How to submit and monitor fine-tuning jobs in Azure OpenAI\n", |
| 31 | + "- How to deploy and test fine-tuned models\n", |
| 32 | + "- How fine-tuning reduces prompt length while improving consistency\n", |
| 33 | + "- When to use fine-tuning vs. few-shot prompting vs. RAG\n", |
| 34 | + "\n", |
| 35 | + "> **Note**: Fine-tuning customizes model behavior at the foundation level, allowing shorter prompts while maintaining quality and consistency.\n", |
| 36 | + "\n", |
| 37 | + "Ready to customize your model? Let's get started! 🚀\n", |
| 38 | + "\n", |
| 39 | + "---" |
14 | 40 | ] |
15 | 41 | }, |
16 | 42 | { |
17 | 43 | "cell_type": "markdown", |
18 | 44 | "id": "57e60caa", |
19 | 45 | "metadata": {}, |
20 | 46 | "source": [ |
21 | | - "---\n", |
22 | | - "### 1. Check Environment Variables" |
| 47 | + "## Step 1: Verify Environment Variables\n", |
| 48 | + "\n", |
| 49 | + "The following environment variables should already be configured in your `.env` file from the earlier setup steps:\n", |
| 50 | + "\n", |
| 51 | + "- **AZURE_OPENAI_API_KEY**: Your Azure OpenAI API key\n", |
| 52 | + "- **AZURE_OPENAI_ENDPOINT**: Your Azure OpenAI service endpoint\n", |
| 53 | + "- **AZURE_OPENAI_API_VERSION**: The API version to use (2025-02-01-preview for fine-tuning)\n", |
| 54 | + "- **AZURE_SUBSCRIPTION_ID**: Your Azure subscription ID\n", |
| 55 | + "- **AZURE_RESOURCE_GROUP**: Your Azure resource group name\n", |
| 56 | + "- **AZURE_AI_PROJECT_NAME**: Your Azure AI Foundry project name\n", |
| 57 | + "\n", |
| 58 | + "> **Important**: Fine-tuning requires specific API versions and model availability. Currently, `gpt-4o-2024-08-06` can be fine-tuned in Sweden Central and North Central US regions." |
23 | 59 | ] |
24 | 60 | }, |
25 | 61 | { |
|
48 | 84 | "id": "85ae8408", |
49 | 85 | "metadata": {}, |
50 | 86 | "source": [ |
51 | | - "---\n", |
52 | | - "### 2. Validate Training Dataset" |
| 87 | + "## Step 2: Prepare Training Dataset\n", |
| 88 | + "\n", |
| 89 | + "Fine-tuning requires carefully prepared training data in JSONL format. Each line contains a conversation example with the desired tone and style. Our dataset includes:\n", |
| 90 | + "\n", |
| 91 | + "- **Training Set** (`31-basic_training.jsonl`): Examples for model learning\n", |
| 92 | + "- **Validation Set** (`31-basic_validation.jsonl`): Examples for performance monitoring during training\n", |
| 93 | + "\n", |
| 94 | + "> **Key Format Requirements**:\n", |
| 95 | + "> - Each example must have a `messages` array with conversation turns\n", |
| 96 | + "> - Messages include `role` (system/user/assistant) and `content`\n", |
| 97 | + "> - Training examples should demonstrate the desired Zava tone: polite, factual, helpful" |
53 | 98 | ] |
54 | 99 | }, |
55 | 100 | { |
|
103 | 148 | "id": "345304de", |
104 | 149 | "metadata": {}, |
105 | 150 | "source": [ |
106 | | - "---\n", |
107 | | - "### 3. Assess Token Counts For Data" |
| 151 | + "## Step 3: Assess Token Counts For Data\n", |
| 152 | + "\n", |
| 153 | + "Token analysis is crucial for fine-tuning cost estimation and quality. We'll analyze:\n", |
| 154 | + "\n", |
| 155 | + "- **Total Tokens**: Complete conversation length including system, user, and assistant messages\n", |
| 156 | + "- **Assistant Tokens**: Only the model's responses (what we're training to improve)\n", |
| 157 | + "- **Distribution Statistics**: Min/max, mean/median, and percentile analysis\n", |
| 158 | + "\n", |
| 159 | + "> **Best Practices**:\n", |
| 160 | + "> - Keep conversations focused and concise\n", |
| 161 | + "> - Aim for consistent token lengths across examples\n", |
| 162 | + "> - Monitor assistant token ratio for cost optimization" |
108 | 163 | ] |
109 | 164 | }, |
110 | 165 | { |
|
171 | 226 | "id": "b023e173", |
172 | 227 | "metadata": {}, |
173 | 228 | "source": [ |
174 | | - "---\n", |
175 | | - "### 4. Upload Fine-Tuning Data To Cloud" |
| 229 | + "## Step 4: Upload Fine-Tuning Data To Azure\n", |
| 230 | + "\n", |
| 231 | + "Upload your prepared datasets to Azure OpenAI for processing. The uploaded files will be:\n", |
| 232 | + "\n", |
| 233 | + "- **Stored securely** in your Azure OpenAI resource\n", |
| 234 | + "- **Validated automatically** for format compliance\n", |
| 235 | + "- **Accessible** for fine-tuning job creation\n", |
| 236 | + "- **Visible** in Azure AI Foundry Portal under 'Data Files'\n", |
| 237 | + "\n", |
| 238 | + "> **Security Note**: Data is encrypted at rest and in transit. Files are only accessible within your Azure OpenAI resource." |
176 | 239 | ] |
177 | 240 | }, |
178 | 241 | { |
|
223 | 286 | "id": "59bee32c", |
224 | 287 | "metadata": {}, |
225 | 288 | "source": [ |
226 | | - "---\n", |
227 | | - "### 5. Submit The Fine-Tuning Job" |
| 289 | + "## Step 5: Submit The Fine-Tuning Job\n", |
| 290 | + "\n", |
| 291 | + "Create and submit a fine-tuning job with your uploaded datasets. Key parameters:\n", |
| 292 | + "\n", |
| 293 | + "- **Base Model**: `gpt-4o-2024-08-06` (fine-tuning compatible version)\n", |
| 294 | + "- **Training File**: Your uploaded training dataset\n", |
| 295 | + "- **Validation File**: Your uploaded validation dataset \n", |
| 296 | + "- **Seed**: For reproducible results (optional but recommended)\n", |
| 297 | + "\n", |
| 298 | + "> **Important**: Fine-tuning jobs typically take 10-30 minutes depending on dataset size. The job will automatically create training checkpoints and monitor validation loss." |
228 | 299 | ] |
229 | 300 | }, |
230 | 301 | { |
|
266 | 337 | "source": [ |
267 | 338 | "---\n", |
268 | 339 | "\n", |
269 | | - "### 6. Track Fine-Tuning Job Status" |
| 340 | + "## Step 6: Track Fine-Tuning Job Status\n", |
| 341 | + "\n", |
| 342 | + "Monitor your fine-tuning job progress in real-time. The job progresses through these stages:\n", |
| 343 | + "\n", |
| 344 | + "1. **Validating**: Checking data format and compatibility\n", |
| 345 | + "2. **Running**: Active training with your dataset\n", |
| 346 | + "3. **Succeeded**: Training completed successfully\n", |
| 347 | + "4. **Failed**: Training encountered an error\n", |
| 348 | + "\n", |
| 349 | + "> **Monitoring**: The cell will automatically refresh every 10 seconds until completion. You can also view progress in the Azure AI Foundry Portal." |
270 | 350 | ] |
271 | 351 | }, |
272 | 352 | { |
|
314 | 394 | "source": [ |
315 | 395 | "---\n", |
316 | 396 | "\n", |
317 | | - "### 7. List Fine-Tuning Events" |
| 397 | + "## Step 7: Review Fine-Tuning Events\n", |
| 398 | + "\n", |
| 399 | + "Examine detailed training events and logs to understand the training process:\n", |
| 400 | + "\n", |
| 401 | + "- **Training Progress**: Step-by-step training updates\n", |
| 402 | + "- **Loss Metrics**: Training and validation loss evolution\n", |
| 403 | + "- **Completion Status**: Final training results and model location\n", |
| 404 | + "\n", |
| 405 | + "> **Debugging**: Events help troubleshoot any training issues and verify successful completion." |
318 | 406 | ] |
319 | 407 | }, |
320 | 408 | { |
|
335 | 423 | "source": [ |
336 | 424 | "---\n", |
337 | 425 | "\n", |
338 | | - "### 8. List Fine-Tuning Checkpoints" |
| 426 | + "## Step 8: Review Fine-Tuning Checkpoints\n", |
| 427 | + "\n", |
| 428 | + "Examine training checkpoints created during the fine-tuning process:\n", |
| 429 | + "\n", |
| 430 | + "- **Checkpoint Analysis**: Model state at different training steps\n", |
| 431 | + "- **Performance Metrics**: Validation loss at each checkpoint\n", |
| 432 | + "- **Model Selection**: Identify the best performing checkpoint\n", |
| 433 | + "\n", |
| 434 | + "> **Advanced Usage**: Checkpoints allow you to select optimal training stopping points and analyze training dynamics." |
339 | 435 | ] |
340 | 436 | }, |
341 | 437 | { |
|
362 | 458 | "id": "42313dcc", |
363 | 459 | "metadata": {}, |
364 | 460 | "source": [ |
365 | | - "### 9. Retrieve Fine-Tuned Model Name" |
| 461 | + "## Step 9: Retrieve Fine-Tuned Model Name\n", |
| 462 | + "\n", |
| 463 | + "Get your completed fine-tuned model identifier for deployment:\n", |
| 464 | + "\n", |
| 465 | + "- **Model ID**: Unique identifier for your custom model\n", |
| 466 | + "- **Training Stats**: Final training metrics and completion details\n", |
| 467 | + "- **Deployment Ready**: Model is ready for Azure deployment\n", |
| 468 | + "\n", |
| 469 | + "> **Next Step**: Use this model ID to create a deployment in Azure AI Foundry Portal." |
366 | 470 | ] |
367 | 471 | }, |
368 | 472 | { |
|
387 | 491 | "source": [ |
388 | 492 | "---\n", |
389 | 493 | "\n", |
390 | | - "### 10. Deploy Fine-Tuned Model For Testing\n", |
| 494 | + "## Step 10: Deploy Fine-Tuned Model For Testing\n", |
| 495 | + "\n", |
| 496 | + "Deploy your fine-tuned model to test the improved tone and style:\n", |
| 497 | + "\n", |
| 498 | + "1. **Azure AI Foundry Portal**: Navigate to your project's Model deployments\n", |
| 499 | + "2. **Deploy Custom Model**: Select your fine-tuned model ID\n", |
| 500 | + "3. **Developer Tier**: Use for testing (inference costs only, no hosting fees)\n", |
| 501 | + "4. **Configure Deployment**: Set deployment name and resource allocation\n", |
391 | 502 | "\n", |
392 | | - "For now - we deployed this manually from the Azure AI Foundry Portal - using the **developer tier** which allows us to test our fine-tuned model for the cost of just inferencing. Once we deploy it, we can try it out" |
| 503 | + "> **Cost Optimization**: Developer tier allows testing without hosting costs. Upgrade to standard deployment for production use." |
393 | 504 | ] |
394 | 505 | }, |
395 | 506 | { |
|
419 | 530 | "id": "4a39c0bb", |
420 | 531 | "metadata": {}, |
421 | 532 | "source": [ |
422 | | - "**Insights**\n", |
| 533 | + "## 🎯 Fine-Tuning Results & Insights\n", |
423 | 534 | "\n", |
424 | | - "In both the examples above we can note that the response now accurately follows our Zava guidelines for \"polite, factual and helpful\"\n", |
425 | | - "- Every response starts with an emoji\n", |
426 | | - "- The first sentence is always an acknowledgement of the user (\"polite\")\n", |
427 | | - "- The next sentence is always an informative segment (\"factual\")\n", |
428 | | - "- The final senteance is always an offer to follow up (\"helpful\")\n", |
| 535 | + "**Zava Tone Consistency Achieved!**\n", |
429 | 536 | "\n", |
430 | | - "And note that we have the succinct responses we were looking for _without adding few-shot examples_, making the prompts shorter and thus saving both token costs and processing latency." |
431 | | - ] |
432 | | - }, |
433 | | - { |
434 | | - "cell_type": "markdown", |
435 | | - "id": "b08deb97", |
436 | | - "metadata": {}, |
437 | | - "source": [ |
438 | | - "---\n", |
439 | | - "### Teardown\n", |
| 537 | + "In both examples above, the fine-tuned model now consistently follows Zava's brand guidelines for \"polite, factual, and helpful\" responses:\n", |
440 | 538 | "\n", |
441 | | - "Once you are done with this lab, don't forget to tear down the infrastructure. The developer tier model will be torn down automatically (after 24 hours?) but it is better to proactively delete the resource group and release all model quota." |
| 539 | + "### ✅ Consistent Structure\n", |
| 540 | + "- **Emoji Opening**: Every response starts with a relevant emoji\n", |
| 541 | + "- **Polite Acknowledgment**: First sentence acknowledges the customer's need\n", |
| 542 | + "- **Factual Information**: Middle section provides specific product details and pricing\n", |
| 543 | + "- **Helpful Follow-up**: Final sentence offers additional assistance\n", |
| 544 | + "\n", |
| 545 | + "### 🚀 Key Benefits Achieved\n", |
| 546 | + "- **Shorter Prompts**: No need for few-shot examples in every request\n", |
| 547 | + "- **Lower Token Costs**: Reduced prompt length saves on API costs\n", |
| 548 | + "- **Faster Processing**: Less context to process means faster responses\n", |
| 549 | + "- **Consistent Quality**: Every response follows the trained pattern\n", |
| 550 | + "\n", |
| 551 | + "### 📊 Performance Comparison\n", |
| 552 | + "| Aspect | Before Fine-Tuning | After Fine-Tuning |\n", |
| 553 | + "|--------|--------------------|--------------------|\n", |
| 554 | + "| Prompt Length | ~800 tokens (with examples) | ~200 tokens (system prompt only) |\n", |
| 555 | + "| Tone Consistency | Variable (depends on examples) | Consistent (embedded in model) |\n", |
| 556 | + "| Response Time | Slower (longer prompt) | Faster (shorter prompt) |\n", |
| 557 | + "| Cost per Request | Higher (more input tokens) | Lower (fewer input tokens) |\n", |
| 558 | + "\n", |
| 559 | + "> **Best Practice**: Fine-tuning is most effective when you have consistent patterns you want the model to learn, rather than just providing factual knowledge (which is better handled by RAG)." |
442 | 560 | ] |
443 | 561 | }, |
444 | 562 | { |
445 | 563 | "cell_type": "markdown", |
446 | | - "id": "8ee386b6", |
| 564 | + "id": "7621cf2b", |
447 | 565 | "metadata": {}, |
448 | 566 | "source": [ |
| 567 | + "## Step 11: Next Steps\n", |
| 568 | + "\n", |
| 569 | + "You've successfully fine-tuned a model for better tone and style! Here are your next steps:\n", |
| 570 | + "\n", |
| 571 | + "> **Key Insight**: Fine-tuning excels at embedding consistent patterns (like tone and style) while RAG excels at providing up-to-date factual information. Combining both creates powerful, cost-effective AI systems.\n", |
| 572 | + "\n", |
| 573 | + "---\n", |
449 | 574 | "\n", |
450 | | - "<div style=\"display: flex; align-items: center; justify-content: left; padding: 5px; height: 40px; background: linear-gradient(90deg, #7873f5 0%, #ff6ec4 100%); border-radius: 8px; box-shadow: 0 2px 8px rgba(0,0,0,0.12); font-size: 1.5em; font-weight: bold; color: #fff;\">\n", |
451 | | - " Next: Be More Cost-Effective With Distillation\n", |
452 | | - "</div>" |
| 575 | + "**Great work! You've mastered fine-tuning for tone and style customization.** 🎉" |
453 | 576 | ] |
454 | 577 | }, |
455 | 578 | { |
|
0 commit comments