|
4 | 4 | "cell_type": "markdown", |
5 | 5 | "metadata": {}, |
6 | 6 | "source": [ |
7 | | - "## Fine-Tuning and Evaluating LLMs with SageMaker Pipelines and MLflow" |
| 7 | + "## Coordinating FMOps Steps into a Fine-Tuning and Model Evaluation Pipeline" |
8 | 8 | ] |
9 | 9 | }, |
10 | 10 | { |
11 | 11 | "cell_type": "markdown", |
12 | 12 | "metadata": {}, |
13 | 13 | "source": [ |
14 | | - "Running hundreds of experiments, comparing the results, and keeping a track of the ML lifecycle can become very complex. This is where MLflow can help streamline the ML lifecycle, from data preparation to model deployment. By integrating MLflow into your LLM workflow, you can efficiently manage experiment tracking, model versioning, and deployment, providing reproducibility. With MLflow, you can track and compare the performance of multiple LLM experiments, identify the best-performing models, and deploy them to production environments with confidence. \n", |
| 14 | + "In this notebook, we stitch together the components of FMOps into a full FMOps pipeline on SageMaker AI. This capability creates a Directed-Acyclic Graph of steps, orchestrated by SageMaker AI and Managed MLFlow 3.0 on Amazon SageMaker.\n", |
| 15 | + "\n", |
| 16 | + "Running hundreds of experiments, comparing the results, and keeping a track of the ML lifecycle can become very complex. This is where MLflow can help streamline the ML lifecycle, from data preparation to model deployment. By integrating MLflow into your LLM workflow, you can efficiently manage experiment tracking, model versioning, and deployment, providing reproducibility of steps. With MLflow, you can track and compare the performance of multiple LLM experiments, identify the best-performing models, and deploy them to production environments with confidence. \n", |
15 | 17 | "\n", |
16 | 18 | "You can create workflows with SageMaker Pipelines that enable you to prepare data, fine-tune models, and evaluate model performance with simple Python code for each step. \n", |
17 | 19 | "\n", |
|
237 | 239 | "os.environ[\"pipeline_name\"] = pipeline_name" |
238 | 240 | ] |
239 | 241 | }, |
| 242 | + { |
| 243 | + "cell_type": "markdown", |
| 244 | + "metadata": {}, |
| 245 | + "source": [ |
| 246 | + "This section provides blanket configuration for how remote functions should be executed in a SageMaker environment. This configuration helps to streamline remote function execution which is particularly useful for optimizing the execution of pipelines." |
| 247 | + ] |
| 248 | + }, |
240 | 249 | { |
241 | 250 | "cell_type": "code", |
242 | 251 | "execution_count": 4, |
|
658 | 667 | "\n", |
659 | 668 | "**Creating the Pipeline**\n", |
660 | 669 | "\n", |
661 | | - "The pipeline object is created with all defined steps." |
| 670 | + "The pipeline object is created with all defined steps.\n", |
| 671 | + "\n", |
| 672 | + "1. Preprocessing Step -- Reformat all of the fine-tuning data to the prompt format required for the fine-tuning job.\n", |
| 673 | + "2. Training Step -- Execute the model fine-tuning job using the preprocessed data.\n", |
| 674 | + "3. Deploy Step -- Deploy the model to a SageMaker AI Managed Endpoint for testing fine-tuning performance.\n", |
| 675 | + "4. Quantitative Evaluation Step -- Evaluate the model's performance using ROUGE scores.\n", |
| 676 | + "5. Qualitative Evaluation Step -- Evaluate the model's performance using LLM-as-a-Judge.\n", |
| 677 | + "6. Conditionally Register Model -- Register the model if the quantitative and qualitative evaluations meet criteria." |
662 | 678 | ] |
663 | 679 | }, |
664 | 680 | { |
|
0 commit comments