|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "id": "170d0303-3d1e-4cec-bcbf-0aa2c55a3b08", |
| 6 | + "metadata": { |
| 7 | + "tags": [] |
| 8 | + }, |
| 9 | + "source": [ |
| 10 | + "# AI Generated Images for your Roboflow Project using Stable Diffusion\n", |
| 11 | + "\n", |
| 12 | + "[Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) is a deep learning, text-to-image model released in 2022. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. In this notebook, we'll use Stable Diffusion to generate images for your Computer Vision project, and push those images into your [Roboflow](https://blog.roboflow.com/synthetic-data-with-stable-diffusion-a-guide/) project for annotating. We will be using [Amazon SageMaker Studio Lab](https://studiolab.sagemaker.aws/). At the end, we'll have quality, AI generated, representative data to further enrich your dataset and strengthen your model.\n", |
| 13 | + "\n", |
| 14 | + "Many thanks to the CompVis group LMU Munich, Runway and Stability AI for releasing [the code](https://github.com/CompVis/stable-diffusion).\n", |
| 15 | + "\n", |
| 16 | + "## **Steps Covered in this Tutorial**\n", |
| 17 | + "\n", |
| 18 | + "To generate our images we will cover the following:\n", |
| 19 | + "\n", |
| 20 | + "* Install Stable Diffusion dependencies\n", |
| 21 | + "* Getting latest model version - runwayml/stable-diffusion-v1-5 - from Hugging Face hub\n", |
| 22 | + "* Create function to generate images\n", |
| 23 | + "* Generate images based on your prompt of choice\n", |
| 24 | + "* Upload images to your Roboflow project using Roboflow PIP package\n", |
| 25 | + "* Begin Annotating" |
| 26 | + ] |
| 27 | + }, |
| 28 | + { |
| 29 | + "cell_type": "markdown", |
| 30 | + "id": "97ac9a9a-a80a-4e02-9c55-293446f816a8", |
| 31 | + "metadata": {}, |
| 32 | + "source": [ |
| 33 | + "## Installing Roboflow and other dependencies\n", |
| 34 | + "\n", |
| 35 | + "We'll be using [Roboflow](https://roboflow.com/?ref=studiolab) to push our images up to after we have generated them for annotating (and, optionally, to use the [Roboflow Annotate tool](https://roboflow.com/annotate).\n", |
| 36 | + "\n", |
| 37 | + "The [`roboflow` pip package](https://blog.roboflow.com/pip-install-roboflow/) will allow us to upload our batch of generated images." |
| 38 | + ] |
| 39 | + }, |
| 40 | + { |
| 41 | + "cell_type": "code", |
| 42 | + "execution_count": 4, |
| 43 | + "id": "9be19ca5-5898-461a-9d6a-1125176d71f6", |
| 44 | + "metadata": {}, |
| 45 | + "outputs": [], |
| 46 | + "source": [ |
| 47 | + "%%sh\n", |
| 48 | + "pip install -q --upgrade pip\n", |
| 49 | + "pip install -q --upgrade diffusers transformers scipy ftfy huggingface_hub roboflow" |
| 50 | + ] |
| 51 | + }, |
| 52 | + { |
| 53 | + "cell_type": "markdown", |
| 54 | + "id": "455c668a-aecc-4eb6-81cc-94bb557907b4", |
| 55 | + "metadata": {}, |
| 56 | + "source": [ |
| 57 | + "## Authenticating with the Hugging Face Hub\n", |
| 58 | + "\n", |
| 59 | + "We'll be using [Hugging Face](https://huggingface.co/) to pull down our Stable Diffusion model, so we must authenticate using our [Hugging Face Access Token](https://huggingface.co/docs/hub/security-tokens) \n", |
| 60 | + "\n", |
| 61 | + "When we run the below cell, we enter our token, click login and we are authenticated.\n", |
| 62 | + "\n", |
| 63 | + "_***Note: You don't get a confirmation of token accepted once you click login. You can however confirm you are authenticated by looking at the SageMaker Studio Lab logs in the terminal at the bottom of your screen.***_" |
| 64 | + ] |
| 65 | + }, |
| 66 | + { |
| 67 | + "cell_type": "code", |
| 68 | + "execution_count": 8, |
| 69 | + "id": "76e38afc-87dd-45e5-ac6b-d506e7fdcaa7", |
| 70 | + "metadata": {}, |
| 71 | + "outputs": [], |
| 72 | + "source": [ |
| 73 | + "from huggingface_hub import notebook_login\n", |
| 74 | + "\n", |
| 75 | + "# Required to get access to stable diffusion model\n", |
| 76 | + "notebook_login()" |
| 77 | + ] |
| 78 | + }, |
| 79 | + { |
| 80 | + "cell_type": "markdown", |
| 81 | + "id": "fa4e5def-2785-4f73-a005-cf65b0d73e78", |
| 82 | + "metadata": {}, |
| 83 | + "source": [ |
| 84 | + "## Accepting License Terms\n", |
| 85 | + "\n", |
| 86 | + "Before we load this model from the Hugging Face Hub, we have to make sure that we accept the license of the runwayml/stable-diffusion-v1-5 project. You can accept the license by clicking on the Agree and access repository button on the [model page](https://huggingface.co/runwayml/stable-diffusion-v1-5)." |
| 87 | + ] |
| 88 | + }, |
| 89 | + { |
| 90 | + "cell_type": "markdown", |
| 91 | + "id": "3a993173-4278-47de-8c16-13c8d2f33a27", |
| 92 | + "metadata": { |
| 93 | + "tags": [] |
| 94 | + }, |
| 95 | + "source": [ |
| 96 | + "## Using the Hugging Face StableDiffusionPipeline Class\n", |
| 97 | + "\n", |
| 98 | + "Here we will create our [Hugging Face Stable Diffusion pipeline](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion), as well as ensure we are running on cuda. Hugging Face pipelines are an easy way to use your Hugging Face models for [inference](https://huggingface.co/docs/transformers/main_classes/pipelines)" |
| 99 | + ] |
| 100 | + }, |
| 101 | + { |
| 102 | + "cell_type": "code", |
| 103 | + "execution_count": 9, |
| 104 | + "id": "60d68fc2-5569-46db-b0b2-a2b544442b4e", |
| 105 | + "metadata": {}, |
| 106 | + "outputs": [], |
| 107 | + "source": [ |
| 108 | + "import torch\n", |
| 109 | + "from diffusers import StableDiffusionPipeline\n", |
| 110 | + "\n", |
| 111 | + "pipeline = StableDiffusionPipeline.from_pretrained(\n", |
| 112 | + " \"runwayml/stable-diffusion-v1-5\", torch_dtype=torch.float16, revision=\"fp16\"\n", |
| 113 | + ")\n", |
| 114 | + "\n", |
| 115 | + "pipeline = pipeline.to(\"cuda\")" |
| 116 | + ] |
| 117 | + }, |
| 118 | + { |
| 119 | + "cell_type": "markdown", |
| 120 | + "id": "d54f6997-db44-4ce6-9312-ea92ef776fcc", |
| 121 | + "metadata": {}, |
| 122 | + "source": [ |
| 123 | + "## Creating our Generate Images Function\n", |
| 124 | + "\n", |
| 125 | + "After we have created our pipeline, we will create our function to generate images. Here we define some parameters:\n", |
| 126 | + "\n", |
| 127 | + "* prompt = The prompt used to generate your images\n", |
| 128 | + "* num_images_to_generate = Total number of images to generate\n", |
| 129 | + "* num_images_per_prompt = The number of images to generate in one iteration\n", |
| 130 | + "* guidance_scale = The guidance scale defines how much freedom you want to give the model. Higher guidance scale encourages the model to generate images that are closely linked to the text prompt, usually at the expense of lower image quality. [Guidance scale as defined in Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598)\n", |
| 131 | + "* output_dir = This is the location you want to save the images to (The location will be created when creating the images)\n", |
| 132 | + "* display_images = Defines if you want to display the images inline after creation\n", |
| 133 | + "\n", |
| 134 | + "You can read more about the parameters associated with the Stable Diffusion pipeline [here](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion)\n", |
| 135 | + "\n", |
| 136 | + "In the function we iterate through all images created, based on our total defined images" |
| 137 | + ] |
| 138 | + }, |
| 139 | + { |
| 140 | + "cell_type": "code", |
| 141 | + "execution_count": 11, |
| 142 | + "id": "ac716acb-e9f0-4990-a018-d698f9a276c2", |
| 143 | + "metadata": {}, |
| 144 | + "outputs": [], |
| 145 | + "source": [ |
| 146 | + "import os\n", |
| 147 | + "\n", |
| 148 | + "from IPython.display import Image, display\n", |
| 149 | + "\n", |
| 150 | + "\n", |
| 151 | + "def generate_images(\n", |
| 152 | + " prompt,\n", |
| 153 | + " num_images_to_generate,\n", |
| 154 | + " num_images_per_prompt=4,\n", |
| 155 | + " guidance_scale=8,\n", |
| 156 | + " output_dir=\"generated_images\",\n", |
| 157 | + " display_images=False,\n", |
| 158 | + "):\n", |
| 159 | + "\n", |
| 160 | + " num_iterations = num_images_to_generate // num_images_per_prompt\n", |
| 161 | + " os.makedirs(output_dir, exist_ok=True)\n", |
| 162 | + "\n", |
| 163 | + " for i in range(num_iterations):\n", |
| 164 | + " images = pipeline(\n", |
| 165 | + " prompt, num_images_per_prompt=num_images_per_prompt, guidance_scale=guidance_scale\n", |
| 166 | + " )\n", |
| 167 | + " for idx, image in enumerate(images.images):\n", |
| 168 | + " image_name = f\"{output_dir}/image_{(i*num_images_per_prompt)+idx}.png\"\n", |
| 169 | + " image.save(image_name)\n", |
| 170 | + " if display_images:\n", |
| 171 | + " display(Image(filename=image_name, width=128, height=128))" |
| 172 | + ] |
| 173 | + }, |
| 174 | + { |
| 175 | + "cell_type": "code", |
| 176 | + "execution_count": 13, |
| 177 | + "id": "c830c486-68b0-4375-812a-3ba96883bcf2", |
| 178 | + "metadata": {}, |
| 179 | + "outputs": [], |
| 180 | + "source": [ |
| 181 | + "# 1000 images takes 2-3 hours on a SageMaker Studio Lab GPU instance. \n", |
| 182 | + "# You can adjust the total image number below\n", |
| 183 | + "\n", |
| 184 | + "generate_images(\"aerial view of cattle\", 12, guidance_scale=4, display_images=True)\n", |
| 185 | + "\n" |
| 186 | + ] |
| 187 | + }, |
| 188 | + { |
| 189 | + "cell_type": "markdown", |
| 190 | + "id": "48f3894f-2916-4e60-a4c5-6ba980e0315a", |
| 191 | + "metadata": { |
| 192 | + "tags": [] |
| 193 | + }, |
| 194 | + "source": [ |
| 195 | + "## Sign up for a free Roboflow Account\n", |
| 196 | + "_***Note: If you already have a Roboflow account and project created, you can skip the below steps and go right to pushing your generated images up to your Roboflow project.***_" |
| 197 | + ] |
| 198 | + }, |
| 199 | + { |
| 200 | + "cell_type": "markdown", |
| 201 | + "id": "1213d4b4-7cdd-4393-8098-bd6bfce028d3", |
| 202 | + "metadata": { |
| 203 | + "tags": [] |
| 204 | + }, |
| 205 | + "source": [ |
| 206 | + "### Sign up for a free Roboflow account\n", |
| 207 | + "\n", |
| 208 | + "[Roboflow](https://roboflow.com/?ref=studiolab) is an end-to-end computer vision platform. It helps you [create](https://docs.roboflow.com/quick-start?ref=studiolab), [understand](https://blog.roboflow.com/dataset-search/?ref=studiolab), and [use](https://docs.roboflow.com/exporting-data?ref=studiolab) image datasets to train and deploy custom models.\n", |
| 209 | + "\n", |
| 210 | + "Roboflow strives to be broadly interoperable and can import and export object detection datasets in [dozens of formats](https://roboflow.com/formats?ref=studiolab). They maintain [training notebooks](https://models.roboflow.com/?ref=studiolab) (like this one) for many state of the art computer vision models, and also offer [AutoML training](https://docs.roboflow.com/train?ref=studiolab) which can be useful for [prototyping](https://blog.roboflow.com/deploy-tab/?ref=studiolab), [model assisted labeling](https://roboflow.com/annotate?ref=studiolab), and even [deploying to a wide range of targets and edge devices](https://roboflow.com/deploy?ref=studiolab).\n", |
| 211 | + "\n", |
| 212 | + "In this tutorial, we will use Roboflow to annotate a custom dataset and export it for use with YOLOv7 in this notebook. But we encourage you to explore [its other features](https://roboflow.com/features?ref=studiolab) as well.\n", |
| 213 | + "\n", |
| 214 | + "### Step 2: Create a Public Workspace\n", |
| 215 | + "\n", |
| 216 | + "Roboflow offers a [generous free tier](https://roboflow.com/pricing?ref=studiolab) if your data can be shared publicly with others on [Roboflow Universe](https://universe.roboflow.com/?ref=studiolab). There are also paid plans available for private data.\n", |
| 217 | + "\n", |
| 218 | + "For this tutorial you'll need to create a Public workspace. Be sure to give it a good name; it will serve as your Universe username where you can showcase your work.\n", |
| 219 | + "\n", |
| 220 | + "<div><img src=\"https://i.imgur.com/zfE5MZL.png\" style=\"max-width: 600px;\"></div>\n", |
| 221 | + " \n", |
| 222 | + "### Step 3: Create a Project\n", |
| 223 | + "\n", |
| 224 | + "Then, create an `Object Detection` project (be sure to give it a descriptive name, and fill in the `What will your model predict?` section since they will make your project more understandable and be pulled in via the API later, and can serve as helpful metadata later for advanced use-cases like automated prompt engineering for zero-shot models).\n", |
| 225 | + "\n", |
| 226 | + "<div><img src=\"https://i.imgur.com/O2xDyxQ.png\" style=\"max-width: 500px;\"></div>\n" |
| 227 | + ] |
| 228 | + }, |
| 229 | + { |
| 230 | + "cell_type": "markdown", |
| 231 | + "id": "6b978e09-a6a0-45a8-8010-7dc2a8f3338d", |
| 232 | + "metadata": { |
| 233 | + "tags": [] |
| 234 | + }, |
| 235 | + "source": [ |
| 236 | + "## Push Generated Data to your Roboflow Project" |
| 237 | + ] |
| 238 | + }, |
| 239 | + { |
| 240 | + "cell_type": "markdown", |
| 241 | + "id": "cbda9f05-d552-4163-8c40-3a9cbda57213", |
| 242 | + "metadata": {}, |
| 243 | + "source": [ |
| 244 | + "### Upload your Images to Roboflow via PIP package\n", |
| 245 | + "\n", |
| 246 | + "Here we will use the [Upload API](https://docs.roboflow.com/adding-data/upload-api?ref=studiolab) to push your images up to your Roboflow project. Pushing your images programatically instead of the web UI is especially important in the world of [Active Learning](https://docs.roboflow.com/python/active-learning).\n", |
| 247 | + "\n", |
| 248 | + "You also have the ability to push those images up to your S3 bucket before sending over to Roboflow [load images from an S3 bucket](https://blog.roboflow.com/how-to-use-s3-computer-vision-pipeline/?ref=studiolab).\n", |
| 249 | + "\n", |
| 250 | + "_**Note:** To get good, generalizable results you will need lots of images covering a wide variety of situations and edge cases. Exactly how many images you need [depends on a wide variety of factors](https://blog.roboflow.com/images-train-model/?ref=studiolab), but we recommend starting out with at least 200 for most use-cases. If you need more images, try sourcing from open source datasets on [Roboflow Universe](https://universe.roboflow.com/?ref=studiolab) with images similar to yours._\n" |
| 251 | + ] |
| 252 | + }, |
| 253 | + { |
| 254 | + "cell_type": "markdown", |
| 255 | + "id": "ef18ec54-d9c1-404c-bb19-27b681241166", |
| 256 | + "metadata": {}, |
| 257 | + "source": [ |
| 258 | + "Set HOME path" |
| 259 | + ] |
| 260 | + }, |
| 261 | + { |
| 262 | + "cell_type": "code", |
| 263 | + "execution_count": 14, |
| 264 | + "id": "d233b775-6c1d-4360-8400-5ade9f4388c4", |
| 265 | + "metadata": {}, |
| 266 | + "outputs": [], |
| 267 | + "source": [ |
| 268 | + "import os\n", |
| 269 | + "HOME = os.getcwd()" |
| 270 | + ] |
| 271 | + }, |
| 272 | + { |
| 273 | + "cell_type": "markdown", |
| 274 | + "id": "2b801fdb-7dc0-4af0-94dc-c47e726907ed", |
| 275 | + "metadata": {}, |
| 276 | + "source": [ |
| 277 | + "Once your images are generated, you call the [Roboflow Upload API](https://docs.roboflow.com/adding-data/upload-api) to push our images into our Roboflow project. \n", |
| 278 | + "\n", |
| 279 | + "In the cell below, you should substitute your API key and project name with a project and API that you create in Roboflow. \n", |
| 280 | + "\n", |
| 281 | + "To get your API key, you can follow our tutorial on [retrieving an API key from the Roboflow dashboard](https://docs.roboflow.com/rest-api#obtaining-your-api-key)." |
| 282 | + ] |
| 283 | + }, |
| 284 | + { |
| 285 | + "cell_type": "code", |
| 286 | + "execution_count": 15, |
| 287 | + "id": "87b8a427-e145-4078-b643-bb001fefa9ec", |
| 288 | + "metadata": {}, |
| 289 | + "outputs": [], |
| 290 | + "source": [ |
| 291 | + "from roboflow import Roboflow\n", |
| 292 | + "import glob\n", |
| 293 | + "import os\n", |
| 294 | + "\n", |
| 295 | + "# glob params\n", |
| 296 | + "image_dir = os.path.join(HOME, \"generated_images\", \"\")\n", |
| 297 | + "file_extension_type = \".png\"\n", |
| 298 | + "\n", |
| 299 | + "# roboflow pip params\n", |
| 300 | + "rf = Roboflow(api_key=\"YOUR_API_KEY\")\n", |
| 301 | + "upload_project = rf.workspace().project(\"YOUR_PROJECT_NAME\")\n", |
| 302 | + "\n", |
| 303 | + "# glob images\n", |
| 304 | + "image_glob = glob.glob(image_dir + '/*' + file_extension_type)\n", |
| 305 | + "\n", |
| 306 | + "# perform upload\n", |
| 307 | + "for image in image_glob:\n", |
| 308 | + " upload_project.upload(image, num_retry_uploads=3)\n", |
| 309 | + " print(\"*** Processing image [\" + str(len(image_glob)) + \"] - \" + image + \" ***\")" |
| 310 | + ] |
| 311 | + }, |
| 312 | + { |
| 313 | + "cell_type": "markdown", |
| 314 | + "id": "b9e9004e-5620-4edb-89f7-ef8656b4100e", |
| 315 | + "metadata": { |
| 316 | + "tags": [] |
| 317 | + }, |
| 318 | + "source": [ |
| 319 | + "## Check Images\n", |
| 320 | + "\n", |
| 321 | + "Once the above cell is complete, all images should now be in your project, under the annotate section with the designation of _unassigned <> PIP Package Upload_.\n", |
| 322 | + "\n", |
| 323 | + "<div><img src=\"https://i.imgur.com/fqAlyM2.png\" style=\"max-width: 700px;\"></div>\n", |
| 324 | + "\n", |
| 325 | + "When you click in the _PIP Package Upload_ batch, you will be able to see all your generated data\n", |
| 326 | + "\n", |
| 327 | + "<div><img src=\"https://i.imgur.com/EJxLTLq.png\" style=\"max-width: 700px;\"></div>" |
| 328 | + ] |
| 329 | + }, |
| 330 | + { |
| 331 | + "cell_type": "markdown", |
| 332 | + "id": "7244ebbc-d1b0-4e7b-81bf-c06d3f501243", |
| 333 | + "metadata": {}, |
| 334 | + "source": [ |
| 335 | + "## Roboflow Annotate\n", |
| 336 | + "\n", |
| 337 | + "Now, we'll use [Roboflow Annotate](https://roboflow.com/annotate?ref=studiolab) to create annotations that will teach our model what we're trying to detect in our images. Since your model will learn to mimic your annotations, it's important that you give some thought to how you label your images ahead of time. We've compiled a list of [best practices to consider when labeling images](https://blog.roboflow.com/tips-for-how-to-label-images/?ref=studiolab)." |
| 338 | + ] |
| 339 | + }, |
| 340 | + { |
| 341 | + "cell_type": "markdown", |
| 342 | + "id": "90f0c902-b140-40c0-a736-6f9327583f7e", |
| 343 | + "metadata": {}, |
| 344 | + "source": [ |
| 345 | + "## Further Enhancements \n", |
| 346 | + "\n", |
| 347 | + "You can take these text-to-image capabilites a step further, further simplify this workflow and really take advantage of [transfer learning](https://en.wikipedia.org/wiki/Transfer_learning). If you locate an open source model on [Roboflow Universe](https://universe.roboflow.com/) that meets your models criteria, you can use this model to help do automated annotations.\n", |
| 348 | + "\n", |
| 349 | + "Roboflow Universe has over 15k pre-trained and fine tuned models for use. Once you locate a useful model, you can call that model to get predictions. In the outout of that prediction is the predicted bounding box coordinates for your AI generated images.\n", |
| 350 | + "\n", |
| 351 | + "You can then setup a workflow which calls the Universe model, takes all predictions with a confidence of over 50% and use the output of the prediction to auto label your images and get annotations. You will then upload all images and corresponding annotations into your Roboflow project via the API. \n", |
| 352 | + "\n", |
| 353 | + "This workflow would greatly reduce the manual efforts of annotating your generated images. I can't wait to see how people take this notebook further! " |
| 354 | + ] |
| 355 | + } |
| 356 | + ], |
| 357 | + "metadata": { |
| 358 | + "kernelspec": { |
| 359 | + "display_name": "default:Python", |
| 360 | + "language": "python", |
| 361 | + "name": "conda-env-default-py" |
| 362 | + }, |
| 363 | + "language_info": { |
| 364 | + "codemirror_mode": { |
| 365 | + "name": "ipython", |
| 366 | + "version": 3 |
| 367 | + }, |
| 368 | + "file_extension": ".py", |
| 369 | + "mimetype": "text/x-python", |
| 370 | + "name": "python", |
| 371 | + "nbconvert_exporter": "python", |
| 372 | + "pygments_lexer": "ipython3", |
| 373 | + "version": "3.9.13" |
| 374 | + } |
| 375 | + }, |
| 376 | + "nbformat": 4, |
| 377 | + "nbformat_minor": 5 |
| 378 | +} |
0 commit comments