Skip to content

Commit 8916ba3

Browse files
committed
feat: QA grammarly
1 parent 3eb3280 commit 8916ba3

File tree

1 file changed

+17
-17
lines changed

1 file changed

+17
-17
lines changed

custom_pytorch_yolov5/custom_pytorch.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ In this article, we're going to learn how to load a YOLOv5 model into PyTorch, a
88
2. Cropping and saving detections
99
3. Counting Detected Objects
1010

11-
If you're a little confused on how we got here from the very beginning, you can check out the first and second article (this article's predecessors) here:
11+
If you're a little confused about how we got here from the very beginning, you can check out the first and second articles (this article's predecessors) here:
1212

1313
- [Creating a CMask Detection Model on OCI with YOLOv5: Data Labeling with RoboFlow](https://medium.com/oracledevs/creating-a-cmask-detection-model-on-oci-with-yolov5-data-labeling-with-roboflow-5cff89cf9b0b)
1414
- [Creating a Mask Model on OCI with YOLOv5: Training and Real-Time Inference](https://medium.com/oracledevs/creating-a-mask-model-on-oci-with-yolov5-training-and-real-time-inference-3534c7f9eb21)
@@ -23,21 +23,21 @@ I decided to create a "modified" version of what YOLOv5 does, by taking advantag
2323

2424
I believed custom PyTorch code would be great, because simply using YOLOv5's repository didn't give you 100% flexibility and responsiveness (real-time), so I decided to __very slightly__ add some *extra* functionalities (we'll talk about them below). If you're trying to use [the standard GitHub repository for YOLOv5](https://github.com/ultralytics/yolov5), you'll find that you can use their code like [this detector](https://github.com/ultralytics/yolov5/blob/master/detect.py) to post-process video or image files. You can also use it directly with a YouTube video, and an integrated youtube downloader will download frames and process them.
2525

26-
But what is the definition of real time? I want every frame that I see in my computer, somehow (be it either a camera frame from my webcam or a YouTube video, or even my screen) to display the results of my detection immediately. This is why I created my own custom code to detect with PyTorch.
26+
But what is the definition of real-time? I want every frame that I see in my computer, somehow (be it either a camera frame from my webcam or a YouTube video or even my screen) to display the results of my detection immediately. This is why I created my own custom code to detect with PyTorch.
2727

28-
Finally, I'd like to mention my journey through a *very painful* road of finding *a few* bugs on the Windows Operating System and trying to *virtualize* your webcam feed. There's this great plugin that could "replicate" your camera feed into a virtual version that you could use in any program (you could give your computer any program / input and feed it into the webcam stream, so that it looked like your webcam feed was coming from somewhere else), and it was really great:
28+
Finally, I'd like to mention my journey through a *very painful* road of finding *a few* bugs on the Windows Operating System and trying to *virtualize* your webcam feed. There's this great plugin that could "replicate" your camera feed into a virtual version that you could use in any program (you could give your computer any program/input and feed it into the webcam stream so that it looked like your webcam feed was coming from somewhere else), and it was really great:
2929

3030
![OBS great outdated plugin](./images/obs_1.PNG)
3131

32-
This was an [OBS (Open Broadcaster Software)](https://obsproject.com/) plugin. OBS is the go-to program to use when you're planning to make a livestream. However, this plugin was discontinued in OBS version 28, and all problems came with this update. I prepared this bug-compilation image so you can feel the pain too:
32+
This was an [OBS (Open Broadcaster Software)](https://obsproject.com/) plugin. OBS is the go-to program to use when you're planning to make a live stream. However, this plugin was discontinued in OBS version 28, and all problems came with this update. I prepared this bug-compilation image so you can feel the pain too:
3333

3434
![OBS bug compilation](./images/obs_errors.PNG)
3535

36-
So, once we've established that there are several roadblocks that prevent us from happily developing in a stable environment, we finally understand the "why" of this article. Let's begin implementing.
36+
So, once we've established that several roadblocks prevent us from happily developing in a stable environment, we finally understand the "why" of this article. Let's begin implementing.
3737

3838
## Implementation
3939

40-
We are going to focus on the three problems explained in the Introduction: cropping and saving objects, counting objects and sorting them. These techniques can be re-used in any computer vision projects, so once you understand how to implement them once, you're good to go.
40+
We are going to focus on the three problems explained in the Introduction: cropping and saving objects, counting objects, and sorting them. These techniques can be re-used in any computer vision projects, so once you understand how to implement them once, you're good to go.
4141

4242
Technical requirements are Python 3.8 or higher, and PyTorch 1.7 or higher. [Here's a list](https://github.com/oracle-devrel/devo.publishing.other/custom_pytorch_yolov5/files/requirements.txt) of the project's requirements if you want to reuse the code (which you can find in [the GitHub repository](https://github.com/oracle-devrel/devo.publishing.other/custom_pytorch_yolov5)), along with everything we publish.
4343

@@ -48,7 +48,7 @@ First, we will use the `argparse` to include additional parameters to our Python
4848
![argparse](./images/argparse.PNG)
4949
> **Note**: the confidence threshold will only display detected objects if the confidence score of the model's prediction is higher than the given value (0.0-1.0).
5050
51-
These argparse parameters' default values can always be modified. The _frequency_ parameter will determine how many frames to detect (e.g. a frame step). If the user specifies any number N, then only 1 frame every N frames will be used for detection. This can be useful if you're expecting your data to be very similar within sequential frames, as the detection of one object, in one frame, will suffice. Specifying this frame step is also beneficial to avoid cost overages when making these predictions (electricity bill beware, or OCI costs if you're using Oracle Cloud).
51+
These argparse parameters' default values can always be modified. The _frequency_ parameter will determine how many frames to detect (e.g. a frame step). If the user specifies any number N, then only 1 frame every `N` frames will be used for detection. This can be useful if you're expecting your data to be very similar within sequential frames, as the detection of one object, in one frame, will suffice. Specifying this frame step is also beneficial to avoid cost overages when making these predictions (electricity bill beware, or OCI costs if you're using Oracle Cloud).
5252

5353
After this initial configuration, we're ready to load our custom model. You can find the pre-trained custom model's weights for the Mask Detection Model being featured [in this link](https://www.kaggle.com/datasets/jasperan/covid-19-mask-detection?select=best.pt). You'll need to have this file within reach of your Python coding/execution environment to work.
5454

@@ -57,7 +57,7 @@ So, we now load the custom weights file:
5757
![loading PyTorch model](./images/load_model.PNG)
5858
> **Note**: we specify the model as a custom YOLO detector, and give it the model's weights file as input.
5959
60-
Now, we're ready to get started. We create our main loop, which constantly gets a new image from the input source (either a webcam feed or a screenshot of what we're seeing in our screen) and displays it with the bounding box detections in place:
60+
Now, we're ready to get started. We create our main loop, which constantly gets a new image from the input source (either a webcam feed or a screenshot of what we're seeing on our screen) and displays it with the bounding box detections in place:
6161

6262
![main loop](./images/main_loop.PNG)
6363

@@ -75,22 +75,22 @@ For this, I chose a `SCALE_FACTOR` variable to hold this value (between 0-1). Cu
7575
Now that we have our downscaled image, we pass it to the model, and it returns the object we wanted:
7676

7777
![infer 3](./images/infer_3.PNG)
78-
> **Note**: the `size=640` option tells the model we're going to pass it images with that width, so the model will predict results of those dimensions.
78+
> **Note**: the `size=640` option tells the model we're going to pass images with that width, so the model will predict the results of those dimensions.
7979
8080
The last thing we do is draw the bounding boxes that we obtained into the image, and return the image to display it later.
8181

8282
![infer return](./images/infer_5.PNG)
8383

8484
## 1. Sorting Detections
8585

86-
This first technique is the simplest, and can be useful to add value to the standard YOLO functionality in an unique way. The idea is to quickly manipulate the PyTorch-pandas object to sort values according to one of the columns.
86+
This first technique is the simplest and can be useful to add value to the standard YOLO functionality in a unique way. The idea is to quickly manipulate the PyTorch-pandas object to sort values according to one of the columns.
8787

8888
For this, I suggest two ideas: sorting by confidence score, or by detection coordinates. To illustrate how any of these techniques are useful, let's look at the following image:
8989

9090
![speed figure](./images/figure_speed.png)
9191
> **Note**: this image illustrates how sorting detections can be useful. [(image credits)](https://www.linkedin.com/in/muhammad-moin-7776751a0/)
9292
93-
In the image above, an imaginary line is drawn between both sides of the roadway, in this case **horizontally**. Any object passing from one equator to the other in a specific direction is counted as an "inward" or "downward" vehicle. This can be achieved by specifying (x,y) bounds, and any item in the PyTorch-pandas object that surpasses it in any direction is detected.
93+
In the image above, an imaginary line is drawn between both sides of the roadway, in this case, **horizontally**. Any object passing from one equator to the other in a specific direction is counted as an "inward" or "downward" vehicle. This can be achieved by specifying (x,y) bounds, and any item in the PyTorch-pandas object that surpasses it in any direction is detected.
9494

9595
For processing purposes, sorting these values from the lowest y coordinate to the highest will return all cars in-order, from top to bottom of the image, which facilitates their processing in an ordered manner.
9696

@@ -111,7 +111,7 @@ This approach wouldn't work if we gave the whole image to the OCR, as it wouldn'
111111
To implement this, we will base everything we do on **bounding boxes**. Our PyTorch code will return an object with bounding box coordinates for detected objects (and the detection's confidence scores), and we will use this object to create newly cropped images with the bounding box sizes.
112112
> **Note**: you can always modify the range of pixels you want to crop in each image, by being either more **permissive** (getting extra pixels around the bounding box) or more **restrictive**, removing the edges of the detected object.
113113
114-
An important consideration is that, since we're passing images to our model with a width of 640 pixels, we need to keep our previously-mentioned `SCALE_FACTOR` variable. The problem is that the original image has a higher size than the downscaled image (the one we pass the model), so bounding box detection coordinates will also be downscaled. We need to multiply these detections by the scale factor in order to _draw_ these bounding boxes over the original image; and then display it:
114+
An important consideration is that, since we're passing images to our model with a width of 640 pixels, we need to keep our previously-mentioned `SCALE_FACTOR` variable. The problem is that the original image has a higher size than the downscaled image (the one we pass the model), so bounding box detection coordinates will also be downscaled. We need to multiply these detections by the scale factor in order to _draw_ these bounding boxes over the original image, and then display it:
115115

116116
![infer 4](./images/infer_4.PNG)
117117

@@ -120,22 +120,22 @@ Inside this function, we will **upscale** bounding box detections. Also, we'll o
120120

121121
![save cropped images](./images/save_cropped_images.PNG)
122122

123-
Last thing we do is save the cropped image with OpenCV:
123+
The last thing we do is save the cropped image with OpenCV:
124124

125125
![save image](./images/save_image.PNG)
126126

127127
And we successfully implemented the functionality.
128128

129129
## 3. Counting Detected Objects
130130

131-
This last technique we're going to learn about is very straightforward and easy to implement: since we want to count the number of detected objects in the screen, we need to use a global variable (in memory) or a database of some sort to store this variable. We can either design the variable to either:
131+
This last technique we're going to learn about is straightforward and easy to implement: since we want to count the number of detected objects on the screen, we need to use a global variable (in memory) or a database of some sort to store this variable. We can either design the variable to either:
132132
1. Always increment, and keep a global value of all detected objects since we started executing our Python program
133133
2. Only hold the value of currently detected objects in the screen
134134

135135
Depending on the problem, you may want to choose one of these two options. In our case, we'll implement the second option:
136136

137137
![draw 2](./images/draw_1.PNG)
138-
> **Note**: to implement the first option, you just need to *increment* the variable every time, instead of setting it. However, you might benefit from looking at implementations like [DeepSORT](https://github.com/ZQPei/deep_sort_pytorch) or [Zero-Shot Tracking](https://github.com/roboflow/zero-shot-object-tracking), which is able to recognize the same object/detection from sequential frames, and only count them as one; not separate entities.
138+
> **Note**: to implement the first option, you just need to *increment* the variable every time, instead of setting it. However, you might benefit from looking at implementations like [DeepSORT](https://github.com/ZQPei/deep_sort_pytorch) or [Zero-Shot Tracking](https://github.com/roboflow/zero-shot-object-tracking), which can recognize the same object/detection from sequential frames, and only count them as one; not separate entities.
139139
140140
With our newly-created global variable, we'll hold a value of our liking. For example, in the code above, I'm detecting the _`mask`_ class. Then, I just need to draw the number of detected objects with OpenCV, along with the bounding boxes on top of the original image:
141141

@@ -150,7 +150,7 @@ Note that I tested this on my own computer with an RTX 3080, I got about 25 FPS
150150

151151
## Conclusions
152152

153-
I've shown three additional features not currently present in YOLO models, by just adding a Python layer to it. PyTorch's Model Hub ultimately made this possible, as well as RoboFlow (made creating and exporting the mask detection model easy).
153+
I've shown three additional features not currently present in YOLO models, by just adding a Python layer to them. PyTorch's Model Hub ultimately made this possible, as well as RoboFlow (made creating and exporting the mask detection model easy).
154154

155155
In the future, I'm planning on releasing an implementation with either DeepSORT or Zero-Shot Tracking, together with YOLO, to track objects. If you'd like to see any additional use cases or features implemented, let me know in the comments!
156156

@@ -161,4 +161,4 @@ Stay tuned...
161161
## Acknowledgments
162162

163163
* **Author** - [Nacho Martinez](https://www.linkedin.com/in/ignacio-g-martinez/), Data Science Advocate @ Oracle Developer Relations
164-
* **Last Updated By/Date** - February 7th, 2023
164+
* **Last Updated By/Date** - February 8th, 2023

0 commit comments

Comments
 (0)