Handwritten digits detection using a YOLOv8 detection model and ONNX pre/post processing. An example of how model works in real world scenario can be viewed at https://thawro.github.io/web-object-detector/.
The dataset consists of images created with the use of a HWD+ dataset.
The HWD+ dataset consists of gray images of single handwritten digits in high resolution (500x500 pixels).
The yolo_HWD+ dataset is composed of images which are produced with the use of HWD+ dataset. Each yolo_HWD+ image has many single digits on one image and each digit is properly annotated (class x_center y_center width height). The processing of HWD+ to obtain yolo_HWD+:
- Cut the digit from each image (
HWD+images have a lot of white background around) - Create background image of size
imgszand apply transform to it (pre_transformattribute) - e.g. RGB shift/shuffle - Take
nrows * ncolsdigit images and form a nrows x ncols grid. - For each digit:
- Apply transform (
obj_transformattribute) - e.g. invert color, RGB shift/shuffle - Randomly place the digit in ij cell and save its label and location as annotation.
- Apply transform (
- Apply transform to the fully formed grid (
post_transformattribute) - e.g. rotation
Example below:
- PyTorch - neural networks architectures and datasets classes
- ONNX - All processing steps used in pipeline
- ONNX Runtime - Pipeline inference
- OpenCV - Image processing for the server-side model inference (optional)
- React - Web application used to test object detection models in real world examples
Each pipeline step is done with ONNX models. The complete pipeline during inference is the following:
- Image preprocessing - resize and pad to match model input size (preprocessing)
- Object detection - Detect objects with YOLOv8 model (yolo)
- Non Maximum Supression - Apply NMS to YOLO output (nms)
- Postprocessing - Apply postprocessing to filtered boxes (postprocessing)




