Skip to content

Commit 8b23e81

Browse files
committed
chore: add en readme
1 parent 097e2df commit 8b23e81

File tree

2 files changed

+198
-1
lines changed

2 files changed

+198
-1
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@
77
<a href="https://semver.org/"><img alt="SemVer2.0" src="https://img.shields.io/badge/SemVer-2.0-brightgreen"></a>
88
<a href="https://github.com/psf/black"><img src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
99
<a href="https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE"><img alt="GitHub" src="https://img.shields.io/badge/license-Apache 2.0-blue"></a>
10+
11+
[English](README_en.md) | 简体中文
1012
</div>
1113

1214
### 最近更新
@@ -40,7 +42,7 @@
4042
🔍 使用在线体验找到适合你场景的模型组合
4143

4244
### 在线体验
43-
[modelscope](https://www.modelscope.cn/studios/jockerK/RapidTableDetDemo)
45+
[modelscope](https://www.modelscope.cn/studios/jockerK/RapidTableDetDemo) [huggingface](https://huggingface.co/spaces/Joker1212/RapidTableDetection)
4446
### 效果展示
4547

4648
![res_show.jpg](readme_resource/res_show.jpg)![res_show2.jpg](readme_resource/res_show2.jpg)

README_en.md

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
<div align="center">
2+
<div align="center">
3+
<h1><b>📊RapidTableDetection</b></h1>
4+
</div>
5+
<a href=""><img src="https://img.shields.io/badge/Python->=3.8,<3.12-aff.svg"></a>
6+
<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Mac%2C%20Win-pink.svg"></a>
7+
<a href="https://semver.org/"><img alt="SemVer2.0" src="https://img.shields.io/badge/SemVer-2.0-brightgreen"></a>
8+
<a href="https://github.com/psf/black"><img src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
9+
<a href="https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE"><img alt="GitHub" src="https://img.shields.io/badge/license-Apache 2.0-blue"></a>
10+
</div>
11+
12+
### Recent Updates
13+
14+
- **2024.10.15**
15+
- Completed the initial version of the code, including three modules: object detection, semantic segmentation, and corner direction recognition.
16+
- **2024.11.2**
17+
- Added new YOLOv11 object detection models and edge detection models.
18+
- Increased automatic downloading and reduced package size.
19+
- Added ONNX-GPU inference support and provided benchmark test results.
20+
- Added online example usage.
21+
22+
### Introduction
23+
24+
💡✨ RapidTableDetection is a powerful and efficient table detection system that supports various types of tables, including those in papers, journals, magazines, invoices, receipts, and sign-in sheets.
25+
26+
🚀 It supports versions derived from PaddlePaddle and YOLO, with the default model combination requiring only 1.2 seconds for single-image CPU inference, and 0.4 seconds for the smallest ONNX-GPU (V100) combination, or 0.2 seconds for the PaddlePaddle-GPU version.
27+
28+
🛠️ It supports free combination and independent training optimization of three modules, providing ONNX conversion scripts and fine-tuning training solutions.
29+
30+
🌟 The whl package is easy to integrate and use, providing strong support for downstream OCR, table recognition, and data collection.
31+
32+
Refer to the implementation solution of the [2nd place in the Baidu Table Detection Competition](https://aistudio.baidu.com/projectdetail/5398861?searchKeyword=%E8%A1%A8%E6%A0%BC%E6%A3%80%E6%B5%8B%E5%A4%A7%E8%B5%9B&searchTab=ALL), and retrain with a large amount of real-world scenario data.
33+
![img.png](readme_resource/structure.png) \
34+
The training dataset is acknowledged. The author works on open-source projects during spare time, please support by giving a star.
35+
36+
37+
### Usage Recommendations
38+
39+
- Document scenarios: No perspective rotation, use only object detection.
40+
- Photography scenarios with small angle rotation (-90~90): Default top-left corner, do not use corner direction recognition.
41+
- Use the online experience to find the suitable model combination for your scenario.
42+
43+
### Online Experience
44+
[modelscope](https://www.modelscope.cn/studios/jockerK/RapidTableDetDemo) [huggingface](https://huggingface.co/spaces/Joker1212/RapidTableDetection)
45+
46+
### Effect Demonstration
47+
48+
![res_show.jpg](readme_resource/res_show.jpg)![res_show2.jpg](readme_resource/res_show2.jpg)
49+
50+
### Installation
51+
52+
Models will be automatically downloaded, or you can download them from the repository [modelscope model warehouse](https://www.modelscope.cn/models/jockerK/TableExtractor).
53+
54+
``` python {linenos=table}
55+
pip install rapid-table-det
56+
```
57+
58+
#### Parameter Explanation
59+
60+
Default values:
61+
- `use_cuda: False`: Enable GPU acceleration for inference.
62+
- `obj_model_type="yolo_obj_det"`: Object detection model type.
63+
- `edge_model_type="yolo_edge_det"`: Edge detection model type.
64+
- `cls_model_type="paddle_cls_det"`: Corner direction classification model type.
65+
66+
67+
Since ONNX has limited GPU acceleration, it is still recommended to directly use YOLOX or install PaddlePaddle for faster model execution (I can provide the entire process if needed).
68+
The PaddlePaddle S model, due to quantization, actually slows down and reduces accuracy, but significantly reduces model size.
69+
70+
71+
| `model_type` | Task Type | Training Source | Size | Single Table Inference Time (V100-16G, cuda12, cudnn9, ubuntu) |
72+
|:---------------------|:---------|:-------------------------------------|:-------|:-------------------------------------|
73+
| **yolo_obj_det** | Table Object Detection | `yolo11-l` | `100m` | `cpu:570ms, gpu:400ms` |
74+
| `paddle_obj_det` | Table Object Detection | `paddle yoloe-plus-x` | `380m` | `cpu:1000ms, gpu:300ms` |
75+
| `paddle_obj_det_s` | Table Object Detection | `paddle yoloe-plus-x + quantization` | `95m` | `cpu:1200ms, gpu:1000ms` |
76+
| **yolo_edge_det** | Semantic Segmentation | `yolo11-l-segment` | `108m` | `cpu:570ms, gpu:200ms` |
77+
| `yolo_edge_det_s` | Semantic Segmentation | `yolo11-s-segment` | `11m` | `cpu:260ms, gpu:200ms` |
78+
| `paddle_edge_det` | Semantic Segmentation | `paddle-dbnet` | `99m` | `cpu:1200ms, gpu:120ms` |
79+
| `paddle_edge_det_s` | Semantic Segmentation | `paddle-dbnet + quantization` | `25m` | `cpu:860ms, gpu:760ms` |
80+
| **paddle_cls_det** | Direction Classification | `paddle pplcnet` | `6.5m` | `cpu:70ms, gpu:60ms` |
81+
82+
Execution parameters:
83+
- `det_accuracy=0.7`
84+
- `use_obj_det=True`
85+
- `use_edge_det=True`
86+
- `use_cls_det=True`
87+
88+
### Quick Start
89+
90+
``` python {linenos=table}
91+
from rapid_table_det.inference import TableDetector
92+
93+
img_path = f"tests/test_files/chip.jpg"
94+
table_det = TableDetector()
95+
96+
result, elapse = table_det(img_path)
97+
obj_det_elapse, edge_elapse, rotate_det_elapse = elapse
98+
print(
99+
f"obj_det_elapse:{obj_det_elapse}, edge_elapse={edge_elapse}, rotate_det_elapse={rotate_det_elapse}"
100+
)
101+
# Output visualization
102+
# import os
103+
# import cv2
104+
# from rapid_table_det.utils.visuallize import img_loader, visuallize, extract_table_img
105+
#
106+
# img = img_loader(img_path)
107+
# img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
108+
# file_name_with_ext = os.path.basename(img_path)
109+
# file_name, file_ext = os.path.splitext(file_name_with_ext)
110+
# out_dir = "rapid_table_det/outputs"
111+
# if not os.path.exists(out_dir):
112+
# os.makedirs(out_dir)
113+
# extract_img = img.copy()
114+
# for i, res in enumerate(result):
115+
# box = res["box"]
116+
# lt, rt, rb, lb = res["lt"], res["rt"], res["rb"], res["lb"]
117+
# # With detection box and top-left corner position
118+
# img = visuallize(img, box, lt, rt, rb, lb)
119+
# # Perspective transformation to extract table image
120+
# wrapped_img = extract_table_img(extract_img.copy(), lt, rt, rb, lb)
121+
# cv2.imwrite(f"{out_dir}/{file_name}-extract-{i}.jpg", wrapped_img)
122+
# cv2.imwrite(f"{out_dir}/{file_name}-visualize.jpg", img)
123+
124+
```
125+
### Using PaddlePaddle Version
126+
You must download the models and specify their locations!
127+
``` python {linenos=table}
128+
#(default installation is GPU version, you can override with CPU version paddlepaddle)
129+
pip install rapid-table-det-paddle
130+
```
131+
```python
132+
from rapid_table_det_paddle.inference import TableDetector
133+
134+
img_path = f"tests/test_files/chip.jpg"
135+
136+
table_det = TableDetector(
137+
obj_model_path="models/obj_det_paddle",
138+
edge_model_path="models/edge_det_paddle",
139+
cls_model_path="models/cls_det_paddle",
140+
use_obj_det=True,
141+
use_edge_det=True,
142+
use_cls_det=True,
143+
)
144+
result, elapse = table_det(img_path)
145+
obj_det_elapse, edge_elapse, rotate_det_elapse = elapse
146+
print(
147+
f"obj_det_elapse:{obj_det_elapse}, edge_elapse={edge_elapse}, rotate_det_elapse={rotate_det_elapse}"
148+
)
149+
# more than one table in one image
150+
# img = img_loader(img_path)
151+
# file_name_with_ext = os.path.basename(img_path)
152+
# file_name, file_ext = os.path.splitext(file_name_with_ext)
153+
# out_dir = "rapid_table_det_paddle/outputs"
154+
# if not os.path.exists(out_dir):
155+
# os.makedirs(out_dir)
156+
# extract_img = img.copy()
157+
# for i, res in enumerate(result):
158+
# box = res["box"]
159+
# lt, rt, rb, lb = res["lt"], res["rt"], res["rb"], res["lb"]
160+
# # With detection box and top-left corner position
161+
# img = visuallize(img, box, lt, rt, rb, lb)
162+
# # Perspective transformation to extract table image
163+
# wrapped_img = extract_table_img(extract_img.copy(), lt, rt, rb, lb)
164+
# cv2.imwrite(f"{out_dir}/{file_name}-extract-{i}.jpg", wrapped_img)
165+
# cv2.imwrite(f"{out_dir}/{file_name}-visualize.jpg", img)
166+
167+
```
168+
169+
## FAQ (Frequently Asked Questions)
170+
171+
1. **Q: How to fine-tune the model for specific scenarios?**
172+
- A: Refer to this project, which provides detailed visualization steps and datasets. You can get the PaddlePaddle inference model from [Baidu Table Detection Competition](https://aistudio.baidu.com/projectdetail/5398861?searchKeyword=%E8%A1%A8%E6%A0%BC%E6%A3%80%E6%B5%8B%E5%A4%A7%E8%B5%9B&searchTab=ALL). For YOLOv11, use the official script, which is simple enough, and convert the data to COCO format for training as per the official guidelines.
173+
2. **Q: How to export ONNX?**
174+
- A: For PaddlePaddle models, use the `onnx_transform.ipynb` file in the `tools` directory of this project. For YOLOv11, follow the official method, which can be done in one line.
175+
3. **Q: Can distorted images be corrected?**
176+
- A: This project only handles rotation and perspective scenarios for table extraction. For distorted images, you need to correct the distortion first.
177+
178+
### Acknowledgments
179+
180+
- [2nd Place Solution in Baidu Table Detection Competition](https://aistudio.baidu.com/projectdetail/5398861?searchKeyword=%E8%A1%A8%E6%A0%BC%E6%A3%80%E6%B5%8B%E5%A4%A7%E8%B5%9B&searchTab=ALL)
181+
- [WTW Natural Scene Table Dataset](https://tianchi.aliyun.com/dataset/108587)
182+
- [FinTabNet PDF Document Table Dataset](https://developer.ibm.com/exchanges/data/all/fintabnet/)
183+
- [TableBank Table Dataset](https://doc-analysis.github.io/tablebank-page/)
184+
- [TableGeneration Table Auto-Generation Tool](https://github.com/WenmuZhou/TableGeneration)
185+
186+
### Contribution Guidelines
187+
188+
Pull requests are welcome. For major changes, please open an issue to discuss what you would like to change.
189+
190+
If you have other good suggestions and integration scenarios, the author will actively respond and support them.
191+
192+
### Open Source License
193+
194+
This project is licensed under the [Apache 2.0](https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE) open source license.
195+

0 commit comments

Comments
 (0)