1515</div >
1616
1717### 最近更新
18- - ** 2024.11.22**
19- - 支持单字符匹配方案,需要RapidOCR>=1.4.0
2018- ** 2024.12.25**
2119 - 补充文档扭曲矫正/去模糊/去阴影/二值化方案,可作为前置处理 [ RapidUnDistort] ( https://github.com/Joker1212/RapidUnWrap )
2220- ** 2025.1.9**
23- - RapidTable支持了 unitable 模型,精度更高支持torch推理,补充测评数据
21+ - RapidTable支持了 unitable 模型,精度更高支持torch推理,补充测评数据
22+ - ** 2025.3.30**
23+ - 输入输出格式对齐RapidTable
24+ - 支持模型自动下载
25+ - 增加来自paddle的新表格分类模型
26+ - 增加最新PaddleX表格识别模型测评值
27+ - 支持 rapidocr 2.0 取消重复ocr检测
2428
2529### 简介
2630💖该仓库是用来对文档中表格做结构化识别的推理库,包括来自阿里读光有线和无线表格识别模型,llaipython(微信)贡献的有线表格模型,网易Qanything内置表格分类模型等。\
5458 Surya-Tabled 使用内置ocr模块,表格模型为行列识别模型,无法识别单元格合并,导致分数较低
5559
5660| 方法 | TEDS | TEDS-only-structure |
57- | :---------------------------------------------------------------------------------------------------------| :-----------:| :-------------------:|
58- | [ surya-tabled(--skip-detect)] ( https://github.com/VikParuchuri/tabled ) | 0.33437 | 0.65865 |
59- | [ surya-tabled] ( https://github.com/VikParuchuri/tabled ) | 0.33940 | 0.67103 |
60- | [ deepdoctection(table-transformer)] ( https://github.com/deepdoctection/deepdoctection?tab=readme-ov-file ) | 0.59975 | 0.69918 |
61- | [ ppstructure_table_master] ( https://github.com/PaddlePaddle/PaddleOCR/tree/main/ppstructure ) | 0.61606 | 0.73892 |
62- | [ ppsturcture_table_engine] ( https://github.com/PaddlePaddle/PaddleOCR/tree/main/ppstructure ) | 0.67924 | 0.78653 |
63- | [ StructEqTable] ( https://github.com/UniModal4Reasoning/StructEqTable-Deploy ) | 0.67310 | 0.81210 |
64- | [ RapidTable(SLANet)] ( https://github.com/RapidAI/RapidTable ) | 0.71654 | 0.81067 |
65- | table_cls + wired_table_rec v1 + lineless_table_rec | 0.75288 | 0.82574 |
66- | table_cls + wired_table_rec v2 + lineless_table_rec | 0.77676 | 0.84580 |
67- | [ RapidTable(SLANet-plus)] ( https://github.com/RapidAI/RapidTable ) | 0.84481 | 0.91369 |
68- | [ RapidTable(unitable)] ( https://github.com/RapidAI/RapidTable ) | ** 0.86200** | ** 0.91813** |
61+ | :---------------------------------------------------------------------------------------------------------| :-----------:| :-----------------:|
62+ | [ surya-tabled(--skip-detect)] ( https://github.com/VikParuchuri/tabled ) | 0.33437 | 0.65865 |
63+ | [ surya-tabled] ( https://github.com/VikParuchuri/tabled ) | 0.33940 | 0.67103 |
64+ | [ deepdoctection(table-transformer)] ( https://github.com/deepdoctection/deepdoctection?tab=readme-ov-file ) | 0.59975 | 0.69918 |
65+ | [ ppstructure_table_master] ( https://github.com/PaddlePaddle/PaddleOCR/tree/main/ppstructure ) | 0.61606 | 0.73892 |
66+ | [ ppsturcture_table_engine] ( https://github.com/PaddlePaddle/PaddleOCR/tree/main/ppstructure ) | 0.67924 | 0.78653 |
67+ | [ StructEqTable] ( https://github.com/UniModal4Reasoning/StructEqTable-Deploy ) | 0.67310 | 0.81210 |
68+ | [ RapidTable(SLANet)] ( https://github.com/RapidAI/RapidTable ) | 0.71654 | 0.81067 |
69+ | table_cls + wired_table_rec v1 + lineless_table_rec | 0.75288 | 0.82574 |
70+ | table_cls + wired_table_rec v2 + lineless_table_rec | 0.77676 | 0.84580 |
71+ | [ PaddleX(SLANetXt+RT-DERT)] ( https://github.com/PaddlePaddle/PaddleX ) | 0.79900 | ** 0.92222** |
72+ | [ RapidTable(SLANet-plus)] ( https://github.com/RapidAI/RapidTable ) | 0.84481 | 0.91369 |
73+ | [ RapidTable(unitable)] ( https://github.com/RapidAI/RapidTable ) | ** 0.86200** | 0.91813 |
6974
7075### 使用建议
7176wired_table_rec_v2(有线表格精度最高): 通用场景有线表格(论文,杂志,期刊, 收据,单据,账单)
@@ -75,63 +80,93 @@ wired_table_rec_v2 对1500px内大小的图片效果最好,所以分辨率超
7580SLANet-plus/unitable (综合精度最高): 文档场景表格(论文,杂志,期刊中的表格)
7681
7782### 安装
78-
83+ rapidocr2.0以上版本支持torch,onnx,paddle,openvino等多引擎切换,详情参考 [ rapidocr文档 ] ( https://rapidai.github.io/RapidOCRDocs/main/install_usage/rapidocr/usage/ )
7984``` python {linenos=table}
8085pip install wired_table_rec lineless_table_rec table_cls
86+ pip install rapidocr
8187```
8288
8389### 快速使用
84-
90+ > ⚠️注意:在 ` wired_table_rec/table_cls ` >=1.2.0 ` ` lineless_table_rec` > 0.1.0 后,采用同RapidTable完全一致格式的输入输出
8591``` python {linenos=table}
86- import os
92+ from pathlib import Path
8793
88- from lineless_table_rec import LinelessTableRecognition
89- from lineless_table_rec.utils_table_recover import format_html, plot_rec_box_with_logic_info, plot_rec_box
94+ from wired_table_rec.utils.utils import VisTable
9095from table_cls import TableCls
91- from wired_table_rec import WiredTableRecognition
92- from rapidocr_onnxruntime import RapidOCR
93-
94- lineless_engine = LinelessTableRecognition()
95- wired_engine = WiredTableRecognition()
96- # 默认小yolo模型(0.1s),可切换为精度更高yolox(0.25s),更快的qanything(0.07s)模型
97- table_cls = TableCls() # TableCls(model_type="yolox"),TableCls(model_type="q")
98- img_path = f ' images/img14.jpg '
99-
100- cls ,elasp = table_cls(img_path)
101- if cls == ' wired' :
102- table_engine = wired_engine
103- else :
104- table_engine = lineless_engine
105-
106- html, elasp, polygons, logic_points, ocr_res = table_engine(img_path)
107- print (f " elasp: { elasp} " )
108-
109- # 使用其他ocr模型
110- # ocr_engine =RapidOCR(det_model_path="xxx/det_server_infer.onnx",rec_model_path="xxx/rec_server_infer.onnx")
111- # ocr_res, _ = ocr_engine(img_path)
112- # html, elasp, polygons, logic_points, ocr_res = table_engine(img_path, ocr_result=ocr_res)
113- # output_dir = f'outputs'
114- # complete_html = format_html(html)
115- # os.makedirs(os.path.dirname(f"{output_dir}/table.html"), exist_ok=True)
116- # with open(f"{output_dir}/table.html", "w", encoding="utf-8") as file:
117- # file.write(complete_html)
118- # # 可视化表格识别框 + 逻辑行列信息
119- # plot_rec_box_with_logic_info(
120- # img_path, f"{output_dir}/table_rec_box.jpg", logic_points, polygons
121- # )
122- # # 可视化 ocr 识别框
123- # plot_rec_box(img_path, f"{output_dir}/ocr_box.jpg", ocr_res)
96+ from wired_table_rec.main import WiredTableInput, WiredTableRecognition
97+ from lineless_table_rec.main import LinelessTableInput, LinelessTableRecognition
98+ from rapidocr import RapidOCR
99+
100+
101+ if __name__ == " __main__" :
102+ # Init
103+ wired_input = WiredTableInput()
104+ lineless_input = LinelessTableInput()
105+ wired_engine = WiredTableRecognition(wired_input)
106+ lineless_engine = LinelessTableRecognition(lineless_input)
107+ viser = VisTable()
108+ # 默认小yolo模型(0.1s),可切换为精度更高yolox(0.25s),更快的qanything(0.07s)模型或paddle模型(0.03s)
109+ table_cls = TableCls()
110+ img_path = f " tests/test_files/table.jpg "
111+
112+ cls , elasp = table_cls(img_path)
113+ if cls == " wired" :
114+ table_engine = wired_engine
115+ else :
116+ table_engine = lineless_engine
117+
118+ # 使用RapidOCR输入
119+ ocr_engine = RapidOCR()
120+ rapid_ocr_output = ocr_engine(img_path, return_word_box = True )
121+ ocr_result = list (
122+ zip (rapid_ocr_output.boxes, rapid_ocr_output.txts, rapid_ocr_output.scores)
123+ )
124+ table_results = table_engine(
125+ img_path, ocr_result = ocr_result
126+ )
127+
128+ # 使用单字识别
129+ # word_results = rapid_ocr_output.word_results
130+ # ocr_result = [
131+ # [word_result[2], word_result[0], word_result[1]] for word_result in word_results
132+ # ]
133+ # table_results = table_engine(
134+ # img_path, ocr_result=ocr_result, enhance_box_line=False
135+ # )
136+
137+ # Save
138+ # save_dir = Path("outputs")
139+ # save_dir.mkdir(parents=True, exist_ok=True)
140+ #
141+ # save_html_path = f"outputs/{Path(img_path).stem}.html"
142+ # save_drawed_path = f"outputs/{Path(img_path).stem}_table_vis{Path(img_path).suffix}"
143+ # save_logic_path = (
144+ # f"outputs/{Path(img_path).stem}_table_vis_logic{Path(img_path).suffix}"
145+ # )
146+
147+ # Visualize table rec result
148+ # vis_imged = viser(
149+ # img_path, table_results, save_html_path, save_drawed_path, save_logic_path
150+ # )
151+
152+
153+
154+
155+
124156```
125157
126158#### 单字ocr匹配
159+
127160``` python
128161# 将单字box转换为行识别同样的结构)
129- from rapidocr_onnxruntime import RapidOCR
130- from wired_table_rec.utils_table_recover import trans_char_ocr_res
162+ from rapidocr import RapidOCR
131163img_path = " tests/test_files/wired/table4.jpg"
132- ocr_engine = RapidOCR()
133- ocr_res, _ = ocr_engine(img_path, return_word_box = True )
134- ocr_res = trans_char_ocr_res(ocr_res)
164+ ocr_engine = RapidOCR()
165+ rapid_ocr_output = ocr_engine(img_path, return_word_box = True )
166+ word_results = rapid_ocr_output.word_results
167+ ocr_result = [
168+ [word_result[2 ], word_result[0 ], word_result[1 ]] for word_result in word_results
169+ ]
135170```
136171
137172#### 表格旋转及透视修正
@@ -177,24 +212,53 @@ for i, res in enumerate(result):
177212
178213### 核心参数
179214``` python
180- wired_table_rec = WiredTableRecognition()
181- html, elasp, polygons, logic_points, ocr_res = wired_table_rec(
215+ # 输入(WiredTableInput/LinelessTableInput)
216+ @dataclass
217+ class WiredTableInput :
218+ model_type: Optional[str ] = " unet" # unet/cycle_center_net
219+ model_path: Union[str , Path, None , Dict[str , str ]] = None
220+ use_cuda: bool = False
221+ device: str = " cpu"
222+
223+ @dataclass
224+ class LinelessTableInput :
225+ model_type: Optional[str ] = " lore" # lore
226+ model_path: Union[str , Path, None , Dict[str , str ]] = None
227+ use_cuda: bool = False
228+ device: str = " cpu"
229+
230+ # 输出(WiredTableOutput/LinelessTableOutput)
231+ @dataclass
232+ class WiredTableOutput :
233+ pred_html: Optional[str ] = None
234+ cell_bboxes: Optional[np.ndarray] = None
235+ logic_points: Optional[np.ndarray] = None
236+ elapse: Optional[float ] = None
237+
238+ @dataclass
239+ class LinelessTableOutput :
240+ pred_html: Optional[str ] = None
241+ cell_bboxes: Optional[np.ndarray] = None
242+ logic_points: Optional[np.ndarray] = None
243+ elapse: Optional[float ] = None
244+ ```
245+
246+ ``` python
247+ wired_table_rec = WiredTableRecognition(WiredTableInput())
248+ table_results = wired_table_rec(
182249 img, # 图片 Union[str, np.ndarray, bytes, Path, PIL.Image.Image]
183250 ocr_result, # 输入rapidOCR识别结果,不传默认使用内部rapidocr模型
184- version = " v2" , # 默认使用v2线框模型,切换阿里读光模型可改为v1
185251 enhance_box_line = True , # 识别框切割增强(关闭避免多余切割,开启减少漏切割),默认为True
186252 col_threshold = 15 , # 识别框左边界x坐标差值小于col_threshold的默认同列
187253 row_threshold = 10 , # 识别框上边界y坐标差值小于row_threshold的默认同行
188254 rotated_fix = True , # wiredV2支持,轻度旋转(-45°~45°)矫正,默认为True
189255 need_ocr = True , # 是否进行OCR识别, 默认为True
190- rec_again = True ,# 是否针对未识别到文字的表格框,进行单独截取再识别,默认为True
191256)
192- lineless_table_rec = LinelessTableRecognition()
193- html, elasp, polygons, logic_points, ocr_res = lineless_table_rec(
257+ lineless_table_rec = LinelessTableRecognition(LinelessTableInput() )
258+ table_results = lineless_table_rec(
194259 img, # 图片 Union[str, np.ndarray, bytes, Path, PIL.Image.Image]
195260 ocr_result, # 输入rapidOCR识别结果,不传默认使用内部rapidocr模型
196261 need_ocr = True , # 是否进行OCR识别, 默认为True
197- rec_again = True ,# 是否针对未识别到文字的表格框,进行单独截取再识别,默认为True
198262)
199263```
200264
@@ -225,7 +289,7 @@ html, elasp, polygons, logic_points, ocr_res = lineless_table_rec(
225289``` mermaid
226290flowchart TD
227291 A[/表格图片/] --> B([表格分类 table_cls])
228- B --> C([有线表格识别 wired_table_rec]) & D([无线表格识别 lineless_table_rec]) --> E([文字识别 rapidocr_onnxruntime ])
292+ B --> C([有线表格识别 wired_table_rec]) & D([无线表格识别 lineless_table_rec]) --> E([文字识别 rapidocr ])
229293 E --> F[/html结构化输出/]
230294```
231295
0 commit comments