Skip to content

请教大佬,OCR识别内容后,有没有解决 “非结构化文本→结构化字段” 的语义对齐问题,实现模型或方法? #15

@fssgh

Description

@fssgh

1、通过OCR引擎识别铭牌图片,例如:

Image

Image
2、把OCR识别的铭牌内容,通过某种轻量模型,转换成一个固定的json结构输出如:
{"manufacturer": "江苏科兴电器有限公司","phone": "0523-87565243","address": "江苏省泰兴市根思乡根思产业园区文昌路 131","manufactureDate": "2021 年 11 月","serialNumber": "211123021","insulationClass": "E 级","cosφ": "0.8","standard": "GB/T20840.1.2","shortTimeThermalCurrent": "31.5kA/s","dynamicStableCurrent": "80kA","model": "LZZBJ9-10E2","equipmentType": "电流互感器","frequency": "50Hz","ratedInsulationLevel": "12/42/75 (kV)","certification": "MC","manufacturingStandard": "苏制 12831412","secondaryWireMarks": {"1S1-1S2": {"ratedCurrentRatio": "300/1","accuracyClass": "10P40","ratedOutput": "7.5 (VA)"},"2S1-2S2": {"ratedCurrentRatio": "300/1","accuracyClass": "10P40","ratedOutput": "7.5 (VA)"},"3S1-3S2": {"ratedCurrentRatio": "300/1","accuracyClass": "10P40","ratedOutput": "7.5 (VA)"},"4S1-4S2": {"ratedCurrentRatio": "300/1","accuracyClass": "0.5","ratedOutput": "10 (VA)"},"5S1-5S2": {"ratedCurrentRatio": "300/1","accuracyClass": "0.2S","ratedOutput": "5 (VA)"}}}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions