Skip to content

Commit a0e5c82

Browse files
authored
Merge branch 'master' into ocrd-processors
2 parents 81ba7cf + 75796b5 commit a0e5c82

File tree

3 files changed

+38
-2
lines changed

3 files changed

+38
-2
lines changed

README.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
# TSV - Processing Tools
22

3+
Create .tsv files that can be viewed and edited with [neat](https://github.com/qurator-spk/neat).
4+
35
## Installation:
46

7+
Clone this project and the [SBB-utils](https://github.com/qurator-spk/sbb_utils).
8+
59
Setup virtual environment:
610
```
711
virtualenv --python=python3.6 venv
@@ -19,7 +23,8 @@ pip install -U pip
1923

2024
Install package together with its dependencies in development mode:
2125
```
22-
pip install -e ./
26+
pip install -e sbb_utils
27+
pip install -e page2tsv
2328
```
2429

2530
## PAGE-XML to TSV Transformation:
@@ -59,3 +64,33 @@ Create a URL-annotated TSV file from an existing TSV file:
5964
```
6065
annotate-tsv enp_DE.tsv enp_DE-annotated.tsv
6166
```
67+
68+
# Command-line interface:
69+
70+
```
71+
page2tsv [OPTIONS] PAGE_XML_FILE TSV_OUT_FILE
72+
73+
Options:
74+
--purpose [NERD|OCR] Purpose of output tsv file.
75+
76+
NERD: NER/NED application/ground-truth creation.
77+
78+
OCR: OCR application/ground-truth creation.
79+
80+
default: NERD.
81+
--image-url TEXT
82+
--ner-rest-endpoint TEXT REST endpoint of sbb_ner service. See
83+
https://github.com/qurator-spk/sbb_ner for
84+
details. Only applicable in case of NERD.
85+
--ned-rest-endpoint TEXT REST endpoint of sbb_ned service. See
86+
https://github.com/qurator-spk/sbb_ned for
87+
details. Only applicable in case of NERD.
88+
--noproxy disable proxy. default: enabled.
89+
--scale-factor FLOAT default: 1.0
90+
--ned-threshold FLOAT
91+
--min-confidence FLOAT
92+
--max-confidence FLOAT
93+
--ned-priority INTEGER
94+
--help Show this message and exit.
95+
96+
```

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
ocrd >= 2.23.2
22
pandas
33
matplotlib
4-
qurator-sbb-tools
4+
qurator-sbb-utils

tsvtools/cli.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
from qurator.utils.ner import ner
2020
from qurator.utils.ned import ned
2121

22+
2223
@click.command()
2324
@click.argument('tsv-file', type=click.Path(exists=True), required=True, nargs=1)
2425
@click.argument('url-file', type=click.Path(exists=False), required=True, nargs=1)

0 commit comments

Comments
 (0)