Skip to content

Commit 939522a

Browse files
committed
added download model script download.py - Adithya S K
1 parent 4429e95 commit 939522a

File tree

3 files changed

+41
-7
lines changed

3 files changed

+41
-7
lines changed

README.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,17 @@ python server.py --host 0.0.0.0 --port 8000 --documents --media --web
9595
- `--media`: Load in Whisper model to transcribe audio and video files.
9696
- `--web`: Set up selenium crawler.
9797

98+
Download Models:
99+
If you want to download the models before starting the server
100+
101+
```bash
102+
python download.py --documents --media --web
103+
```
104+
105+
- `--documents`: Load in all the models that help you parse and ingest documents (Surya OCR series of models and Florence-2).
106+
- `--media`: Load in Whisper model to transcribe audio and video files.
107+
- `--web`: Set up selenium crawler.
108+
98109
## Supported Data Types
99110

100111
| Type | Supported Extensions |
@@ -280,14 +291,16 @@ Arguments:
280291
## Limitations
281292
There is a need for a GPU with 8~10 GB minimum VRAM as we are using deep learning models.
282293
\
294+
283295
Document Parsing Limitations
284296
\
285-
[Marker](https://github.com/VikParuchuri/marker) which is the underlying PDF parser will not convert 100% of equations to LaTeX because it has to detect and then convert them.
286-
Tables are not always formatted 100% correctly; text can be in the wrong column.
287-
Whitespace and indentations are not always respected.
288-
Not all lines/spans will be joined properly.
289-
This works best on digital PDFs that won't require a lot of OCR. It's optimized for speed, and limited OCR is used to fix errors.
290-
To fit all the models in the GPU, we are using the smallest variants, which might not offer the best-in-class performance.
297+
- [Marker](https://github.com/VikParuchuri/marker) which is the underlying PDF parser will not convert 100% of equations to LaTeX because it has to detect and then convert them.
298+
- It is good at parsing english but might struggle for languages such as Chinese
299+
- Tables are not always formatted 100% correctly; text can be in the wrong column.
300+
- Whitespace and indentations are not always respected.
301+
- Not all lines/spans will be joined properly.
302+
- This works best on digital PDFs that won't require a lot of OCR. It's optimized for speed, and limited OCR is used to fix errors.
303+
- To fit all the models in the GPU, we are using the smallest variants, which might not offer the best-in-class performance.
291304

292305
## License
293306
OmniParse is licensed under the GPL-3.0 license. See `LICENSE` for more information.

download.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
"""
2+
Script to download models
3+
"""
4+
import argparse
5+
from omniparse import load_omnimodel
6+
7+
def download_models():
8+
9+
parser = argparse.ArgumentParser(description="Download models for omniparse")
10+
11+
parser.add_argument("--documents", action='store_true', help="Load document models")
12+
parser.add_argument("--media", action='store_true', help="Load media models")
13+
parser.add_argument("--web", action='store_true', help="Load web models")
14+
args = parser.parse_args()
15+
16+
17+
load_omnimodel(args.documents, args.media, args.web)
18+
19+
20+
if __name__ == '__main__':
21+
download_models()

server.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ def add(app: FastAPI):
3434

3535
def main():
3636
# Parse command-line arguments
37-
parser = argparse.ArgumentParser(description="Run the marker-api server.")
37+
parser = argparse.ArgumentParser(description="Run the omniparse server.")
3838
parser.add_argument("--host", default="0.0.0.0", help="Host IP address")
3939
parser.add_argument("--port", type=int, default=8000, help="Port number")
4040
parser.add_argument("--documents", action='store_true', help="Load document models")

0 commit comments

Comments
 (0)