Skip to content

Commit 3947265

Browse files
committed
[fix] readme fixes
1 parent faacb7a commit 3947265

File tree

7 files changed

+23
-8
lines changed

7 files changed

+23
-8
lines changed

.env.example

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ REDIS_CACHE_URL=redis://redis:6379/1
33
OLLAMA_HOST=http://ollama:11434
44
STORAGE_PROFILE_PATH=./storage_profiles
55
LLAMA_VISION_PROMPT="You are OCR. Convert image to markdown."
6+
REMOTE_API_URL=
67

78
# CLI settings
89
OCR_URL=http://localhost:8000/ocr/upload
@@ -15,3 +16,4 @@ LOAD_FILE_URL=http://localhost:8000/storage/load
1516
DELETE_FILE_URL=http://localhost:8000/storage/delete
1617
OCR_REQUEST_URL=http://localhost:8000/ocr/request
1718
OCR_UPLOAD_URL=http://localhost:8000/ocr/upload
19+

.env.localhost.example

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
REDIS_CACHE_URL=redis://localhost:6379/1
33
LLAMA_VISION_PROMPT="You are OCR. Convert image to markdown."
44
DISABLE_LOCAL_OLLAMA=0
5+
REMOTE_API_URL=
56

67
# CLI settings
78
OCR_URL=http://localhost:8000/ocr/upload

README.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ The API is built with FastAPI and uses Celery for asynchronous task processing.
88

99
## Features:
1010
- **No Cloud/external dependencies** all you need: PyTorch based OCR (EasyOCR) + Ollama are shipped and configured via `docker-compose` no data is sent outside your dev/server environment,
11-
- **PDF/Office to Markdown** conversion with very high accuracy using different OCR strategies including [llama3.2-vision](https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/), [easyOCR](https://github.com/JaidedAI/EasyOCR), [minicpm-v](https://github.com/OpenBMB/MiniCPM-o?tab=readme-ov-file#minicpm-v-26), [marker-pdf](https://github.com/VikParuchuri/marker)
11+
- **PDF/Office to Markdown** conversion with very high accuracy using different OCR strategies including [llama3.2-vision](https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/), [easyOCR](https://github.com/JaidedAI/EasyOCR), [minicpm-v](https://github.com/OpenBMB/MiniCPM-o?tab=readme-ov-file#minicpm-v-26), remote URL strategies including [marker-pdf](https://github.com/VikParuchuri/marker)
1212
- **PDF/Office to JSON** conversion using Ollama supported models (eg. LLama 3.1)
1313
- **LLM Improving OCR results** LLama is pretty good with fixing spelling and text issues in the OCR text
1414
- **Removing PII** This tool can be used for removing Personally Identifiable Information out of document - see `examples`
@@ -215,12 +215,20 @@ pip install -U uvicorn fastapi python-multipart
215215
marker_server --port 8002
216216
```
217217
218-
**Note: *** you might run `marker_server` on different port - then just make sure you export a proper env setting beffore starting off `text-extract-api` server:
218+
Set the Remote API Url:
219+
220+
**Note: *** you might run `marker_server` on different port or server - then just make sure you export a proper env setting beffore starting off `text-extract-api` server:
219221
220222
```bash
221223
export REMOTE_API_URL=http://localhost:8002/marker/upload
222224
```
223225
226+
Run the `text-extract-api`:
227+
228+
```bash
229+
make run
230+
```
231+
224232
Please do use the `strategy=remote` CLI and URL parameters to use it. For example:
225233
226234
```bash
@@ -477,7 +485,7 @@ apiClient.uploadFile(formData).then(response => {
477485
- **Method**: POST
478486
- **Parameters**:
479487
- **file**: PDF, image or Office file to be processed.
480-
- **strategy**: OCR strategy to use (`llama_vision`, `minicpm_v`, `marker` or `easyocr`). See the [available strategies](#text-extract-stratgies)
488+
- **strategy**: OCR strategy to use (`llama_vision`, `minicpm_v`, `remote` or `easyocr`). See the [available strategies](#text-extract-stratgies)
481489
- **ocr_cache**: Whether to cache the OCR result (true or false).
482490
- **prompt**: When provided, will be used for Ollama processing the OCR result
483491
- **model**: When provided along with the prompt - this model will be used for LLM processing
@@ -496,7 +504,7 @@ curl -X POST -H "Content-Type: multipart/form-data" -F "file=@examples/example-m
496504
- **Method**: POST
497505
- **Parameters** (JSON body):
498506
- **file**: Base64 encoded PDF file content.
499-
- **strategy**: OCR strategy to use (`llama_vision`, `minicpm_v`, marker or `easyocr`). See the [available strategies](#text-extract-stratgies)
507+
- **strategy**: OCR strategy to use (`llama_vision`, `minicpm_v`, `remote` or `easyocr`). See the [available strategies](#text-extract-stratgies)
500508
- **ocr_cache**: Whether to cache the OCR result (true or false).
501509
- **prompt**: When provided, will be used for Ollama processing the OCR result.
502510
- **model**: When provided along with the prompt - this model will be used for LLM processing.

config/strategies.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ strategies:
66
easyocr:
77
class: text_extract_api.extract.strategies.easyocr.EasyOCRStrategy
88
remote:
9-
class: text_extract_api.extract.strategies.marker.RemoteStrategy
9+
class: text_extract_api.extract.strategies.remote.RemoteStrategy

docker-compose.gpu.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ services:
1919
- LOAD_FILE_URL=${LOAD_FILE_URL-http://localhost:8000/storage/load}
2020
- DELETE_FILE_URL=${DELETE_FILE_URL-http://localhost:8000/storage/delete}
2121
- LLAMA_VISION_PROMPT=${LLAMA_VISION_PROMPT-"You are OCR. Convert image to markdown."}
22+
- REMOTE_API_URL=${REMOTE_API_URL}
2223
depends_on:
2324
- redis
2425
- ollama

docker-compose.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ services:
1919
- LOAD_FILE_URL=${LOAD_FILE_URL-http://localhost:8000/storage/load}
2020
- DELETE_FILE_URL=${DELETE_FILE_URL-http://localhost:8000/storage/delete}
2121
- LLAMA_VISION_PROMPT=${LLAMA_VISION_PROMPT-"You are OCR. Convert image to markdown."}
22+
- REMOTE_API_URL=${REMOTE_API_URL}
2223
depends_on:
2324
- redis
2425
- ollama

text_extract_api/extract/strategies/marker.py renamed to text_extract_api/extract/strategies/remote.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ class RemoteStrategy(Strategy):
1616

1717
@classmethod
1818
def name(cls) -> str:
19-
return "marker"
19+
return "remote"
2020

2121
def extract_text(self, file_format: FileFormat, language: str = 'en') -> ExtractResult:
2222

@@ -40,7 +40,9 @@ def extract_text(self, file_format: FileFormat, language: str = 'en') -> Extract
4040
raise ValueError("No PDF file found - conversion error.")
4141

4242
try:
43-
url = os.getenv("REMOTE_API_URL", "http://localhost:8002/marker/upload")
43+
url = os.getenv("REMOTE_API_URL", "")
44+
if not url:
45+
raise Exception('Please do set the REMOTE_API_URL environment variable: export REMOTE_API_URL=http://...')
4446
files = {'file': ('document.pdf', pdf_files[0].binary, 'application/pdf')}
4547
data = {
4648
'page_range': None,
@@ -64,6 +66,6 @@ def extract_text(self, file_format: FileFormat, language: str = 'en') -> Extract
6466
extracted_text = response.json().get('output', '')
6567
except Exception as e:
6668
print('Error:', e)
67-
raise Exception("Failed to generate text with Marker PDF API. Make sure marker-pdf server is up and running: marker_server --port 8002. Details: https://github.com/VikParuchuri/marker")
69+
raise Exception("Failed to generate text with Remote API. Make sure the remote server is up and running")
6870

6971
return ExtractResult.from_text(extracted_text)

0 commit comments

Comments
 (0)