Skip to content

Commit 99a3a0a

Browse files
committed
[feat]: readme
1 parent 8d63383 commit 99a3a0a

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ The API is built with FastAPI and uses Celery for asynchronous task processing.
88

99
## Features:
1010
- **No Cloud/external dependencies** all you need: PyTorch based OCR (EasyOCR) + Ollama are shipped and configured via `docker-compose` no data is sent outside your dev/server environment,
11-
- **PDF/Office to Markdown** conversion with very high accuracy using different OCR strategies including [llama3.2-vision](https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/), [easyOCR](https://github.com/JaidedAI/EasyOCR)
11+
- **PDF/Office to Markdown** conversion with very high accuracy using different OCR strategies including [llama3.2-vision](https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/), [easyOCR](https://github.com/JaidedAI/EasyOCR), [minicpm-v](https://github.com/OpenBMB/MiniCPM-o?tab=readme-ov-file#minicpm-v-26)
1212
- **PDF/Office to JSON** conversion using Ollama supported models (eg. LLama 3.1)
1313
- **LLM Improving OCR results** LLama is pretty good with fixing spelling and text issues in the OCR text
1414
- **Removing PII** This tool can be used for removing Personally Identifiable Information out of document - see `examples`
@@ -410,7 +410,7 @@ apiClient.uploadFile(formData).then(response => {
410410
- **Method**: POST
411411
- **Parameters**:
412412
- **file**: PDF, image or Office file to be processed.
413-
- **strategy**: OCR strategy to use (`llama_vision` or `easyocr`).
413+
- **strategy**: OCR strategy to use (`llama_vision`, `minicpm_v` or `easyocr`).
414414
- **ocr_cache**: Whether to cache the OCR result (true or false).
415415
- **prompt**: When provided, will be used for Ollama processing the OCR result
416416
- **model**: When provided along with the prompt - this model will be used for LLM processing
@@ -429,7 +429,7 @@ curl -X POST -H "Content-Type: multipart/form-data" -F "file=@examples/example-m
429429
- **Method**: POST
430430
- **Parameters** (JSON body):
431431
- **file**: Base64 encoded PDF file content.
432-
- **strategy**: OCR strategy to use (`llama_vision` or `easyocr`).
432+
- **strategy**: OCR strategy to use (`llama_vision`, `minicpm_v` or `easyocr`).
433433
- **ocr_cache**: Whether to cache the OCR result (true or false).
434434
- **prompt**: When provided, will be used for Ollama processing the OCR result.
435435
- **model**: When provided along with the prompt - this model will be used for LLM processing.

0 commit comments

Comments
 (0)