[Enhancement] End-to-end support for images (as well as PDFs)

While this sample was originally created for multi-page documents in PDF, other related use-cases (such as ID document or receipt extraction) may operate on single-page images/photographs/scans instead.

Today there's support for images in some aspects of the pipeline, but others assume PDF. It would be great to round out support for images as source documents - particularly for common JPEG+PNG formats which have good native support in e.g. Amazon Textract, SageMaker Ground Truth, and web browsers.

- [X] 1. (Believe so but need to double-check) Core Textract state machine component supports OCRing image files
- [ ] 2. Notebook entity recognition data prep flow supports image files
- [ ] 3. (Need to check) OCR pipeline trigger and Textract orchestration supports image files
- [ ] 4. (Known gap) A2I human review UI supports image files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Enhancement] End-to-end support for images (as well as PDFs) #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Enhancement] End-to-end support for images (as well as PDFs) #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions