Skip to content

Commit 2e6ef33

Browse files
committed
refactor: deduplicate image extraction logic
Extract shared image processing and retrieval logic into reusable helper functions, eliminating ~150 lines of duplicate code between extractImagesFromPage and extractPageContent. **Changes:** - Add processImageData() - converts raw PDF.js image data to ExtractedImage - Add retrieveImageData() - handles image retrieval strategy (commonObjs -> sync -> async with timeout) - Refactor extractImagesFromPage to use shared helpers - Refactor extractPageContent to use shared helpers while preserving yPosition **Benefits:** - Reduces code duplication by ~150 lines - Improves maintainability - fixes/improvements in one place - Increases test coverage from 90.7% to 95.37% - Consistent error handling and timeout behavior across both functions
1 parent 7893cf6 commit 2e6ef33

File tree

1 file changed

+164
-266
lines changed

1 file changed

+164
-266
lines changed

0 commit comments

Comments
 (0)