-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
Describe the bug
When attempting to extract tables from this 250+ page PDF, I found that it hangs on a specific page (98), in the 'Detect' method.
To Reproduce
Using 40927R03.pdf
I've tried with 0.1.3 and 0.1.4-alpha001, and got hang in same spot.
Using .NET 6.0, C#.
using var pdoc = PdfDocument.Open(content.Stream, new ParsingOptions { SkipMissingFonts = true, UseLenientParsing = true });
var da = new Tabula.Detectors.SimpleNurminenDetectionAlgorithm();
var area = Tabula.ObjectExtractor.ExtractPage(pdoc, 98 /* hangs on this page */);
var regions = da.Detect(area); <-- this line hangs
Expected behavior
To properly parse all tables.
andyesys and LuisM000
Metadata
Metadata
Assignees
Labels
No labels