-
Notifications
You must be signed in to change notification settings - Fork 19
Closed
Description
For one of our examples,that is http://tudigit.ulb.tu-darmstadt.de/show/Gue-11660-24 , we have a struggle with line segmentation.
Of course, being three columned and separated by lines doesn't make things easy.
If we run e.g. the following workflow
'anybaseocr-binarize -I OCR-D-IMG -O OCR-D-BIN' \
'anybaseocr-crop -I OCR-D-BIN -O OCR-D-CROP' \
'anybaseocr-deskew -I OCR-D-CROP -O OCR-D-DESKEW' \
'tesserocr-segment-region -I OCR-D-DESKEW -O OCR-D-PAGE-SEG' \
'segment-repair -I OCR-D-PAGE-SEG -O OCR-D-PAGE-SEG-REPAIR -P plausibilize true' \
'tesserocr-deskew -I OCR-D-PAGE-SEG-REPAIR -O OCR-D-REG-DESKEW' \
'tesserocr-segment-line -I OCR-D-REG-DESKEW -O OCR-D-LINE-SEG' \
'tesserocr-recognize -I OCR-D-LINE-SEG -O OCR-D-OCR -P model Fraktur'
we get a table for the main content. We find this reasonable. However, the line-segmentation is not performed for any of the table cells. This is independent of the line-segmentation processor, i.e. happens with cis-ocropy-segmet, too.
Is this an expected behaviour?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels