Skip to content

only use multiprocessing if max_workers > 1 (partial revert of f95111ff)#1352

Merged
kba merged 1 commit intoOCR-D:masterfrom
bertsky:mp-iff-parallel
Feb 4, 2026
Merged

only use multiprocessing if max_workers > 1 (partial revert of f95111ff)#1352
kba merged 1 commit intoOCR-D:masterfrom
bertsky:mp-iff-parallel

Conversation

@bertsky
Copy link
Collaborator

@bertsky bertsky commented Jan 21, 2026

This was a self-own: in f95111f I tried to be smarter than my past self and changed the criterion when to use the ProcessPool executor –

  • from max_workers > 1 (i.e. whether OCRD_MAX_PARALLEL_PAGES was requested and the processor implementation supports that)
  • to isinstance(workspace.mets, ClientSideOcrdMets) (i.e. whether the workspace can be processed in parallel)

But for such important cases like Tensorflow, where (unless you put the model in a singleton background process connected via queues to the page workers) multiprocessing is impossible (because the CUDA context cannot be shared), this is clearly wrong. We have to be able to prohibit in the processor implementation (via max_workers = 1) multiprocessing.

@bertsky bertsky requested a review from kba January 21, 2026 17:58
@kba kba force-pushed the mp-iff-parallel branch from 1961d68 to beed714 Compare February 3, 2026 17:46
@kba kba merged commit beed714 into OCR-D:master Feb 4, 2026
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants