Skip to content

Conversation

@bertsky
Copy link
Collaborator

@bertsky bertsky commented Jan 21, 2026

This was a self-own: in f95111f I tried to be smarter than my past self and changed the criterion when to use the ProcessPool executor –

  • from max_workers > 1 (i.e. whether OCRD_MAX_PARALLEL_PAGES was requested and the processor implementation supports that)
  • to isinstance(workspace.mets, ClientSideOcrdMets) (i.e. whether the workspace can be processed in parallel)

But for such important cases like Tensorflow, where (unless you put the model in a singleton background process connected via queues to the page workers) multiprocessing is impossible (because the CUDA context cannot be shared), this is clearly wrong. We have to be able to prohibit in the processor implementation (via max_workers = 1) multiprocessing.

@bertsky bertsky requested a review from kba January 21, 2026 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant