Skip to content

Releases: OCR-D/core

v3.12.1

04 Feb 15:38
@kba kba

Choose a tag to compare

Fixed:

  • only use multiprocessing if max_workers > 1, not just when using METS Server, #1352
  • ensure that file paths are relative to workspace directory, #1213, #1353

Removed:

  • DOCKER_RABBIT_MQ_FEATURES env var not needed anymore, #1354

v3.11.0

20 Jan 17:59
@kba kba

Choose a tag to compare

Fixed:

  • 🔥 CUDA base images working with newer Python, Ubuntu, Tensorflow, Torch and Numpy, #1350

v3.10.1

19 Jan 12:59
@kba kba

Choose a tag to compare

Fixed:

  • Missing imports in shell_processor.py

v3.10.0

16 Jan 17:52
@kba kba

Choose a tag to compare

Removed:

  • 🔥 Drop support for Python 3.8, #1345

Changed:

  • 🔥 Upgrade from Ubuntu 20.04 to 22.04 for the docker base image, #1345
  • 🔥 Restrict supported tensorflow version to < 2.16 (for support for v1 compat), #1345
  • 🔥 Upgrade PAGE XML API to include PRImA-Research-Lab/PAGE-XML#24 for properly recursive RegionType, #1341

Fixed:

  • Support timeouts (OCRD_PROCESSING_PAGE_TIMEOUT) for processor calls in more cases, #1345
    • In multi-processing setup, via pebble (which replaces loky) ProcessFuture and ProcessPool
    • With some C python extensions (such as tesserocr)
    • In non-networked local processor calls via cysignals

Merged PRs

  • replace page.xsd with current PRImA master; regenerate PAGE API by @bertsky in #1341
  • Processor: replace loky with pebble to enforce worker timeouts by @bertsky in #1345

Full Changelog: v3.9.2...v3.10.0

v3.9.2

06 Jan 16:41
@kba kba

Choose a tag to compare

Fixed:

  • Require beanie version compatible with pydantic >=2, #1342, #1349

v3.9.1

19 Dec 12:31
@kba kba

Choose a tag to compare

Added:

  • ocrd network client check-status has a --verbose flag for more detailed job status, #1348

v3.9.0

19 Dec 12:30
@kba kba

Choose a tag to compare

Changed:

  • Support multiple output file groups for processors, #1344
    • OcrdPageResult: replace by proxy class OcrdPageResultVariadicListWrapper with list semantics and variadic constructor (with the original class now under SingleOcrdPageResult)
    • Processor.process_page_file: handle results from process_page_pcgts
      as lists:
      • split output_file_grp with commas (just as input_file_grp)
      • iterate over output file groups and OcrdPageResult
      • log error if there are more results than output file groups (that will get lost)
      • raise FileExistsError (in order to skip actual computation) iff output file exists for all output file groups
      • make output files (and file IDs), and save images etc for each output independently
  • PAGE API: get_AllRegions available for all region types, not just PAGE root, #1344
  • ocrd_network: Update RabbitMQ from 3.12 to (latest) 4.2, #1348
  • ocrd_network: Fix and improve logging for network integration tests, #1348

Fixed:

  • 🔥 do not log RabbitMQ credentials, #1346, #1348

Added:

  • test combinations of OCRD_* config variables and multi-output, #1344

v3.8.1

16 Dec 11:15
@kba kba

Choose a tag to compare

Fixed:

  • Include ocrd-command and ocrd-merge in the ocrd-all-tool.json, #1347

Merged PR:

  • add ocrd-command and ocrd-merge to distributed ocrd-all-tool.json by @bertsky in #1347

v3.8.0

10 Dec 15:15
@kba kba

Choose a tag to compare

Added:

  • ocrd-command processor to run arbitrary PAGE transformation CLIs, #1343
  • various parameter presets for ocrd-command, #1343
  • ocrd-merge processor to join multiple PAGE inputs by concatenation, #1343
  • test coverage for ocrd-filter, ocrd-command, and ocrd-merge, #1343
  • Resource Manager Server as ocrd_network analogon of ocrd.cli.resmgr, #1309
    • ocrd network resmgr-server for triggering Resource Manager Server (RMS) in the background
    • Processing Server also deploys RMS on each processing host

Fixed:

  • Page.get_ReadingOrderGroups: sort by index, use OrderedDict as result
  • OcrdAgent.notes: convert to dict to accommodate pydantic 2 with older lxml
  • ocrd.resource_manager: ensure necessary + reduce unnecessary updates of user database
  • ocrd.resource_manager: deduplicate entries (newer wins) before updating user database
  • ocrd resmgr download: extract archives independent of whether they are URLs or local paths
  • ocrd resmgr download: if --overwrite, ensure the old res gets removed
  • ocrd resmgr download: default to data location instead of first in list of allowed
  • ocrd_utils.list_all_resources: filter module non-resource files w/ more anti-patterns
  • ocrd_utils.list_all_resources: no subpaths except for cwd location, OCR-D/spec#263, #1315
  • ocrd_utils.list_all_resources: filter resources via media (MIME) type, if specified, #1315

Merged PR

Full Changelog: v3.7.0...v3.8.0

v3.7.0

02 Dec 13:49
@kba kba

Choose a tag to compare

Changed:

  • 🔥 upgrade and adapt ocrd_network to pydantic v2, #1342

Removed:

  • 🔥 drop bashlib processors, retain only the Python API, #1339