Releases: OCR-D/core
Releases · OCR-D/core
v3.12.1
v3.11.0
v3.10.1
v3.10.0
Removed:
- 🔥 Drop support for Python 3.8, #1345
Changed:
- 🔥 Upgrade from Ubuntu 20.04 to 22.04 for the docker base image, #1345
- 🔥 Restrict supported tensorflow version to
< 2.16(for support for v1 compat), #1345 - 🔥 Upgrade PAGE XML API to include PRImA-Research-Lab/PAGE-XML#24 for properly recursive
RegionType, #1341
Fixed:
- Support timeouts (
OCRD_PROCESSING_PAGE_TIMEOUT) for processor calls in more cases, #1345- In multi-processing setup, via pebble (which replaces loky)
ProcessFutureandProcessPool - With some C python extensions (such as
tesserocr) - In non-networked local processor calls via cysignals
- In multi-processing setup, via pebble (which replaces loky)
Merged PRs
- replace page.xsd with current PRImA master; regenerate PAGE API by @bertsky in #1341
- Processor: replace loky with pebble to enforce worker timeouts by @bertsky in #1345
Full Changelog: v3.9.2...v3.10.0
v3.9.2
v3.9.1
v3.9.0
Changed:
- Support multiple output file groups for processors, #1344
OcrdPageResult: replace by proxy classOcrdPageResultVariadicListWrapperwith list semantics and variadic constructor (with the original class now underSingleOcrdPageResult)Processor.process_page_file: handle results fromprocess_page_pcgts
as lists:- split
output_file_grpwith commas (just asinput_file_grp) - iterate over output file groups and
OcrdPageResult - log error if there are more results than output file groups (that will get lost)
- raise
FileExistsError(in order to skip actual computation) iff output file exists for all output file groups - make output files (and file IDs), and save images etc for each output independently
- split
- PAGE API:
get_AllRegionsavailable for all region types, not just PAGE root, #1344 ocrd_network: Update RabbitMQ from 3.12 to (latest) 4.2, #1348ocrd_network: Fix and improve logging for network integration tests, #1348
Fixed:
Added:
- test combinations of
OCRD_*config variables and multi-output, #1344
v3.8.1
v3.8.0
Added:
ocrd-commandprocessor to run arbitrary PAGE transformation CLIs, #1343- various parameter presets for ocrd-command, #1343
ocrd-mergeprocessor to join multiple PAGE inputs by concatenation, #1343- test coverage for ocrd-filter, ocrd-command, and ocrd-merge, #1343
- Resource Manager Server as
ocrd_networkanalogon ofocrd.cli.resmgr, #1309ocrd network resmgr-serverfor triggering Resource Manager Server (RMS) in the background- Processing Server also deploys RMS on each processing host
Fixed:
Page.get_ReadingOrderGroups: sort by index, useOrderedDictas resultOcrdAgent.notes: convert to dict to accommodate pydantic 2 with older lxmlocrd.resource_manager: ensure necessary + reduce unnecessary updates of user databaseocrd.resource_manager: deduplicate entries (newer wins) before updating user databaseocrd resmgr download: extract archives independent of whether they are URLs or local pathsocrd resmgr download: if--overwrite, ensure the old res gets removedocrd resmgr download: default todatalocation instead of first in list of allowedocrd_utils.list_all_resources: filter module non-resource files w/ more anti-patternsocrd_utils.list_all_resources: no subpaths except forcwdlocation, OCR-D/spec#263, #1315ocrd_utils.list_all_resources: filter resources via media (MIME) type, if specified, #1315
Merged PR
- drop ocrd-distributed resource_list.yml for good by @bertsky in #1322
- resmgr download: implement git clone by @bertsky in #1340
- Continuation of #1309: Implementation of the resource manager server (issue #1294) by @MehmedGIT in #1319
- add builtin processors ocrd-command and ocrd-merge by @bertsky in #1343
Full Changelog: v3.7.0...v3.8.0