Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
edcf855
Dockerfile: Use virtual environment
stweil Mar 15, 2023
5c003fa
give up workaround for shapely-CUDA issue
Apr 1, 2023
2250550
rehash after pip upgrade
Apr 1, 2023
00a0f6f
keep gcc, no autoremove
Apr 1, 2023
410783f
docker-cuda: change base image, no multi-CUDA runtimes
Apr 1, 2023
de86e0f
reinstate workaround for shapely, but more robust
Apr 12, 2023
c1178f9
core-cuda: use CUDA 11.8, install cuDNN via pip and make available sy…
Apr 15, 2023
d3d54bf
core-cuda: install more CUDA libs via pip and ld.so.conf, simplify Do…
Apr 20, 2023
357e729
make install on py36: prefer binary OpenCV/Numpy via pip config inste…
Apr 20, 2023
2640b71
make install on py36: fix prefer-binary syntax
Apr 21, 2023
b3618a9
make install on py36: revert to prefer-binary via install
Apr 21, 2023
b713030
:package: v2.50.0
kba Apr 24, 2023
c0c153e
Merge branch 'master' of https://github.com/OCR-D/core into reduce-cuda
Apr 24, 2023
209fa21
Merge branch 'pr-1008' into reduce-cuda
Apr 27, 2023
a6cf5ff
core-cuda: use same CUDA libs as needed for Torch anyway
Apr 27, 2023
1da66f4
Fix mongodb credentials usage
joschrew May 10, 2023
8530bd9
Add skip_deployment flag for queue and database
joschrew May 10, 2023
b4e6576
Make deployer optionally skip mongo and rabbitmq
joschrew May 10, 2023
8da7868
Remove redundant rabbitmq availability check
joschrew May 11, 2023
f749c24
Correct spelling
joschrew May 15, 2023
f5e212a
docker-cuda: rewrite…
Jun 1, 2023
85a5d16
docker-cuda: improve (reduce size) again…
Jun 2, 2023
47eff22
remove out-dated processor resources
Jun 2, 2023
12e781c
Revert "Merge remote-tracking branch 'hnesk/no-more-pkg_resources' in…
Jun 2, 2023
bac1a45
make help: improve description
Jun 2, 2023
95062b0
:memo: changelog
kba Jun 7, 2023
c636f2c
:package: v2.51.0
kba Jun 7, 2023
8c7b761
docker-image: reuse local ghcr.io image instead of docker.io
bertsky Jun 7, 2023
4bfac5e
disable logging tests until properly fixed
kba Jun 7, 2023
ca5f342
test_workspace_bagger: use ocr-d.de instead of google.com for testing
kba Jun 7, 2023
3f05745
test bashlib: /usr/bin/env bash instead of /bin/bash
kba Jun 8, 2023
af38a2c
debug gh actions
kba Jun 8, 2023
c63ab4c
readme: remove dockerhub/travis badge, add GH actions badge
kba Jun 8, 2023
555dece
ci: disable upterm for gh actions
kba Jun 8, 2023
d76409e
docker-cuda: move recipe to reusable makefile target deps-cuda
Jun 8, 2023
2538979
make deps-cuda: Set MAMBA_ROOT_PREFIX to CONDA_PREFIX
kba Jun 9, 2023
79ad301
:memo: changelog
kba Jun 9, 2023
6708624
Merge pull request #1055 from bertsky/deps-cuda
kba Jun 9, 2023
821f70c
Merge branch 'master' into ocrd-network-optional-deploy
kba Jun 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ jobs:
steps:
- checkout
- run: HOMEBREW_NO_AUTO_UPDATE=1 brew install imagemagick geos bash
- run: make install
- run: which bash
- run: bash --version
- run: make install
- run: make deps-test test benchmark

test-python37:
Expand Down
4 changes: 3 additions & 1 deletion .github/workflows/docker-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@ jobs:
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build the Docker image
# default tag uses docker.io, so override on command-line
run: make docker DOCKER_TAG=${{ env.DOCKER_TAG }}
- name: Build the Docker image with GPU support
run: make docker-cuda DOCKER_TAG=${{ env.DOCKER_TAG }}-cuda
# default tag uses docker.io, so override on command-line
run: make docker-cuda DOCKER_TAG=${{ env.DOCKER_TAG }}-cuda DOCKER_BASE_IMAGE=${{ env.DOCKER_TAG }}
- name: Login to GitHub Container Registry
uses: docker/login-action@v2
with:
Expand Down
24 changes: 24 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,27 @@ Versioned according to [Semantic Versioning](http://semver.org/).

## Unreleased

Added:

* `make deps-cuda`: Makefile target to set up a working CUDA installation, both for native and Dockerfile.cuda, #1055

## [2.51.0] - 2023-06-07

Changed:

* `core cuda` Docker: CUDA base image working again, based on `ocrd/core` not `nvidia/cuda` in a separate `Dockerfile.cuda`, #1041
* `core-cuda` Docker: adopt #1008 (venv under /usr/local, as in ocrd_all, instead of dist-packages), #1041
* `core-cuda` Docker: use conda ([micromamba](https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html)) for CUDA toolkit, and [nvidia-pyindex](https://pypi.org/project/nvidia-pyindex/) for CUDA libs – instead of [nvidia/cuda](https://hub.docker.com/r/nvidia/cuda) base image, #1041
* more robust workaround for shapely#1598, #1041

Removed:

* Revert #882 (fastentrypoints) as it enforces deps versions at runtime
* Drop `ocrd_utils.package_resources` and use `pkg_resources.*` directly, #1041
* `ocrd resmgr`: Drop redundant (processor-provided) entries in the central `resource_list.yml`.

## [2.50.0] - 2023-04-24

Added:

* :fire: `ocrd_network`: Components related to OCR-D Web API, #974
Expand Down Expand Up @@ -1729,6 +1750,9 @@ Fixed
Initial Release

<!-- link-labels -->
[2.51.0]: ../../compare/v2.51.0..v2.50.0
[2.50.0]: ../../compare/v2.50.0..v2.49.0
[2.49.0]: ../../compare/v2.49.0..v2.48.1
[2.48.1]: ../../compare/v2.48.1..v2.48.0
[2.48.0]: ../../compare/v2.48.0..v2.47.4
[2.47.4]: ../../compare/v2.47.4..v2.47.3
Expand Down
11 changes: 5 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ ENV DEBIAN_FRONTEND noninteractive
ENV PYTHONIOENCODING utf8
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
ENV PIP=pip3
ENV PIP=pip

WORKDIR /build-ocrd
COPY ocrd ./ocrd
Expand All @@ -24,7 +24,6 @@ RUN apt-get update && apt-get -y install software-properties-common \
&& apt-get update && apt-get -y install \
ca-certificates \
python3-dev \
python3-pip \
python3-venv \
gcc \
make \
Expand All @@ -34,11 +33,11 @@ RUN apt-get update && apt-get -y install software-properties-common \
sudo \
git \
&& make deps-ubuntu \
&& pip3 install --upgrade pip setuptools \
&& python3 -m venv /usr/local \
&& hash -r \
&& pip install --upgrade pip setuptools \
&& make install \
&& apt-get remove -y gcc \
&& apt-get autoremove -y \
&& $FIXUP \
&& eval $FIXUP \
&& rm -rf /build-ocrd

WORKDIR /data
Expand Down
22 changes: 22 additions & 0 deletions Dockerfile.cuda
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
ARG BASE_IMAGE
FROM $BASE_IMAGE

ENV MAMBA_EXE=/usr/local/bin/conda
ENV MAMBA_ROOT_PREFIX=/conda
ENV PATH=$MAMBA_ROOT_PREFIX/bin:$PATH
ENV CONDA_EXE=$MAMBA_EXE
ENV CONDA_PREFIX=$MAMBA_ROOT_PREFIX
ENV CONDA_SHLVL='1'

WORKDIR /build

COPY Makefile .

RUN make deps-cuda

WORKDIR /data

RUN rm -fr /build

CMD ["/usr/local/bin/ocrd", "--help"]

127 changes: 81 additions & 46 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ help:
@echo ""
@echo " Targets"
@echo ""
@echo " deps-ubuntu Dependencies for deployment in an ubuntu/debian linux"
@echo " deps-cuda Dependencies for deployment with GPU support via Conda"
@echo " deps-ubuntu Dependencies for deployment in an Ubuntu/Debian Linux"
@echo " deps-test Install test python deps via pip"
@echo " install (Re)install the tool"
@echo " install-dev Install with pip install -e"
Expand All @@ -32,31 +33,75 @@ help:
@echo " docs-clean Clean docs"
@echo " docs-coverage Calculate docstring coverage"
@echo " docker Build docker image"
@echo " docker-cuda Build docker GPU / CUDA image"
@echo " cuda-ubuntu Install native CUDA toolkit in different versions"
@echo " docker-cuda Build docker image for GPU / CUDA"
@echo " pypi Build wheels and source dist and twine upload them"
@echo ""
@echo " Variables"
@echo ""
@echo " DOCKER_TAG Docker tag. Default: '$(DOCKER_TAG)'."
@echo " DOCKER_BASE_IMAGE Docker base image. Default: '$(DOCKER_BASE_IMAGE)'."
@echo " DOCKER_TAG Docker target image tag. Default: '$(DOCKER_TAG)'."
@echo " DOCKER_BASE_IMAGE Docker source image tag. Default: '$(DOCKER_BASE_IMAGE)'."
@echo " DOCKER_ARGS Additional arguments to docker build. Default: '$(DOCKER_ARGS)'"
@echo " PIP_INSTALL pip install command. Default: $(PIP_INSTALL)"

# END-EVAL

# Docker tag. Default: '$(DOCKER_TAG)'.
DOCKER_TAG = ocrd/core

# Docker base image. Default: '$(DOCKER_BASE_IMAGE)'.
DOCKER_BASE_IMAGE = ubuntu:20.04

# Additional arguments to docker build. Default: '$(DOCKER_ARGS)'
DOCKER_ARGS =

# pip install command. Default: $(PIP_INSTALL)
PIP_INSTALL = $(PIP) install

deps-cuda: CONDA_EXE ?= /usr/local/bin/conda
deps-cuda: export CONDA_PREFIX ?= /conda
deps-cuda: PYTHON_PREFIX != $(PYTHON) -c 'import sysconfig; print(sysconfig.get_paths()["purelib"])'
deps-cuda:
curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
mv bin/micromamba $(CONDA_EXE)
# Install Conda system-wide (for interactive / login shells)
echo 'export MAMBA_EXE=$(CONDA_EXE) MAMBA_ROOT_PREFIX=$(CONDA_PREFIX) CONDA_PREFIX=$(CONDA_PREFIX) PATH=$(CONDA_PREFIX)/bin:$$PATH' >> /etc/profile.d/98-conda.sh
mkdir -p $(CONDA_PREFIX)/lib $(CONDA_PREFIX)/include
echo $(CONDA_PREFIX)/lib >> /etc/ld.so.conf.d/conda.conf
# Get CUDA toolkit, including compiler and libraries with dev,
# however, the Nvidia channels do not provide (recent) cudnn (needed for Torch, TF etc):
#MAMBA_ROOT_PREFIX=$(CONDA_PREFIX) \
#conda install -c nvidia/label/cuda-11.8.0 cuda && conda clean -a
#
# The conda-forge channel has cudnn and cudatoolkit but no cudatoolkit-dev anymore (and we need both!),
# so let's combine nvidia and conda-forge (will be same lib versions, no waste of space),
# but omitting cuda-cudart-dev and cuda-libraries-dev (as these will be pulled by pip for torch anyway):
MAMBA_ROOT_PREFIX=$(CONDA_PREFIX) \
conda install -c nvidia/label/cuda-11.8.0 \
cuda-nvcc \
cuda-cccl \
&& conda clean -a \
&& find $(CONDA_PREFIX) -name "*_static.a" -delete
#conda install -c conda-forge \
# cudatoolkit=11.8.0 \
# cudnn=8.8.* && \
#conda clean -a && \
#find $(CONDA_PREFIX) -name "*_static.a" -delete
#
# Since Torch will pull in the CUDA libraries (as Python pkgs) anyway,
# let's jump the shark and pull these via NGC index directly,
# but then share them with the rest of the system so native compilation/linking
# works, too:
$(PIP) install nvidia-pyindex \
&& $(PIP) install nvidia-cudnn-cu11==8.6.0.163 \
nvidia-cublas-cu11 \
nvidia-cusparse-cu11 \
nvidia-cusolver-cu11 \
nvidia-curand-cu11 \
nvidia-cufft-cu11 \
nvidia-cuda-runtime-cu11 \
nvidia-cuda-nvrtc-cu11 \
&& for pkg in cudnn cublas cusparse cusolver curand cufft cuda_runtime cuda_nvrtc; do \
for lib in $(PYTHON_PREFIX)/nvidia/$$pkg/lib/lib*.so.*; do \
base=`basename $$lib`; \
ln -s $$lib $(CONDA_PREFIX)/lib/$$base.so; \
ln -s $$lib $(CONDA_PREFIX)/lib/$${base%.so.*}.so; \
done \
&& ln -s $(PYTHON_PREFIX)/nvidia/$$pkg/include/* $(CONDA_PREFIX)/include/; \
done \
&& ldconfig
# gputil/nvidia-smi would be nice, too – but that drags in Python as a conda dependency...

# Dependencies for deployment in an ubuntu/debian linux
deps-ubuntu:
apt-get install -y python3 imagemagick libgeos-dev
Expand All @@ -68,12 +113,13 @@ deps-test:

# (Re)install the tool
install:
$(PIP) install -U pip wheel setuptools fastentrypoints
$(PIP) install -U pip wheel setuptools
@# speedup for end-of-life builds
@# we cannot use pip config here due to pip#11988
if $(PYTHON) -V | fgrep -e 3.5 -e 3.6; then $(PIP) install --prefer-binary opencv-python-headless numpy; fi
for mod in $(BUILD_ORDER);do (cd $$mod ; $(PIP_INSTALL) .);done
@# workaround for shapely#1598
$(PIP) install --no-binary shapely --force-reinstall shapely
$(PIP) config set global.no-binary shapely

# Install with pip install -e
install-dev: uninstall
Expand Down Expand Up @@ -149,9 +195,16 @@ assets: repo/assets
.PHONY: test
# Run all unit tests
test: assets
$(PYTHON) -m pytest --continue-on-collection-errors --durations=10\
--ignore=$(TESTDIR)/test_logging.py \
--ignore=$(TESTDIR)/test_logging_conf.py \
--ignore-glob="$(TESTDIR)/**/*bench*.py" \
$(TESTDIR)
#$(MAKE) test-logging

test-logging:
HOME=$(CURDIR)/ocrd_utils $(PYTHON) -m pytest --continue-on-collection-errors -k TestLogging $(TESTDIR)
HOME=$(CURDIR) $(PYTHON) -m pytest --continue-on-collection-errors -k TestLogging $(TESTDIR)
$(PYTHON) -m pytest --continue-on-collection-errors --durations=10 --ignore=$(TESTDIR)/test_logging.py --ignore-glob="$(TESTDIR)/**/*bench*.py" $(TESTDIR)

benchmark:
$(PYTHON) -m pytest $(TESTDIR)/model/test_ocrd_mets_bench.py
Expand Down Expand Up @@ -214,40 +267,22 @@ pyclean:

.PHONY: docker docker-cuda

# Additional arguments to docker build. Default: '$(DOCKER_ARGS)'
DOCKER_ARGS =

# Build docker image
docker docker-cuda:
docker build -t $(DOCKER_TAG) --build-arg BASE_IMAGE=$(DOCKER_BASE_IMAGE) $(DOCKER_ARGS) .
docker: DOCKER_BASE_IMAGE = ubuntu:20.04
docker: DOCKER_TAG = ocrd/core
docker: DOCKER_FILE = Dockerfile

# Build docker GPU / CUDA image
docker-cuda: DOCKER_BASE_IMAGE = nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu20.04
docker-cuda: DOCKER_BASE_IMAGE = ocrd/core
docker-cuda: DOCKER_TAG = ocrd/core-cuda
docker-cuda: DOCKER_ARGS += --build-arg FIXUP="make cuda-ubuntu cuda-ldconfig"

#
# CUDA
#

.PHONY: cuda-ubuntu cuda-ldconfig

# Install native CUDA toolkit in different versions
cuda-ubuntu: cuda-ldconfig
apt-get -y install --no-install-recommends cuda-runtime-11-0 cuda-runtime-11-1 cuda-runtime-11-3 cuda-runtime-11-7 cuda-runtime-12-1
docker-cuda: DOCKER_FILE = Dockerfile.cuda

cuda-ldconfig: /etc/ld.so.conf.d/cuda.conf
ldconfig
docker-cuda: docker

/etc/ld.so.conf.d/cuda.conf:
@echo > $@
@echo /usr/local/cuda-11.0/lib64 >> $@
@echo /usr/local/cuda-11.0/targets/x86_64-linux/lib >> $@
@echo /usr/local/cuda-11.1/lib64 >> $@
@echo /usr/local/cuda-11.1/targets/x86_64-linux/lib >> $@
@echo /usr/local/cuda-11.3/lib64 >> $@
@echo /usr/local/cuda-11.3/targets/x86_64-linux/lib >> $@
@echo /usr/local/cuda-11.7/lib64 >> $@
@echo /usr/local/cuda-11.7/targets/x86_64-linux/lib >> $@
@echo /usr/local/cuda-12.1/lib64 >> $@
@echo /usr/local/cuda-12.1/targets/x86_64-linux/lib >> $@
docker docker-cuda:
docker build --progress=plain -f $(DOCKER_FILE) -t $(DOCKER_TAG) --build-arg BASE_IMAGE=$(DOCKER_BASE_IMAGE) $(DOCKER_ARGS) .

# Build wheels and source dist and twine upload them
pypi: uninstall install
Expand Down
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,9 @@
> Python modules implementing [OCR-D specs](https://github.com/OCR-D/spec) and related tools

[![image](https://img.shields.io/pypi/v/ocrd.svg)](https://pypi.org/project/ocrd/)
[![image](https://travis-ci.org/OCR-D/core.svg?branch=master)](https://travis-ci.org/OCR-D/core)
[![image](https://circleci.com/gh/OCR-D/core.svg?style=svg)](https://circleci.com/gh/OCR-D/core)
[![Docker Image CI](https://github.com/OCR-D/core/actions/workflows/docker-image.yml/badge.svg)](https://github.com/OCR-D/core/actions/workflows/docker-image.yml)
[![image](https://scrutinizer-ci.com/g/OCR-D/core/badges/build.png?b=master)](https://scrutinizer-ci.com/g/OCR-D/core)
[![Docker Automated build](https://img.shields.io/docker/automated/ocrd/core.svg)](https://hub.docker.com/r/ocrd/core/tags/)
[![image](https://codecov.io/gh/OCR-D/core/branch/master/graph/badge.svg)](https://codecov.io/gh/OCR-D/core)
[![image](https://scrutinizer-ci.com/g/OCR-D/core/badges/quality-score.png?b=master)](https://scrutinizer-ci.com/g/OCR-D/core)

Expand Down
2 changes: 1 addition & 1 deletion ocrd/ocrd/constants.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""
Constants for ocrd.
"""
from ocrd_utils.package_resources import resource_filename
from pkg_resources import resource_filename

__all__ = [
'TMP_PREFIX',
Expand Down
4 changes: 2 additions & 2 deletions ocrd/ocrd/processor/builtin/dummy_processor.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# pylint: disable=missing-module-docstring,invalid-name
from os.path import join, basename
from ocrd_utils.package_resources import resource_string
from pkg_resources import resource_string

import click

Expand All @@ -17,7 +17,7 @@
)
from ocrd_modelfactory import page_from_file

OCRD_TOOL = parse_json_string_with_comments(resource_string(__name__, 'ocrd-tool.json').decode('utf8'))
OCRD_TOOL = parse_json_string_with_comments(resource_string(__name__, 'dummy/ocrd-tool.json').decode('utf8'))

class DummyProcessor(Processor):
"""
Expand Down
16 changes: 0 additions & 16 deletions ocrd/ocrd/resource_list.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,19 +59,3 @@ ocrd-sbb-binarize:
type: archive
path_in_archive: models
size: 1654623597
ocrd-sbb-textline-detector:
- url: https://qurator-data.de/sbb_textline_detector/models.tar.gz
description: default models provided by github.com/qurator-spk
name: default
type: archive
size: 1194551551
ocrd-kraken-segment:
- url: https://github.com/mittagessen/kraken/raw/master/kraken/blla.mlmodel
description: Pretrained baseline segmentation model
name: blla.mlmodel
size: 5046835
ocrd-kraken-recognize:
- url: https://zenodo.org/record/2577813/files/en_best.mlmodel?download=1
name: en_best.mlmodel
description: This model has been trained on a large corpus of modern printed English text\naugmented with ~10000 lines of historical pages
size: 2930723
2 changes: 1 addition & 1 deletion ocrd/ocrd/workspace_bagger.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import sys
from bagit import Bag, make_manifests, _load_tag_file, _make_tag_file, _make_tagmanifest_file # pylint: disable=no-name-in-module
from distutils.dir_util import copy_tree
from pkg_resources import get_distribution

from ocrd_utils import (
pushd_popd,
Expand All @@ -22,7 +23,6 @@
from ocrd_validators.constants import BAGIT_TXT, TMP_BAGIT_PREFIX, OCRD_BAGIT_PROFILE_URL
from ocrd_modelfactory import page_from_file
from ocrd_models.ocrd_page import to_xml
from ocrd_utils.package_resources import get_distribution

from .workspace import Workspace

Expand Down
1 change: 0 additions & 1 deletion ocrd/setup.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
# -*- coding: utf-8 -*-
import fastentrypoints
from setuptools import setup, find_packages
from ocrd_utils import VERSION

Expand Down
2 changes: 1 addition & 1 deletion ocrd_models/ocrd_models/constants.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""
Constants for ocrd_models.
"""
from ocrd_utils.package_resources import resource_string
from pkg_resources import resource_string
import re

__all__ = [
Expand Down
Loading