HTSNet

Handwritten Text Separation via End-to-End Learning of Convolutional Neural Network

Junho Jo, Hyung Il Koo, Jae Woong Soh, Nam Ik Cho

Environments

python 3.6
scipy
opencv
numpy
tqdm

Abstract

We present a method that separates handwritten and machine-printed components that are mixed and overlapped in documents. Many conventional methods addressed this problem by extracting connected components (CCs) and classifying the extracted CCs into two classes. They were based on the assumption that two types of components are not overlapping each other, while we are focusing on more challenging and realistic cases where the components are often overlapping or touching each other. For this, we propose a new method that performs pixel-level classification with a convolutional neural network. Unlike conventional neural network methods, our method works in an end-to-end manner and does not require any preprocessing steps (e.g., foreground extraction, handcrafted feature extraction, and so on). For the training of our network, we develop a cross-entropy based loss function to alleviate the class imbalance problem. Regarding the training dataset, although there are some datasets of mixed printed characters and handwritten scripts, most of them do not have many overlapping cases and do not provide pixel-level annotations. Hence, we also propose a data synthesis method that generates realistic pixel-level training samples having many overlappings of printed and handwritten characters.

Synthesis method

Prepare datasets

For data synthesis, machine-printed and handwritten crops with their pixel-wise annotations are required.

Run synthesis code

python data_generation.py --data_root ${YOUR_DATA_ROOT} --save_dir ${YOUR_SAVE_DIR} --patch_size 256

Running the above command will generate the scribbled document patches in ${YOUR_SAVE_DIR} as shown in following figures: The first row shows synthesized patches and the second row indicates corresponding pixel-level annotations. Blue, Red and Green denote background, machine-printed and handwritten text pixels, respectively. Yellow are overalapping areas.

Files used in the bachelor theisis

All requirements can be found in the file requirements.txt.

The file binim.py contains the code which was used to binarize color images of wgm dataset.

The synthesis method is defined in the file data_generation.py.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
README.md		README.md
binim.py		binim.py
data_generation.py		data_generation.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HTSNet

Handwritten Text Separation via End-to-End Learning of Convolutional Neural Network

Environments

Abstract

Synthesis method

Prepare datasets

Run synthesis code

Files used in the bachelor theisis

About

Uh oh!

Releases

Packages

Languages

anaprikho/HTSNet

Folders and files

Latest commit

History

Repository files navigation

HTSNet

Handwritten Text Separation via End-to-End Learning of Convolutional Neural Network

Environments

Abstract

Synthesis method

Prepare datasets

Run synthesis code

Files used in the bachelor theisis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages