Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
035f66a
Add PaddleDetection-based Layout Model (#54)
an1018 Aug 17, 2021
262ef0b
Minor fixes for typos and bugs (#58)
lolipopshock Aug 20, 2021
f027525
Add the MFD Model for detecting mathematic formula regions (#59)
lolipopshock Aug 21, 2021
c6a5c6f
[feat] Allowing specifying box_alpha for draw_box (#60)
lolipopshock Aug 27, 2021
2afd2a6
Re-organize the elements to individual files (#62)
lolipopshock Aug 31, 2021
e8d5488
Re-org the OCR Model Files (#64)
lolipopshock Sep 8, 2021
9b73ff1
[feat] Dynamic import based on the available dependencies (#65)
lolipopshock Sep 9, 2021
b4b4fea
Add efficientdet models (#67)
lolipopshock Sep 9, 2021
06fca71
Fix file utils and add tests (#68)
lolipopshock Sep 9, 2021
4ff55fa
[feat] AutoLayoutModel and flexible model configs (#69)
lolipopshock Sep 10, 2021
c0044a0
[fix] Add License Information (#70)
Sep 11, 2021
341d3fc
[feat] Add pdf loader (#71)
lolipopshock Sep 13, 2021
4f75c5a
[feat] Add shape operation tools (#72)
lolipopshock Sep 13, 2021
f5d129e
Fix readthedocs build
lolipopshock Sep 13, 2021
d4bb55a
Update READMEs (#73)
lolipopshock Sep 13, 2021
ea3cc11
version bump 0.3.0
lolipopshock Sep 13, 2021
73e3015
[fix] Remove detectron2 from extras_require (#74)
lolipopshock Sep 13, 2021
52ce56b
[fix] set label_map in Detectron2LayoutModel (#75)
lolipopshock Sep 14, 2021
f89a18f
Update the installation command
lolipopshock Sep 14, 2021
867b89e
version bump 0.3.1
lolipopshock Sep 15, 2021
b9fd396
Improve the installation instructions
lolipopshock Sep 23, 2021
e70bf05
[fix] Improve dependencies for multi-backend support (#79)
lolipopshock Sep 23, 2021
29fb2fb
version bump 0.3.2
lolipopshock Sep 23, 2021
6651da5
Minor update to Deep Learning Parser example notebook (#56)
Jim-Salmons Jan 12, 2022
87e5e72
Set `inplace` to True in sorting function in the example code (#104)
yusanshi Jan 12, 2022
79bd1af
CI for PR as well
lolipopshock Feb 2, 2022
cd295de
Robust pdf loading for empty pages (#115)
lolipopshock Feb 2, 2022
0809fa8
fix to issue #94 (#95)
kforcodeai Feb 2, 2022
f230971
Add notebook for customizing LayoutParser Models with Label Studio An…
lolipopshock Apr 2, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .github/layout-parser.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
65 changes: 59 additions & 6 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,17 +1,70 @@
name: CI

on: [push]
on: [push, pull_request]

jobs:
build:

test_only_effdet_backend:

runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: '3.7'

- name: Test Dependency Support
run: |
pip install pytest
pip install -e . # The bare layoutparser module
pytest tests_deps/test_file_utils.py

- name: Install only effdet deps
run: |
pip install pytest
pip install -e ".[effdet]"
pytest tests_deps/test_only_effdet.py

test_only_detectron2_backend:

runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: '3.7'

- name: Install only Detectron2 deps
run: |
pip install pytest
pip install -e .
pip install torchvision && pip install "git+https://github.com/facebookresearch/detectron2.git@v0.5#egg=detectron2"
pytest tests_deps/test_only_detectron2.py

test_only_paddledetection_backend:

runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: '3.7'

- name: Install only PaddleDetection deps
run: |
pip install pytest
pip install -e ".[paddledetection]"
pytest tests_deps/test_only_paddledetection.py

test_all_methods_all_backends:
needs: [test_only_effdet_backend, test_only_detectron2_backend, test_only_paddledetection_backend]
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.7, 3.8]

steps:
- uses: actions/checkout@v2

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
Expand All @@ -21,7 +74,6 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install .
pip install 'git+https://github.com/facebookresearch/detectron2.git@v0.1.3#egg=detectron2'

- name: Lint with flake8
run: |
Expand All @@ -30,9 +82,10 @@ jobs:
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics --ignore F821
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

- name: Test with pytest
run: |
# Install additional requirements when running tests
pip install ".[effdet]"
pip install -r dev-requirements.txt
pytest
pytest tests
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Examples files
examples/Customizing Layout Models with Label Studio Annotation/downloaded-annotations

*.bak
.gitattributes
.last_checked
Expand Down
2 changes: 2 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,6 @@ python:
install:
- method: pip
path: .
extra_requirements:
- effdet
- requirements: dev-requirements.txt
128 changes: 90 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,120 @@
<p align="center">
<img src="https://github.com/Layout-Parser/layout-parser/raw/master/.github/layout-parser.png" alt="Layout Parser Logo" width="35%">
<p align="center">
<h3 align="center">
A unified toolkit for Deep Learning Based Document Image Analysis
</p>
</h3>
</p>

<p align=center>
<a href="https://arxiv.org/abs/2103.15348"><img src="https://img.shields.io/badge/arXiv-2103.15348-b31b1b.svg" title="Layout Parser Paper"></a>
<a href="https://layout-parser.github.io"><img src="https://img.shields.io/badge/website-layout--parser.github.io-informational.svg" title="Layout Parser Paper"></a>
<a href="https://layout-parser.readthedocs.io/en/latest/"><img src="https://img.shields.io/badge/doc-layout--parser.readthedocs.io-light.svg" title="Layout Parser Documentation"></a>
<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/v/layoutparser?color=%23099cec&label=PyPI%20package&logo=pypi&logoColor=white" title="The current version of Layout Parser"></a>
<a href="https://github.com/Layout-Parser/layout-parser/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/layoutparser" title="Layout Parser uses Apache 2 License"></a>
<img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dm/layoutparser">
</p>

<p align=center>
<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/v/layoutparser?color=%23099cec&label=PyPI%20package&logo=pypi&logoColor=white" title="The current version of Layout Parser"></a>
<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/pyversions/layoutparser?color=%23099cec&" alt="Python 3.6 3.7 3.8" title="Layout Parser supports Python 3.6 and above"></a>
<img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dm/layoutparser">
<a href="https://github.com/Layout-Parser/layout-parser/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/layoutparser" title="Layout Parser uses Apache 2 License"></a>
<a href="https://arxiv.org/abs/2103.15348"><img src="https://img.shields.io/badge/paper-2103.15348-b31b1b.svg" title="Layout Parser Paper"></a>
<a href="https://layout-parser.github.io"><img src="https://img.shields.io/badge/website-layout--parser.github.io-informational.svg" title="Layout Parser Paper"></a>
<a href="https://layout-parser.readthedocs.io/en/latest/"><img src="https://img.shields.io/badge/doc-layout--parser.readthedocs.io-light.svg" title="Layout Parser Documentation"></a>
</p>

---

## Installation
## What is LayoutParser

![Example Usage](.github/example.png)

You can find detailed installation instructions in [installation.md](installation.md). But generally, it's just `pip install`
some libraries:
LayoutParser aims to provide a wide range of tools that aims to streamline Document Image Analysis (DIA) tasks. Please check the LayoutParser [demo video](https://youtu.be/8yA5xB4Dg8c) (1 min) or [full talk](https://www.youtube.com/watch?v=YG0qepPgyGY) (15 min) for details. And here are some key features:

- LayoutParser provides a rich repository of deep learning models for layout detection as well as a set of unified APIs for using them. For example,

<details>
<summary>Perform DL layout detection in 4 lines of code</summary>

```python
import layoutparser as lp
model = lp.AutoLayoutModel('lp://EfficientDete/PubLayNet')
# image = Image.open("path/to/image")
layout = model.detect(image)
```

</details>

- LayoutParser comes with a set of layout data structures with carefully designed APIs that are optimized for document image analysis tasks. For example,

<details>
<summary>Selecting layout/textual elements in the left column of a page</summary>

```python
image_width = image.size[0]
left_column = lp.Interval(0, image_width/2, axis='x')
layout.filter_by(left_column, center=True) # select objects in the left column
```

</details>

<details>
<summary>Performing OCR for each detected Layout Region</summary>

```python
ocr_agent = lp.TesseractAgent()
for layout_region in layout:
image_segment = layout_region.crop(image)
text = ocr_agent.detect(image_segment)
```

</details>

<details>
<summary>Flexible APIs for visualizing the detected layouts</summary>

```python
lp.draw_box(image, layout, box_width=1, show_element_id=True, box_alpha=0.25)
```

</details>

</details>

<details>
<summary>Loading layout data stored in json, csv, and even PDFs</summary>

```python
layout = lp.load_json("path/to/json")
layout = lp.load_csv("path/to/csv")
pdf_layout = lp.load_pdf("path/to/pdf")
```

</details>

- LayoutParser is also a open platform that enables the sharing of layout detection models and DIA pipelines among the community.
<details>
<summary><a href="https://layout-parser.github.io/platform/">Check</a> the LayoutParser open platform</summary>
</details>

<details>
<summary><a href="https://github.com/Layout-Parser/platform">Submit</a> your models/pipelines to LayoutParser</summary>
</details>

```bash
pip install -U layoutparser
## Installation

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version.
pip install 'git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2'
After several major updates, layoutparser provides various functionalities and deep learning models from different backends. But it still easy to install layoutparser, and we designed the installation method in a way such that you can choose to install only the needed dependencies for your project:

# Install the ocr components when necessary
pip install layoutparser[ocr]
```bash
pip install layoutparser # Install the base layoutparser library with
pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit
pip install "layoutparser[ocr]" # Install OCR toolkit
```

**For Windows Users:** Please read [installation.md](installation.md) for details about installing Detectron2.
Extra steps are needed if you want to use Detectron2-based models. Please check [installation.md](installation.md) for additional details on layoutparser installation.

## Quick Start
## Examples

We provide a series of examples for to help you start using the layout parser library:

1. [Table OCR and Results Parsing](https://github.com/Layout-Parser/layout-parser/blob/master/examples/OCR%20Tables%20and%20Parse%20the%20Output.ipynb): `layoutparser` can be used for conveniently OCR documents and convert the output in to structured data.

2. [Deep Layout Parsing Example](https://github.com/Layout-Parser/layout-parser/blob/master/examples/Deep%20Layout%20Parsing.ipynb): With the help of Deep Learning, `layoutparser` supports the analysis very complex documents and processing of the hierarchical structure in the layouts.


## DL Assisted Layout Prediction Example

![Example Usage](.github/example.png)

*The images shown in the figure above are: a screenshot of [this paper](https://arxiv.org/abs/2004.08686), an image from the [PRIMA Layout Analysis Dataset](https://www.primaresearch.org/dataset/), a screenshot of the [WSJ website](http://wsj.com), and an image from the [HJDataset](https://dell-research-harvard.github.io/HJDataset/).*

With only 4 lines of code in `layoutparse`, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the [ModelZoo](https://github.com/Layout-Parser/layout-parser/blob/master/docs/notes/modelzoo.md), or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

```python
>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations
```

## Contributing

We encourage you to contribute to Layout Parser! Please check out the [Contributing guidelines](.github/CONTRIBUTING.md) for guidelines about how to proceed. Join us!
Expand Down
5 changes: 4 additions & 1 deletion dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
pytest
torch
numpy
opencv-python
pandas
Expand All @@ -10,4 +11,6 @@ sphinx_rtd_theme
google-cloud-vision==1
pytesseract
pycocotools
git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2
git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2
paddlepaddle
effdet
10 changes: 8 additions & 2 deletions docs/api_doc/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,28 @@ Load and Export Layout Data
================================


DataFrame and CSV
`Dataframe` and CSV
--------------------------------

.. autofunction:: layoutparser.io.load_dataframe

.. autofunction:: layoutparser.io.load_csv


Dictionary and JSON
`Dict` and JSON
--------------------------------

.. autofunction:: layoutparser.io.load_dict

.. autofunction:: layoutparser.io.load_json


PDF
--------------------------------

.. autofunction:: layoutparser.io.load_pdf


Other Formats
--------------------------------
Stay tuned! We are working on to support more formats.
14 changes: 14 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,17 @@
# Copyright 2021 The Layout Parser team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
Expand Down
4 changes: 2 additions & 2 deletions docs/example/deep_layout_parsing/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,10 @@ Finally sort the text regions and assign ids:
left_interval = lp.Interval(0, w/2*1.05, axis='x').put_on_canvas(image)

left_blocks = text_blocks.filter_by(left_interval, center=True)
left_blocks.sort(key = lambda b:b.coordinates[1])
left_blocks.sort(key = lambda b:b.coordinates[1], inplace=True)

right_blocks = [b for b in text_blocks if b not in left_blocks]
right_blocks.sort(key = lambda b:b.coordinates[1])
right_blocks.sort(key = lambda b:b.coordinates[1], inplace=True)

# And finally combine the two list and add the index
# according to the order
Expand Down
5 changes: 4 additions & 1 deletion docs/notes/modelzoo.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ model.detect(image)
| [NewspaperNavigator](https://news-navigator.labs.loc.gov/) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/wnido8pk4oubyzr/config.yml?dl=1) | lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config | |
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/7cqle02do7ah7k4/config.yaml?dl=1) | lp://TableBank/faster_rcnn_R_50_FPN_3x/config | 89.78 [eval.csv](https://www.dropbox.com/s/1uwnz58hxf96iw2/eval.csv?dl=0) |
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_101_FPN_3x](https://www.dropbox.com/s/h63n6nv51kfl923/config.yaml?dl=1) | lp://TableBank/faster_rcnn_R_101_FPN_3x/config | 91.26 [eval.csv](https://www.dropbox.com/s/e1kq8thkj2id1li/eval.csv?dl=0) |
| [Math Formula Detection(MFD)](http://transcriptorium.eu/~htrcontest/MathsICDAR2021/) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/ld9izb95f19369w/config.yaml?dl=1) | lp://MFD/faster_rcnn_R_50_FPN_3x/config | 79.68 [eval.csv](https://www.dropbox.com/s/1yvrs29jjybrlpw/eval.csv?dl=0) |


* For PubLayNet models, we suggest using `mask_rcnn_X_101_32x8d_FPN_3x` model as it's trained on the whole training set, while others are only trained on the validation set (the size is only around 1/50). You could expect a 15% AP improvement using the `mask_rcnn_X_101_32x8d_FPN_3x` model.

Expand All @@ -39,4 +41,5 @@ model.detect(image)
| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) | `{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}` |
| [PrimaLayout](https://www.primaresearch.org/dataset/) | `{1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"}` |
| [NewspaperNavigator](https://news-navigator.labs.loc.gov/) | `{0: "Photograph", 1: "Illustration", 2: "Map", 3: "Comics/Cartoon", 4: "Editorial Cartoon", 5: "Headline", 6: "Advertisement"}` |
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | `{0: "Table"}` |
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | `{0: "Table"}` |
| [MFD](http://transcriptorium.eu/~htrcontest/MathsICDAR2021/) | `{1: "Equation"}` |
Loading