For text mining of biological research articles using DeepDive
Tong Shu Li and Sandip Chatterjee of the Su Lab
- Python 3.4+
- Java 1.8 for various NER/NLP annotators
- PBS/Torque cluster for cluster workflows
- Python package dependencies in
requirements.txtlxmlmay require externallibxml2installation (using a tool likeapt-get)
- See DeepDive main page for the latest installation instructions
- Run
bash <(curl -fsSL git.io/getdeepdive) - Install DeepDive by selecting option from menu
- Install PostgreSQL by selecting option from menu
- On Ubuntu (14.04), run
sudo apt-get install -y python3-pip - Install
lxmldependencies using:sudo apt-get install -y libxml2 libxml2-dev libxslt1-dev lib32z1-dev
- Clone the repo and
cd bioshovel - Create a virtualenv:
$ python3 -m venv venv - Activate virtualenv:
$ source venv/bin/activate - Install dependencies:
(venv) $ pip install -r requirements.txt
- Modules should be run from the
srcdirectory - Use
(venv) $ python3 -m [package_name].[module_name] [args] - See preprocess and downloaders packages for more information
- Tests should be run from the
srcdirectory - Run test discovery using
python3 -m unittest