- Download the repository and enter it:
git clone https://github.com/f-maury/MAKAAO_core
cd MAKAAO_core- Create a Conda environment with the needed packages:
conda create -n makaao python=3.11
conda activate makaao
python -m pip install -r requirements.txt- Browse to
scripts/folder, and run the MAKAAO scripts in correct order:
cd scripts
python 00_xlsx_to_csv.py
python 01_process_makaao_core_to_tables.py
python 02_create_enrichment_tables.py
python 03_build_kg_from_tables.py
python 04_make_lite_graph_from_makaao-kg.py- If everything runs without issue:
- MAKAAO KG and Lite KG will be located in
kg/folder - Normalized tables will be in
data/processed_tables/. - Enrichment tables will be in
data/enrichment_tables/.
- MAKAAO KG and Lite KG will be located in
tests/contains theshacl_shapes.ttlfile used by the GitHub CI workflow to validate some constraints on the knowledge graph.tests/also contains the other unit tests used to check each script.- Sample data are located in
data/makaao_sample.csv.
For licensing reasons, some files that you would need to correctly run the scripts are not included in this repository. These are:
data/en_product4.xml, obtained from Orphanet (https://www.orphadata.com/data/xml/en_product4.xml)data/MRCONSO.RRF, obtained from the UMLS (https://www.nlm.nih.gov/research/umls/licensedcontent/umlsknowledgesources.html)data/makaao_core.xlsx