Graph-KIR

Graph-KIR is a tool for KIR (Killer Immunoglobulin-like Receptor) typing using short read FASTQ files.

This repo contains two main programs:

graphkir - Main Typing Tool

graphkir reads FASTQ files, both from CSV or directly via command-line arguments. It outputs copy number estimations in a CSV file called cohort.cn.tsv and allele typing results in cohort.allele.tsv by default. More details about its algorithm and concept can be found in the paper.
kirpipe - KIR Typing Pipeline

kirpipe is an aggregation tool that automates the KIR typing pipeline. It includes five published tools: graphkir, PING, Sakaue's KIR, T1K, and KIR*KPI.

(Note: Currently, kirpipe requires podman or docker to execute)

Version

version 1.0
- Biorxiv: https://doi.org/10.1101/2023.11.29.568665
- github tag: v1.0
version 2.0 (latest)
- github tag: v2.0

Docker Version

We have prepared a Docker version of Graph-KIR for easy setup and reproducibility. You can build and use the Docker image as follows:

docker build -t linnil1/graphkir .
docker run -it --rm -v "$PWD":/data linnil1/graphkir graphkir --help

This will run Graph-KIR inside a container, mounting your current directory to /data in the container. Adjust the command and volume as needed for your workflow.

Requirements (Local Installation)

To run Graph-KIR locally in default (with --engine local), you need:

Python >= 3.10
MUSCLE >= 5.1 (required only for index building stage)
HISAT2 >= 2.2.1
samtools >= 1.15.1
BWA-MEM >= 0.7.17 (needed only for the WGS extraction stage)

Example: Create a Conda Environment for Local Engine

You can use conda to set up the required environment and install the necessary tools:

conda create -n graphkir_env python=3.14
conda activate graphkir_env
conda install -c bioconda muscle=5.3 hisat2=2.2.1 samtools=1.22.1 bwa=0.7.19

Then install Graph-KIR:

pip install .

Using Container Tools

You can also use Graph-KIR with containerization tools for easier setup and reproducibility. Supported engines:

podman
docker
singularity

Specify the engine with the --engine argument, e.g. --engine podman.

Note: If you use other container engines (podman, docker, singularity) with --engine, you should install Graph-KIR with pip install . on your local machine. The container will be used only for running the external tools, while the main program runs locally.

Usage (Main)

Download the pre-built Graph-KIR index:

wget https://graphkir.c4lab.tw/download/example_index.tar.gz
tar xvf example_index.tar.gz
# If kirpipe is used, rename it
# ln -s example_index graphkir_alpha

Install Graph-KIR:

git clone https://github.com/linnil1/KIR_graph
cd KIR_graph
pip install .
graphkir --help

Run Graph-KIR (If the index does not exist, it will be auto-built):

graphkir \
    --thread 2 \
    --r1 example/test00.read1.fq.gz \
    --r2 example/test00.read2.fq.gz \
    --r1 example/test01.read1.fq.gz \
    --r2 example/test01.read2.fq.gz \
    --index-folder example_index \
    --output-folder example_data \
    --output-cohort-name example_data/cohort

Or, if you have an input CSV file (e.g., cohort.csv) containing the list of samples:

graphkir \
    --thread 2 \
    --input-csv example/cohort.csv \
    --index-folder example_index \
    --allele-strategy exonfirst \
    --output-cohort-name example_data/cohort \
    --log-level DEBUG

The CSV should have four columns:

name: The output prefix of the sample.
r1 and r2: Paths to the fastq files.
cnfile: You can assign a copy number file for the sample. Leave it empty for Graph-KIR to assign automatically.

name,r1,r2,cnfile
example_data/linnil1.00,example/test00.read1.fq.gz,example/test00.read2.fq.gz,example/test00.assigned.cn.tsv
example_data/linnil1.01,example/test01.read1.fq.gz,example/test01.read2.fq.gz,

The final result that includes all the samples are aggrate into one file with prefix output-cohort-name. In the above sample, example_data/cohort.cn.tsv and example_data/cohort.allele.tsv are generated.

Some useful arguments include:

--ref-genome: Reference genome for WGS extraction: hg19 (hs37d5) or hg38 (GRCh38_no_alt). (default: hg19)
--step-skip-extraction: Skip whole genome mapping and KIR read extraction. Use this if your input reads are already filtered for KIR regions.
--allele-strategy exonfirst: Denoted as 'exon_only' in the manuscript for 3-digit or 5-digit typing. This mode prioritizes exon-level information and is designed to enhance exon-level typing accuracy.
--cn-3dl3-not-diploid: Estimate CN without assuming KIR3DL3 CN is 2. By default, Graph-KIR assumes KIR3DL3 is diploid and adjusts CN estimation accordingly.
--cn-diploid-gene: Use a diploid gene (VDR/RYR1/EGFR) to normalize CN estimation. Leave empty for no normalization. Requires --cn-3dl3-not-diploid.
--cn-cohort: Estimate CN while considering the entire cohort. In cohort mode, diploid gene information is not considered.
--plot: Generate CN result plots.
--cn-dist-dev: Adjust CN distribution model deviation (e.g., 0.06).

Usage (`kirpipe` pipeline for other KIR tools)

ln -s ../example/test00.read1.fq.gz example_data/test.00.read.1.fq.gz
ln -s ../example/test00.read2.fq.gz example_data/test.00.read.2.fq.gz
ln -s ../example/test01.read1.fq.gz example_data/test.01.read.1.fq.gz
ln -s ../example/test01.read2.fq.gz example_data/test.01.read.2.fq.gz
kirpipe example_data/test.{} --tools t1k

Usage (for paper)

If you want to develop or rerun the code related to the Graph-KIR research, check out the research/ directory.

Most of these scripts are not automated and require manual configuration or linking to your cohort (e.g., HPRC). You may also need to adjust arguments to run Graph-KIR with different configurations.

Requirements:

pip install .[paper]
podman (other container tools are not tested)

To build the document, use: mkdocs serve

research/kg_main.py My work for simulated data (100 samples)
research/kg_real.py My work for real data (HPRC)
research/other_kir.py Run other KIR tools for HPRC or 100 samples
research/kg_dev_* Scripts for development purposes (not used in the paper)
research/kg_eval_* Compare the results

Evaluation code and data for v2:

research/kg_eval_hprc_alldigit.py
research/kg_eval_hprc_remove_novel.py
research/groundtruth/hprc_annotation_skirt.tsv

Related tools

star_allele_comp: https://github.com/linnil1/star_alleles_comparator

The star allele comparator allows KIR/HLA alleles as input. This module is inspired by research/kg_eval.py.
pyhlamsa: https://github.com/linnil1/pyHLAMSA

A tool for easily manipulating MSA data. It reads from IPD-KIR or IPD-HLA database formats, merges exons, calculates consensus, writes data in specific formats, and more.
filenameflow: https://github.com/linnil1/FileNameFlow

A lightweight pipeline tool that executes pipelines. It uses filenames as auto-versioning keys, which is convenient when tuning arguments or switching parts frequently. Note that in this research, Version 0.0.7 is used, so clone the repository and run git checkout v0.0.7 && pip install ..

Changelog

github tag: v1.0: Initial release on bioRxiv and open-sourced the Graph-KIR code.
latest: Current Version
- Improved the algorithm for assuming KIR3DL3 is diploid. We now treat KIR3DL3's depth as a probability of 2x depth instead of assuming an exact 2x depth, which enhances the clustering results for copy number estimation. Special thanks to Ting-Jian Wang, one of the authors of the original paper.

LICENSE

LGPL

::: graphkir

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
docs		docs
example		example
graphkir		graphkir
kir		kir
research		research
.gitignore		.gitignore
Dockerfile		Dockerfile
KIR_gene_haplotypes.csv		KIR_gene_haplotypes.csv
README.md		README.md
graphviz.dockerfile		graphviz.dockerfile
hprc.csv		hprc.csv
mkdocs.yml		mkdocs.yml
ncbi_kir_search_result1.tsv		ncbi_kir_search_result1.tsv
nginx.conf		nginx.conf
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph-KIR

Version

Docker Version

Requirements (Local Installation)

Example: Create a Conda Environment for Local Engine

Using Container Tools

Usage (Main)

Usage (`kirpipe` pipeline for other KIR tools)

Usage (for paper)

Related tools

Changelog

LICENSE

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

linnil1/KIR_graph

Folders and files

Latest commit

History

Repository files navigation

Graph-KIR

Version

Docker Version

Requirements (Local Installation)

Example: Create a Conda Environment for Local Engine

Using Container Tools

Usage (Main)

Usage (kirpipe pipeline for other KIR tools)

Usage (for paper)

Related tools

Changelog

LICENSE

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Usage (`kirpipe` pipeline for other KIR tools)

Packages