MicrohapDB (HaploSearch)

A comprehensive web application for managing and analyzing microhaplotype genetic data, designed for researchers and institutions working with population genetics and genomic analysis.

🧬 What is MicrohapDB?

MicrohapDB is a specialized database system for storing, managing, and analyzing microhaplotypes - short DNA sequences that contain multiple closely linked polymorphic sites. These genetic markers are valuable for:

Population genetics studies
Ancestry inference
Forensic genetics
Evolutionary research
Biogeographical analysis

🏗️ Architecture Overview

🌐 Frontend (Vue.js)

Location: microhapDB-frontend/

A modern, responsive web interface built with Vue.js that provides:

Data Upload & Management: Upload PAV (Presence/Absence Variation) and MADC (Microhaplotype Allele Data Collection) files
Interactive Visualizations: Charts and graphs for genetic data analysis
User Authentication: ORCID-based login system for researchers
Admin Dashboard: Administrative tools for data management
Search & Filter: Advanced search capabilities across genetic datasets

Key Features:

Real-time data validation during upload
Batch processing for large datasets
Export functionality for analysis results
Role-based access control (Admin/User)
Integration with ORCID for researcher authentication

🔧 Backend (FastAPI)

Location: microhapDB-backend/

A robust REST API built with FastAPI and Python that handles:

Database Management: PostgreSQL database with optimized schemas
File Processing: Handles PAV and MADC file formats with validation
Authentication: ORCID OAuth integration
Data Analysis: Statistical computations and genetic analysis algorithms
API Documentation: Auto-generated OpenAPI/Swagger documentation
Performance: Optimized queries and caching for large datasets

Key Features:

RESTful API with comprehensive endpoints
Database migrations with Alembic
Background task processing
Data validation and error handling
Comprehensive logging and monitoring
Docker containerization for deployment

🚀 Deployment Architecture

Production Infrastructure

Backend: AWS EC2 instance with Docker containers
Frontend: AWS S3 + CloudFront for global CDN
Database: PostgreSQL (configurable for AWS RDS)
CI/CD: GitHub Actions for automated deployment
Infrastructure: Terraform for Infrastructure as Code

Supported Deployment Methods

GitHub Actions (Recommended): Automated CI/CD pipeline
Terraform: Infrastructure as Code management
Manual EC2: Direct deployment to AWS EC2 instances

📊 Data Types Supported

PAV Files (Presence/Absence Variation)

Binary genetic variation data
Population frequency information
Geographic metadata

MADC Files (Microhaplotype Allele Data Collection)

Detailed allele frequency data
Multi-population comparisons
Statistical analysis results

🔐 Security Features

ORCID Authentication: Secure researcher authentication
Role-based Access: Admin and user permission levels
Data Validation: Comprehensive input validation and sanitization
Secure File Upload: Validated file processing with size limits
Environment Variables: Secure configuration management

🛠️ Technology Stack

Frontend

Vue.js 3: Progressive JavaScript framework
Vue Router: Client-side routing
Axios: HTTP client for API communication
Chart.js: Data visualization
Bootstrap/CSS: Responsive styling

Backend

FastAPI: Modern Python web framework
SQLAlchemy: Database ORM
Alembic: Database migrations
PostgreSQL: Primary database
Pydantic: Data validation
Uvicorn: ASGI server

DevOps & Deployment

Docker: Containerization
GitHub Actions: CI/CD pipeline
Terraform: Infrastructure as Code
AWS: Cloud infrastructure (EC2, S3, CloudFront)
Nginx: Reverse proxy (in production)

📈 Use Cases

For Researchers

Upload and manage genetic datasets
Perform population genetics analysis
Visualize allele frequencies and distributions
Export data for further analysis
Collaborate with other researchers

For Institutions

Centralized genetic data repository
Multi-user access with role management
Standardized data formats and validation
Scalable infrastructure for large datasets
Integration with existing research workflows

🌍 Target Audience

Population Geneticists: Researchers studying genetic variation in populations
Forensic Scientists: Professionals using genetic markers for identification
Evolutionary Biologists: Scientists studying genetic evolution and ancestry
Academic Institutions: Universities and research centers
Government Agencies: Organizations involved in genetic research and policy

📚 Getting Started

For Users

Visit the deployed application
Sign in with your ORCID account
Upload your genetic data files (PAV/MADC format)
Explore visualizations and analysis tools
Export results for your research

For Developers

Clone the repository
Set up the backend (see microhapDB-backend/README.md)
Set up the frontend (see microhapDB-frontend/README.md)
Configure environment variables
Run locally for development

🤝 Contributing

We welcome contributions from the genetics and bioinformatics community! Please see our contributing guidelines and feel free to submit issues or pull requests.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Links

Live Application: [Deployed URL]
API Documentation: [Backend URL]/docs
GitHub Repository: https://github.com/tylerslonecki/microhapDB
Issues & Support: GitHub Issues

MicrohapDB is developed to support the global genetics research community in advancing our understanding of human genetic diversity and evolution.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.github/workflows		.github/workflows
microhapDB-backend		microhapDB-backend
microhapDB-frontend		microhapDB-frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
temp_public_key.pub		temp_public_key.pub
temp_public_key_b64.txt		temp_public_key_b64.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MicrohapDB (HaploSearch)

🧬 What is MicrohapDB?

🏗️ Architecture Overview

🌐 Frontend (Vue.js)

🔧 Backend (FastAPI)

🚀 Deployment Architecture

Production Infrastructure

Supported Deployment Methods

📊 Data Types Supported

PAV Files (Presence/Absence Variation)

MADC Files (Microhaplotype Allele Data Collection)

🔐 Security Features

🛠️ Technology Stack

Frontend

Backend

DevOps & Deployment

📈 Use Cases

For Researchers

For Institutions

🌍 Target Audience

📚 Getting Started

For Users

For Developers

🤝 Contributing

📄 License

🔗 Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

tylerslonecki/microhapDB

Folders and files

Latest commit

History

Repository files navigation

MicrohapDB (HaploSearch)

🧬 What is MicrohapDB?

🏗️ Architecture Overview

🌐 Frontend (Vue.js)

🔧 Backend (FastAPI)

🚀 Deployment Architecture

Production Infrastructure

Supported Deployment Methods

📊 Data Types Supported

PAV Files (Presence/Absence Variation)

MADC Files (Microhaplotype Allele Data Collection)

🔐 Security Features

🛠️ Technology Stack

Frontend

Backend

DevOps & Deployment

📈 Use Cases

For Researchers

For Institutions

🌍 Target Audience

📚 Getting Started

For Users

For Developers

🤝 Contributing

📄 License

🔗 Links

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages