Skip to content

eoinleen/PDB-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDB-Tools

Colab Python Jupyter License

A collection of well-annotated Python notebooks for PDB file analysis and manipulation, designed for Google Colab environments. These tools have been battle-tested in real research workflows and provide practical solutions for common protein structure tasks.

🎯 What This Collection Offers

A curated set of Jupyter notebooks that handle various aspects of protein structure analysis:

  • Structure manipulation - Chain renaming, merging, splitting
  • Quality control - ANISO removal, missing atom detection, validation
  • Format conversion - PDB to FASTA, robust input handling
  • Analysis tools - Polar/charged residue analysis, structural assessments
  • ProteinMPNN utilities - Selective labeling, design preparation
  • Specialized workflows - Ubiquitin analysis, Pyrosetta integration

🚀 Quick Start

  1. Open in Google Colab - Each notebook is designed to run directly in Colab
  2. Upload your PDB files - Use Colab's file upload or Google Drive integration
  3. Configure settings - Edit the clearly marked configuration sections
  4. Run and download - Execute cells and download processed files

📂 Notebook Categories

🔧 Structure Manipulation

  • Chain operations - Renaming, merging, splitting protein chains
  • File cleaning - Removing unwanted records, standardizing formats
  • Robust conversion - PDB to FASTA with error handling

🔍 Analysis & Quality Control

  • Structure validation - Missing atoms, coordinate issues
  • Residue analysis - Polar, charged, and structural properties
  • ANISO removal - Cleaning anisotropic temperature factors

🧬 ProteinMPNN Workflows

  • Selective labeling - Distance-based FIXED residue assignment
  • Design preparation - Input formatting for sequence design
  • Validation tools - Post-design analysis and verification

🔬 Specialized Applications

  • Ubiquitin analysis - Domain-specific structural tools
  • Pyrosetta integration - Interface with Rosetta workflows
  • Custom protocols - Task-specific analysis pipelines

💡 Key Features

  • Google Colab Ready - No local installation required
  • Well Documented - Extensive comments and usage instructions
  • Production Tested - Used extensively in real research projects
  • Error Handling - Robust file processing with clear error messages
  • Flexible Configuration - Easy-to-modify settings sections
  • Claude AI Enhanced - Developed with AI assistance for optimal usability

🛠️ Usage Pattern

Many notebooks follow this structure:

# ===== CONFIGURATION SECTION =====
INPUT_FILES = "path/to/your/files"
OUTPUT_FORMAT = "desired_format"
PROCESSING_OPTIONS = {...}
# ================================

# Processing code with detailed annotations
# Clear section headers and progress indicators
# Error handling and validation
# Results summary and download links

📋 Requirements

  • Google Colab (recommended) or Jupyter environment
  • Python 3.8+
  • Common libraries (usually pre-installed in Colab):
    • BioPython (for enhanced PDB parsing)
    • NumPy, Pandas (for data handling)
    • Standard library modules

🎓 Getting Started

  1. Browse the collection - Find notebooks relevant to your task
  2. Read the headers - Each notebook has detailed usage instructions
  3. Configure settings - Edit the marked configuration sections
  4. Run step-by-step - Execute cells sequentially
  5. Download results - Use provided download links

📚 Documentation Style

Each notebook includes:

  • Clear purpose statement - What the tool does
  • Usage instructions - How to configure and run
  • Example workflows - Typical use cases
  • Error handling - What to do when things go wrong
  • Output descriptions - What files you'll get

🔬 Research Applications

These tools have been used for:

  • Protein design validation
  • Structure-function analysis
  • Design pipeline preparation
  • Quality control workflows
  • Custom analysis protocols

🤝 Development Notes

  • AI-Assisted Development - Created with Claude AI for optimal functionality
  • User-Tested - Refined through extensive real-world usage
  • Iterative Improvement - Continuously updated based on practical needs
  • Community Focused - Designed for easy sharing and collaboration

⚠️ Usage Notes

  • Google Colab Optimized - Some notebooks may require modification for local use
  • File Upload Required - Most tools expect you to upload PDB files to the session
  • Session Temporary - Remember to download results before closing Colab sessions
  • Resource Limits - Large datasets may hit Colab's computational limits

📧 Support

These notebooks are provided as-is, with comprehensive documentation. The extensive comments and annotations should guide you through most use cases. For complex modifications, refer to the configuration sections and example workflows provided in each notebook.

📄 License

MIT License - feel free to use, modify, and share these tools in your research.


A collection of practical protein structure analysis tools, refined through real research applications and enhanced with AI-assisted development for maximum usability.

About

PDB-tools-random!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published