-
Notifications
You must be signed in to change notification settings - Fork 0
Home
PROBEst is a tool designed for generating nucleotide probes with tailored properties, leveraging advanced algorithms and AI-driven techniques to ensure high-quality results. It is particularly useful for researchers and bioinformaticians who require probes with specific universality and specificity for applications such as PCR, hybridization, and sequencing. By integrating a wrapped evolutionary algorithm, PROBEst optimizes probe generation through iterative refinement, ensuring that the final probes meet stringent biological and computational criteria.
At the core of PROBEst is an AI-enhanced workflow that combines Primer3 for initial oligonucleotide generation, BLASTn for specificity and universality checks, and a mutation module for probe optimization. The tool allows users to input target sequences, select reference files for universality and specificity validation, and customize layouts for probe design. The evolutionary algorithm iteratively refines the probes by introducing mutations and evaluating their performance, ensuring that the final output is both specific to the target and universally applicable across related sequences. This AI-driven approach significantly enhances the efficiency and accuracy of probe generation, making PROBEst a valuable resource for molecular biology research.
Warning: This tool is under active development.
PROBEst consists of two main components: databases and their parsers, and the probe generation tool described in the README. Below is a detailed breakdown of each part:
This part of the tool focuses on managing and processing data related to nucleotide probes. It includes:
-
Parsed Probes Databases:
- Several pre-parsed databases containing information about nucleotide probes.
- These databases are used to validate and optimize probe universality and specificity.
-
Large Language Models (LLMs):
- LLMs are employed to extract information about nucleotide probes and their testing from scientific articles.
- The goal is to gather valuable insights from previous research to enhance the probe generation process.
This part is responsible for generating nucleotide probes with specified properties. It involves the following steps:
- Users need to prepare a BLAST database and a TSV table of genome-contig links.
- Use the provided script to set up the database:
bash scripts/prep_db.sh
- Once the database is prepared, users can run the probe generation pipeline using:
python pipeline.py [arguments]
- The pipeline integrates Primer3, BLASTn, and an evolutionary algorithm to generate and optimize probes.
- To fine-tune the probe generation process, users can run:
python gridsearch.py [arguments]
- This script helps identify the optimal parameters for the pipeline.
- Developers can refer to the Testing and Contribution Guide for detailed instructions on running tests, contributing to the codebase, and extending the tool's functionality.
By combining these two parts, PROBEst provides a comprehensive solution for generating high-quality nucleotide probes tailored to specific research needs. For further details, refer to the README and the Contribution Guide.
To install PROBEst, follow these steps:
git clone github.com/CTLab-ITMO/PROBEst/
cd PROBEst
pip install -e .-
To check the installation, run:
bash test_run_generator.sh
-
Additional information for developers is available at the
Testingpage
We welcome contributions from the community! To contribute:
- Fork the repository.
- Create a new branch for your feature or bugfix.
- Commit your changes and push to the branch.
- Open a pull request with a detailed description of your changes.
Please read the Contribution Guidelines for more details.
This project is licensed under the MIT License - see the LICENSE file for details.
For questions, suggestions, or feedback, feel free to reach out:
- Email: [dvsmutin@itmo.com]
- GitHub Issues: Create an issue
Thank you for using PROBEst! We hope this tool enhances your research and look forward to your contributions.