An AI-powered assistant for querying your personal Zotero library and synthesizing literature across sources like PubMed using GPT-4.
This tool lets you interrogate your own research collection using natural language, and combines it with up-to-date PubMed searches to generate integrated answers, identify gaps, and recommend papers to add.
- ✅ Query your Zotero library using natural language
- 🔍 Automatically search PubMed for new or related findings
- 🧠 Synthesizes both sources with GPT-4 (or GPT-4o)
- 🗃️ Local vector search with FAISS for your own library
- 🧾 Logs and stores queries and results in a local SQLite database
- 📬 Watch PubMed for new papers on past queries and send email alerts
- 📄 Prompts factored into editable Markdown files in
/prompts
- Python 3.10+
- A Zotero library (
.bibfile) - Conda (Miniconda or Anaconda)
- An OpenAI API key with access to:
text-embedding-3-largegpt-4orgpt-4o
git clone https://github.com/BridgesLab/ResearchAssistant.git
cd ResearchAssistantYou can either:
Export manually from Zotero: File → Export Library → Format: BibTeX Save the file as library.bib in the project root.
Or create a symbolic link to your .bib file:
ln -s /path/to/your/library.bib ./library.bibln -s /path/to/your/library.bib ./library.bibCreate a .env file in the root folder containing at minimum:
OPENAI_API_KEY=your-api-key-here
# For email alerts (adjust according to your SMTP provider)
EMAIL_USER=your.email@example.com
EMAIL_PASS=your-email-password-or-app-token
EMAIL_SMTP=smtp.example.com
EMAIL_PORT=587
EMAIL_TO=your.email@example.com
Run this once after setup.
python scripts/build_index.pyThis creates:
- zotero.index — FAISS vector index of embedded papers
- zotero_meta.pkl — metadata used for retrieval and synthesis
To update your index run this (or set up a cron job to do it regularly)
python scripts/update_index.pyUse the manager agent to query both Zotero and PubMed, and get a synthesized answer:
python scripts/manager_agent.py "What is the relationship between calcium and cholesterol?"- Log your query to a local SQLite database
- Search your Zotero library using semantic similarity
- Use GPT-4 to create a PubMed search string and query PubMed
- Synthesize results and highlight any gaps or missing references
- Store results and synthesis in the database for future reference
Periodically run the watcher script to search PubMed for new results related to your past queries and receive email alerts:
python scripts/watch_pubmed.pyOr do do this manually checking recent papers run this to see if there are any new publications relevant to your logged queries (in this case in the last 60 days).
python scripts/find_new_papers.py --days 60Automate this with cron or task scheduler to run weekly or biweekly.
Launch a simple browser interface for querying your Zotero library:
streamlit run scripts/app.pyOpen the provided local address in your browser.
All GPT prompts are stored in Markdown format under /prompts. These are easy to edit and version, and follow a structured format with role, task, input, output, and guardrails. Current prompts include:
- pubmed_search.md — converts questions to PubMed Boolean searches
- synthesis.md — integrates and summarizes findings
- zotero_search.md — (optional/future) for explainable Zotero querying
- 🔁 Auto-add recommended PubMed papers to Zotero
- 📄 Export summaries or citations to BibTeX, Markdown, or Notion
- 📊 Interactive filters for Zotero results
- 🧪 Jupyter or VS Code extension for integrated research