DocNexus is a local-first, privacy focused RAG (Retrieval-Augmented Generation) system designed for deep analysis of PDF documents. It combines document parsing, knowledge graph construction, and large language model inference to enable context aware question answering for an uploaded document.
The project emphasizes:
- Local inference (no cloud APIs)
- Graph-based reasoning instead of pure vector similarity
- Transparency into how answers are derived
- PDF RAG workflow
- Upload PDF for focused analysis
- Graph-based retrieval
- Documents are converted into a knowledge graph of concepts and relationships
- LLM-powered reasoning
- Uses Ollama-hosted models (e.g.,
llama3.2) via LangChain
- Uses Ollama-hosted models (e.g.,
- Interactive Streamlit UI
- Chat-style interface for querying the document
- Graph traversal visualization
- Visual insight into how the system navigates the knowledge graph
- Local & private
- No external API calls; all processing runs on your machine
The Streamlit UI allows users to upload a PDF and interactively query it using a chat style interface.
This visualization shows how DocNexus traverses the knowledge graph to assemble context for answering a query.
High-level flow:
-
PDF Upload (Streamlit)
- User uploads a PDF through the UI
-
Document Loading & Chunking
PyPDFLoaderextracts text- Text is chunked for processing
-
Knowledge Graph Construction
- Concepts and entities are extracted using an LLM
- Nodes represent concepts
- Edges represent semantic relationships with weights
-
Graph RAG Querying
- User query triggers graph traversal
- Relevant nodes are prioritized and explored
- Context is assembled from traversal results
-
LLM Answer Generation
- The selected context is sent to Ollama
- Final answer is generated and returned to the UI
-
Visualization
- Traversal path and graph structure are rendered using NetworkX + Matplotlib
- Docker & Docker Compose
- Optional: NVIDIA GPU + NVIDIA Container Toolkit (for GPU inference)
- Python 3.11 (inside container)
- Ollama
- Streamlit
git clone https://github.com/SeanClay10/doc-nexus
cd docnexusdocker compose up --buildAfter the first build, you can run:
docker compose up- Streamlit UI: http://localhost:8501
- Ollama API: http://localhost:11434
- Upload a PDF
- Ask questions in the chat interface
- Wait for graph construction
- Inspect traversal visualizations
DocNexus is intended as an experimental platform for exploring RAG, context aware AI workflows, and local LLM deployment.

