Rag-pipeline-pdf

Rag-pipeline-pdf is a Python-based project implementing a Retrieval-Augmented Generation (RAG) pipeline specifically for PDF documents. It combines traditional information retrieval techniques with AI-powered generation, allowing users to query large PDF datasets efficiently and get accurate, context-aware answers.

Key Features

PDF Text Extraction: Automatically extracts and preprocesses text from PDF documents.
Vector Database Storage: Uses ChromaDB to store embeddings of the extracted text for fast and efficient retrieval.
RAG Pipeline: Integrates retrieval with AI models to provide context-aware answers from the documents.
Interactive Notebooks: Jupyter notebooks included for testing, exploring, and demonstrating the pipeline.
Scalable & Modular: Can be extended to larger datasets or integrated with other AI applications.

What I Did

Built the end-to-end pipeline from PDF ingestion to AI query.
Implemented vectorization and storage for quick retrieval.
Enabled AI-assisted querying to fetch precise answers from PDFs.
Organized the project for easy experimentation and extension.

This project serves as a base for building intelligent document search systems and can be extended to handle multiple document types or integrated into web applications.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Rag-pipeline-pdf		Rag-pipeline-pdf
data		data
notebook		notebook
src		src
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rag-pipeline-pdf

Key Features

What I Did

About

Uh oh!

Releases

Packages

Languages

manyasharma1008/Rag-pipeline-pdf

Folders and files

Latest commit

History

Repository files navigation

Rag-pipeline-pdf

Key Features

What I Did

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages