๐ฅ Revolutionary Vector Database Distribution: Encode your knowledge base into MP4 video files and distribute globally through CDNs for lightning-fast semantic search.
Ragged transforms how we think about vector database distribution by leveraging the mature video streaming infrastructure. Instead of complex database deployments, simply upload an MP4 file to any CDN and get instant global semantic search capabilities.
# Install dependencies
poetry install
# configure the R2 bucket by creating a .env with the following variables. Any s3-compatible provider should be ok. I have only tested with R2
R2_BUCKET=ragged
R2_ENDPOINT=<cloudflare-r2-endpoint>
R2_ACCESS_KEY=<cloudflare-r2-access-key>
R2_SECRET_KEY=<cloudflare-r2-secret-key>
# next step is to start the model server. We dont have to but warming the embedding model reduces the processing time
python3 ragged/video/model_server.py --start
# now lets build the mp4 and other data from wikipedia. This will automtically upload the files to R2
python3 ragged/video/wiki_upload.py --max-articles 1000
# Search the knowledge base we just built
python3 ragged/video/search.py "machine" --show-performance --detailed
# If you want to run benchmarks
python3 ragged/video/benchmarks.py --benchmark- First search run will be slow as the faiss index and the manifest will be populated one time from the cloud. Even this can be warmed up (future enhancement)
- A seperate model server helps a lot with performance. I strongly recommend running that.
- You might notice that the similarity results are somewhat low. That will be a fair critique but the point of this library and demo is to show the Mp4 storage and cloud retrieval functionality. People way smarter and efficient than myself have solved those problems and with some effort the quality of results can be improved (future enhancement)
a. Run model server --
Makua-202506-2815.38.10.mp4
b. Encode knowledge base --
Makua-202506-2815.43.32.mp4
c. Query first run --
Makua-202506-2814.59.46.mp4
d. Query subsequent runs --
Makua-202506-2815.03.31.mp4
- Complex server deployments
- Expensive hosting infrastructure
- Cold-start penalties
- Regional latency issues
- Database connection limits
- MP4 files โ Upload anywhere (Cloudflare R2, AWS S3, etc.)
- CDN distribution โ Global edge caching automatically
- HTTP range requests โ Download only what you need
- Zero servers โ Serverless and edge-computing ready
- Infinite scale โ No connection limits
graph TD
A[๐ Documents] --> B[๐ค Text Chunking]
B --> C[๐งฎ Vector Encoding]
C --> D[๐ฆ MP4 Fragments]
D --> E[๐ฌ MP4 Container]
E --> F[๐ CDN Distribution]
F --> G[๐ Global Search]
H[๐ FAISS Index] --> F
I[๐ JSON Manifest] --> F
- ๐ Input: Your documents (PDFs, text files, web content)
- ๐ค Processing: Smart chunking with overlap and topic extraction
- ๐งฎ Encoding: Convert to vectors using sentence-transformers
- ๐ฆ Packaging: Encode vectors into MP4 fragments with metadata
- ๐ Distribution: Upload to any CDN (Cloudflare R2, AWS CloudFront, etc.)
- ๐ Search: Lightning-fast semantic search from anywhere in the world
- Standards Compliant: ISO/IEC 14496-12 MP4 containers
- Fragment-Based: Optimized chunk sizes for CDN performance
- Rich Metadata: Topic classification, timestamps, source attribution
- Binary Efficiency: Float32 vectors with JSON metadata
- HTTP Range Requests: Surgical data access, download only needed fragments
- Intelligent Prefetching: Background loading of adjacent fragments
- Multi-Level Caching: Memory, disk, and CDN edge caching
- Global Performance: Consistent search speed worldwide
- Semantic Search: Natural language queries using sentence-transformers
- Topic Filtering: Search within specific topics or domains
- FAISS Integration: Exact and approximate similarity search
- Similarity Thresholds: Configurable result quality filtering
- Cold-Start Optimization: 3-second initialization vs 45+ seconds for traditional DBs
- Infinite Readers: No database connection limits
- Edge Computing: Works in Cloudflare Workers, AWS Lambda, etc.
- Bandwidth Efficient: 92% less data transfer for initial loads
ragged/
โโโ ๐ ragged/ # Core package
โ โโโ ๐ video/ # Video encoding/decoding
โ โ โโโ encoder.py # MP4 vector encoding
โ โ โโโ decoder.py # CDN-optimized decoding
โ โ โโโ config.py # Video codec settings
โ โโโ ๐ api/ # FastAPI web service
โ โ โโโ v1/endpoints/ # REST API endpoints
โ โโโ ๐ services/ # Business logic
โ โ โโโ uploader/ # CDN upload services
โ โโโ ๐ enterprise/ # Enterprise features (WIP)
โ โโโ main.py # FastAPI app entry point
โโโ ๐ examples/ # Usage examples
โโโ ๐ pyproject.toml # Dependencies
๐ BENCHMARK SUMMARY - Obtained by running benchmarks.py against a random dataset. To be honest i feel while the system is very good, these numbers are a bit generous. Critiques on the benchmark script are welcome.
โก Performance Grade: A (10.0ms avg) ๐ฏ Quality Grade: F (43.3% relevance) ๐ Throughput: 100.9 queries/sec ๐พ Cache Hit Rate: 100.0%
๐ Detailed Metrics: Cold Start p95: 129ms Warm Search p95: 10ms Query Encoding: 8.0ms Result Diversity: 40.0% Memory Usage: 607.5MB
ps: Quality is highly dependent on the articles that you get from the wiki dataset. This will vary from run to run.
- ๐ Knowledge Base Search: Documentation, FAQs, internal wikis
- ๐ค RAG Applications: Retrieval-augmented generation systems
- ๐ Global Applications: Multi-region deployments with consistent performance
- โก Edge Computing: Serverless functions, IoT devices, mobile apps
- ๐ฐ Cost-Sensitive Deployments: Startups, side projects, research
- ๐ Frequent Updates: Real-time indexing requirements
- ๐ Complex Queries: Multi-stage filtering, analytical workloads
- ๐ Traditional CRUD: Applications needing database transactions
- Core MP4 encoding/decoding
- CDN-optimized distribution
- FastAPI web service
- PDF upload pipeline
- Production deployment guides
- Performance optimization
- Multi-modal vectors (images, audio)
- Streaming updates (incremental changes)
- Advanced search (hybrid, faceted)
- Enterprise SSO integration
- Standard MP4 boxes for vectors
- P2P distribution networks
- Edge AI processing
- Ecosystem integrations
Welcome
- Update README for new features
- Add docstrings for new functions
- Include usage examples
Read our arXiv paper: "Ragged: ragged.pdf Leveraging Video Container Formats for Efficient Vector Database Distribution"
This project was inspired by Memvid, which demonstrated storing data in video formats. Ragged extends this concept with vector-specific optimizations, CDN distribution, and semantic search capabilities.
- FAISS: Efficient similarity search
- Sentence Transformers: Text embedding models
- FastAPI: Modern web framework
MIT License - see LICENSE file for details.
- Video Streaming Community: For the mature CDN infrastructure we leverage
- FAISS Team: For efficient similarity search algorithms
- Sentence-Transformers: For high-quality text embeddings
- Memvid: For the initial inspiration of storing data in video formats
- Open Source Community: For the foundational libraries that make this possible
๐ Star this repo if you find it useful! ๐
Questions? Open an issue or start a discussion!
๐ก Fun Fact: Your entire knowledge base is now a video file that can be streamed, cached, and distributed just like any YouTube video - but instead of cat videos, it's semantic search! ๐ฑโก๏ธ๐