Semantic Search for Autonomous Vehicle & Robotics Datasets
Find any frame in terabytes of sensor data using natural language. 100% local. Zero cloud dependencies.
Quick Start • How It Works • Features • Docs • Contributing
Engineers working with autonomous vehicles and robotics generate terabytes of sensor data. Finding specific scenarios—a pedestrian jaywalking, a cyclist at dusk, a truck blocking an intersection—means hours of manual review or brittle keyword searches through metadata.
Prism lets you search your image and video datasets with natural language:
"red car turning left at intersection"
"pedestrian with umbrella crossing street"
"construction zone with orange cones"
Prism uses state-of-the-art vision AI (YOLOv8 for detection, Google SigLIP for semantic understanding) running entirely on your local machine. Your data never leaves your network.
- Python 3.9+ (GPU recommended: CUDA or Apple MPS)
- Go 1.21+
# Clone the repository
git clone https://github.com/sjanney/prism.git
cd prism
# Install dependencies and build
make install
make build
# Launch Prism
./run_prism.sh- Select Index New Data → enter
data/sample→ press Enter - Wait for indexing to complete (~10 seconds)
- Select Search Dataset → type
car→ press Enter
You're now searching images with natural language.
💡 First run downloads AI models (~2GB). Subsequent launches are instant.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Images/ │────▶│ YOLOv8 │────▶│ SigLIP │
│ Videos │ │ Detection │ │ Embedding │
└─────────────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Search │◀────│ Vector │◀────│ SQLite │
│ Results │ │ Similarity │ │ + NumPy │
└─────────────┘ └─────────────┘ └─────────────┘
- Indexing: For each image/video frame, Prism detects objects (cars, pedestrians, signs) and generates semantic embeddings
- Storage: Embeddings are stored locally in SQLite with NumPy vector blobs
- Search: Your query is embedded and compared against all indexed frames using cosine similarity
- Results: Matching frames are ranked and displayed in the TUI
| Challenge | Prism's Approach |
|---|---|
| Data Sensitivity | Proprietary AV data stays on your machine—no cloud uploads |
| Cost | No egress fees, no API costs, no subscription |
| Speed | Sub-second queries on local hardware |
| Compliance | Full control for GDPR, SOC2, and enterprise security requirements |
| Offline | Works without internet after initial model download |
| Feature | Description |
|---|---|
| Natural Language Search | Query with plain English: "truck at loading dock" |
| Video Indexing | Automatically extract and index frames from MP4, AVI, MOV, MKV |
| Object-Aware | YOLOv8 detects 80+ object classes for context-rich indexing |
| Cross-Platform | Runs on macOS (MPS), Linux (CUDA), and Windows (CPU/CUDA) |
| Terminal UI | Beautiful, keyboard-driven interface with real-time progress |
| gRPC API | Integrate Prism into your existing pipelines |
Prism extracts frames intelligently:
- 1 frame per second by default (configurable)
- Max 300 frames per video to prevent index bloat
- Frames reference source video with timestamps
# ~/.prism/config.yaml
video:
enabled: true
frames_per_second: 1.0
max_frames_per_video: 300
device: auto # auto, cuda, mps, cpu| Guide | Description |
|---|---|
| Getting Started | Detailed installation and first run |
| Architecture | System design and data flow |
| Configuration | All configuration options |
| Benchmarks | Performance testing and diagnostics |
| API Reference | gRPC service documentation |
| Error Codes | Troubleshooting common issues |
| Component | Technology |
|---|---|
| Frontend | Go, Bubbletea, Lipgloss |
| Backend | Python, PyTorch, gRPC |
| Detection | YOLOv8 (Ultralytics) |
| Embeddings | SigLIP-SO400M (Google) |
| Storage | SQLite, NumPy |
- Prism Pro: Unlimited indexing, S3/GCP ingestion, remote GPU mode
- Export: YOLO/COCO format output for training pipelines
- Clustering: Automatic scene grouping and anomaly detection
- Web UI: Browser-based interface option
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
# Run the test suite
make test
# Check formatting
make fmtApache 2.0 — see LICENSE for details.
Built with ❤️ for the AV & robotics community by Shane Janney
