Overview

OpenData is a collection of open source databases built on a common, object-native storage and infrastructure foundation. This shared foundation means every database has a virtually identical operational profile, which makes our database fleet materially easier and cheaper to operate than alternatives.

The common foundation has two distinct components:

SlateDB as the common storage layer: SlateDB is an object-store-native LSM tree that handles write batching, tiered caching, and compaction. It provides snapshot isolation via atomic manifest updates using S3's compare-and-set. Each individual OpenData database implements its own domain-specific data structure and query layer on top of SlateDB.

OpenData as the common infrastructure layer: OpenData is the shared foundation for the layers above storage: service infrastructure, service catalog, admin tooling, distributed state infrastructure, configuration systems, and testing frameworks.

Taken together, with OpenData there is only one storage engine and one set of operational tooling to learn across all systems. By inheriting these key components, individual databases focus only on the unique query semantics, data layout and optimizations.

OpenData Databases

TSDB: Object-store-native timeseries. Prometheus remote-write compatible.
Log: Event streaming with a replayable log per key.
Vector: SPANN-style ANN search. Centroids in memory, posting lists on disk.

Roadmap

Each database has its own roadmap, documented in their READMEs. SlateDB has its own roadmap as well. Here is what's in the pipe for the rest of the shared foundation:

Common service infrastructure: server runtime with pluggable protocols, shared metrics, health checks.
Service Registry: discover and browse deployed databases.
Admin tooling: deploy, teardown, upgrade, inspect, migrate databases.
Benchmark and regression testing frameworks.
Distributed mode: state sharding and request routing.

Bigger ideas for later

Shared ingest layer: all writes got a shared service for better batching. A compactor framework creates queryable state.
Flexible deployment modes: embedded writers + hosted readers, fully embedded, fully distributed, etc.
Deterministic simulation testing.

Get Involved

We are early and building in the open, with most discussions happening on Discord.

Want to build? Check out the open RFCs, these are the active design efforts. Or you can check out issues labeled good-first-issue to get coding right away. Or simply file a bug or add a feature request!

Have opinions? What databases should exist under OpenData? What operational problems matter most? Open an issue or find us on Discord.

Want to follow along? Star the repo, join Discord, or sign the manifesto.

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
.github		.github
bencher		bencher
common		common
keyvalue		keyvalue
log		log
public		public
rfcs		rfcs
timeseries		timeseries
vector		vector
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
MANIFESTO_SIGNERS.md		MANIFESTO_SIGNERS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

OpenData Databases

Roadmap

Bigger ideas for later

Get Involved

About

Uh oh!

Releases

Packages

Languages

License

siphonite/opendata

Folders and files

Latest commit

History

Repository files navigation

Overview

OpenData Databases

Roadmap

Bigger ideas for later

Get Involved

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages