Foundry-ML simplifies access to machine learning-ready datasets in materials science and chemistry.
- Search & Load - Find and use curated datasets with a few lines of code
- Understand - Rich schemas describe what each field means
- Cite - Automatic citation generation for publications
- Publish - Share your datasets with the community
- AI-Ready - MCP server for Claude and other AI assistants
pip install foundry-mlfrom foundry import Foundry
# Connect
f = Foundry()
# Search
results = f.search("band gap", limit=5)
# Load
dataset = results.iloc[0].FoundryDataset
X, y = dataset.get_as_dict()['train']
# Understand
schema = dataset.get_schema()
print(schema['fields'])
# Cite
print(dataset.get_citation())For Google Colab or remote Jupyter:
f = Foundry(no_browser=True, no_local_server=True)foundry search "band gap"
foundry schema 10.18126/abc123
foundry --helpfoundry mcp install # Add to Claude Code| Feature | Description |
|---|---|
| Search | Find datasets by keyword, DOI, or browse catalog |
| Load | Automatic download, caching, and format conversion |
| PyTorch/TensorFlow | dataset.get_as_torch(), dataset.get_as_tensorflow() |
| CLI | Terminal-based workflows |
| MCP Server | AI assistant integration |
| HuggingFace Export | Publish to HuggingFace Hub |
Browse datasets at Foundry-ML.org or:
f = Foundry()
f.list(limit=20) # See available datasetsIf you use Foundry-ML, please cite:
@article{Schmidt2024,
doi = {10.21105/joss.05467},
year = {2024},
publisher = {The Open Journal},
volume = {9},
number = {93},
pages = {5467},
author = {Kj Schmidt and Aristana Scourtas and Logan Ward and others},
title = {Foundry-ML - Software and Services to Simplify Access to Machine Learning Datasets in Materials Science},
journal = {Journal of Open Source Software}
}Foundry is open source. To contribute:
- Fork from
main - Make your changes
- Open a Pull Request
See CONTRIBUTING.md for details.
This work was supported by the National Science Foundation under NSF Award Number: 1931306 "Collaborative Research: Framework: Machine Learning Materials Innovation Infrastructure".
Foundry integrates with Materials Data Facility, DLHub, and MAST-ML.