Weave moments into an everlasting cosmic memory: Enabling AI to reason, adapt, and recall across modalities
💬 Reason, interact, remember — all in one universe.
MemVerse is an open-source framework designed to provide continuous, multimodal memory for AI agents. It organizes and retrieves information from text, images, and other sensory inputs, enabling agents to remember past interactions, understand context, and adapt dynamically. By structuring memories as hierarchical knowledge graphs and combining them with fast parametric recall, MemVerse empowers AI to reason, interact, and provide coherent, personalized experiences across extended conversations.
📄 Paper Available on arXiv — https://arxiv.org/abs/2512.03627
On benchmark evaluations, agents equipped with MemVerse achieve a reasoning accuracy of 84.48% on ScienceQA and 90.40% on MSR-VTT, demonstrating enhanced multimodal reasoning, adaptive learning, and coherent performance across extended interactions.
|
[2025-12-19] 🐳 MemVerse Docker v1.1.0 Released!
|
|
[2025-12-01] 🎉 🎉 🎉 MemVerse v1.0.0 Released!
|
Build AI memory that continuously evolves, remembering what matters while automatically correcting past mistakes, so every memory grows from accurate understanding rather than meaningless accumulation.
🌐 Fast Integration: One-line install. Works with any LLM framework. No complex setup required.
🗂️ Multimodal Support: Remember text, images, audio, and video. Process and retrieve across all modalities.
⚡ Cost Efficient: 90% token savings. Scale without breaking the bank.
- Python 3.10+
- At least 4GB of available RAM (for memory storage)
- Access to the MemVerse API
You can create a Conda environment and install dependencies using requirements.txt :
conda create --name memverse python=3.10
conda activate memverse
pip install -r requirements.txtOr setup environment with provided YML :
conda env create -f environment.yml- Start the MemVerse API server
uvicorn app:app --host 0.0.0.0 --port 8000 --reload- Insert new memory
Send a POST request to /insert with text, image, video, or audio. Example using curl:
curl -X POST "http://127.0.0.1:8000/insert" \
-F "query=Hello MemVerse!" \
-F "image=@path/to/image.jpg" \
-F "video=@path/to/video.mp4" \
-F "audio=@path/to/audio.wav"- Query memory
curl -X POST "http://127.0.0.1:8000/query" \
-F "query=Hello MemVerse!" \For ease of deployment and reproducibility, MemVerse provides a pre-built Docker image that bundles both the FastAPI service and the MCP server.
- Pull the Docker Image
docker pull yifeisunecust/memverse:v1.1.0Make sure Docker is installed on your system.
- Start MemVerse Services
Run the container with required environment variables:
docker run -d \
--name memverse \
-p 8000:8000 \
-p 5250:5250 \
-e OPENAI_API_KEY="YOUR_OPENAI_API_KEY" \
-e OPENAI_API_BASE="http://35.220.164.252:3888/v1" \
yifeisunecust/memverse:v1.1.0This command will start:
FastAPI server → http://localhost:8000
MCP server → http://localhost:5250
- Use MCP Server
Terminal 2 — Run MCP Client, on your host machine (outside Docker):
python mcp_client.py # demo- Use Fastapi
Terminal 3 — Try Fastapi, on your host machine (outside Docker):
curl -X POST "http://127.0.0.1:8000/query" \
-F "query=Hello MemVerse!" \If you are using a Linux system, you can use
sudo apt install -y qemu-user-staticPull the arm64 architecture image
docker pull --platform linux/arm64 yifeisunecust/memverse:v1.1.0Start servers
docker run -d \
--name memverse \
-p 8000:8000 \
-p 5250:5250 \
-e OPENAI_API_KEY="YOUR_OPENAI_API_KEY" \
-e OPENAI_API_BASE="http://35.220.164.252:3888/v1" \
--platform linux/arm64 yifeisunecust/memverse:v1.1.0The rest is the same as above.
ScienceQA: MemVerse-enhanced GPT-4o-mini achieves an accuracy of 84.48%, showing that parametric memory enables fast, context-aware reasoning even when questions have limited sequential dependencies. The model effectively integrates long-term knowledge for subject-specific reasoning in natural science, social science, and language tasks.
MSR-VTT: By leveraging a memory-based knowledge graph and semantic associations between captions, MemVerse achieves 90.40% R@1 in text-to-video retrieval and 89.20% R@1 in video-to-text retrieval. This demonstrates that structured memory greatly enhances multimodal semantic matching, enabling lightweight models to retrieve relevant information efficiently while capturing rich reasoning from large pretrained models.
Ecosystem & Extensions
🌐 |
If you use MemVerse in your research, please cite our paper (coming soon):
@misc{Liu_2025_MemVerse,
title={MemVerse: Multimodal Memory for Lifelong Learning Agents},
author={Junming Liu and Yifei Sun and Weihua Cheng and Haodong Lei and Yirong Chen and Licheng Wen and Xuemeng Yang and Daocheng Fu and Pinlong Cai and Nianchen Deng and Yi Yu and Shuyue Hu and Botian Shi and Ding Wang},
year={2025},
eprint={2512.03627},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2512.03627},
}


