🎬 Daily Diary — AI That Turns Your Daily day into a beautiful memory

daily.dairy.demo-mini.mp4

What is this?

Daily Diary lets users create short daily videos just by talking. Before bed, you tell the AI about your day — it listens, understands the story, finds related photos and videos, adds narration and music, and produces a short “memory movie.” It captures not only what happened, but how it felt.

How does it work?

It is an intelligent photo memory assistant that combines real-time voice conversation with visual analysis. Upload photos and engage in meaningful conversations about your memories while the AI analyzes your images and asks thoughtful questions to help you reflect on your experiences.

Out put video

I took photos at the hackathon and made this video using Daily diary

vid_f0a24376-60af-4d98-833e-858d552aeaad.mp4

How Gemini models and Pipecat used

Gemini Integration:

Gemini 2.5 gemini-2.5-flash for real-time voice conversations via Pipecat's Gemini integration
Gemini 2.5 gemini-2.5-flash-image for intelligent photo analysis, generating empathetic responses about user memories
Custom prompts designed for emotional understanding and memory exploration

Pipecat Integration:

Real-time WebRTC voice communication through Daily.co transport
Custom pipeline handling photo uploads and analysis results
Voice UI Kit components for polished user experience

Other tools used

AWS S3, Lambda: Secure photo storage with presigned URL uploads, video generation
Daily.co: WebRTC infrastructure for real-time communication

What we built new during the hackathon

Feedback on tools used

Gemini:

Excellent: Natural conversation flow with minimal latency
Great: Easy integration with Pipecat's existing infrastructure
Suggestion: More examples of custom prompt engineering for specific use cases. Role is different from other models (Gemini only has "Model", "User") which was a bit confusing.

Pipecat:

Loved: Voice UI Kit components saved significant development time!
Challenge: While Pipecat is super flexible and has many ways to achieve something, I often was not sure what's the best way. For instance, when handling a frame, there is push_frame and queue_frame. I wasn't quite sure what's the difference between these two functions and I wish there were a good example that illustrates the behavior.

Project structure

client/: Next.js app with Voice UI Kit components and resizable layout
server/: Python bot integrating Gemini with Pipecat

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.claude		.claude
.github/workflows		.github/workflows
client		client
infrastructure		infrastructure
server		server
specs		specs
.gitignore		.gitignore
README.md		README.md
capture.gif		capture.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎬 Daily Diary — AI That Turns Your Daily day into a beautiful memory

What is this?

How does it work?

Out put video

How Gemini models and Pipecat used

Other tools used

What we built new during the hackathon

Feedback on tools used

Project structure

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

tomoima525/daily-diary

Folders and files

Latest commit

History

Repository files navigation

🎬 Daily Diary — AI That Turns Your Daily day into a beautiful memory

What is this?

How does it work?

Out put video

How Gemini models and Pipecat used

Other tools used

What we built new during the hackathon

Feedback on tools used

Project structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages