fix: transcription timeout by Pertempto · Pull Request #76 · el-apps/Daily-Manna

Pertempto · 2026-02-06T09:27:17Z

No description provided.

The outer _apiTimeout (30s) was firing before the service's transcription timeout (60s) could complete, causing spurious TimeoutException errors for users. Now relies on OpenRouterService's internal timeouts: - 60s for audio transcription - 30s for passage recognition Co-authored-by: Shelley <shelley@exe.dev>

User-facing: - Clear message about slow internet connection Error logs now include: - Timeout duration (60s transcription, 30s recognition) - Audio size in bytes and duration in seconds - Transcription text length for recognition errors Co-authored-by: Shelley <shelley@exe.dev>

Co-authored-by: Shelley <shelley@exe.dev>

Made transcriptionTimeout and recognitionTimeout public in OpenRouterService so error messages stay in sync with actual values. Co-authored-by: Shelley <shelley@exe.dev>

Added logInfo() to ErrorLoggerService. Now logs audio size (KB) and duration (s) before each transcription attempt, helping diagnose if large files cause timeouts on certain devices. Co-authored-by: Shelley <shelley@exe.dev>

Log now shows: - WAV file size (KB) - Base64 payload size (KB) - actual upload size - Audio duration (s) - Model name being used Co-authored-by: Shelley <shelley@exe.dev>

Switch from PCM/WAV streaming to AAC file recording: - AAC-LC at 64kbps is ~10-20x smaller than uncompressed PCM - A 5.7 min recording: ~14MB WAV → ~700KB AAC - Much more likely to upload within timeout on mobile networks Changes: - Record to temp .m4a file instead of streaming PCM chunks - Send compressed audio directly (no WAV encoding needed) - Playback uses file path instead of in-memory bytes - Clean up temp files on discard/dispose Co-authored-by: Shelley <shelley@exe.dev>

Co-authored-by: Shelley <shelley@exe.dev>

github-actions · 2026-02-06T09:29:17Z

Changes Requested

Consider adding a comment explaining that AAC format is supported by the transcription service, or note any potential quality differences from WAV
Add try/catch around File(_audioFilePath!).delete().ignore() in _clearAudio() method for safer file cleanup
Verify that using timestamp in temp filename doesn't introduce any security concerns (though it likely doesn't)

Summary of Changes

Switched from PCM WAV recording to AAC compression for smaller audio files
Updated audio recording to use file-based storage instead of streaming byte chunks
Modified transcription process to handle AAC files with .m4a extension
Improved cleanup of temporary audio files
Adjusted transcription timeout error messages to reflect AAC format

Overall Feedback

Great work updating the audio recording to use AAC compression and file-based storage! This should significantly reduce the file sizes and improve performance. The switch from streaming byte chunks to direct file recording simplifies the code and makes it more efficient.

I have a few suggestions regarding error handling and documentation that would make this even better. @Pertempto

github-actions · 2026-02-06T09:37:55Z

Summary of Changes

Increased transcription timeout from 30s to 60s
Switched audio recording format from PCM to AAC for better compression
Implemented automatic audio segmentation every 4 minutes to stay under API limits
Added proper temporary file cleanup on dispose
Improved error logging with more detailed information

Overall Feedback

This PR introduces some solid improvements to the transcription feature. 👏 The switch to AAC compression and automatic segmentation should help with both storage and API limits, which is a big win! 💾 The extended timeout and improved error handling with detailed logs will make debugging much easier. 🛠️ I noticed the removal of the "Draft:" prefix in the PR title - nice touch for clarity! 😊 Overall, these changes seem well-thought-out and address real user experience issues. @Pertempto, good job tackling this!

- Add detailed error logging for API failures (full response body) - Improve error detail extraction (include code and metadata) - Show user-friendly message for recordings over 5 min that hit API limits - Log shows 772s recording hit 400 error - likely Gemini audio limit Co-authored-by: Shelley <shelley@exe.dev>

Long recordings (>4 min) are automatically split into segments: - Timer monitors recording duration - Auto-splits at 4 min boundaries (seamlessly continues recording) - Each segment transcribed separately then concatenated - Supports recordings of any length (12+ min tested) Also improved error logging: - Full response body logged on API errors - Better error detail extraction with code/metadata Co-authored-by: Shelley <shelley@exe.dev>

Keep recording as single continuous file for seamless playback. Split into 2MB chunks only when sending to API for transcription. - Single file recording: full audio playback works - Chunking on transcribe: splits by bytes, transcribes each, concatenates - Simpler code: no timers or segment management during recording Co-authored-by: Shelley <shelley@exe.dev>

github-actions · 2026-02-06T10:07:45Z

Changes Requested

Audio format compatibility
- Suggestion: Double-check OpenRouter API documentation to confirm 'm4a' format support for transcription input
Device compatibility
- Suggestion: Add a fallback to PCM if AAC encoding fails on some devices
Chunked transcription quality
- Suggestion: Test with boundary words to ensure chunk splitting doesn't break meaningful phrases

Summary of Changes

Switched audio recording from PCM16 to AAC format for better compression
Implemented file-based recording instead of in-memory chunks
Added chunking logic for large audio files during transcription
Improved error handling and logging in OpenRouter service
Updated app version to 0.15.4+28

Overall Feedback

@Pertempto This is a solid improvement to handle transcription timeouts! Moving from PCM to AAC should significantly reduce file sizes and using file paths instead of in-memory buffers is much cleaner. I particularly like the chunking approach for larger recordings.

The enhanced error logging in OpenRouterService will definitely help with debugging transcription issues. Just a few points to verify around audio format support and potential edge cases with chunked processing.

github-actions · 2026-02-06T10:15:17Z

📱 Preview APK built! 0.15.4-pr76-5bae63c

⬇️ Download APK

github-actions · 2026-02-06T10:19:44Z

Changes Requested

Add try-catch around file operations like file.readAsBytes() to gracefully handle cases where the temp file might not exist or be accessible
Ensure that audio chunking preserves codec frame boundaries (especially important for compressed formats like Opus) to avoid corrupting the audio data at split points
Consider improving the transcription joining logic to handle cases where sentence boundaries fall across chunk splits, possibly by overlapping chunks slightly or implementing smarter text merging

Summary of Changes

Switched audio recording from PCM16 to Opus compression for smaller file sizes and better API compatibility
Implemented file-based recording using path_provider instead of streaming bytes
Added audio chunking logic to handle large recordings by splitting them into 2MB chunks before transcription
Improved error handling and logging in the OpenRouter service, including better parsing of API error responses
Updated version number from 0.15.3+27 to 0.15.4+28

Overall Feedback

Good improvements to the transcription workflow! 👍 Moving to file-based recording and compression should help with performance and reliability. The chunking logic is a smart way to handle API limits. Just keep an eye on how well the chunked transcriptions merge back together, especially with natural sentence breaks.

Gemini may not recognize 'm4a' as a valid format. M4A is just a container - the codec is AAC, so report as 'aac'. Co-authored-by: Shelley <shelley@exe.dev>

AAC/M4A wasn't being recognized by Gemini ('Model input cannot be empty'). Opus/OGG is compressed like AAC but has better API support. - Record as .opus instead of .m4a - Map opus -> ogg format for API Co-authored-by: Shelley <shelley@exe.dev>

github-actions · 2026-02-06T11:34:43Z

📱 Preview APK built! 0.15.4-pr76-00e465b

⬇️ Download APK

github-actions · 2026-02-06T11:36:44Z

Changes Requested

Audio format handling in openrouter_service.dart
- Issue: m4a was removed from supportedFormats list, but is explicitly mapped to aac above.
- Suggestion: Either re-add m4a to supportedFormats or remove the explicit mapping for m4a to ensure consistent handling.
Audio chunking comment clarity in recitation_mode.dart
- Issue: Comment says WAV at 16kHz mono 16-bit = 32KB/s, so ~2 min per chunk = 3.8MB, but 4MB is used as the limit.
- Suggestion: Update comment to clarify that 4MB is used as a safe limit under API's ~20MB limit.

Summary of Changes

Increase transcription timeout from 60s to 120s
Update audio format handling for m4a and opus files
Improve error parsing from OpenRouter API
Refactor recitation mode to record directly to file instead of streaming bytes
Add chunking logic for large audio files
Update version to 0.15.4+28

Overall Feedback

The changes look good overall. Increasing the transcription timeout and adding chunking support for large audio files should help with transcription reliability. The refactor to record directly to a file simplifies the audio handling logic. @Pertempto, please check the audio format handling for m4a files and consider updating the chunking comment for clarity.

OpenRouter's input_audio only supports mp3 and wav formats. AAC/Opus were being rejected with 'Model input cannot be empty'. Changes: - Record as WAV instead of Opus - Increase chunk size to 4MB (~2 min audio) - Increase timeout to 120s for larger uploads Co-authored-by: Shelley <shelley@exe.dev>

github-actions · 2026-02-07T13:10:27Z

Changes Requested

Audio format mapping in openrouter_service.dart
- Consider validating supported formats against API documentation or adding unit tests for the format detection logic.
- Add documentation or comments explaining why certain formats map to others (e.g. m4a -> aac).
Audio chunking implementation
- Add validation for WAV header structure to handle non-standard WAV files.
- Consider the impact of chunk boundaries on transcription accuracy and add appropriate overlap or padding.
- Fix calculation comment: WAV at 16kHz mono 16-bit = 32,000 bytes/sec (not 32KB/s as stated in comments).

Summary of Changes

Increased transcription timeout from 60s to 120s
Updated audio recording to use WAV format and file-based storage instead of in-memory streams
Added audio chunking support for large recordings with proper WAV header handling
Improved error logging and handling in OpenRouter service
Added detailed logging for transcription process

Overall Feedback

The audio chunking and improved error handling look good overall. The timeout increase and format support should help with transcription reliability. @Pertempto, be sure to verify the WAV chunking handles various edge cases in header formats. The logging improvements will definitely help with debugging transcription issues! 🎉

WAV files need headers for each chunk - can't just split raw bytes. Now each chunk gets a valid WAV header with correct size fields. Added detailed logging to error log: - Chunk sizes and estimated durations - Preview of each chunk's transcription result (first/last 50 chars) - Final combined transcription length Co-authored-by: Shelley <shelley@exe.dev>

github-actions · 2026-02-07T13:55:27Z

📱 Preview APK built! 0.15.4-pr76-72d14c7

⬇️ Download APK

exe.dev user and others added 9 commits February 4, 2026 22:35

chore: bump version to 0.15.3+27

b8ccaff

Co-authored-by: Shelley <shelley@exe.dev>

refactor: expose timeout constants for error messages

c13b508

Made transcriptionTimeout and recognitionTimeout public in OpenRouterService so error messages stay in sync with actual values. Co-authored-by: Shelley <shelley@exe.dev>

feat(logging): log audio file size before upload

fb7ebb4

Added logInfo() to ErrorLoggerService. Now logs audio size (KB) and duration (s) before each transcription attempt, helping diagnose if large files cause timeouts on certain devices. Co-authored-by: Shelley <shelley@exe.dev>

feat(logging): add base64 size and model to transcription logs

9155ec7

Log now shows: - WAV file size (KB) - Base64 payload size (KB) - actual upload size - Audio duration (s) - Model name being used Co-authored-by: Shelley <shelley@exe.dev>

merge: incorporate upstream main

6cb3f24

Co-authored-by: Shelley <shelley@exe.dev>

chore: bump version to 0.15.4+28

93b80f5

Co-authored-by: Shelley <shelley@exe.dev>

exe.dev user and others added 3 commits February 6, 2026 09:55

exe.dev user and others added 2 commits February 6, 2026 11:23

fix(transcription): map m4a to aac format for Gemini API

00e465b

Gemini may not recognize 'm4a' as a valid format. M4A is just a container - the codec is AAC, so report as 'aac'. Co-authored-by: Shelley <shelley@exe.dev>

Pertempto merged commit 727429d into main Feb 7, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix: transcription timeout#76

fix: transcription timeout#76
Pertempto merged 16 commits intomainfrom
fix/transcription-timeout

Pertempto commented Feb 6, 2026

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

github-actions bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

github-actions bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

github-actions bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

Pertempto commented Feb 6, 2026

Uh oh!

github-actions bot commented Feb 6, 2026

Changes Requested

Summary of Changes

Overall Feedback

Uh oh!

github-actions bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of Changes

Overall Feedback

Uh oh!

github-actions bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Requested

Summary of Changes

Overall Feedback

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

github-actions bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Requested

Summary of Changes

Overall Feedback

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

github-actions bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Requested

Summary of Changes

Overall Feedback

Uh oh!

github-actions bot commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Requested

Summary of Changes

Overall Feedback

Uh oh!

github-actions bot commented Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Feb 6, 2026 •

edited

Loading

github-actions bot commented Feb 6, 2026 •

edited

Loading

github-actions bot commented Feb 6, 2026 •

edited

Loading

github-actions bot commented Feb 6, 2026 •

edited

Loading

github-actions bot commented Feb 7, 2026 •

edited

Loading