Conversation
The outer _apiTimeout (30s) was firing before the service's transcription timeout (60s) could complete, causing spurious TimeoutException errors for users. Now relies on OpenRouterService's internal timeouts: - 60s for audio transcription - 30s for passage recognition Co-authored-by: Shelley <shelley@exe.dev>
User-facing: - Clear message about slow internet connection Error logs now include: - Timeout duration (60s transcription, 30s recognition) - Audio size in bytes and duration in seconds - Transcription text length for recognition errors Co-authored-by: Shelley <shelley@exe.dev>
Co-authored-by: Shelley <shelley@exe.dev>
Made transcriptionTimeout and recognitionTimeout public in OpenRouterService so error messages stay in sync with actual values. Co-authored-by: Shelley <shelley@exe.dev>
Added logInfo() to ErrorLoggerService. Now logs audio size (KB) and duration (s) before each transcription attempt, helping diagnose if large files cause timeouts on certain devices. Co-authored-by: Shelley <shelley@exe.dev>
Log now shows: - WAV file size (KB) - Base64 payload size (KB) - actual upload size - Audio duration (s) - Model name being used Co-authored-by: Shelley <shelley@exe.dev>
Switch from PCM/WAV streaming to AAC file recording: - AAC-LC at 64kbps is ~10-20x smaller than uncompressed PCM - A 5.7 min recording: ~14MB WAV → ~700KB AAC - Much more likely to upload within timeout on mobile networks Changes: - Record to temp .m4a file instead of streaming PCM chunks - Send compressed audio directly (no WAV encoding needed) - Playback uses file path instead of in-memory bytes - Clean up temp files on discard/dispose Co-authored-by: Shelley <shelley@exe.dev>
Co-authored-by: Shelley <shelley@exe.dev>
Co-authored-by: Shelley <shelley@exe.dev>
Changes Requested
Summary of Changes
Overall FeedbackGreat work updating the audio recording to use AAC compression and file-based storage! This should significantly reduce the file sizes and improve performance. The switch from streaming byte chunks to direct file recording simplifies the code and makes it more efficient. I have a few suggestions regarding error handling and documentation that would make this even better. @Pertempto |
Summary of Changes
Overall FeedbackThis PR introduces some solid improvements to the transcription feature. 👏 The switch to AAC compression and automatic segmentation should help with both storage and API limits, which is a big win! 💾 The extended timeout and improved error handling with detailed logs will make debugging much easier. 🛠️ I noticed the removal of the "Draft:" prefix in the PR title - nice touch for clarity! 😊 Overall, these changes seem well-thought-out and address real user experience issues. @Pertempto, good job tackling this! |
- Add detailed error logging for API failures (full response body) - Improve error detail extraction (include code and metadata) - Show user-friendly message for recordings over 5 min that hit API limits - Log shows 772s recording hit 400 error - likely Gemini audio limit Co-authored-by: Shelley <shelley@exe.dev>
Long recordings (>4 min) are automatically split into segments: - Timer monitors recording duration - Auto-splits at 4 min boundaries (seamlessly continues recording) - Each segment transcribed separately then concatenated - Supports recordings of any length (12+ min tested) Also improved error logging: - Full response body logged on API errors - Better error detail extraction with code/metadata Co-authored-by: Shelley <shelley@exe.dev>
Keep recording as single continuous file for seamless playback. Split into 2MB chunks only when sending to API for transcription. - Single file recording: full audio playback works - Chunking on transcribe: splits by bytes, transcribes each, concatenates - Simpler code: no timers or segment management during recording Co-authored-by: Shelley <shelley@exe.dev>
Changes Requested
Summary of Changes
Overall Feedback@Pertempto This is a solid improvement to handle transcription timeouts! Moving from PCM to AAC should significantly reduce file sizes and using file paths instead of in-memory buffers is much cleaner. I particularly like the chunking approach for larger recordings. The enhanced error logging in OpenRouterService will definitely help with debugging transcription issues. Just a few points to verify around audio format support and potential edge cases with chunked processing. |
|
📱 Preview APK built! |
Changes Requested
Summary of Changes
Overall FeedbackGood improvements to the transcription workflow! 👍 Moving to file-based recording and compression should help with performance and reliability. The chunking logic is a smart way to handle API limits. Just keep an eye on how well the chunked transcriptions merge back together, especially with natural sentence breaks. |
Gemini may not recognize 'm4a' as a valid format. M4A is just a container - the codec is AAC, so report as 'aac'. Co-authored-by: Shelley <shelley@exe.dev>
AAC/M4A wasn't being recognized by Gemini ('Model input cannot be empty').
Opus/OGG is compressed like AAC but has better API support.
- Record as .opus instead of .m4a
- Map opus -> ogg format for API
Co-authored-by: Shelley <shelley@exe.dev>
|
📱 Preview APK built! |
Changes Requested
Summary of Changes
Overall FeedbackThe changes look good overall. Increasing the transcription timeout and adding chunking support for large audio files should help with transcription reliability. The refactor to record directly to a file simplifies the audio handling logic. @Pertempto, please check the audio format handling for m4a files and consider updating the chunking comment for clarity. |
OpenRouter's input_audio only supports mp3 and wav formats. AAC/Opus were being rejected with 'Model input cannot be empty'. Changes: - Record as WAV instead of Opus - Increase chunk size to 4MB (~2 min audio) - Increase timeout to 120s for larger uploads Co-authored-by: Shelley <shelley@exe.dev>
Changes Requested
Summary of Changes
Overall FeedbackThe audio chunking and improved error handling look good overall. The timeout increase and format support should help with transcription reliability. @Pertempto, be sure to verify the WAV chunking handles various edge cases in header formats. The logging improvements will definitely help with debugging transcription issues! 🎉 |
WAV files need headers for each chunk - can't just split raw bytes. Now each chunk gets a valid WAV header with correct size fields. Added detailed logging to error log: - Chunk sizes and estimated durations - Preview of each chunk's transcription result (first/last 50 chars) - Final combined transcription length Co-authored-by: Shelley <shelley@exe.dev>
|
📱 Preview APK built! |
No description provided.