-
Notifications
You must be signed in to change notification settings - Fork 4
Added a note about the maximum duration of realtime transcription sessions (3 hours) #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -3,26 +3,28 @@ | |||||
| description: "Media files limitations" | ||||||
| --- | ||||||
|
|
||||||
| We support almost all types of audio or video files with a tradeoff to be taken into account between the transfer time of specific formats that can generate big files and the time to convert the original format to the target one (WAV pcm 16KHz little-endian). | ||||||
|
|
||||||
| You can find an estimate of the conversion times in the table below. | ||||||
|
|
||||||
|
|
||||||
| ## Gladia API current limitations | ||||||
|
|
||||||
| <Tip> | ||||||
| Those limits will be gradually lifted to ensure the full stability and performance of the service for everyone. | ||||||
| </Tip> | ||||||
|
|
||||||
|
|
||||||
| - **Audio length**: The maximum length of audio that can be transcribed in a single request is currently | ||||||
| - **Audio length (pre-recorded)**: The maximum length of audio that can be transcribed in a single request is currently | ||||||
| 135 minutes. Attempts to transcribe longer audio files will result in an error. | ||||||
| Direct YouTube links are limited to 120 minutes instead of 135 minutes. | ||||||
|
|
||||||
| <Tip> | ||||||
| We support up to 4h15 audio length for enterprise plans. | ||||||
| </Tip> | ||||||
|
|
||||||
| - **Realtime session duration**: For [live (realtime) transcription](/chapters/live-stt/quickstart), a single WebSocket session cannot exceed **3 hours**. The session will be terminated after 3 hours; for longer events, start a new session before reaching the limit. | ||||||
|
|
||||||
|
Comment on lines
+26
to
+27
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same terminology nit: "Realtime" → "Real-time". Same issue as in Suggested fix-- **Realtime session duration**: For [live (realtime) transcription](/chapters/live-stt/quickstart), a single WebSocket session cannot exceed **3 hours**. The session will be terminated after 3 hours; for longer events, start a new session before reaching the limit.
+- **Real-time session duration**: For [live (real-time) transcription](/chapters/live-stt/quickstart), a single WebSocket session cannot exceed **3 hours**. The session will be terminated after 3 hours; for longer events, start a new session before reaching the limit.📝 Committable suggestion
Suggested change
🧰 Tools🪛 GitHub Check: Mintlify Validation (gladia-95) - vale-spellcheck[warning] 26-26: chapters/limits-and-specifications/supported-formats.mdx#L26 🤖 Prompt for AI Agents |
||||||
| - **File size**: Audio files must not exceed 1000 MB in size. Larger files will not be accepted by the API. | ||||||
|
|
||||||
| ### Splitting oversize audio files | ||||||
|
|
@@ -31,7 +33,7 @@ | |||||
| Tools for Splitting Audio Files: | ||||||
|
|
||||||
| - **FFMPEG** : FFMPEG is a versatile command-line tool that can be used to manipulate audio and video files. It is a popular choice for splitting long audio files. | ||||||
| - **ffmpeg-python** : For Python users, ffmpeg-python is a wrapper around FFMPEG that provides a more Pythonic interface for interacting with FFMPEG. | ||||||
| - **prism-media** for Node.js : Node.js users can use prism-media for manipulating media files, including splitting audio files. | ||||||
| - **fluent-ffmpeg** for Node.js : Another option for Node.js users is fluent-ffmpeg, which offers a simpler and more fluent API for handling media files. | ||||||
|
|
||||||
|
|
@@ -41,16 +43,16 @@ | |||||
|
|
||||||
| | Source Format | Mime Type | Audio/Video | | ||||||
| |---------------|----------------|-------------| | ||||||
| | aac | audio/aac | Audio | | ||||||
| | ac3 | audio/ac3 | Audio | | ||||||
| | eac3 | audio/eac3 | Audio | | ||||||
| | flac | audio/flac | Audio | | ||||||
| | m4a | audio/mp4 | Audio | | ||||||
| | mp2 | audio/mpeg | Audio | | ||||||
| | mp3 | audio/mpeg | Audio | | ||||||
| | ogg | application/ogg| Audio | | ||||||
| | opus | audio/opus | Audio | | ||||||
| | wav | audio/wav | Audio | | ||||||
|
|
||||||
|
|
||||||
| ## Supported video formats | ||||||
|
|
@@ -59,13 +61,13 @@ | |||||
| |---------------|----------------------|-------------| | ||||||
| | 3g2 | video/3gpp2 | Video | | ||||||
| | 3gp | video/3gpp | Video | | ||||||
| | avi | video/x-msvideo | Video | | ||||||
| | flv | video/x-flv | Video | | ||||||
| | m4v | video/x-m4v | Video | | ||||||
| | matroska | video/x-matroska | Audio/Video | | ||||||
| | mov | video/quicktime | Video | | ||||||
| | mp4 | video/mp4 | Audio/Video | | ||||||
| | wmv | video/x-ms-wmv | Video | | ||||||
|
|
||||||
|
|
||||||
| ## Supported online video services | ||||||
|
|
@@ -77,10 +79,10 @@ | |||||
| | Instagram | Video | Released | | ||||||
| | Facebook | Video | Released | | ||||||
| | Vimeo | Video | Released | | ||||||
| | Dailymotion | Video | Released | | ||||||
| | LinkedIn | Video | Released | | ||||||
| | Sharechat | Video | Released | | ||||||
| | Likee | Video | Released | | ||||||
| | TikTok (Beta) | Video | Beta | | ||||||
| | Twitter (Beta) | Video | Beta | | ||||||
|
|
||||||
|
|
@@ -91,21 +93,21 @@ | |||||
| |---------------|------------------|-------------|------------------------------|------------------------------------| | ||||||
| | 3g2 | video/3gpp2 | Video | ~300 MB | ~30 seconds | | ||||||
| | 3gp | video/3gpp | Video | ~300 MB | ~40 seconds | | ||||||
| | aac | audio/aac | Audio | ~60 MB | ~36 seconds | | ||||||
| | ac3 | audio/ac3 | Audio | ~215 MB | ~42 seconds | | ||||||
| | avi | video/x-msvideo | Video | ~800 MB | ~1 minute | | ||||||
| | eac3 | audio/eac3 | Audio | ~215 MB | ~32 seconds | | ||||||
| | flac | audio/flac | Audio | ~260 MB | ~46 seconds | | ||||||
| | flv | video/x-flv | Video | ~400 MB | ~40 seconds | | ||||||
| | m4a | audio/m4a | Audio | ~60 MB | ~26 seconds | | ||||||
| | x-m4a | audio/x-m4a | Audio | ~60 MB | ~26 seconds | | ||||||
| | m4v | video/x-m4v | Video | ~800 MB | ~1 minute | | ||||||
| | matroska | video/x-matroska | Audio/Video | ~800 MB | ~1 minute | | ||||||
| | mov | video/quicktime | Video | ~800 MB | ~1 minute | | ||||||
| | mp2 | audio/mpeg | Audio | ~120 MB | ~42 seconds | | ||||||
| | mp3 | audio/mpeg | Audio | ~120 MB | ~37 seconds | | ||||||
| | mp4 | video/mp4 | Audio/Video | ~800 MB | ~1 minute | | ||||||
| | ogg | application/ogg | Audio | ~60 MB | ~1 minute | | ||||||
| | opus | audio/opus | Audio | ~30 MB | ~1 minute | | ||||||
| | wav | audio/wav | Audio | ~510 MB | N/A | | ||||||
| | wmv | video/x-ms-wmv | Video | ~800 MB | ~1 minute | | ||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,11 +1,15 @@ | ||||||||||||||
| --- | ||||||||||||||
| title: Quickstart | ||||||||||||||
| description: How to transcribe live audio with Gladia's Real-time speech-to-text (STT) API | ||||||||||||||
| --- | ||||||||||||||
|
|
||||||||||||||
| import LiveFlowFull from "/snippets/live-flow-full.mdx"; | ||||||||||||||
| import Samples from '/snippets/samples.mdx'; | ||||||||||||||
|
|
||||||||||||||
| <Note> | ||||||||||||||
| A single realtime transcription session cannot exceed **3 hours**. For longer events, start a new session before reaching the limit. See [Concurrency and rate limits](/chapters/limits-and-specifications/concurrency) and [Supported files & duration](/chapters/limits-and-specifications/supported-formats) for details. | ||||||||||||||
| </Note> | ||||||||||||||
|
Comment on lines
+9
to
+11
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good placement and cross-referencing. The note is well-positioned at the top of the quickstart for high visibility, and the links to the limits pages are helpful. One nit for consistency: line 3 of this very file says "Real-time speech-to-text" but line 10 uses "realtime." Prefer the hyphenated form here as well. Suggested fix <Note>
-A single realtime transcription session cannot exceed **3 hours**. For longer events, start a new session before reaching the limit. See [Concurrency and rate limits](/chapters/limits-and-specifications/concurrency) and [Supported files & duration](/chapters/limits-and-specifications/supported-formats) for details.
+A single real-time transcription session cannot exceed **3 hours**. For longer events, start a new session before reaching the limit. See [Concurrency and rate limits](/chapters/limits-and-specifications/concurrency) and [Supported files & duration](/chapters/limits-and-specifications/supported-formats) for details.
</Note>📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||
|
|
||||||||||||||
| <LiveFlowFull /> | ||||||||||||||
| <Tip> | ||||||||||||||
| Want to know more about a specific feature? Check out our [Features chapter](/chapters/live-stt/features) for more details. | ||||||||||||||
|
|
||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent terminology: "realtime" vs. "real-time".
The rest of this document (line 29) and the quickstart page title (line 3 of
quickstart.mdx) use the hyphenated form "real-time," but lines 32 and 34 introduce "Realtime" / "realtime." The vale spellcheck also flags this. Consider using "Real-time" consistently to match the existing docs.Also, a minor wording inconsistency: this file says "when approaching the limit" while the other two files say "before reaching the limit." Aligning the phrasing across all three pages would reduce reader confusion.
Suggested fix
📝 Committable suggestion
🧰 Tools
🪛 GitHub Check: Mintlify Validation (gladia-95) - vale-spellcheck
[warning] 32-32: chapters/limits-and-specifications/concurrency.mdx#L32
Did you really mean 'Realtime'?
[warning] 34-34: chapters/limits-and-specifications/concurrency.mdx#L34
Did you really mean 'realtime'?
🤖 Prompt for AI Agents