From bd8135351fc1564c89062816d3c71d43d9464745 Mon Sep 17 00:00:00 2001 From: karamouche Date: Wed, 28 Jan 2026 15:04:32 -0500 Subject: [PATCH] Improved custom vocabulary description to explain phoneme-based method --- chapters/live-stt/features/custom-vocabulary.mdx | 4 +++- chapters/pre-recorded-stt/features/custom-vocabulary.mdx | 4 +++- snippets/custom-vocabulary-description.mdx | 9 +++++++++ 3 files changed, 15 insertions(+), 2 deletions(-) create mode 100644 snippets/custom-vocabulary-description.mdx diff --git a/chapters/live-stt/features/custom-vocabulary.mdx b/chapters/live-stt/features/custom-vocabulary.mdx index c56daa6..e664000 100644 --- a/chapters/live-stt/features/custom-vocabulary.mdx +++ b/chapters/live-stt/features/custom-vocabulary.mdx @@ -4,9 +4,11 @@ description: "Boost recognition of domain-specific words and phrases in real tim --- import CustomVocabularyParams from '/snippets/custom-vocabulary-params.mdx' +import CustomVocabularyDescription from '/snippets/custom-vocabulary-description.mdx' -To enhance the precision of words you know will recur often in your transcription, use the `custom_vocabulary` feature. + +## Example configuration ```json { "realtime_processing": { diff --git a/chapters/pre-recorded-stt/features/custom-vocabulary.mdx b/chapters/pre-recorded-stt/features/custom-vocabulary.mdx index ecfd8f6..4becafa 100644 --- a/chapters/pre-recorded-stt/features/custom-vocabulary.mdx +++ b/chapters/pre-recorded-stt/features/custom-vocabulary.mdx @@ -4,9 +4,11 @@ description: "Improve recognition of expected vocabulary in your files" --- import CustomVocabularyParams from '/snippets/custom-vocabulary-params.mdx' +import CustomVocabularyDescription from '/snippets/custom-vocabulary-description.mdx' -To enhance the precision of transcription, especially for recurring words or phrases, use `custom_vocabulary`. + +## Example configuration ```json request data { "audio_url": "YOUR_AUDIO_URL", diff --git a/snippets/custom-vocabulary-description.mdx b/snippets/custom-vocabulary-description.mdx new file mode 100644 index 0000000..bafcc3a --- /dev/null +++ b/snippets/custom-vocabulary-description.mdx @@ -0,0 +1,9 @@ +The custom vocabulary feature allows you to process your transcription results by replacing specific words with terms that better fit your domain. This is especially useful for company names, product names, technical terms, or uncommon words that are often mis-transcribed by speech to text models. + +### How it works + +Custom vocabulary operates at a **word level** and is based on **phoneme similarity**. + +Once the transcription is generated, Gladia compares the phonemes of the transcribed words with the phonemes of the words you provided in your custom vocabulary. If the similarity score is above a defined similarity, the word in the transcription is replaced. + +Alongside the word value, the pronunciations list allows you to define alternative ways a word can be pronounced. This helps cover a wider phoneme range without having to increase the similarity threshold, which could otherwise lead to false positives. It is especially useful for words with multiple common pronunciations, foreign words, or variations caused by accents. \ No newline at end of file