Releases: Bridgeconn/vachan-api
v2.0.19
vachan-access v2.0.19
A . New APIS
Forgot Password (Recovery by Code) — 3 APIs
-
Initiate recovery
POST /v2/auth/user/forgot-password
Triggers a Kratos recovery flow (method=code) and emails a 6-digit code to the user. -
Verify recovery code
POST /v2/auth/user/verify-recovery-code
Validates the code and returns the settings_flow_id (server-side step to proceed with password change). -
Set new password
POST /v2/auth/user/reset-password
Completes the flow by posting the new password to Kratos settings API .
B. Enhancement
Simplified password rules in kratos.yml:
- Users can choose very short passwords (6+ characters)
- Kratos will not check passwords against the public “Have I Been Pwned” breach database.
- Kratos will allow passwords similar to the username/email
v2.0.18
vachan-ai 25.10.16-beta
Text-to-Speech:
- Added new model Indic Parler-TTS
- Supproted languages: Assamese, Bengali, Bodo, Dogri, English, Gujarati, Hindi, Kannada, Konkani, Maithili, Malayalam, Manipuri, Marathi, Nepali, Odia, Sanskrit, Santali, Sindhi, Tamil, Telugu, and Urdu.
- 'language' field is not required for indic-parler-tts model.
- Speaker Voice: Generated audios use the model's default speaker voices. Can be changed through description. For more details refer to link .
- New input field 'description': A detailed description of how the speech should sound, e.g., "Leela speaks in a high-pitched, fast-paced, and cheerful tone, full of energy and happiness. The recording is very high quality with no background noise." This field is relevant only for indic-parler-tts model. Can be left empty.
- Added finetuned Chatterbox model
- Supproted languages: English, Hindi, Telugu and Malayalam
- Voice Customization: Audios can be generated in desired voice by passing reference audio
Text translation :
- The output of document translation can be downloaded either in txt, docx or html format
- Now finetuned NLLB models can be used via APIs
- nllb-english-zeme (one-direction)
- nllb-english-nagamese (bi-directional)
- nllb-gujrathi-koli_kachchi (one-direction)
- nllb-hindi-surjapuri (bi-directional)
- nllb-gujrathi-kukna (bi-directional)
- nllb-gujrathi-kutchi (bi-directional)
Languages :
- Included Post Language API
- Added new language column ISO_639_1_code in the language table using alembic migration
Enhancements :
- Model dependencies are now loaded from an external configuration file, allowing deployment servers to modify them without changing the codebase.
- Mlflow upgradation
- Enhanced GPU prediction process
- Integrated Pyright type checks across the codebase and in the CI pipeline
Docs:
- Updated architecture diagram and documentation
Next release plan:
- Speech-to-Text : Optionally generate SRT/VTT files of the uploaded audio
- Speech-to-Speech : Cascaded pipeline using STT, MT and TTS APIs
- Text-to-speech: Chatterbox model with more language support
- Python version update
v2.0.17
Vachan-AI
New features
- TTS: Integrated orpheous wrapper
- STT: punctuated transcripts using LLM Api
- Voice clone: Integrated chatterbox wrapper
Enhancement
- Serving mode for VACHAN_AI_ENV to suppress warning logs during app startup
- Changed predict function param to model_input based on the mlflow warning that appear during the app start
- Orpheous model handles with or without reference audio
- Package updation-SQLalchemy,pydantic ,uvicorn ,pitz ,librosa,python-multipart,fastapi,pytest,pytest-mock,pylint,redis,rq,numpy,pydantic-settings,skypilot,llama-index,docx2txt,PyMuPDF,orjson
- Router modularization -seperate ai_apis into seperate files ( assets_apis.py, cloud_operations_apis.py, inference_apis.py, jobs_apis.py, languages_apis.py, model_apis.py)
- Added type hints for predict function in all wrappers ,for removing warnings from app start
- MT Cloud finetuning - Included training script and config file for MT cloud fine tuning
- Updated pylint version to v3.3.7
- Changed the input parameter name in voice clone API to input_transcripts
Docs
- Converted i/p params json to valid format in models csv
- Modified architecture documentation
- Updated app folder structure
- Container diagram, component diagram : included ray, modified training pipeline
- Moved auth csvs into a new folder within data
- Updated model csv and model documentation (Added chatterbox details)
v2.0.16
vachan-ai 25.4.22-beta
New features:
- API for audio enhancement using resemble-enhance
- Model logging feature for resemble-enhance
- New API for TTS
- Accepts Reference audio
- Accepts text as document(txt file), splits text into sentences, generate audios for sentences
- Text-to-speech with ToucanTTS (pre-trained and fine-tuned models)
- Audios can be generated based on a reference audio
- Model logging feature for ToucanTTS
- Priority queue: Jobs can be handled based on assigned priorities
Enhancements:
- Audios can be generated in ogg format
- Enhanced audio generation
- Enhancements using resemble-enhance
- Sampling rate : 48khz
- Bit depth : PCM_24
- Updated languages csv to include more languages(as toucan model supports 7000+ languages)
- 'model_name' field of all apis is a drop down now, listing the available models
- Detailed logs added
Code refactoring:
- Split inference core (having core functions of different services) into small modules for following features
- Noise removal
- Forced alignment
- Audio segmentation
- Speech-to-text
- Moved model specific env variables into a data class
- Included config file for the project
- Moved frequently changing env variables into the config file
- App can be configured with a config file that sits independent of the code base
v2.0.15
v2.0.14
vachan-ai
New features
- Device option added: APIS can be run in GPU and CPU @Jayasankar-kk
- API for retrieving served models details @Jayasankar-kk
- Audio splitting utils @anjalyv
- Model logging support for LLMs @Jayasankar-kk
- API for forced alignment @Jayasankar-kk
Enhancements
- Output files of TTS, S2S, voice-clone and noise removal will have same name as that of input @NoelSudhish
- Improved segmentation accuracy by post-processing the segments info @anjalyv
- Package update @shaziya
- Other fixes @NoelSudhish , @shaziya
Documentation
v2.0.13
v2.0.12
vachan-ai
- Model logging: Model wrapper for finetuned mms-tts models
- Jobs table: Included new columns User_id, Job_creation_time and Job_updation_time
- API for retrieving job history based on user_id
- Model serving using Rayserve
- MT: Text splitting utils using nltk
- MT: Document translation
- Audio segmententation api
- Included model metadata along with the model while logging
v2.0.11
What's Changed
- Updated the make file by @Joel-C-Johnson in #864
- Modified AWS ENV variables by @NoelSudhish in #872
- New release of vachan-ai [ 0.0.0-alpha.15] by @NoelSudhish in #874
- API for background noise removal
- Enhancement in logging models from s3 bucket
- Model logging script for DeepFilterNet model (noise removal)
- Fixed gitlab pipeline issue with Kratos by using docker profile
Full Changelog: v2.0.10...v2.0.11
v2.0.10
What's Changed
- Adding more details in deployment steps by @AthulyaMS in #862
- commented out the vachan-api-test container and Run test in workflow by @KetanKBaboo in #863
- auth based on flag for vachan-ai by @NoelSudhish in #866
New Contributors
- @KetanKBaboo made their first contribution in #863
Full Changelog: v2.0.9...v2.0.10