Skip to content

Releases: Bridgeconn/vachan-api

v2.0.19

21 Nov 04:21

Choose a tag to compare

vachan-access v2.0.19

A . New APIS

Forgot Password (Recovery by Code) — 3 APIs

  1. Initiate recovery
    POST /v2/auth/user/forgot-password
    Triggers a Kratos recovery flow (method=code) and emails a 6-digit code to the user.

  2. Verify recovery code
    POST /v2/auth/user/verify-recovery-code
    Validates the code and returns the settings_flow_id (server-side step to proceed with password change).

  3. Set new password
    POST /v2/auth/user/reset-password
    Completes the flow by posting the new password to Kratos settings API .

B. Enhancement

Simplified password rules in kratos.yml:

  1. Users can choose very short passwords (6+ characters)
  2. Kratos will not check passwords against the public “Have I Been Pwned” breach database.
  3. Kratos will allow passwords similar to the username/email

v2.0.18

31 Oct 10:51

Choose a tag to compare

vachan-ai 25.10.16-beta

Text-to-Speech:

  • Added new model Indic Parler-TTS
    • Supproted languages: Assamese, Bengali, Bodo, Dogri, English, Gujarati, Hindi, Kannada, Konkani, Maithili, Malayalam, Manipuri, Marathi, Nepali, Odia, Sanskrit, Santali, Sindhi, Tamil, Telugu, and Urdu.
    • 'language' field is not required for indic-parler-tts model.
    • Speaker Voice: Generated audios use the model's default speaker voices. Can be changed through description. For more details refer to link .
    • New input field 'description': A detailed description of how the speech should sound, e.g., "Leela speaks in a high-pitched, fast-paced, and cheerful tone, full of energy and happiness. The recording is very high quality with no background noise." This field is relevant only for indic-parler-tts model. Can be left empty.
  • Added finetuned Chatterbox model
    • Supproted languages: English, Hindi, Telugu and Malayalam
    • Voice Customization: Audios can be generated in desired voice by passing reference audio

Text translation :

  • The output of document translation can be downloaded either in txt, docx or html format
  • Now finetuned NLLB models can be used via APIs
    • nllb-english-zeme (one-direction)
    • nllb-english-nagamese (bi-directional)
    • nllb-gujrathi-koli_kachchi (one-direction)
    • nllb-hindi-surjapuri (bi-directional)
    • nllb-gujrathi-kukna (bi-directional)
    • nllb-gujrathi-kutchi (bi-directional)

Languages :

  • Included Post Language API
  • Added new language column ISO_639_1_code in the language table using alembic migration

Enhancements :

  • Model dependencies are now loaded from an external configuration file, allowing deployment servers to modify them without changing the codebase.
  • Mlflow upgradation
  • Enhanced GPU prediction process
  • Integrated Pyright type checks across the codebase and in the CI pipeline

Docs:

  • Updated architecture diagram and documentation

Next release plan:

  • Speech-to-Text : Optionally generate SRT/VTT files of the uploaded audio
  • Speech-to-Speech : Cascaded pipeline using STT, MT and TTS APIs
  • Text-to-speech: Chatterbox model with more language support
  • Python version update

v2.0.17

19 Aug 03:50

Choose a tag to compare

Vachan-AI

New features

  • TTS: Integrated orpheous wrapper
  • STT: punctuated transcripts using LLM Api
  • Voice clone: Integrated chatterbox wrapper

Enhancement

  • Serving mode for VACHAN_AI_ENV to suppress warning logs during app startup
  • Changed predict function param to model_input based on the mlflow warning that appear during the app start
  • Orpheous model handles with or without reference audio
  • Package updation-SQLalchemy,pydantic ,uvicorn ,pitz ,librosa,python-multipart,fastapi,pytest,pytest-mock,pylint,redis,rq,numpy,pydantic-settings,skypilot,llama-index,docx2txt,PyMuPDF,orjson
  • Router modularization -seperate ai_apis into seperate files ( assets_apis.py, cloud_operations_apis.py, inference_apis.py, jobs_apis.py, languages_apis.py, model_apis.py)
  • Added type hints for predict function in all wrappers ,for removing warnings from app start
  • MT Cloud finetuning - Included training script and config file for MT cloud fine tuning
  • Updated pylint version to v3.3.7
  • Changed the input parameter name in voice clone API to input_transcripts

Docs

  • Converted i/p params json to valid format in models csv
  • Modified architecture documentation
  • Updated app folder structure
  • Container diagram, component diagram : included ray, modified training pipeline
  • Moved auth csvs into a new folder within data
  • Updated model csv and model documentation (Added chatterbox details)

v2.0.16

06 May 06:48

Choose a tag to compare

vachan-ai 25.4.22-beta

New features:

  • API for audio enhancement using resemble-enhance
  • Model logging feature for resemble-enhance
  • New API for TTS
    • Accepts Reference audio
    • Accepts text as document(txt file), splits text into sentences, generate audios for sentences
  • Text-to-speech with ToucanTTS (pre-trained and fine-tuned models)
    • Audios can be generated based on a reference audio
  • Model logging feature for ToucanTTS
  • Priority queue: Jobs can be handled based on assigned priorities

Enhancements:

  • Audios can be generated in ogg format
  • Enhanced audio generation
    • Enhancements using resemble-enhance
    • Sampling rate : 48khz
    • Bit depth : PCM_24
  • Updated languages csv to include more languages(as toucan model supports 7000+ languages)
  • 'model_name' field of all apis is a drop down now, listing the available models
  • Detailed logs added

Code refactoring:

  • Split inference core (having core functions of different services) into small modules for following features
    • Noise removal
    • Forced alignment
    • Audio segmentation
    • Speech-to-text
  • Moved model specific env variables into a data class
  • Included config file for the project
    • Moved frequently changing env variables into the config file
    • App can be configured with a config file that sits independent of the code base

v2.0.15

20 Jan 08:22

Choose a tag to compare

vachan-ai

  • Parallel processing for handling multiple jobs
  • Audios can be generated in mp3 format
  • Model logging : model_uri is optional now
  • Minor fixes
  • Calendar versioning for the app

v2.0.14

06 Jan 12:23

Choose a tag to compare

vachan-ai

New features

Enhancements

  • Output files of TTS, S2S, voice-clone and noise removal will have same name as that of input @NoelSudhish
  • Improved segmentation accuracy by post-processing the segments info @anjalyv
  • Package update @shaziya
  • Other fixes @NoelSudhish , @shaziya

Documentation

v2.0.13

27 Nov 04:15

Choose a tag to compare

vachan-ai

Bug fixes:

  • Resolved issue with mp3 audio files @anjalyv
  • Raising error related to issues on the model serving side @Jayasankar-kk
  • Updated model info record @shaziya
  • MT: Enhanced pdf, txt file processing @anjalyv
  • Minor fixes

v2.0.12

05 Nov 12:15

Choose a tag to compare

vachan-ai

  • Model logging: Model wrapper for finetuned mms-tts models
  • Jobs table: Included new columns User_id, Job_creation_time and Job_updation_time
  • API for retrieving job history based on user_id
  • Model serving using Rayserve
  • MT: Text splitting utils using nltk
  • MT: Document translation
  • Audio segmententation api
  • Included model metadata along with the model while logging

v2.0.11

13 Sep 05:24

Choose a tag to compare

What's Changed

  • Updated the make file by @Joel-C-Johnson in #864
  • Modified AWS ENV variables by @NoelSudhish in #872
  • New release of vachan-ai [ 0.0.0-alpha.15] by @NoelSudhish in #874
    - API for background noise removal
    - Enhancement in logging models from s3 bucket
    - Model logging script for DeepFilterNet model (noise removal)
    - Fixed gitlab pipeline issue with Kratos by using docker profile

Full Changelog: v2.0.10...v2.0.11

v2.0.10

30 Aug 10:41

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v2.0.9...v2.0.10