DACVAE

VAE version of the Descript Audio Codec, which has a continuous latent space. Descript Audio Codec (DAC) is a high-fidelity general neural audio codec, introduced in the paper titled High-Fidelity Audio Compression with Improved RVQGAN. Most code is adopted from the open-source repo DAC

Installation

$ pip install git+https://github.com/facebookresearch/dacvae

Usage

from dacvae import DACVAE
import torchaudio


model = DACVAE.load("facebook/dacvae-watermarked")
wav, sample_rate = torchaudio.load("<path to audio file>")
# Resample to expected sample rate
resampled = torchaudio.functional.resample(wav, sample_rate, model.sample_rate)
# Convert stereo to mono (if applicable)
resampled = resampled.mean(dim=0, keepdim=True)
# Expected shape is batch x 1 x samples
model_input = resampled.unsqueeze(0)
encoded = model.encode(model_input)
# `decoded` shape is `batch x 1 x samples`
decoded = model.decode(encoded)

Watermarking

The DAC-VAE decoder has been integrated with Audioseal to ensure all audios generated contain watermarks that are verifiable independently. We develop a new watermarking model with an adapted architecture specifically for DAC-VAE to optimize the high-fidelity outcome. We also plan to release the detector API. Stay tuned!

Citation

If you found this repository useful, please cite the following paper for DAC-VAE,

@article{dacvae,
  title={Movie gen: A cast of media foundation models},
  author={Polyak, Adam and Zohar, Amit and Brown, Andrew and Tjandra, Andros and Sinha, Animesh and Lee, Ann and Vyas, Apoorv and Shi, Bowen and Ma, Chih-Yao and Chuang, Ching-Yao and others},
  journal={arXiv preprint arXiv:2410.13720},
  year={2024}
}

and the following paper for watermarking.

@article{audioseal,
 title={Proactive Detection of Voice Cloning with Localized Watermarking},
 author={San Roman, Robin and Fernandez, Pierre and Elsahar, Hady and D´efossez, Alexandre and Furon, Teddy and Tran, Tuan},
 journal={ICML},
 year={2024}
}

Contributing

See contributing and code of conduct for more information.

License

This project is licensed under Apache-2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
dacvae		dacvae
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DACVAE

Installation

Usage

Watermarking

Citation

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Languages

License

facebookresearch/dacvae

Folders and files

Latest commit

History

Repository files navigation

DACVAE

Installation

Usage

Watermarking

Citation

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Languages

Packages