Audio

Jukebox: A Generative Model for Music [code] (May 2020)
A hierarchical VQ-VAE architecture to compress audio into a discrete space and them use this compressed audio with an autoregressive Sparse Transformer.

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (Jan 2019)
3 models trained separately: speaker encoder (identification with triple loss), Synthesizer which uses the speaker encoder as input and the phoneme sequence, and the vocoder (wavenet) to get a waveform.

Interpretable Convolutional Filters with SincNet [code] (Nov 2018)
Convolutions for raw audio using sinc function as a frequency filter instead of normal convolutions.

How to add a paper / dataset:

Check in which category the paper fits
Check in which subcategory the paper fits (create a new one if needed)
Add the title, link, the month and year it was published, a link to the code if exits and the contribution of the paper. Papers should be sorted by more recent first in each category. Example:

Examples:

Title of the paper [code] (Jun 2018)
A couple of lines describing the main contribution of the paper. Do not copy the abstract or write more than 2 lines in order to keep the wiki tidy.

Title of the paper (Jan 2018)
A couple of lines describing the main contribution of the paper. Do not copy the abstract or write more than 2 lines in order to keep the wiki tidy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio

How to add a paper / dataset:

Examples:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Home

Categories:

Datasets

Clone this wiki locally