MUSAN: A Music, Speech, and Noise Corpus

Key Points

Key points are not available for this paper at this time.

Abstract

This report introduces a new corpus of music, speech, and noise. This dataset is suitable for training models for voice activity detection (VAD) and music/speech discrimination. Our corpus is released under a flexible Creative Commons license. The dataset consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises. We demonstrate use of this corpus for music/speech discrimination on Broadcast news and VAD for speaker identification.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

David Snyder

ECRI Institute

Guoguo Chen

New England Biolabs (China)

Daniel Povey

Xiaomi (China)

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Snyder et al. (Thu,) studied this question.

synapsesocial.com/papers/6a08fcfc944076d22073a909 — DOI: https://doi.org/10.48550/arxiv.1510.08484

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

The Million Song Dataset· 2011 · 831 citations
Supervised/Unsupervised Voice Activity Detectors for Text-dependent Speaker Recognition on the RSR2015 Corpus· 2014 · 33 citations
Music tonality features for speech/music discrimination· 2014 · 48 citations
Kaldi Speech Recognition Toolkit· 2024 · 4,899 citations
Time delay deep neural network-based universal background models for speaker recognition· 2015 · 136 citations

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

The Million Song Dataset· 2011 · 831 citations
Supervised/Unsupervised Voice Activity Detectors for Text-dependent Speaker Recognition on the RSR2015 Corpus· 2014 · 33 citations
Music tonality features for speech/music discrimination· 2014 · 48 citations
Kaldi Speech Recognition Toolkit· 2024 · 4,899 citations
Time delay deep neural network-based universal background models for speaker recognition· 2015 · 136 citations

MUSAN: A Music, Speech, and Noise Corpus

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider