April 1, 2018

Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation

Key Points

Key points are not available for this paper at this time.

Abstract

The recently-proposed deep clustering algorithm represents a fundamental advance towards solving the cocktail party problem in the single-channel case. When multiple microphones are available, spatial information can be leveraged to differentiate signals from different directions. This study combines spectral and spatial features in a deep clustering framework so that the complementary spectral and spatial information can be simultaneously exploited to improve speech separation. We find that simply encoding inter-microphone phase patterns as additional input features during deep clustering provides a significant improvement in separation performance, even with random microphone array geometry. Experiments on a spatial-ized version of the wsj0-2mix dataset show the strong potential of the proposed algorithm for speech separation in reverberant environments.

Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation

Key Points

Abstract

Cite This Study

Also Consider

Also Consider