September 1, 2024

Investigating Confidence Estimation Measures for Speaker Diarization

Key Points

Key points are not available for this paper at this time.

Abstract

Speaker diarization systems segment a conversation recording based on the speakers' identity. Such systems can misclassify the speaker of a portion of audio due to a variety of factors, such as speech pattern variation, background noise, and overlapping speech. These errors propagate to, and can adversely affect, downstream systems that rely on the speaker's identity, such as speaker-adapted speech recognition. One of the ways to mitigate these errors is to provide segment-level diarization confidence scores to downstream systems. In this work, we investigate multiple methods for generating diarization confidence scores, including those derived from the original diarization system and those derived from an external model. Our experiments across multiple datasets and diarization systems demonstrate that the most competitive confidence score methods can isolate 30% of the diarization errors within segments with the lowest 10% of confidence scores.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Chowdhury et al. (Sun,) studied this question.

www.synapsesocial.com/papers/68e59e92b6db643587538d07 — DOI: https://doi.org/10.21437/interspeech.2024-1044

Authors

Anurag Chowdhury

Abhinav Misra

Mark C. Fuhs

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Investigating Confidence Estimation Measures for Speaker Diarization

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion