April 1, 2025Open Access

Computer Audition: From Task-Specific Machine Learning to Foundation Models

Key Points

Key points are not available for this paper at this time.

Abstract

Foundation models (FMs) are increasingly spearheading recent advances on a variety of tasks that fall under the purview of computer audition—i.e., the use of machines to understand sounds. They feature several advantages over traditional pipelines: among others, the ability to consolidate multiple tasks in a single model, the option to leverage knowledge from other modalities, and the readily available interaction with human users. Naturally, these promises have created substantial excitement in the audio community and have led to a wave of early attempts to build new, generalpurpose FMs for audio. In the present contribution, we give an overview of computational audio analysis as it transitions from traditional pipelines toward auditory FMs. Our work highlights the key operating principles that underpin those models and showcases how they can accommodate multiple tasks that the audio community previously tackled separately.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Andreas Triantafyllopoulos

Iosif Tsangko

Alexander Gebhard

Journals

Proceedings of the IEEE

Actions

Institutions

Technical University of Munich

Tampere University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Computer Audition: From Task-Specific Machine Learning to Foundation Models

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study