Key points are not available for this paper at this time.
The ability to identify sounds in complex audio environments is highly useful for multimedia retrieval, security, and many mobile robotic applications, but very little work has been done in this area. We present the SOLAR system, a system capable of finding sound objects, such as dog barks or car horns, in complex audio data extracted from movies. SOLAR avoids the need for segmentation by scanning over the audio data in fixed increments and classifying each short audio window separately. SOLAR employs boosted decision tree classifiers to select suitable features for modeling each sound object and to discriminate between the object of interest and all other sounds. We demonstrate the effectiveness of our approach with experiments on thirteen sound object classes trained using only tens of positive examples and tested on hours of audio data extracted from popular movies.
Hoiem et al. (Wed,) studied this question.