November 22, 2002

Speaker identification and video analysis for hierarchical video shot classification

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

We present a new video shot classification and clustering technique to support content-based indexing, browsing and retrieval in video databases. The proposed method is based on the analysis of both the audio and visual data tracks. The visual stream is analyzed using a 3-D wavelet transform and segmented into shot units which are matched and clustered by visual content. Simultaneously, speaker changes are detected by tracking voiced phonemes in the audio signal. The clues obtained from the video and speech data are combined to classify and group the isolated video shots. This integrated approach also allows effective indexing of the audio-visual objects in multimedia databases.

Preguntar a la IA

Me gusta

Guardar