Key points are not available for this paper at this time.
With more than 10,000 new videos posted online every day on social websites such as YouTube and Facebook, the internet is becoming an almost infinite source of information. One crucial challenge for the coming decade is to be able to harvest relevant information from this constant flow of multimodal data. This paper addresses the task of multimodal sentiment analysis, and conducts proof-of-concept experiments that demonstrate that a joint model that integrates visual, audio, and textual features can be effectively used to identify sentiment in Web videos. This paper makes three important contributions. First, it addresses for the first time the task of tri-modal sentiment analysis, and shows that it is a feasible task that can benefit from the joint exploitation of visual, audio and textual modalities. Second, it identifies a subset of audio-visual features relevant to sentiment analysis and present guidelines on how to integrate these features. Finally, it introduces a new dataset consisting of real online data, which will be useful for future research in this area.
Building similarity graph...
Analyzing shared references across papers
Loading...
Louis‐Philippe Morency
Carnegie Mellon University
Rada Mihalcea
University of Michigan
Payal Doshi
SoftTeam Solutions (India)
University of Southern California
University of North Texas
Creative Technologies (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...
Morency et al. (Mon,) studied this question.
synapsesocial.com/papers/6a0804820511025d3a379157 — DOI: https://doi.org/10.1145/2070481.2070509