Key points are not available for this paper at this time.
We propose an effective approach for spatio-temporal action localization in realistic videos. The approach first detects proposals at the frame-level and scores them with a combination of static and motion CNN features. It then tracks high-scoring proposals throughout the video using a tracking-by-detection approach. Our tracker relies simultaneously on instance-level and class-level detectors. The tracks are scored using a spatio-temporal motion histogram, a descriptor at the track level, in combination with the CNN features. Finally, we perform temporal localization of the action using a sliding-window approach at the track level. We present experimental results for spatio-temporal localization on the UCF-Sports, J-HMDB and UCF-101 action localization datasets, where our approach outperforms the state of the art with a margin of 15%, 7% and 12% respectively in mAP.
Building similarity graph...
Analyzing shared references across papers
Loading...
Philippe Weinzaepfel
École Normale Supérieure Paris-Saclay
Zaïd Harchaoui
University of Washington
Cordelia Schmid
Karlsruhe Institute of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Weinzaepfel et al. (Fri,) studied this question.
synapsesocial.com/papers/6a1a941d37bfaf0f5945aa52 — DOI: https://doi.org/10.48550/arxiv.1506.01929
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: