Key points are not available for this paper at this time.
Activities in egocentric video are largely defined by the objects with which the cam-era wearer interacts, making representations that summarize the objects in view quite informative. Beyond simply recording how frequently each object occurs in a single histogram, spatio-temporal binning approaches can capture the objects ’ relative layout and ordering. However, existing methods use hand-crafted binning schemes (e.g., a uni-formly spaced pyramid of partitions), which may fail to capture the relationships that best distinguish certain activities. We propose to learn the spatio-temporal partitions that are discriminative for a set of egocentric activity classes. We devise a boosting approach that automatically selects a small set of useful spatio-temporal pyramid histograms among a randomized pool of candidate partitions. In order to efficiently focus the candidate par-titions, we further propose an “object-centric ” cutting scheme that prefers sampling bin boundaries near those objects prominently involved in the egocentric activities. In this way, we specialize the randomized pool of partitions to the egocentric setting and im-prove the training efficiency for boosting. Our approach yields state-of-the-art accuracy for recognition of challenging activities of daily living. 1
McCandless et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: