Key points are not available for this paper at this time.
Context has been playing an increasingly important role to improve the object detection performance. In this paper we propose an effective representation, Multi-Order Contextual co-Occurrence (MOCO), to implicitly model the high level context using solely detection responses from a baseline object detector. The so-called (1st-order) context feature is computed as a set of randomized binary comparisons on the response map of the baseline object detector. The statistics of the 1st-order binary context features are further calculated to construct a high order co-occurrence descriptor. Combining the MOCO feature with the original image feature, we can evolve the baseline object detector to a stronger context aware detector. With the updated detector, we can continue the evolution till the contextual improvements saturate. Using the successful deformable-part-model detector 13 as the baseline detector, we test the proposed MOCO evolution framework on the PASCAL VOC 2007 dataset 8 and Caltech pedestrian dataset 7: The proposed MOCO detector outperforms all known state-of-the-art approaches, contextually boosting deformable part models (ver. 5) 13 by 3.3% in mean average precision on the PASCAL 2007 dataset. For the Caltech pedestrian dataset, our method further reduces the log-average miss rate from 48% to 46% and the miss rate at 1 FPPI from 25% to 23%, compared with the best prior art 6.
Chen et al. (Sat,) studied this question.