Key points are not available for this paper at this time.
In this paper we propose a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor. By jointly reasoning about these tasks, our holistic approach is more robust to occlusion as well as sparse data at range. Our approach performs 3D convolutions across space and time over a bird's eye view representation of the 3D world, which is very efficient in terms of both memory and computation. Our experiments on a new very large scale dataset captured in several north american cities, show that we can outperform the state-of-the-art by a large margin. Importantly, by sharing computation we can perform all tasks in as little as 30 ms.
Building similarity graph...
Analyzing shared references across papers
Loading...
Wenjie Luo
Tongji University
Bin Yang
Shanghai Medical Information Center
Raquel Urtasun
Karlsruhe Institute of Technology
University of Toronto
Advanced Technologies Group (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...
Luo et al. (Tue,) studied this question.
synapsesocial.com/papers/6a09ba43e5a55b25c0513bd1 — DOI: https://doi.org/10.48550/arxiv.2012.12395