Key points are not available for this paper at this time.
We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings, we devise to train a quantization-aware accuracy predictor that is fed to the evolutionary search to select the best fit. Since directly training such a predictor requires time-consuming quantization data collection, we propose to use predictor-transfer technique to get the quantization-aware predictor: we first generate a large dataset of 〈NN architecture, ImageNet accuracy〉 pairs by sampling a pretrained unified once-for-all network and doing direct evaluation; then we use these data to train an accuracy predictor without quantization, followed by transferring its weights to train the quantization-aware predictor, which largely reduces the quantization data collection time. Extensive experiments on ImageNet show the benefits of this joint design methodology: the model searched by our method maintains the same level accuracy as ResNet34 8-bit model while saving 8× BitOps; we achieve 2×/1.3× latency/energy saving compared to MobileNetV2+HAQ 30, 36 while obtaining the same level accuracy; the marginal search cost ofjoint optimization for a new deployment scenario outperforms separate optimizations using ProxylessNAS+AMC+HAQ 5, 12, 36 by 2.3% accuracy while reducing orders of magnitude GPU hours and CO 2 emission with respect to the training cost.
Building similarity graph...
Analyzing shared references across papers
Loading...
Tianzhe Wang
Shanghai Jiao Tong University
Kuan Wang
Shandong Normal University
Han Cai
Guangdong University of Technology
Massachusetts Institute of Technology
Shanghai Jiao Tong University
Institute of Natural Science
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Mon,) studied this question.
synapsesocial.com/papers/6a0ef57a2eca052da647f5e1 — DOI: https://doi.org/10.1109/cvpr42600.2020.00215