June 1, 2020Open Access

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Key Points

Key points are not available for this paper at this time.

Abstract

We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings, we devise to train a quantization-aware accuracy predictor that is fed to the evolutionary search to select the best fit. Since directly training such a predictor requires time-consuming quantization data collection, we propose to use predictor-transfer technique to get the quantization-aware predictor: we first generate a large dataset of 〈NN architecture, ImageNet accuracy〉 pairs by sampling a pretrained unified once-for-all network and doing direct evaluation; then we use these data to train an accuracy predictor without quantization, followed by transferring its weights to train the quantization-aware predictor, which largely reduces the quantization data collection time. Extensive experiments on ImageNet show the benefits of this joint design methodology: the model searched by our method maintains the same level accuracy as ResNet34 8-bit model while saving 8× BitOps; we achieve 2×/1.3× latency/energy saving compared to MobileNetV2+HAQ 30, 36 while obtaining the same level accuracy; the marginal search cost ofjoint optimization for a new deployment scenario outperforms separate optimizations using ProxylessNAS+AMC+HAQ 5, 12, 36 by 2.3% accuracy while reducing orders of magnitude GPU hours and CO 2 emission with respect to the training cost.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Tianzhe Wang

Shanghai Jiao Tong University

Kuan Wang

Shandong Normal University

Han Cai

Guangdong University of Technology

Actions

Institutions

Massachusetts Institute of Technology

Shanghai Jiao Tong University

Institute of Natural Science

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study