Abstract
It is an important problem to design resource-efficient neural architectures. One solution is adjusting the number of channels in each layer and the number of blocks in each network stage. This paper presents a novel framework named network adjustment which considers accuracy as a function of the computational resource (e.g., FLOPs or parameters), so that architecture design becomes an optimization problem and can be solved with the gradient-based optimization method. The gradient is defined as the resource utilization ratio (RUR) of each changeable module (layer or block) in a network and is accurate only in a small neighborhood of the current status. Therefore, we estimate it using Dropout, a probabilistic operation, and optimize the network architecture iteratively. The computational overhead of the entire process is comparable to that of re-training the final model from scratch. We investigate two versions of RUR where the resource usage is measured by FLOPs and latency. Experiments on standard image classification datasets and a few base networks including ResNet and EfficientNet demonstrate the effectiveness of our approach, which consistently outperforms the pruning-based counterparts.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Cai, H., Zhu, L., & Han, S. (2018). Proxylessnas: Direct neural architecture search on target task and hardware. arXiv:1812.00332
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu,J., Zhang, Z., Cheng, D., Zhu, C., Cheng, T., Zhao, Q., Li, B., Lu, X., Zhu, R., Wu, Y., Dai, J., Wang, J., Shi, J., Ouyang, W., Loy, C. C., & Lin, D. (2019a). MMDetection: Open mmlab detection toolbox and benchmark. arXiv:1906.07155
Chen, Y., Yang, T., Zhang, X., Meng, G., Pan, C., & Sun, J. (2019b). Detnas: Neural architecture search on object detection. arXiv:1903.10979
Chen, Z., Niu, J., Xie, L., Liu, X., Wei, L., & Tian, Q. (2020). Network adjustment: Channel search guided by flops utilization ratio. In: Computer Vision and Pattern Recognition
Chu, X., Zhang, B., Xu, R., & Li, J. (2019). Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search. arXiv:1907.01845
Dong, X., & Yang, Y. (2019). Network pruning via transformable architecture search. arXiv:1905.09717
Dong, X., Huang, J., Yang, Y., & Yan, S. (2017). More is less: A more complicated network with less inference complexity. In: Computer Vision and Pattern Recognition
Frankle, J., & Carbin, M. (2018). The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv:1803.03635
Ghiasi, G., Lin, T. Y., & Le, Q. V. (2018). Dropblock: A regularization method for convolutional networks. arXiv:1810.12890
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition
Gordon, A., Eban, E., Nachum, O., Chen, B., Wu, H., Yang, T. J., & Choi, E. (2018). Morphnet: Fast & simple resource-constrained structure learning of deep networks. In: Computer Vision and Pattern Recognition
Guo, Z., Zhang, X., Mu, H., Heng, W., Liu, Z., Wei, Y., & Sun, J. (2019). Single path one-shot neural architecture search with uniform sampling. arXiv:1904.00420
Han, D., Kim, J., & Kim, J. (2017). Deep pyramidal residual networks. In: Computer Vision and Pattern Recognition
Han, S., Mao, H., & Dally, W. J. (2015a). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv:1510.00149
Han, S., Pool, J., Tran, J., & Dally, W. (2015b). Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In: International Conference on Computer Vision, IEEE
He, Y., Kang, G., Dong, X., Fu, Y., & Yang, Y. (2018). Soft filter pruning for accelerating deep convolutional neural networks. arXiv:1808.06866
He, Y., Liu, P., Wang, Z., Hu, Z., & Yang, Y. (2019). Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Computer Vision and Pattern Recognition
Hinton, G., Vinyals, O., & Dean, J. (2015) Distilling the knowledge in a neural network. arXiv:1503.02531
Howard, A., Sandler, M., Chu, G., Chen, L. C., Chen, B., & Tan, M., Wang W, Zhu Y, Pang R, Vasudevan V, et al. (2019). Searching for mobilenetv3. arXiv:1905.02244
Huang, G., Sun, Y., Liu, Z., Sedra, D., & Weinberger, K. Q. (2016) Deep networks with stochastic depth. In: European Conference on Computer Vision. Springer
Huang, G., Liu, Z., Van, Der Maaten, L., & Weinberger, K. Q. (2017) Densely connected convolutional networks. In: Computer Vision and Pattern Recognition
Huang, Z., & Wang, N. (2018). Data-driven sparse structure selection for deep neural networks. In: European Conference on Computer Vision
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems
Larsson, G., Maire, M., & Shakhnarovich, G. (2016). Fractalnet: Ultra-deep neural networks without residuals. arXiv:1605.07648
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.
Li, H., Kadav, A., Durdanovic, I., Samet, H, & Graf, H. P. (2016). Pruning filters for efficient convnets. arXiv:1608.08710
Li, X., Chen, S., Hu, X., & Yang, J. (2019). Understanding the disharmony between dropout and batch normalization by variance shift. In: Computer Vision and Pattern Recognition
Liu, H., Simonyan, K., & Yang, Y. (2018a). Darts: Differentiable architecture search. arXiv:1806.09055
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., & Zhang, C. (2017). Learning efficient convolutional networks through network slimming. In: International Conference on Computer Vision
Liu, Z., Sun, M., Zhou, T., Huang, G., & Darrell, T. (2018b). Rethinking the value of network pruning. arXiv:1810.05270
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In: Computer Vision and Pattern Recognition
Lym, S., Choukse, E., Zangeneh, S., Wen, W., Erez, M.&, Shanghavi, S. (2019). Prunetrain: Gradual structured pruning from scratch for faster neural network training. arXiv:1901.09290
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch
Pham, H., Guan, M. Y., Zoph, B., Le, Q. V., & Dean, J. (2018). Efficient neural architecture search via parameter sharing. arXiv:1802.03268
Qiao, S., Lin, Z., Zhang, J.,&Yuille, A. L. (2019). Neural rejuvenation: Improving deep network training by enhancing computational resource utilization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 61–71
Real, E., Aggarwal, A., Huang, Y., & Le, Q. V. (2019). Regularized evolution for image classifier architecture search. In: AAAI Conference on Artificial Intelligence, 33, 4780–4789
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In: Computer Vision and Pattern Recognition
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.
Stamoulis, D., Ding, R., Wang, D., Lymberopoulos, D., Priyantha, B., Liu, J., & Marculescu, D. (2019) Single-path nas: Designing hardware-efficient convnets in less than 4 hours. arXiv:1904.02877
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In: Computer Vision and Pattern Recognition
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2016a). Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv:1602.07261
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016b). Rethinking the inception architecture for computer vision. In: Computer Vision and Pattern Recognition
Tan, M., & Le, Q. V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv:1905.11946
Tan, M., & Le, Q. V. (2021). Efficientnetv2: Smaller models and faster training. arXiv:2104.00298
Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., & Le, Q. V. (2019) Mnasnet: Platform-aware neural architecture search for mobile. In: Computer Vision and Pattern Recognition, pp. 2820–2828
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., & Bregler, C. (2015). Efficient object localization using convolutional networks. In: Computer Vision and Pattern Recognition
Veit, A., Wilber, M. J., & Belongie, S. (2016). Residual networks behave like ensembles of relatively shallow networks. In: Advances in Neural Information Processing Systems
Wan, A., Dai, X., Zhang, P., He, Z., Tian, Y., Xie, S., Wu, B., Yu, M., Xu, T.,&Chen, K., et al. (2020). Fbnetv2: Differentiable neural architecture search for spatial and channel dimensions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12965–12974
Wang, J., Bai, H., Wu, J., Shi, X., Huang, J., King, I., Lyu, M., & Cheng, J. (2020a). Revisiting parameter sharing for automatic neural channel number search. Advances in Neural Information Processing Systems 33
Wang, X., Yu, F., Dou, Z. Y., Darrell, T., & Gonzalez, J. E. (2018). Skipnet: Learning dynamic routing in convolutional networks. In: European Conference on Computer Vision
Wang, Y., Zhang, X., Xie, L., Zhou, J., Su, H., Zhang, B., & Hu, X. (2020b). Pruning from scratch. In: AAAI Conference on Artificial Intelligence, pp 12273–12280
Wen, W., Wu, C., Wang, Y., Chen, Y., & Li, H. (2016). Learning structured sparsity in deep neural networks. In: Advances in Neural Information Processing Systems
Wu, B., Dai, X., Zhang, P., Wang, Y., Sun, F., Wu, Y., Tian, Y., Vajda, P., Jia, Y., & Keutzer, K. (2019). Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. In: Computer Vision and Pattern Recognition
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2016). Aggregated residual transformations for deep neural networks. arXiv:1611.05431
Xu, Y., Xie, L., Zhang, X., Chen, X., Shi, B., Tian, Q., & Xiong, H. (2020) Latency-aware differentiable neural architecture search. arXiv preprint arXiv:2001.06392
Yu, J., & Huang, T. (2019). Network slimming by slimmable networks: Towards one-shot architecture search for channel numbers. arXiv:1903.11728
Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. arXiv:1605.07146
Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2017). Learning transferable architectures for scalable image recognition. arXiv:1707.07012
Acknowledgements
This work was supported by the National Key R&D Program of China (2017YFB1301100), National Natural Science Foundation of China (61772060, U1536107, 61472024, 61572060, 61976012, 61602024), and the CERNET Innovation Project (NGII20160316). We would like to thank Dr. Xin Chen for the helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by Jifeng Dai.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, Z., Xie, L., Niu, J. et al. Network Adjustment: Channel and Block Search Guided by Resource Utilization Ratio. Int J Comput Vis 130, 820–835 (2022). https://doi.org/10.1007/s11263-021-01566-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-021-01566-5