skip to main content
10.5555/3294771.3294979guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article
Free access

Runtime neural pruning

Published: 04 December 2017 Publication History

Abstract

In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime. Unlike existing neural pruning methods which produce a fixed pruned model for deployment, our method preserves the full ability of the original network and conducts pruning according to the input image and current feature maps adaptively. The pruning is performed in a bottom-up, layer-by-layer manner, which we model as a Markov decision process and use reinforcement learning for training. The agent judges the importance of each convolutional kernel and conducts channel-wise pruning conditioned on different samples, where the network is pruned more when the image is easier for the task. Since the ability of network is fully preserved, the balance point is easily adjustable according to the available resources. Our method can be applied to off-the-shelf network structures and reach a better tradeoff between speed and accuracy, especially with a large pruning rate.

References

[1]
Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, and Aaron Courville. Dynamic capacity networks. arXiv preprint arXiv:1511.07838, 2015.
[2]
Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. Structured pruning of deep convolutional neural networks. arXiv preprint arXiv:1512.08571, 2015.
[3]
Richard Bellman. Dynamic programming and lagrange multipliers. PNAS, 42(10):767-769, 1956.
[4]
Djalel Benbouzid, Róbert Busa-Fekete, and Balázs Kégl. Fast classification using sparse decision dags. arXiv preprint arXiv:1206.6387, 2012.
[5]
Emmanuel Bengio, Pierre-Luc Bacon, Joelle Pineau, and Doina Precup. Conditional computation in neural networks for faster models. arXiv preprint arXiv:1511.06297, 2015.
[6]
Yoshua Bengio, Nicholas Léonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
[7]
Tolga Bolukbasi, Joseph Wang, Ofer Dekel, and Venkatesh Saligrama. Adaptive neural networks for fast test-time prediction. arXiv preprint arXiv:1702.07811, 2017.
[8]
Juan C Caicedo and Svetlana Lazebnik. Active object localization with deep reinforcement learning. In ICCV, pages 2488-2496, 2015.
[9]
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759, 2014.
[10]
Michael Figurnov, Maxwell D Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, and Ruslan Salakhutdinov. Spatially adaptive computation time for residual networks. arXiv preprint arXiv:1612.02297, 2016.
[11]
Song Han, Huizi Mao, and William J Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
[12]
Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neural network. In NIPS, pages 1135-1143, 2015.
[13]
Stephen José Hanson and Lorien Y Pratt. Comparing biases for minimal network construction with back-propagation. In NIPS, pages 177-185, 1989.
[14]
Babak Hassibi, David G Stork, et al. Second order derivatives for network pruning: Optimal brain surgeon. NIPS, pages 164-164, 1993.
[15]
He He, Jason Eisner, and Hal Daume. Imitation learning by coaching. In Advances in Neural Information Processing Systems, pages 3149-3157, 2012.
[16]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
[17]
Hengyuan Hu, Rui Peng, Yu-Wing Tai, and Chi-Keung Tang. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250, 2016.
[18]
Gary B Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report, Technical Report 07-49, University of Massachusetts, Amherst, 2007.
[19]
Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866, 2014.
[20]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
[21]
Sergey Karayev, Tobias Baumgartner, Mario Fritz, and Trevor Darrell. Timely object recognition. In Advances in Neural Information Processing Systems, pages 890-898, 2012.
[22]
Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. 2009.
[23]
Neeraj Kumar, Alexander Berg, Peter N Belhumeur, and Shree Nayar. Describable visual attributes for face verification and image search. PAMI, 33(10):1962-1977, 2011.
[24]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998.
[25]
Yann LeCun, John S Denker, and Sara A Solla. Optimal brain damage. In NIPS, pages 598-605, 1990.
[26]
Sam Leroux, Steven Bohez, Elias De Coninck, Tim Verbelen, Bert Vankeirsbilck, Pieter Simoens, and Bart Dhoedt. The cascading neural network: building the internet of smart things. Knowledge and Information Systems, pages 1-24, 2017.
[27]
Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
[28]
Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, and Gang Hua. A convolutional neural network cascade for face detection. In CVPR, pages 5325-5334, 2015.
[29]
Michael L Littman. Reinforcement learning improves behaviour from evaluative feedback. Nature, 521(7553):445-451, 2015.
[30]
Lanlan Liu and Jia Deng. Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution. arXiv preprint arXiv:1701.00299, 2017.
[31]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533, 2015.
[32]
Kenton Murray and David Chiang. Auto-sizing neural networks: With applications to n-gram language models. arXiv preprint arXiv:1508.05051, 2015.
[33]
Augustus Odena, Dieterich Lawson, and Christopher Olah. Changing model behavior at test-time using reinforcement learning. arXiv preprint arXiv:1702.07780, 2017.
[34]
Martin L Puterman. Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, 2014.
[35]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In NIPS, pages 91-99, 2015.
[36]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. IJCV, 115(3):211-252, 2015.
[37]
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[38]
Nikko Ström. Phoneme probability estimation with dynamic sparsely connected artificial neural networks. The Free Speech Journal, 5:1-41, 1997.
[39]
Chen Sun, Manohar Paluri, Ronan Collobert, Ram Nevatia, and Lubomir Bourdev. Pronet: Learning to propose object-specific boxes for cascaded neural networks. In CVPR, pages 3485-3493, 2016.
[40]
Yi Sun, Xiaogang Wang, and Xiaoou Tang. Deep convolutional network cascade for facial point detection. In CVPR, pages 3476-3483, 2013.
[41]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alex Alemi. Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261, 2016.
[42]
Tijmen Tieleman and Geoffrey Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2), 2012.
[43]
Christopher JCH Watkins and Peter Dayan. Q-learning. Machine learning, 8(3-4):279-292, 1992.
[44]
Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. Learning structured sparsity in deep neural networks. In NIPS, pages 2074-2082, 2016.
[45]
Fangyi Zhang, Jürgen Leitner, Michael Milford, Ben Upcroft, and Peter Corke. Towards vision-based deep reinforcement learning for robotic motion control. arXiv preprint arXiv:1511.03791, 2015.
[46]
Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun. Accelerating very deep convolutional networks for classification and detection. PAMI, 38(10):1943-1955, 2016.

Cited By

View all
  • (2024)Progressive Channel-Shrinking NetworkIEEE Transactions on Multimedia10.1109/TMM.2023.329119726(2016-2026)Online publication date: 1-Jan-2024
  • (2023)Resource-Efficient Convolutional Networks: A Survey on Model-, Arithmetic-, and Implementation-Level TechniquesACM Computing Surveys10.1145/358709555:13s(1-36)Online publication date: 13-Jul-2023
  • (2022)Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference ApproachProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548001(2899-2908)Online publication date: 10-Oct-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems
December 2017
7104 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 04 December 2017

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)265
  • Downloads (Last 6 weeks)27
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Progressive Channel-Shrinking NetworkIEEE Transactions on Multimedia10.1109/TMM.2023.329119726(2016-2026)Online publication date: 1-Jan-2024
  • (2023)Resource-Efficient Convolutional Networks: A Survey on Model-, Arithmetic-, and Implementation-Level TechniquesACM Computing Surveys10.1145/358709555:13s(1-36)Online publication date: 13-Jul-2023
  • (2022)Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference ApproachProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548001(2899-2908)Online publication date: 10-Oct-2022
  • (2022)Edge Intelligence: Concepts, Architectures, Applications, and Future DirectionsACM Transactions on Embedded Computing Systems10.1145/348667421:5(1-41)Online publication date: 8-Oct-2022
  • (2021)Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit ApproachACM Transactions on Internet of Things10.1145/34801723:1(1-23)Online publication date: 27-Oct-2021
  • (2021)Adaptive Inference through Early-Exit NetworksProceedings of the 5th International Workshop on Embedded and Mobile Deep Learning10.1145/3469116.3470012(1-6)Online publication date: 25-Jun-2021
  • (2021)An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning LevelACM Transactions on Design Automation of Electronic Systems10.1145/346097226:6(1-20)Online publication date: 1-Aug-2021
  • (2021)Efficient Multi-Scale Feature Generation Adaptive NetworkProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482337(883-892)Online publication date: 26-Oct-2021
  • (2021)Towards Model Compression for Deep Learning Based Speech EnhancementIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2021.308228229(1785-1794)Online publication date: 20-May-2021
  • (2020)Towards Objection Detection Under IoT Resource ConstraintsProceedings of the 2nd International Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things10.1145/3417313.3429379(14-20)Online publication date: 16-Nov-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media