research-article

Open access

NERVE: Real-Time Neural Video Recovery and Enhancement on Mobile Devices

Authors:

Kyoungjun Park,

Yuqing YangAuthors Info & Claims

Proceedings of the ACM on Networking, Volume 2, Issue CoNEXT1

Article No.: 4, Pages 1 - 19

https://doi.org/10.1145/3649472

Published: 28 March 2024 Publication History

Abstract

As mobile devices become increasingly popular for video streaming, it is crucial to optimize the streaming experience for these devices. Although deep learning-based video enhancement techniques are gaining attention, most of them cannot support real-time enhancement on mobile devices. Additionally, many of these techniques are focused solely on super-resolution and cannot handle partial or complete loss or corruption of video frames, which is common in the Internet and wireless networks.

To overcome these challenges, we present NERVE, a novel approach in this paper. NERVE consists of (i) a novel video frame recovery scheme, (ii) a new super-resolution algorithm, and (iii) an enhancement-aware video bit rate adaptation algorithm. We implement NERVE on an iPhone 12, and it can support 30 frames per second (FPS). We evaluate NERVE in various networks such as 3G, 4G, 5G, and WiFi networks. Our evaluation shows that NERVE enables real-time video recovery and enhancement, and results in 24% - 83% increase in video Quality of Experience (QoE) in our video streaming system.

References

[1]

aioquic, 2019. https://github.com/aiortc/aioquic.

[2]

S. Aigner and M. Korner. Futuregan: Anticipating the future ¨ frames of video sequences using spatio-temporal 3d convolutions in progressively growing autoencoder GANs. arXiv:1810.01325, 2018.

[3]

Z. Akhtar, Y. S. Nam, R. Govindan, S. Rao, J. Chen, E. Katz-Bassett, B. Ribeiro, J. Zhan, and H. Zhang. Oboe: Auto-tuning video abr algorithms to network conditions. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pages 44--58, 2018.

Digital Library

[4]

H. Amirpour, M. Ghanbari, and C. Timmerer. Deepstream: Video streaming enhancements using compressed deep neural networks. IEEE Transactions on Circuits and Systems for Video Technology, 2022.

[5]

P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 33(5):898--916, May 2011.

Digital Library

[6]

K. C. Chan, X. Wang, K. Yu, C. Dong, and C. C. Loy. Basicvsr: The search for essential components in video superresolution and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4947--4956, 2021.

[7]

K. C. Chan, X. Wang, K. Yu, C. Dong, and C. C. Loy. Basicvsr: The search for essential components in video superresolution and beyond. Computer Vision and Pattern Recognition, 2021.

[8]

J. Chen, M. Hu, Z. Luo, Z. Wang, and D. Wu. Sr360: boosting 360-degree video streaming with super-resolution. In Proceedings of the 30th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, pages 1--6, 2020.

Digital Library

[9]

S. K. Chin and R. Braun. A survey of udp packet loss characteristics. In Proc. of Conference Record of 35th Asilomar Conference on Signals, Systems and Computers, 2001.

[10]

Chrome is deploying http/3 and ietf quic. https://blog.chromium.org/2020/10/chrome-is-deploying-http3-and-ietfquic. html.

[11]

M. Chu, Y. Xie, J. Mayer, L. Leal-Taixé, and N. Thuerey. Learning temporal coherence via self-supervision for gan-based video generation. ACM Transactions on Graphics (TOG), 39(4):75--1, 2020.

[12]

Cirp. https://cirpapple.substack.com/p/iphone-14-pro-and-pro-max-soar.

[13]

M. Dasari, A. Bhattacharya, S. Vargas, P. Sahu, A. Balasubramanian, and S. R. Das. Streaming 360-degree videos using super-resolution. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pages 1977--1986. IEEE, 2020.

Digital Library

[14]

D. Fuoli, S. Gu, and R. Timofte. Efficient video super-resolution through recurrent latent space propagation. arXiv: Image and Video Processing, 2019.

[15]

P. Hu, R. Misra, and S. Katti. Dejavu: Enhancing videoconferencing with prior knowledge. In Proceedings of the 20th International Workshop on Mobile Computing Systems and Applications, pages 63--68, 2019.

Digital Library

[16]

T.-Y. Huang, R. Johari, N. McKeown, M. Trunnell, and M.Watson. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. In Proceedings of the 2014 ACM conference on SIGCOMM, pages 187--198, 2014.

Digital Library

[17]

H. Jiang, Z. Liu, Y. Wang, K. Lee, and I. Rhee. Understanding bufferbloat in cellular networks. In Proc. of CellNet, 2012.

Digital Library

[18]

H. Jiang, Y. Wang, K. Lee, and I. Rhee. Tackling bufferbloat in 3g/4g mobile networks. In Proc. of IMC, 2012.

[19]

J. Jiang, V. Sekar, and H. Zhang. Improving fairness, efficiency, and stability in http-based adaptive video streaming with festive. In Proceedings of the 8th international conference on Emerging networking experiments and technologies, pages 97--108, 2012.

Digital Library

[20]

J. Kang, S. W. Oh, and S. J. Kim. Error compensation framework for flow-guided video inpainting. In European Conference on Computer Vision, pages 375--390. Springer, 2022.

Digital Library

[21]

T. H. Kim, M. S. Sajjadi, M. Hirsch, and B. Scholkopf. Spatio-temporal transformer network for video restoration. In Proceedings of the European Conference on Computer Vision (ECCV), pages 106--122, 2018.

Digital Library

[22]

A. Langley, A. Riddoch, A. Wilk, A. Vicente, C. Krasic, D. Zhang, F. Yang, F. Kouranov, I. Swett, J. Iyengar, et al. The quic transport protocol: Design and internet-scale deployment. In Proceedings of the conference of the ACM special interest group on data communication, pages 183--196, 2017.

Digital Library

[23]

I. Lee, S. Kim, S. Sathyanarayana, K. Bin, S. Chong, K. Lee, D. Grunwald, and S. Ha. R-fec: Rl-based fec adjustment for better qoe in webrtc. In Proc. of MM, 2022.

Digital Library

[24]

W. Li, X. Tao, T. Guo, L. Qi, J. Lu, and J. Jia. Mucan: Multi-correspondence aggregation network for video superresolution. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part X 16, pages 335--351. Springer, 2020.

[25]

J. Lorincz, Z. Klarin. A comprehensive overview of tcp congestion control in 5g networks: Research challenges and future perspectives. Sensors, 2021.

[26]

H. Mao, R. Netravali, and M. Alizadeh. Neural adaptive video streaming with pensieve. In Proceedings of the conference of the ACM special interest group on data communication, pages 197--210, 2017.

Digital Library

[27]

Medium report about 'top 10 most popular types of videos on youtube'. https://mag.octoly.com/here-are-the-top-10- most-popular-types-of-videos-on-youtube-4ea1e1a192ac.

[28]

Mobile rrn. https://github.com/MediaTek-NeuroPilot/mai22-real-time-video-sr.

[29]

Nearly 60% of americans now stream video daily on smartphones, tablets and computers. https://www.nexttv.com/ news/nearly-60-of-americans-now-stream-video-daily-on-smart-phones-tablets-and-computers.

[30]

S. Nah, S. Baik, S. Hong, G. Moon, S. Son, R. Timofte, and K. M. Lee. Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study. In CVPR Workshops, June 2019.

[31]

A. Narayanan, X. Zhang, R. Zhu, A. Hassan, S. Jin, X. Zhu, X. Zhang, D. Rybkin, Z. Yang, Z. M. Mao, F. Qian, and Z.-L. Zhang. A variegated look at 5g in the wild: performance, power, and qoe implications. In Proc. of SIGCOMM, 2021.

Digital Library

[32]

Capture network log. chrome://net-export/.

[33]

Proximal policy optimization (ppo). https://openai.com/blog/openai-baselines-ppo/.

[34]

Psnr. https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio.

[35]

A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever. Zero-shot text-to-image generation. arXiv: Computer Vision and Pattern Recognition, 2021.

[36]

A. Ranjan and M. J. Black. Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4161--4170, 2017.

[37]

I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the society for industrial and applied mathematics, 8(2):300--304, 1960.

[38]

M. S. Sajjadi, R. Vemulapalli, and M. Brown. Frame-recurrent video super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6626--6634, 2018.

[39]

V. Sanh, T. Wolf, and A. Rush. Movement pruning: Adaptive sparsity by fine-tuning. Advances in Neural Information Processing Systems, 33:20378--20389, 2020.

[40]

A. Sankisa, A. Punjabi, and A. K. Katsaggelos. Video error concealment using deep neural networks. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 380--384. IEEE, 2018.

[41]

A. Sankisa, A. Punjabi, and A. K. Katsaggelos. Temporal capsule networks for video motion estimation and error concealment. Signal, Image and Video Processing, 14(7):1369--1377, 2020.

[42]

W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1874--1883, 2016.

[43]

K. Spiteri, R. Urgaonkar, and R. K. Sitaraman. Bola: Near-optimal bitrate adaptation for online videos. IEEE/ACM Transactions On Networking, 28(4):1698--1711, 2020.

Digital Library

[44]

Ssim. https://en.wikipedia.org/wiki/Structural_similarity.

[45]

Z. Su, W. Liu, Z. Yu, D. Hu, Q. Liao, Q. Tian, M. Pietikäinen, and L. Liu. Pixel difference networks for efficient edge detection. International Conference on Computer Vision, 2021.

[46]

A. Terwilliger, G. Brazil, and X. Liu. Recurrent flow-guided semantic forecasting. In Proc. of WACV, 2019.

[47]

S. Tulyakov, M.-Y. Liu, X. Yang, and J. Kautz. Mocogan: Decomposing motion and content for video generation. In Proc. of CVPR, 2018.

[48]

C. Vondrick, H. Pirsiavash, and A. Torralba. Generating videos with scene dynamics. In Proc. of NeurIPS, 2016.

[49]

L. Wang, Y. Guo, Z. Lin, X. Deng, and W. An. Learning for video super-resolution through hr optical flow estimation. In Computer Vision--ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2--6, 2018, Revised Selected Papers, Part I 14, pages 514--529. Springer, 2019.

[50]

Y. Wang, L. Jiang, M.-H. Yang, L.-J. Li, M. Longand, and L. Fei-Fei. In Proc. of ICLR, 2019.

[51]

Wowza's dash bitrate recommendation. https://www.wowza.com/docs/how-to-encode-source-video-for-wowzastreaming- cloud.

[52]

J. Xiao, X. Jiang, N. Zheng, H. Yang, Y. Yang, Y. Yang, D. Li, and K.-M. Lam. Online video super-resolution with convolutional kernel bypass graft. 2022.

[53]

D. Xu, A. Zhou, X. Zhang, G. Wang, X. Liu, C. An, Y. Shi, L. Liu, and H. Ma. Understanding operational 5g: A first measurement study on its coverage, performance and energy consumption. In Proc. of SIGCOMM, 2020.

Digital Library

[54]

W. Yan, Y. Zhang, P. Abbeel, and A. Srinivas. Videogpt: Video generation using vq-vae and transformers. arXiv: Computer Vision and Pattern Recognition, 2021.

[55]

H. Yeo, C. J. Chong, Y. Jung, J. Ye, and D. Han. Nemo: enabling neural-enhanced video streaming on commodity mobile devices. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, pages 1--14, 2020.

Digital Library

[56]

H. Yeo, Y. Jung, J. Kim, J. Shin, and D. Han. Neural adaptive content-aware internet video delivery. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18), pages 645--661, 2018.

[57]

P. Yi, Z. Wang, K. Jiang, J. Jiang, and J. Ma. Progressive fusion video super-resolution network via exploiting nonlocal spatio-temporal correlations. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3106--3115, 2019.

[58]

X. Yin, A. Jindal, V. Sekar, and B. Sinopoli. A control-theoretic approach for dynamic adaptive video streaming over http. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, pages 325--338, 2015.

Digital Library

[59]

J. Zhang, Y. Wang, M. Long, W. Jianmin, and P. S. Yu. Z-order recurrent neural networks for video prediction. In Proc. of ICME, 2019.

[60]

Y. Zhang, N. Duffield, V. Paxson, and S. Shenker. On the constancy of internet path properties. In Proc. of IMW, 2001.

Digital Library

[61]

G. Zhou, Z. Luo, M. Hu, and D. Wu. Presr: Neural-enhanced adaptive streaming of vbr-encoded videos with selective prefetching. IEEE Transactions on Broadcasting, 2022.

Cited By

Kharbas VSambargi SAmin R(2024)Evaluating Novel Network Coding Schemes for Wirelessly Delivered Media Streams2024 2nd International Conference on Artificial Intelligence and Machine Learning Applications Theme: Healthcare and Internet of Things (AIMLA)10.1109/AIMLA59606.2024.10531387(1-6)Online publication date: 15-Mar-2024
https://doi.org/10.1109/AIMLA59606.2024.10531387

Index Terms

NERVE: Real-Time Neural Video Recovery and Enhancement on Mobile Devices

Recommendations

Towards energy-aware DASH for mobile video
MoVid '15: Proceedings of the 7th ACM International Workshop on Mobile Video

Advances in computing hardware and novel multimedia applications have urged the development of handheld mobile devices such as smartphones and tablets. Videos are accounting as the highest data traffic on handheld devices. With this significant increase ...
Adaptive transmission of variable-bit-rate video streams to mobile devices
NETWORKING'11: Proceedings of the 10th international IFIP TC 6 conference on Networking - Volume Part II

We propose a novel algorithm to efficiently transmit multiple variable-bit-rate (VBR) video streams from a base station to mobile receivers in wide-area wireless networks. The algorithm transmits video streams in bursts to save the energy of mobile ...
Retinex Based Flicker-Free Low-Light Video Enhancement
Pattern Recognition and Computer Vision
Abstract
Videos captured in low light environment tend to be poor visual effect. To get better visual experience, a video enhancement algorithm based on improved center-surrounded Retinex and optical flow is proposed in this paper, which contains intra-...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Networking

Proceedings of the ACM on Networking Volume 2, Issue CoNEXT1

PACMNET

March 2024

95 pages

EISSN:2834-5509

DOI:10.1145/3655593

Editors:
Marco Mellia
Politecnico di Torino, Italy
,
Peter Steenkiste
Carnegie Mellon University, United States

Issue’s Table of Contents

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 March 2024

Published in PACMNET Volume 2, Issue CoNEXT1

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
252
Total Downloads

Downloads (Last 12 months)252
Downloads (Last 6 weeks)52

Reflects downloads up to 15 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kharbas VSambargi SAmin R(2024)Evaluating Novel Network Coding Schemes for Wirelessly Delivered Media Streams2024 2nd International Conference on Artificial Intelligence and Machine Learning Applications Theme: Healthcare and Internet of Things (AIMLA)10.1109/AIMLA59606.2024.10531387(1-6)Online publication date: 15-Mar-2024
https://doi.org/10.1109/AIMLA59606.2024.10531387

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents