research-article

VMRF: View Matching Neural Radiance Fields

Authors:

Shijian LuAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 6579 - 6587

https://doi.org/10.1145/3503161.3548078

Published: 10 October 2022 Publication History

Abstract

Neural Radiance Fields (NeRF) has demonstrated very impressive performance in novel view synthesis via implicitly modelling 3D representations from multi-view 2D images. However, most existing studies train NeRF models with either reasonable camera pose initialization or manually-crafted camera pose distributions which are often unavailable or hard to acquire in various real-world data. We design VMRF, an innovative view matching NeRF that enables effective NeRF training without requiring prior knowledge in camera poses or camera pose distributions. VMRF introduces a view matching scheme, which exploits unbalanced optimal transport to produce a feature transport plan for mapping a rendered image with randomly initialized camera pose to the corresponding real image. With the feature transport plan as the guidance, a novel pose calibration technique is designed which rectifies the initially randomized camera poses by predicting relative pose transformations between the pair of rendered and real images. Extensive experiments over a number of synthetic and real datasets show that the proposed VMRF outperforms the state-of-the-art qualitatively and quantitatively by large margins.

Supplementary Material

MP4 File (MM22-fp1374.mp4)

Presentation video for View Matching Neural Radiance Fields

Download
100.72 MB

References

[1]

Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. 2021. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5855--5864.

[2]

Mark Boss, Raphael Braun, Varun Jampani, Jonathan T Barron, Ce Liu, and Hendrik Lensch. 2021. Nerd: Neural reflectance decomposition from image collections. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12684--12694.

[3]

Anpei Chen, Zexiang Xu, Fuqiang Zhao, Xiaoshuai Zhang, Fanbo Xiang, Jingyi Yu, and Hao Su. 2021. Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14124--14133.

[4]

Lenaic Chizat, Gabriel Peyré, Bernhard Schmitzer, and Francc ois-Xavier Vialard. 2018. Scaling algorithms for unbalanced optimal transport problems. Math. Comp., Vol. 87, 314 (2018), 2563--2609.

[5]

Nicolas Courty, Rémi Flamary, and Devis Tuia. 2014. Domain adaptation with regularized optimal transport. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 274--289.

Digital Library

[6]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).

[7]

Yilun Du, Yinan Zhang, Hong-Xing Yu, Joshua B Tenenbaum, and Jiajun Wu. 2021. Neural radiance flow for 4d view synthesis and video processing. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14324--14334.

[8]

Olivier Faugeras and Quang-Tuan Luong. 2001. The geometry of multiple images: the laws that govern the formation of multiple images of a scene and some of their applications. MIT press.

[9]

Chen Gao, Ayush Saraf, Johannes Kopf, and Jia-Bin Huang. 2021. Dynamic view synthesis from dynamic monocular video. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5712--5721.

[10]

Jiatao Gu, Lingjie Liu, Peng Wang, and Christian Theobalt. 2021. Stylenerf: A style-based 3d-aware generator for high-resolution image synthesis. arXiv preprint arXiv:2110.08985 (2021).

[11]

Yudong Guo, Keyu Chen, Sen Liang, Yong-Jin Liu, Hujun Bao, and Juyong Zhang. 2021. Ad-nerf: Audio driven neural radiance fields for talking head synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5784--5794.

[12]

Richard Hartley and Andrew Zisserman. 2003. Multiple view geometry in computer vision. Cambridge university press.

[13]

Rasmus Jensen, Anders Dahl, George Vogiatzis, Engin Tola, and Henrik Aanæs. 2014. Large scale multi-view stereopsis evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 406--413.

Digital Library

[14]

Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Anima Anandkumar, Minsu Cho, and Jaesik Park. 2021. Self-calibrating neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5846--5854.

[15]

Nicholas Kolkin, Jason Salavon, and Gregory Shakhnarovich. 2019. Style transfer by relaxed optimal transport and self-similarity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10051--10060.

[16]

Matthias Liero, Alexander Mielke, and Giuseppe Savaré. 2018. Optimal entropy-transport problems and a new Hellinger--Kantorovich distance between positive measures. Inventiones mathematicae, Vol. 211, 3 (2018), 969--1117.

[17]

Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey. 2021. Barf: Bundle-adjusting neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5741--5751.

[18]

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020a. Neural sparse voxel fields. Advances in Neural Information Processing Systems, Vol. 33 (2020), 15651--15663.

[19]

Steven Liu, Xiuming Zhang, Zhoutong Zhang, Richard Zhang, Jun-Yan Zhu, and Bryan Russell. 2021. Editing conditional radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5773--5783.

[20]

Yanbin Liu, Linchao Zhu, Makoto Yamada, and Yi Yang. 2020b. Semantic correspondence as an optimal transport problem. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4463--4472.

[21]

Ricardo Martin-Brualla, Noha Radwan, Mehdi SM Sajjadi, Jonathan T Barron, Alexey Dosovitskiy, and Daniel Duckworth. 2021. Nerf in the wild: Neural radiance fields for unconstrained photo collections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7210--7219.

[22]

Quan Meng, Anpei Chen, Haimin Luo, Minye Wu, Hao Su, Lan Xu, Xuming He, and Jingyi Yu. 2021. Gnerf: Gan-based neural radiance field without posed camera. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6351--6361.

[23]

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision. Springer, 405--421.

Digital Library

[24]

Michael Niemeyer and Andreas Geiger. 2021. Giraffe: Representing scenes as compositional generative neural feature fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11453--11464.

[25]

Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo Martin-Brualla. 2021. Nerfies: Deformable neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5865--5874.

[26]

Sida Peng, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Xiaowei Zhou, and Hujun Bao. 2021. Animatable neural radiance fields for modeling dynamic human bodies. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14314--14323.

[27]

Gabriel Peyré, Marco Cuturi, et al. 2019. Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, Vol. 11, 5--6 (2019), 355--607.

Digital Library

[28]

Johannes L Schonberger and Jan-Michael Frahm. 2016. Structure-from-motion revisited. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4104--4113.

[29]

Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas Geiger. 2020. Graf: Generative radiance fields for 3d-aware image synthesis. Advances in Neural Information Processing Systems, Vol. 33 (2020), 20154--20166.

[30]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[31]

Pratul P Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, and Jonathan T Barron. 2021. Nerv: Neural reflectance and visibility fields for relighting and view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7495--7504.

[32]

Edgar Tretschk, Ayush Tewari, Vladislav Golyanik, Michael Zollhöfer, Christoph Lassner, and Christian Theobalt. 2021. Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12959--12970.

[33]

Cédric Villani. 2009. Optimal transport: old and new. Vol. 338. Springer.

[34]

Zirui Wang, Shangzhe Wu, Weidi Xie, Min Chen, and Victor Adrian Prisacariu. 2021. NeRF--: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021).

[35]

Changchang Wu. 2013. Towards linear-time incremental structure from motion. In 2013 International Conference on 3D Vision-3DV 2013. IEEE, 127--134.

Digital Library

[36]

Bangbang Yang, Yinda Zhang, Yinghao Xu, Yijin Li, Han Zhou, Hujun Bao, Guofeng Zhang, and Zhaopeng Cui. 2021. Learning object-compositional neural radiance field for editable scene rendering. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13779--13788.

[37]

Fangneng Zhan, Yingchen Yu, Kaiwen Cui, Gongjie Zhang, Shijian Lu, Jianxiong Pan, Changgong Zhang, Feiying Ma, Xuansong Xie, and Chunyan Miao. 2021a. Unbalanced feature transport for exemplar-based image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15028--15038.

[38]

Fangneng Zhan, Yingchen Yu, Rongliang Wu, Kaiwen Cui, Aoran Xiao, Shijian Lu, and Ling Shao. 2021b. Bi-level feature alignment for versatile image translation and manipulation. arXiv preprint arXiv:2107.03021 (2021).

[39]

Fangneng Zhan, Yingchen Yu, Changgong Zhang, Rongliang Wu, Wenbo Hu, Shijian Lu, Feiying Ma, Xuansong Xie, and Ling Shao. 2022a. Gmlight: Lighting estimation via geometric distribution approximation. IEEE Transactions on Image Processing, Vol. 31 (2022), 2268--2278.

[40]

Fangneng Zhan, Changgong Zhang, Wenbo Hu, Shijian Lu, Feiying Ma, Xuansong Xie, and Ling Shao. 2021c. Sparse needlets for lighting estimation with spherical transport loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12830--12839.

[41]

Fangneng Zhan, Changgong Zhang, Yingchen Yu, Yuan Chang, Shijian Lu, Feiying Ma, and Xuansong Xie. 2021d. Emlight: Lighting estimation via spherical distribution approximation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3287--3295.

[42]

Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Rongliang Wu, and Shijian Lu. 2022b. Modulated contrast for versatile image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18280--18290.

[43]

Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen Koltun. 2020. Nerf: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492 (2020).

[44]

Zichao Zhang and Davide Scaramuzza. 2018. A tutorial on quantitative trajectory evaluation for visual (-inertial) odometry. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 7244--7251.

Digital Library

[45]

Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. 2019. On the continuity of rotation representations in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5745--5753.

Cited By

Li DHuang SLu ZDuan XHuang H(2024)ST-4DGS: Spatial-Temporally Consistent 4D Gaussian Splatting for Efficient Dynamic Scene RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657520(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657520
Zhou KLi WJiang NHan XLu J(2024)From NeRFLiX to NeRFLiX++: A General NeRF-Agnostic Restorer ParadigmIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.334339546:5(3422-3437)Online publication date: May-2024
https://doi.org/10.1109/TPAMI.2023.3343395
Xu BWang YZheng YLe X(2024)Self-Calibrated Neural Implicit 3D Reconstruction2024 IEEE 19th Conference on Industrial Electronics and Applications (ICIEA)10.1109/ICIEA61579.2024.10664952(1-6)Online publication date: 5-Aug-2024
https://doi.org/10.1109/ICIEA61579.2024.10664952
Show More Cited By

Index Terms

VMRF: View Matching Neural Radiance Fields
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

CaSE-NeRF: Camera Settings Editing of Neural Radiance Fields
Advances in Computer Graphics
Abstract
Neural Radiance Fields (NeRF) have shown excellent quality in three-dimensional (3D) reconstruction by synthesizing novel views from multi-view images. However, previous NeRF-based methods do not allow users to perform user-controlled camera ...
CamP: Camera Preconditioning for Neural Radiance Fields

Neural Radiance Fields (NeRF) can be optimized to obtain high-fidelity 3D scene reconstructions of objects and large-scale scenes. However, NeRFs require accurate camera parameters as input --- inaccurate camera parameters result in blurry renderings. ...
CBARF: Cascaded Bundle-Adjusting Neural Radiance Fields From Imperfect Camera Poses
Existing volumetric neural rendering techniques, such as Neural Radiance Fields (NeRF), face limitations in synthesizing high-quality novel views when the camera poses of input images are imperfect. To address this issue, we propose a novel 3D ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

October 2022

7537 pages

ISBN:9781450392037

DOI:10.1145/3503161

General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Ministry of Education of Singapore

Conference

MM '22

Sponsor:

SIGMM

MM '22: The 30th ACM International Conference on Multimedia

October 10 - 14, 2022

Lisboa, Portugal

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
268
Total Downloads

Downloads (Last 12 months)79
Downloads (Last 6 weeks)2

Reflects downloads up to 14 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li DHuang SLu ZDuan XHuang H(2024)ST-4DGS: Spatial-Temporally Consistent 4D Gaussian Splatting for Efficient Dynamic Scene RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657520(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657520
Zhou KLi WJiang NHan XLu J(2024)From NeRFLiX to NeRFLiX++: A General NeRF-Agnostic Restorer ParadigmIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.334339546:5(3422-3437)Online publication date: May-2024
https://doi.org/10.1109/TPAMI.2023.3343395
Xu BWang YZheng YLe X(2024)Self-Calibrated Neural Implicit 3D Reconstruction2024 IEEE 19th Conference on Industrial Electronics and Applications (ICIEA)10.1109/ICIEA61579.2024.10664952(1-6)Online publication date: 5-Aug-2024
https://doi.org/10.1109/ICIEA61579.2024.10664952
Zhang JZhan FXu MLu SXing E(2024)FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02024(21424-21433)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.02024
Engelhardt ARaj ABoss MZhang YKar ALi YSun DBrualla RBarron JLensch HJampani V(2024)SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01857(19636-19646)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01857
Levy AMatthews MSela MWetzstein GLagun D(2024)MELON: NeRF with Unposed Images in SO(3)2024 International Conference on 3D Vision (3DV)10.1109/3DV62453.2024.00084(354-364)Online publication date: 18-Mar-2024
https://doi.org/10.1109/3DV62453.2024.00084
Li DHuang SShen THuang HEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Dynamic View Synthesis with Spatio-Temporal Feature Warping from Sparse ViewsProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612419(1565-1576)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612419
Wang CSun JLiu LWu CShen ZWu DDai YZhang LEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Digging into Depth Priors for Outdoor Neural Radiance FieldsProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612306(1221-1230)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612306
Maggio DAbate MShi JMario CCarlone L(2023)Loc-NeRF: Monte Carlo Localization using Neural Radiance Fields2023 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA48891.2023.10160782(4018-4025)Online publication date: 29-May-2023
https://doi.org/10.1109/ICRA48891.2023.10160782
Cheng ZEsteves CJampani VKar AMaji SMakadia A(2023)LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01679(18266-18275)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.01679
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents