skip to main content
10.1145/3503161.3548078acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

VMRF: View Matching Neural Radiance Fields

Published: 10 October 2022 Publication History

Abstract

Neural Radiance Fields (NeRF) has demonstrated very impressive performance in novel view synthesis via implicitly modelling 3D representations from multi-view 2D images. However, most existing studies train NeRF models with either reasonable camera pose initialization or manually-crafted camera pose distributions which are often unavailable or hard to acquire in various real-world data. We design VMRF, an innovative view matching NeRF that enables effective NeRF training without requiring prior knowledge in camera poses or camera pose distributions. VMRF introduces a view matching scheme, which exploits unbalanced optimal transport to produce a feature transport plan for mapping a rendered image with randomly initialized camera pose to the corresponding real image. With the feature transport plan as the guidance, a novel pose calibration technique is designed which rectifies the initially randomized camera poses by predicting relative pose transformations between the pair of rendered and real images. Extensive experiments over a number of synthetic and real datasets show that the proposed VMRF outperforms the state-of-the-art qualitatively and quantitatively by large margins.

Supplementary Material

MP4 File (MM22-fp1374.mp4)
Presentation video for View Matching Neural Radiance Fields

References

[1]
Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. 2021. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5855--5864.
[2]
Mark Boss, Raphael Braun, Varun Jampani, Jonathan T Barron, Ce Liu, and Hendrik Lensch. 2021. Nerd: Neural reflectance decomposition from image collections. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12684--12694.
[3]
Anpei Chen, Zexiang Xu, Fuqiang Zhao, Xiaoshuai Zhang, Fanbo Xiang, Jingyi Yu, and Hao Su. 2021. Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14124--14133.
[4]
Lenaic Chizat, Gabriel Peyré, Bernhard Schmitzer, and Francc ois-Xavier Vialard. 2018. Scaling algorithms for unbalanced optimal transport problems. Math. Comp., Vol. 87, 314 (2018), 2563--2609.
[5]
Nicolas Courty, Rémi Flamary, and Devis Tuia. 2014. Domain adaptation with regularized optimal transport. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 274--289.
[6]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
[7]
Yilun Du, Yinan Zhang, Hong-Xing Yu, Joshua B Tenenbaum, and Jiajun Wu. 2021. Neural radiance flow for 4d view synthesis and video processing. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14324--14334.
[8]
Olivier Faugeras and Quang-Tuan Luong. 2001. The geometry of multiple images: the laws that govern the formation of multiple images of a scene and some of their applications. MIT press.
[9]
Chen Gao, Ayush Saraf, Johannes Kopf, and Jia-Bin Huang. 2021. Dynamic view synthesis from dynamic monocular video. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5712--5721.
[10]
Jiatao Gu, Lingjie Liu, Peng Wang, and Christian Theobalt. 2021. Stylenerf: A style-based 3d-aware generator for high-resolution image synthesis. arXiv preprint arXiv:2110.08985 (2021).
[11]
Yudong Guo, Keyu Chen, Sen Liang, Yong-Jin Liu, Hujun Bao, and Juyong Zhang. 2021. Ad-nerf: Audio driven neural radiance fields for talking head synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5784--5794.
[12]
Richard Hartley and Andrew Zisserman. 2003. Multiple view geometry in computer vision. Cambridge university press.
[13]
Rasmus Jensen, Anders Dahl, George Vogiatzis, Engin Tola, and Henrik Aanæs. 2014. Large scale multi-view stereopsis evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 406--413.
[14]
Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Anima Anandkumar, Minsu Cho, and Jaesik Park. 2021. Self-calibrating neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5846--5854.
[15]
Nicholas Kolkin, Jason Salavon, and Gregory Shakhnarovich. 2019. Style transfer by relaxed optimal transport and self-similarity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10051--10060.
[16]
Matthias Liero, Alexander Mielke, and Giuseppe Savaré. 2018. Optimal entropy-transport problems and a new Hellinger--Kantorovich distance between positive measures. Inventiones mathematicae, Vol. 211, 3 (2018), 969--1117.
[17]
Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey. 2021. Barf: Bundle-adjusting neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5741--5751.
[18]
Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020a. Neural sparse voxel fields. Advances in Neural Information Processing Systems, Vol. 33 (2020), 15651--15663.
[19]
Steven Liu, Xiuming Zhang, Zhoutong Zhang, Richard Zhang, Jun-Yan Zhu, and Bryan Russell. 2021. Editing conditional radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5773--5783.
[20]
Yanbin Liu, Linchao Zhu, Makoto Yamada, and Yi Yang. 2020b. Semantic correspondence as an optimal transport problem. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4463--4472.
[21]
Ricardo Martin-Brualla, Noha Radwan, Mehdi SM Sajjadi, Jonathan T Barron, Alexey Dosovitskiy, and Daniel Duckworth. 2021. Nerf in the wild: Neural radiance fields for unconstrained photo collections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7210--7219.
[22]
Quan Meng, Anpei Chen, Haimin Luo, Minye Wu, Hao Su, Lan Xu, Xuming He, and Jingyi Yu. 2021. Gnerf: Gan-based neural radiance field without posed camera. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6351--6361.
[23]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision. Springer, 405--421.
[24]
Michael Niemeyer and Andreas Geiger. 2021. Giraffe: Representing scenes as compositional generative neural feature fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11453--11464.
[25]
Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo Martin-Brualla. 2021. Nerfies: Deformable neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5865--5874.
[26]
Sida Peng, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Xiaowei Zhou, and Hujun Bao. 2021. Animatable neural radiance fields for modeling dynamic human bodies. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14314--14323.
[27]
Gabriel Peyré, Marco Cuturi, et al. 2019. Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, Vol. 11, 5--6 (2019), 355--607.
[28]
Johannes L Schonberger and Jan-Michael Frahm. 2016. Structure-from-motion revisited. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4104--4113.
[29]
Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas Geiger. 2020. Graf: Generative radiance fields for 3d-aware image synthesis. Advances in Neural Information Processing Systems, Vol. 33 (2020), 20154--20166.
[30]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[31]
Pratul P Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, and Jonathan T Barron. 2021. Nerv: Neural reflectance and visibility fields for relighting and view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7495--7504.
[32]
Edgar Tretschk, Ayush Tewari, Vladislav Golyanik, Michael Zollhöfer, Christoph Lassner, and Christian Theobalt. 2021. Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12959--12970.
[33]
Cédric Villani. 2009. Optimal transport: old and new. Vol. 338. Springer.
[34]
Zirui Wang, Shangzhe Wu, Weidi Xie, Min Chen, and Victor Adrian Prisacariu. 2021. NeRF--: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021).
[35]
Changchang Wu. 2013. Towards linear-time incremental structure from motion. In 2013 International Conference on 3D Vision-3DV 2013. IEEE, 127--134.
[36]
Bangbang Yang, Yinda Zhang, Yinghao Xu, Yijin Li, Han Zhou, Hujun Bao, Guofeng Zhang, and Zhaopeng Cui. 2021. Learning object-compositional neural radiance field for editable scene rendering. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13779--13788.
[37]
Fangneng Zhan, Yingchen Yu, Kaiwen Cui, Gongjie Zhang, Shijian Lu, Jianxiong Pan, Changgong Zhang, Feiying Ma, Xuansong Xie, and Chunyan Miao. 2021a. Unbalanced feature transport for exemplar-based image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15028--15038.
[38]
Fangneng Zhan, Yingchen Yu, Rongliang Wu, Kaiwen Cui, Aoran Xiao, Shijian Lu, and Ling Shao. 2021b. Bi-level feature alignment for versatile image translation and manipulation. arXiv preprint arXiv:2107.03021 (2021).
[39]
Fangneng Zhan, Yingchen Yu, Changgong Zhang, Rongliang Wu, Wenbo Hu, Shijian Lu, Feiying Ma, Xuansong Xie, and Ling Shao. 2022a. Gmlight: Lighting estimation via geometric distribution approximation. IEEE Transactions on Image Processing, Vol. 31 (2022), 2268--2278.
[40]
Fangneng Zhan, Changgong Zhang, Wenbo Hu, Shijian Lu, Feiying Ma, Xuansong Xie, and Ling Shao. 2021c. Sparse needlets for lighting estimation with spherical transport loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12830--12839.
[41]
Fangneng Zhan, Changgong Zhang, Yingchen Yu, Yuan Chang, Shijian Lu, Feiying Ma, and Xuansong Xie. 2021d. Emlight: Lighting estimation via spherical distribution approximation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3287--3295.
[42]
Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Rongliang Wu, and Shijian Lu. 2022b. Modulated contrast for versatile image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18280--18290.
[43]
Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen Koltun. 2020. Nerf: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492 (2020).
[44]
Zichao Zhang and Davide Scaramuzza. 2018. A tutorial on quantitative trajectory evaluation for visual (-inertial) odometry. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 7244--7251.
[45]
Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. 2019. On the continuity of rotation representations in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5745--5753.

Cited By

View all
  • (2024)ST-4DGS: Spatial-Temporally Consistent 4D Gaussian Splatting for Efficient Dynamic Scene RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657520(1-11)Online publication date: 13-Jul-2024
  • (2024)From NeRFLiX to NeRFLiX++: A General NeRF-Agnostic Restorer ParadigmIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.334339546:5(3422-3437)Online publication date: May-2024
  • (2024)Self-Calibrated Neural Implicit 3D Reconstruction2024 IEEE 19th Conference on Industrial Electronics and Applications (ICIEA)10.1109/ICIEA61579.2024.10664952(1-6)Online publication date: 5-Aug-2024
  • Show More Cited By

Index Terms

  1. VMRF: View Matching Neural Radiance Fields

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '22: Proceedings of the 30th ACM International Conference on Multimedia
    October 2022
    7537 pages
    ISBN:9781450392037
    DOI:10.1145/3503161
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. computer vision
    2. deep learning
    3. neural radiance field
    4. optimal transport
    5. pose calibration
    6. view matching

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 995 of 4,171 submissions, 24%

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)79
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 14 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)ST-4DGS: Spatial-Temporally Consistent 4D Gaussian Splatting for Efficient Dynamic Scene RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657520(1-11)Online publication date: 13-Jul-2024
    • (2024)From NeRFLiX to NeRFLiX++: A General NeRF-Agnostic Restorer ParadigmIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.334339546:5(3422-3437)Online publication date: May-2024
    • (2024)Self-Calibrated Neural Implicit 3D Reconstruction2024 IEEE 19th Conference on Industrial Electronics and Applications (ICIEA)10.1109/ICIEA61579.2024.10664952(1-6)Online publication date: 5-Aug-2024
    • (2024)FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02024(21424-21433)Online publication date: 16-Jun-2024
    • (2024)SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01857(19636-19646)Online publication date: 16-Jun-2024
    • (2024)MELON: NeRF with Unposed Images in SO(3)2024 International Conference on 3D Vision (3DV)10.1109/3DV62453.2024.00084(354-364)Online publication date: 18-Mar-2024
    • (2023)Dynamic View Synthesis with Spatio-Temporal Feature Warping from Sparse ViewsProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612419(1565-1576)Online publication date: 26-Oct-2023
    • (2023)Digging into Depth Priors for Outdoor Neural Radiance FieldsProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612306(1221-1230)Online publication date: 26-Oct-2023
    • (2023)Loc-NeRF: Monte Carlo Localization using Neural Radiance Fields2023 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA48891.2023.10160782(4018-4025)Online publication date: 29-May-2023
    • (2023)LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01679(18266-18275)Online publication date: 1-Oct-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media