research-article

Data-driven visual similarity for cross-domain image matching

Authors:

Abhinav Shrivastava,

Tomasz Malisiewicz,

Alexei A. EfrosAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 30, Issue 6

Pages 1 - 10

https://doi.org/10.1145/2070781.2024188

Published: 12 December 2011 Publication History

Abstract

The goal of this work is to find visually similar images even if they appear quite different at the raw pixel level. This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting conditions, paintings, hand-drawn sketches, etc. We propose a surprisingly simple method that estimates the relative importance of different features in a query image based on the notion of "data-driven uniqueness". We employ standard tools from discriminative object detection in a novel way, yielding a generic approach that does not depend on a particular image representation or a specific visual domain. Our approach shows good performance on a number of difficult cross-domain visual tasks e.g., matching paintings or sketches to real photographs. The method also allows us to demonstrate novel applications such as Internet re-photography, and painting2gps. While at present the technique is too computationally intensive to be practical for interactive image retrieval, we hope that some of the ideas will eventually become applicable to that domain as well.

References

[1]

Bae, S., Agarwala, A., and Durand, F. 2010. Computational rephotography. ACM Trans. Graph. 29 (July), 24:1--24:15.

Digital Library

[2]

Baeza-Yates, R. A., and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison-Wesley Longman Publishing.

Digital Library

[3]

Boiman, O., and Irani, M. 2007. Detecting irregularities in images and in video. In IJCV.

Digital Library

[4]

Buades, A., Coll, B., and Morel, J.-M. 2005. A non-local algorithm for image denoising. In CVPR.

Digital Library

[5]

Chang, C.-C., and Lin, C.-J. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology.

Digital Library

[6]

Chen, T., Cheng, M.-M., Tan, P., Shamir, A., and Hu, S.-M. 2009. Sketch2photo: internet image montage. ACM Trans. Graph. 28.

Digital Library

[7]

Chong, H., Gortler, S., and Zickler, T. 2008. A perception-based color space for illumination-invariant image processing. In Proceedings of SIGGRAPH.

Digital Library

[8]

Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In CVPR.

Digital Library

[9]

Dale, K., Johnson, M. K., Sunkavalli, K., Matusik, W., and Pfister, H. 2009. Image restoration using online photo collections. In ICCV.

[10]

Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv..

Digital Library

[11]

Efros, A. A., and Freeman, W. T. 2001. Image quilting for texture synthesis and transfer. In SIGGRAPH, Computer Graphics Proceedings, Annual Conference Series.

Digital Library

[12]

Eitz, M., Hildebrand, K., Boubekeur, T., and Alexa, M. 2010. Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE TVCG.

Digital Library

[13]

Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., and Zisserman, A., 2007. The PASCAL Visual Object Classes Challenge.

[14]

Freeman, W. T., Jones, T. R., and Pasztor, E. C. 2002. Example-based super-resolution. IEEE Computer Graphics Applications.

Digital Library

[15]

HaCohen, Y., Fattal, R., and Lischinski, D. 2010. Image upsampling via texture hallucination. In ICCP.

[16]

Hays, J., and Efros, A. A. 2007. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH).

Digital Library

[17]

Hays, J., and Efros, A. A. 2008. im2gps: estimating geographic information from a single image. In CVPR.

[18]

Hertzmann, A., Jacobs, C., Oliver, N., Curless, B., and Salesin, D. 2001. Image analogies. In SIGGRAPH.

Digital Library

[19]

Hoiem, D., Sukthankar, R., Schneiderman, H., and Huston, L. 2004. Object-based image retrieval using the statistical structure of images. In CVPR.

Digital Library

[20]

Itti, L., and Koch, C. 2000. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research.

[21]

Jégou, H., Douze, M., and Schmid, C. 2008. Hamming embedding and weak geometric consistency for large scale image search. In ECCV.

[22]

Johnson, M. K., Dale, K., Avidan, S., Pfister, H., Freeman, W. T., and Matusik, W. 2010. CG2real: Improving the realism of computer generated images using a large collection of photographs. IEEE TVCG.

Digital Library

[23]

Judd, T., Ehinger, K., Durand, F., and Torralba, A. 2009. Learning to predict where humans look. In ICCV.

[24]

Kaneva, B., Sivic, J., Torralba, A., Avidan, S., and Freeman, W. T. 2010. Infinite images: Creating and exploring a large photorealistic virtual space. Proceedings of the IEEE.

[25]

Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., and Seitz, S. M. 2011. Exploring photobios. In SIGGRAPH.

Digital Library

[26]

Lazebnik, S., Schmid, C., and Ponce, J. 2009. Spatial pyramid matching. In Object Categorization: Computer and Human Vision Perspectives. Cambridge University Press.

[27]

Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. IJCV.

Digital Library

[28]

Malisiewicz, T., and Efros, A. A. 2009. Beyond categories: The visual memex model for reasoning about object relationships. In NIPS.

[29]

Malisiewicz, T., Gupta, A., and Efros, A. A. 2011. Ensemble of exemplar-svms for object detection and beyond. In ICCV.

Digital Library

[30]

Oliva, A., and Torralba, A. 2006. Building the gist of a scene: the role of global image features in recognition. Progress in Brain Research.

[31]

Russell, B. C., Sivic, J., Ponce, J., and Dessales, H. 2011. Automatic alignment of paintings and photographs depicting a 3d scene. In 3D Representation and Recognition (3dRR).

[32]

Schodl, A., Szeliski, R., Salesin, D. H., and Essa, I. 2000. Video textures. In SIGGRAPH.

Digital Library

[33]

Shechtman, E., and Irani, M. 2007. Matching local self-similarities across images and videos. In CVPR.

[34]

Sivic, J., and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In ICCV.

Digital Library

[35]

Snavely, N., Garg, R., Seitz, S. M., and Szeliski, R. 2008. Finding paths through the world's photos. ACM Transactions on Graphics.

Digital Library

[36]

Tieu, K., and Viola, P. 2004. Boosting image retrieval. IJCV.

Digital Library

[37]

Torralba, A., Fergus, R., and Freeman, W. T. 2008. 80 million tiny images: a large database for non-parametric object and scene recognition. IEEE PAMI.

Digital Library

[38]

Wexler, Y., Shechtman, E., and Irani, M. Space-time completion of video. IEEE PAMI.

Digital Library

[39]

Whyte, O., Sivic, J., and Zisserman, A. 2009. Get out of my picture! internet-based inpainting. In BMVC.

[40]

Wolf, L., Hassner, T., and Taigman, Y. 2009. The one-shot similarity kernel. In ICCV.

Cited By

Gao Q(2024)User Engagement Detection-Based Financial Technology Advertising Video Effectiveness EvaluationJournal of Organizational and End User Computing10.4018/JOEUC.34093136:1(1-11)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.4018/JOEUC.340931
Yang FIsmail NPang YKebande VAl-Dhaqm AKoh T(2024)A Systematic Literature Review of Deep Learning Approaches for Sketch-Based Image Retrieval: Datasets, Metrics, and Future DirectionsIEEE Access10.1109/ACCESS.2024.335793912(14847-14869)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3357939
Li TXu YWu TCharlton JBennett KAl-Hindawi F(2023)BlobCUT: A Contrastive Learning Method to Support Small Blob Detection in Medical ImagingBioengineering10.3390/bioengineering1012137210:12(1372)Online publication date: 29-Nov-2023
https://doi.org/10.3390/bioengineering10121372
Show More Cited By

Index Terms

Data-driven visual similarity for cross-domain image matching
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
      2. Computer vision tasks
        Scene understanding

Recommendations

Data-driven visual similarity for cross-domain image matching
SA '11: Proceedings of the 2011 SIGGRAPH Asia Conference

The goal of this work is to find visually similar images even if they appear quite different at the raw pixel level. This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting ...
Efficient Cross-Domain Image Retrieval by Multi-Level Matching and Spatial Verification for Structural Similarity
MM '14: Proceedings of the 22nd ACM international conference on Multimedia

Content-based image retrieval (CBIR) technique is important for browsing the rapidly growing Web images. However, traditional CBIR methods usually fail when the query and database images are in different domains. Instead of focusing on a specific domain,...
Visual saliency based bag of phrases for image retrival
VRCAI '14: Proceedings of the 13th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

This paper presents a saliency based bag-of-phrases (Saliency-BoP for short) method for image retrieval. It combines saliency detection with visual phrase construction to extract bag-of-phrase features. To achieve this, the method first detects salient ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 30, Issue 6

December 2011

678 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2070781

Issue’s Table of Contents

Copyright © 2011 ACM.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 December 2011

Published in TOG Volume 30, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Office of Naval Research

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

130
Total Citations
View Citations
2,774
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)2

Reflects downloads up to 14 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gao Q(2024)User Engagement Detection-Based Financial Technology Advertising Video Effectiveness EvaluationJournal of Organizational and End User Computing10.4018/JOEUC.34093136:1(1-11)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.4018/JOEUC.340931
Yang FIsmail NPang YKebande VAl-Dhaqm AKoh T(2024)A Systematic Literature Review of Deep Learning Approaches for Sketch-Based Image Retrieval: Datasets, Metrics, and Future DirectionsIEEE Access10.1109/ACCESS.2024.335793912(14847-14869)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3357939
Li TXu YWu TCharlton JBennett KAl-Hindawi F(2023)BlobCUT: A Contrastive Learning Method to Support Small Blob Detection in Medical ImagingBioengineering10.3390/bioengineering1012137210:12(1372)Online publication date: 29-Nov-2023
https://doi.org/10.3390/bioengineering10121372
Ge CWang JQi QSun HXu TLiao JWilliams BChen YNeville J(2023)Semi-transductive learning for generalized zero-shot sketch-based image retrievalProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i6.25931(7678-7686)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i6.25931
Ge CWang JQi QSun HXu TLiao JWilliams BChen YNeville J(2023)Scene-level sketch-based image retrieval with minimal pairwise supervisionProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i1.25141(650-657)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i1.25141
Truong PDanelljan MTimofte RVan Gool L(2023)PDC-Net+: Enhanced Probabilistic Dense Correspondence NetworkIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.324922545:8(10247-10266)Online publication date: Aug-2023
https://doi.org/10.1109/TPAMI.2023.3249225
Feng CCao ZXiao YFang ZZhou J(2023)Multi-spectral template matching based object detection in a few-shot learning mannerInformation Sciences: an International Journal10.1016/j.ins.2022.12.067624:C(20-36)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.ins.2022.12.067
Zeng ASong SYu KDonlon EHogan FBauza MMa DTaylor OLiu MRomo EFazeli NAlet FChavan Dafle NHolladay RMorona INair PGreen DTaylor ILiu WFunkhouser TRodriguez A(2022)Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matchingInternational Journal of Robotics Research10.1177/027836491986801741:7(690-705)Online publication date: 1-Jun-2022
https://dl.acm.org/doi/10.1177/0278364919868017
Ufer NSimon MLang SOmmer B(2021)Large-scale interactive retrieval in art collections using multi-style feature aggregationPLOS ONE10.1371/journal.pone.025971816:11(e0259718)Online publication date: 24-Nov-2021
https://doi.org/10.1371/journal.pone.0259718
Lu JLi LZhang C(2021)Self-reinforcing Unsupervised MatchingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.3061945(1-1)Online publication date: 2021
https://doi.org/10.1109/TPAMI.2021.3061945
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents