skip to main content
research-article

Variational inference with graph regularization for image annotation

Published: 24 February 2011 Publication History

Abstract

Image annotation is a typical area where there are multiple types of attributes associated with each individual image. In order to achieve better performance, it is important to develop effective modeling by utilizing prior knowledge. In this article, we extend the graph regularization approaches to a more general case where the regularization is imposed on the factorized variational distributions, instead of posterior distributions implicitly involved in EM-like algorithms. In this way, the problem modeling can be more flexible, and we can choose any factor in the problem domain to impose graph regularization wherever there are similarity constraints among the instances. We formulate the problem formally and show its geometrical background in manifold learning. We also design two practically effective algorithms and analyze their properties such as the convergence. Finally, we apply our approach to image annotation and show the performance improvement of our algorithm.

References

[1]
Attias, H. 2000. A variational Bayesian framework for graphical models. Adv. Neural Info. Proc. Syst. 12, 209--215.
[2]
Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D. M., and Jordan, M. I. 2003. Matching words and pictures. J. Mach. Learn. Res. 3, 1107--1135.
[3]
Belkin, M. and Niyogi, P. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proceedings of the Conference on Advances in Neural Information Processing Systems 14, 585--591.
[4]
Belkin, M., Niyogi, P., and Sindhwani, V. 2006. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399--2434.
[5]
Bilmes, J. 2004. On virtual evidence and soft evidence in Bayesian networks. Tech. rep. UWEETR-2004-0016, Department of EE, University of Washington.
[6]
Bishop, C. M. 2007. Pattern Recognition and Machine Learning. Springer.
[7]
Blei, D., Ng, A., and Jordan, M. 2003. Latent dirichlet allocation. J. Mach. Learn. Res.
[8]
Blei, D. M. and Jordan, M. I. 2003. Modeling annotated data. In Proceedings of the ACM International Conference on Research and Development in Informaion Retrieval (ACM SIGIR). 127--134.
[9]
Bousquet, O., Boucheron, S., and Lugosi, G. 2003. Introduction to statistical learning theory. In Advanced Lectures on Machine Learning. 169--207.
[10]
Cai, D., Mei, Q., Han, J., and Zhai, C. 2008. Modeling hidden topics on document manifold. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM'08). 911--920.
[11]
Chang, E. and Sychay, G. 2003. CBSA: Content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Trans. Circ. Syst. Video Tech. 13, 26--38.
[12]
Chang, J. and Blei, D. 2009. Relational topic models for document networks. In Proceedings of Conference on AI and Statistics.
[13]
Chung, F. R. K. 1997. Spectral Graph Theory. Regional Conference Series in Mathematics, vol. 92. AMS.
[14]
Csurka, G., Dance, C. R., Fan, L., Willamowski, J., and Bray, C. 2004. Visual categorization with bags of keypoints. In Proceedings of the Workshop on Statistical Learning in Computer Vision (ECCV). 1--22.
[15]
Hastie, T., Tibshirani, R., and Friedman, J. H. 2001. The Elements of Statistical Learning. Springer-Verlag.
[16]
He, X. 2010. Laplacian regularized d-optimal design for active learning and its application to image retrieval. Trans. Img. Proc. 19, 1, 254--263.
[17]
He, X., Cai, D., Shao, Y., Bao, H., and Han, J. 2009a. Laplacian regularized gaussian mixture model for data clustering. IEEE Trans. Knowl. Data Engin.
[18]
He, X., Ji, M., and Bao, H. 2009b. Graph embedding with constraints. In Proceedings of the 21st International Jont Conference on Artifical Intelligence (IJCAI'09). Morgan Kaufmann Publishers Inc., San Francisco, CA, 1065--1070.
[19]
He, X., Ji, M., and Bao, H. 2009c. A unified active and semi-supervised learning framework for image compression. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 65--72.
[20]
He, X., Ji, M., and Bao, H. 2009d. A unified active and semi-supervised learning framework for image compression. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 65--72.
[21]
Hofmann, T. 1999. Probabilistic latent semantic indexing. In Proceedings of the ACM International Conference on Research and Development in Information Retrieval (SIGIR'05). ACM, 50--57.
[22]
Hofmann, T. 2001. Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 1-2, 177--196.
[23]
Jordan, M. I., Ed. 1999. Learning in Graphical Models. MIT Press, Cambridge, MA.
[24]
Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., and Saul, L. K. 1999. An introduction to variational methods for graphical models. In Learning in Graphical Models. MIT Press, Cambridge, MA, 105--161.
[25]
Li, L.-J., Socher, R., and Fei-Fei, L. 2009. Towards total scene understanding:classification, annotation and segmentation in an automatic framework. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26]
Mei, Q., Cai, D., Zhang, D., and Zhai, C. 2008. Topic modeling with network regularization. In Proceedings of the ACM International Conference on World Wide Web (WWW'08). 101--110.
[27]
Minka, T. P. 2003. Estimating a dirichlet distribution. http://research.microsoft.com/minka.
[28]
Monay, F. and Gatica-Perez, D. 2003. On image auto-annotation with latent space models. In Proceedings of the ACM International Conference on Multimedia (SIGMM'03). 275--278.
[29]
Monay, F. and Gatica-Perez, D. 2004. pLSA-based image auto-annotation: constraining the latent space. In Proceedings of the ACM International Conference on Multimedia (SIGMM'04). 348--351.
[30]
Neal, R. M. and Hinton, G. E. 1999. A view of the em algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models. 355--368.
[31]
Nocedal, J. and Wright, S. 2006. Numerical Optimization. Springer.
[32]
Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco, CA.
[33]
Rubner, Y., Tomasi, C., and Guibas, L. J. 2000. The earth mover's distance as a metric for image retrieval. Int. J. Comput. Vis. 40, 2, 99--121.
[34]
Shao, Y., Zhou, Y., He, X., Cai, D., and Bao, H. 2009. Semi-supervised topic modeling for image annotation. In Proceedings of the 17th ACM International Conference on Multimedia (MM'09). ACM, New York, 521--524.
[35]
Si, S., Tao, D., and Geng, B. 2010. Bregman divergence-based regularization for transfer subspace learning. IEEE Trans. Knowl. Data Engin. 22, 929--942.
[36]
Song, D. and Tao, D. 2010. Biologically inspired feature manifold for scene classification. Trans. Img. Proc. 19, 1, 174--184.
[37]
Stephen, E. E., Fienberg, S., and Lafferty, J. 2004. Mixed membership models of scientific publications. In Proc. National Acad. Sci.
[38]
Tao, D., Li, X., Wu, X., and Maybank, S. J. 2009. Geometric mean for subspace selection. IEEE Trans. Patt. Anal. Mach. Intell. 31, 2, 260--274.
[39]
Vapnik, V. N. 1995. The Nature of Statistical Learning Theory. Springer-Verlag, Berlin.
[40]
Vapnik, V. N. 1998. Statistical Learning Theory. Wiley.
[41]
Winn, J. and Bishop, C. M. 2005. Variational message passing. J. Mach. Learn. Res. 6, 661--694.
[42]
Xing, E. P., Jordan, M. I., and Russell, S. J. 2003. A generalized mean field algorithm for variational inference in exponential families. In Proceedings of the International Conference on Uncertainty in Artificial Intelligence. 583--591.
[43]
Zhang, R., Zhang, Z. M., Li, M., Ma, W.-Y., and Zhang, H.-J. 2005. A probabilistic semantic model for image annotation and multi-modal image retrieva. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05). 846--851.
[44]
Zhou, T., Tao, D., and Wu, X. 2010. Manifold elastic net: A unified framework for sparse dimension reduction. Data Min. Knowl. Disc.
[45]
Zhu, X., Lafferty, J. and Ghahramani, Z. 2005. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the International Conference on Machine Learning (ICML'05).

Cited By

View all
  • (2020)End-to-End Text-to-Image Synthesis with Spatial ConstrainsACM Transactions on Intelligent Systems and Technology10.1145/339170911:4(1-19)Online publication date: 25-May-2020
  • (2015)Nonnegative Multiresolution Representation-Based Texture Image ClassificationACM Transactions on Intelligent Systems and Technology10.1145/27380507:1(1-21)Online publication date: 7-Oct-2015
  • (2014)Differentiating Intended Sensory Outcome from Underlying Motor Actions in the Human BrainThe Journal of Neuroscience10.1523/JNEUROSCI.5435-13.201434:46(15446-15454)Online publication date: 12-Nov-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 2, Issue 2
February 2011
175 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/1899412
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2011
Accepted: 01 July 2010
Revised: 01 May 2010
Received: 01 February 2010
Published in TIST Volume 2, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Automatic image annotation
  2. Laplacian regularization
  3. graph regularization
  4. semantic indexing
  5. semi-supervised learning
  6. variational inference

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2020)End-to-End Text-to-Image Synthesis with Spatial ConstrainsACM Transactions on Intelligent Systems and Technology10.1145/339170911:4(1-19)Online publication date: 25-May-2020
  • (2015)Nonnegative Multiresolution Representation-Based Texture Image ClassificationACM Transactions on Intelligent Systems and Technology10.1145/27380507:1(1-21)Online publication date: 7-Oct-2015
  • (2014)Differentiating Intended Sensory Outcome from Underlying Motor Actions in the Human BrainThe Journal of Neuroscience10.1523/JNEUROSCI.5435-13.201434:46(15446-15454)Online publication date: 12-Nov-2014
  • (2012)Multiview Metric Learning with Global Consistency and Local SmoothnessACM Transactions on Intelligent Systems and Technology10.1145/2168752.21687673:3(1-22)Online publication date: 1-May-2012

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media