Abstract
Nonnegative matrix factorization and its graph regularized extensions have received significant attention in machine learning and data mining. However, existing approaches are sensitive to outliers and noise due to the utilization of the squared loss function in measuring the quality of graph regularization and data reconstruction. In this paper, we present a novel robust graph regularized NMF model (RGNMF) to approximate the data matrix for clustering. Our assumption is that there may exist some entries of the data corrupted arbitrarily, but the corruption is sparse. To address this problem, an error matrix is introduced to capture the sparse corruption. With this sparse outlier matrix, a robust factorization result could be obtained since a much cleaned data could be reconstructed. Moreover, the \(\ell _{1}\)-norm function is used to alleviate the influence of unreliable regularization which is incurred by unexpected graphs. That is, the sparse error matrix alleviates the impact of noise and outliers, and the \(\ell _{1}\)-norm function leads to a faithful regularization since the influence of the unreliable regularization errors can be reduced. Thus, RGNMF is robust to unreliable graphs and noisy data. In order to solve the optimization problem of our method, an iterative updating algorithm is proposed and its convergence is also guaranteed theoretically. Experimental results show that the proposed method consistently outperforms many state-of-the-art methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bach F, Jenatton R, Mairal J, Obozinski G (2012) Optimization with sparsity-inducing penalties. Found Trends Mach Learn 4(1):1–106
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Blake C, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Sciences, University of California. http://www.ics.uci.edu/~mlearn/MLRepository.html
Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Cai D, He X, Wu X, Han J (2008) Non-negative matrix factorization on manifold. yIn: IEEE International Conference on Data Mining, pp. 63–72
Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):11
Chen J, Wu L, Audhkhasi K, Kingsbury B, Ramabhadrari B (2016) Efficient one-vs-one kernel ridge regression for speech recognition. yIn: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2454–2458
Ding C, Li T, Jordan MI (2010) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1):45–55
Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix t-factorizations for clustering. In: SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135
Du L, Shen YD (2013) Towards robust co-clustering. In: International Joint Conference on Artificial Intelligence, pp. 1317–1322
Gu Q, Ding C, Han J (2011) On trivial solution and scale transfer problems in graph regularized nmf. In: International Joint Conference on Artificial Intelligence, vol. 22, p. 1288
Gu Q, Zhou J (2009) Local learning regularized nonnegative matrix factorization. In: International Joint Conference on Artificial Intelligence
Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (2011) Robust statistics: the approach based on influence functions, vol 114. John Wiley and Sons
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
Huang J, Nie F, Huang H, Ding C (2014) Robust manifold nonnegative matrix factorization. ACM Trans Knowl Discov Data 8(3):11
Huang S, Xu Z, Wang F (2017) Nonnegative matrix factorization with adaptive neighbors. In: IEEE International Joint Conference on Neural Networks, pp. 486–493
Karypis G (2002) CLUTO-a clustering toolkit. Technical Report 02-017. Department of Computer Science, University of Minnesota. http://www.cs.umn.edu/cluto
Kong D, Ding C, Huang H (2011) Robust nonnegative matrix factorization using l21-norm. In: ACM International Conference on Information and Knowledge Management, pp. 673–682
Le Q, Sarlós T, Smola A (2013) Fastfood-approximating kernel expansions in loglinear time. In: International Conference on Machine Learning, vol. 85
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Proceedings of Advances in Neural Information Processing Systems, pp. 556–562
Liu H, Li X, Zheng X (2013) Solving non-negative matrix factorization by alternating least squares with a modified strategy. Data Min Knowl Disc 26(3):435–451
Liu Y, Jiao L, Shang F (2013) A fast tri-factorization method for low-rank matrix recovery and completion. Pattern Recogn 46(1):163–173
Long M, Wang J, Ding G, Shen D, Yang Q (2014) Transfer learning with graph co-regularization. IEEE Trans Knowl Data Eng 26(7):1805–1818
Pauca VP, Piper J, Plemmons RJ (2006) Nonnegative matrix factorization for spectral data analysis. Linear Algebra Appl 416(1):29–47
Pauca VP, Shahnaz F, Berry MW, Plemmons RJ (2004) Text mining using non-negative matrix factorizations. In: SIAM International Conference on Data Mining, pp. 452–456
Seung HS, Lee DD (2000) The manifold ways of perception. Science 290(5500):2268–2269
Shang F, Jiao L, Wang F (2012) Graph dual regularization non-negative matrix factorization for co-clustering. Pattern Recogn 45(6):2237–2250
Sun M, Van Hamme H (2012) Large scale graph regularized non-negative matrix factorization with \(\ell _1\) normalization based on kullback-leibler divergence. IEEE Trans Signal Process 60(7):3876–3880
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Wang F, Li T, Wang X, Zhu S, Ding C (2011) Community discovery using nonnegative matrix factorization. Data Min Knowl Disc 22(3):493–521
Wang F, Zhang C, Li T (2009) Clustering with local and global regularization. IEEE Trans Knowl Data Eng 21(12):1665–1678
Wang H, Nie F, Huang H, Makedon F (2011) Fast nonnegative matrix tri-factorization for large-scale data co-clustering. In: International Joint Conference on Artificial Intelligence, pp. 1553–1558
Wu L, Yen IE, Chen J, Yan R (2016) Revisiting random binning features: Fast convergence and strong parallelizability. In: SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1265–1274
Yoo J, Choi S (2010) Orthogonal nonnegative matrix tri-factorization for co-clustering: multiplicative updates on stiefel manifolds. Inf Process Manag 46(5):559–570
Zhang H, Zha ZJ, Yan S, Wang M, Chua TS (2012) Robust non-negative graph embedding: Towards noisy data, unreliable graphs, and noisy labels. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2464–2471
Acknowledgements
This paper was in part supported by Grants from the Natural Science Foundation of China (No. 61572111), a 985 Project of UESTC (No.A1098531023601041), a Fundamental Research Fund for the Central Universities of China (No. ZYGX2016Z003) and the National High Technology Research and Development Program of China (863 Program) (No. 2015AA015408).
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Fei Wang.
Rights and permissions
About this article
Cite this article
Huang, S., Wang, H., Li, T. et al. Robust graph regularized nonnegative matrix factorization for clustering. Data Min Knowl Disc 32, 483–503 (2018). https://doi.org/10.1007/s10618-017-0543-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-017-0543-9