skip to main content
10.1145/2339530.2339723acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

RolX: structural role extraction & mining in large graphs

Published: 12 August 2012 Publication History

Abstract

Given a network, intuitively two nodes belong to the same role if they have similar structural behavior. Roles should be automatically determined from the data, and could be, for example, "clique-members," "periphery-nodes," etc. Roles enable numerous novel and useful network-mining tasks, such as sense-making, searching for similar nodes, and node classification. This paper addresses the question: Given a graph, how can we automatically discover roles for nodes? We propose RolX (Role eXtraction), a scalable (linear in the number of edges), unsupervised learning approach for automatically extracting structural roles from general network data. We demonstrate the effectiveness of RolX on several network-mining tasks: from exploratory data analysis to network transfer learning. Moreover, we compare network role discovery with network community discovery. We highlight fundamental differences between the two (e.g., roles generalize across disconnected networks, communities do not); and show that the two approaches are complimentary in nature.

Supplementary Material

JPG File (307_w_talk_2.jpg)
MP4 File (307_w_talk_2.mp4)

References

[1]
H. Akaike. Information theory and an extension of the maximum likelihood principle. In Proc. of the 2nd Int'l Symp. on Information Theory, pages 267--281, 1972.
[2]
L. Akoglu, M. McGlohon, and C. Faloutsos. Oddball: Spotting anomalies in weighted graphs. In PAKDD, pages 410--421, 2010.
[3]
I. Bhattacharya, S. Godbole, S. Joshi, and A. Verma. Cross-guided clustering: Transfer of relevant supervision across domains for improved clustering. In ICDM, pages 41--50, 2009.
[4]
D. Cai, C. Zhang, and X. He. Unsupervised feature selection for multi-cluster data. In KDD, pages 333--342, 2010.
[5]
B. Cao, N. Liu, and Q. Yang. Transfer learning for collective link prediction in multiple heterogenous domains. In ICML, pages 159--166, 2010.
[6]
A. Clauset, M. Newman, and C. Moore. Finding community structure in very large networks. Phys. Rev. E, 70:066111, 2004.
[7]
W. Dai, O. Jin, G.-R. Xue, Q. Yang, and Y. Yu. Eigentransfer: a unified framework for transfer learning. In ICML, pages 193--200, 2009.
[8]
I. S. Dhillon and S. Sra. Generalized nonnegative matrix approximations with Bregman divergences. In NIPS, pages 283--290, 2005.
[9]
N. Eagle, A. Pentland, and D. Lazer. Inferring social network structure using mobile phone data. PNAS, 106(36):15274--15278, 2009.
[10]
J. Eggert and E. Korner. Sparse coding and NMF. In IJCNN, pages 2529--2533, 2004.
[11]
H. Fürstenau and M. Lapata. Semi-supervised semantic role labeling. In EACL, pages 220--228, 2009.
[12]
B. Gallagher and T. Eliassi-Rad. Leveraging label-independent features for classification in sparsely labeled networks: An empirical study. Lecture Notes in Computer Science, 5498:1--19, 2010.
[13]
B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos. Using ghost edges for classification in sparsely labeled networks. In KDD, pages 256--264, 2008.
[14]
K. Henderson, T. Eliassi-Rad, C. Faloutsos, L. Akoglu, L. Li, K. Maruhashi, B. A. Prakash, and H. Tong. Metric forensics: A multi-level approach for mining volatile graphs. In KDD, pages 163--172, 2010.
[15]
K. Henderson, B. Gallagher, L. Li, L. Akoglu, T. Eliassi-Rad, H. Tong, and C. Faloutsos. It's who you know: Graph mining using recursive structural features. In KDD, pages 663--671, 2011.
[16]
M. Herbster. Exploiting cluster-structure to predict the labeling of a graph. In ALT, pages 54--69, 2008.
[17]
D. Huffman. A method for the construction of minimum-redundancy codes. In Proc. of the IRE, pages 1098--1101, 1952.
[18]
D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788--791, 1999.
[19]
R. N. Lichtenwalter, J. T. Lussier, and N. V. Chawla. New perspectives and methods in link prediction. In KDD, pages 243--252, 2010.
[20]
C.-J. Lin. Projected Gradient Methods for Nonnegative Matrix Factorization. Neural Computation, 19(10):2756--2779, 2007.
[21]
S. P. Lloyd. Least squares quantization in PCM. IEEE Trans. on Info. Theory, IT-28(2):129--137, 1982.
[22]
J. Max. Quantizing for minimum distortion. IRE Trans. on Information Theory, IT-6:7--12, 1960.
[23]
A. Mccallum, X. Wang, and A. Corrada-Emmanuel. Topic and role discovery in social networks with experiments on enron and academic email. JAIR, 30(1):249--272, 2007.
[24]
I. Molloy, N. Li, T. Li, Z. Mao, Q. Wang, and J. Lobo. Evaluating role mining algorithms. In SACMAT, pages 95--104, 2009.
[25]
M. E. J. Newman. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E., 74:036104, 2006.
[26]
A. F. Patrick Doreian, Vladimir Batagelj. Generalized Blockmodeling. Cambridge University Press, 2005.
[27]
J. Rissanen. Modeling by shortest data description. Automatica, 14:465--471, 1978.
[28]
M. Somaiya, C. Jermaine, and S. Ranka. Mixture models for learning low-dimensional roles in high-dimensional data. In KDD, pages 909--918, 2010.
[29]
Z. Wang, Y. Song, and C. Zhang. Knowledge transfer on hybrid graph. In IJCAI, pages 1291--1296, 2009.
[30]
J. Weston, C. Leslie, E. Ie, D. Zhou, A. Elisseeff, and W. S. Noble. Semi-supervised protein classification using cluster kernels. Bioinformatics, 21:3241--3247, 2005.

Cited By

View all
  • (2024)Anomaly Detection in Blockchain Networks Using Unsupervised Learning: A SurveyAlgorithms10.3390/a1705020117:5(201)Online publication date: 9-May-2024
  • (2024)A novel hierarchical network-based approach to unveil the complexity of functional microbial genomeBMC Genomics10.1186/s12864-024-10692-625:1Online publication date: 14-Aug-2024
  • (2024)DP-GCN: Node Classification by Connectivity and Local Topology Structure on Real-World NetworkACM Transactions on Knowledge Discovery from Data10.1145/364946018:6(1-20)Online publication date: 12-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2012
1616 pages
ISBN:9781450314626
DOI:10.1145/2339530
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph mining
  2. network classification
  3. sense-making
  4. similarity search
  5. structural role discovery

Qualifiers

  • Research-article

Conference

KDD '12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)115
  • Downloads (Last 6 weeks)7
Reflects downloads up to 13 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Anomaly Detection in Blockchain Networks Using Unsupervised Learning: A SurveyAlgorithms10.3390/a1705020117:5(201)Online publication date: 9-May-2024
  • (2024)A novel hierarchical network-based approach to unveil the complexity of functional microbial genomeBMC Genomics10.1186/s12864-024-10692-625:1Online publication date: 14-Aug-2024
  • (2024)DP-GCN: Node Classification by Connectivity and Local Topology Structure on Real-World NetworkACM Transactions on Knowledge Discovery from Data10.1145/364946018:6(1-20)Online publication date: 12-Apr-2024
  • (2024)Unveiling Privacy Vulnerabilities: Investigating the Role of Structure in Graph DataProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672013(4059-4070)Online publication date: 25-Aug-2024
  • (2024)Graph Exploration With Embedding-Guided LayoutsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.323890930:7(3693-3708)Online publication date: Jul-2024
  • (2024)RD-GCN: A Role-Based Dynamic Graph Convolutional Network for Information Diffusion PredictionIEEE Transactions on Network Science and Engineering10.1109/TNSE.2024.340365211:5(4923-4937)Online publication date: Sep-2024
  • (2024)Representation Learning on Heterostructures via Heterogeneous Anonymous WalksIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.323400535:7(9538-9552)Online publication date: Jul-2024
  • (2024)Anchor Link Prediction for Cross-Network Digital Forensics From Local and Global PerspectivesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.336406619(3620-3635)Online publication date: 2024
  • (2024)Stability of Role Assignments in Modified Networks2024 4th Interdisciplinary Conference on Electrics and Computer (INTCEC)10.1109/INTCEC61833.2024.10603056(1-6)Online publication date: 11-Jun-2024
  • (2024)Dismantling Complex Networks with Graph Contrastive Learning and Multi-hop AggregationInformation Sciences10.1016/j.ins.2024.120780(120780)Online publication date: May-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media