skip to main content
research-article

Multi-relation Graph Summarization

Published: 09 March 2022 Publication History

Abstract

Graph summarization is beneficial in a wide range of applications, such as visualization, interactive and exploratory analysis, approximate query processing, reducing the on-disk storage footprint, and graph processing in modern hardware. However, the bulk of the literature on graph summarization surprisingly overlooks the possibility of having edges of different types. In this article, we study the novel problem of producing summaries of multi-relation networks, i.e., graphs where multiple edges of different types may exist between any pair of nodes. Multi-relation graphs are an expressive model of real-world activities, in which a relation can be a topic in social networks, an interaction type in genetic networks, or a snapshot in temporal graphs.
The first approach that we consider for multi-relation graph summarization is a two-step method based on summarizing each relation in isolation, and then aggregating the resulting summaries in some clever way to produce a final unique summary. In doing this, as a side contribution, we provide the first polynomial-time approximation algorithm based on the k-Median clustering for the classic problem of lossless single-relation graph summarization.
Then, we demonstrate the shortcomings of these two-step methods, and propose holistic approaches, both approximate and heuristic algorithms, to compute a summary directly for multi-relation graphs. In particular, we prove that the approximation bound of k-Median clustering for the single relation solution can be maintained in a multi-relation graph with proper aggregation operation over adjacency matrices corresponding to its multiple relations. Experimental results and case studies (on co-authorship networks and brain networks) validate the effectiveness and efficiency of the proposed algorithms.

References

[1]
K. J. Ahn, S. Guha, and A. McGregor. 2012. Analyzing graph structure via linear measurements. In Proceedings of the SODA.
[2]
K. J. Ahn, S. Guha, and A. McGregor. 2012. Graph sketches: Sparsification, spanners, and subgraphs. In Proceedings of the PODS.
[3]
S. Bandyopadhyay, M. Mehta, D. Kuo, M.-K. Sung, R. Chuang, E. J. Jaehnig, B. Bodenmiller, K. Licon, W. Copeland, M. Shales, D. Fiedler, J. Dutkowski, A. Guénolé, H. van Attikum, K. M. Shokat, R. D. Kolodner, W.-K. Huh, R. Aebersold, M.-C. Keogh, N. J. Krogan, and T. Ideker. 2010. Rewiring of genetic networks in response to DNA damage. Science 330, 6009 (2010), 1385-1389.
[4]
N. Bansal, A. Blum, and Chawla S.2004. Correlation clutering. Machine Learning 56 (2004), 89–113.
[5]
M. A. Beg, M. Ahmad, A. Zaman, and I. Khan. 2018. Scalable approximation algorithm for graph summarization. In Proceedings of the PAKDD.
[6]
A. A. Benczúr and D. R. Karger. 2015. Randomized approximation schemes for cuts and flows in capacitated graphs. SIAM Journal on Computing 44, 2 (2015), 290–319.
[7]
M. Besta and T. Hoefler. 2018. Survey and taxonomy of lossless graph compression and space-efficient graph representations. arXiv : 1806.01799 (2018). Retrieved from https://arxiv.org/abs/1806.01799
[8]
B. Boden, S. Günnemann, H. Hoffmann, and T. Seidl. 2012. Mining coherent subgraphs in multi-layer graphs with edge labels. In Proceedings of the KDD.
[9]
P. Boldi, M. Rosa, M. Santini, and S. Vigna. 2011. Layered label propagation: A multiresolution coordinate-free ordering for compressing social Networks. In Proceedings of the WWW.
[10]
P. Boldi and S. Vigna. 2004. The webgraph framework I: Compression techniques. In Proceedings of the WWW.
[11]
N. R. Brisaboa, S. Ladra, and G. Navarro. 2009. k2-trees for compact web graph representation. In Proceedings of the SPIRE.
[12]
A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. 2000. Min-wise independent permutations. Journal of Computer and System Sciences 60, 3 (2000), 630–659.
[13]
G. Buehrer and K. Chellapilla. 2008. A scalable pattern mining approach to web graph compression with communities. In Proceedings of the WSDM.
[14]
A. Cardillo, J. Gómez-Gardeñes, M. Zanin, M. Romance, D. Papo, F. del Pozo, and S. Boccaletti. 2012. Emergence of network features from multiplexity. Scientific Reports 3, 1 (2013), 1–6.
[15]
M. Charikar, V. Guruswami, and A. Wirth. 2005. Clustering with qualitative information. Journal of Computer and System Sciences 71, 3 (2005), 360–383.
[16]
C. Chen, C. X. Lin, M. Fredrikson, M. Christodorescu, X. Yan, and J. Han. 2009. Mining graph patterns efficiently via randomized summaries. PVLDB 2, 1 (2009), 742–753.
[17]
C. Chen, C. X. Lin, M. Fredrikson, M. Christodorescu, X. Yan, and J. Han. 2010. Mining large information networks by graph summarization. In Proceedings of the Link Mining: Models, Algorithms, and Applications. 475–501.
[18]
C. Chen, X. Yan, F. Zhu, J. Han, and P. S. Yu. 2008. Graph OLAP: Towards online analytical processing on graphs. In Proceedings of the ICDM.
[19]
F. Chierichetti, R. Kumar, S. Lattanzi, M. Mitzenmacher, A. Panconesi, and P. Raghavan. 2009. On compressing social networks. In Proceedings of the KDD.
[20]
Y. Choi and W. Szpankowski. 2012. Compression of graphical structures: Fundamental limits, algorithms, and experiments. IEEE Transactions on Information Theory 58, 2 (2012), 620–638.
[21]
D. J. Cook and L. B. Holder. 1994. Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research 1, 1 (1994), 231–255.
[22]
G. Cormode and S. Muthukrishnan. 2005. Space efficient mining of multigraph streams. In Proceedings of the PODS.
[23]
M. Coscia, G. Rossetti, D. Pennacchioli, D. Ceccarelli, and F. Giannotti. 2013. “You know because I Know”: A multidimensional network approach to human resources problem. In Proceedings of the ASONAM.
[24]
C. Craddock, Y. Benhajali, C. Chu, F. Chouinard, A. Evans, A. Jakab, B. S. Khundrakpam, J. D. Lewis, Q. Li, M. Miham, C. Yan, and P. Bellec. 2013. The neuro bureau preprocessing initiative: Open sharing of preprocessed neuroimaging data and derivatives. Frontiers in Neuroinformatics 41 (2013).
[25]
M. E. Dickison, M. Magnani, and L. Rossi. 2016. Multilayer Social Networks. Cambridge University Press.
[26]
W. Fan, J. Li, X. Wang, and Y. Wu. 2012. Query preserving graph compression. In Proceedings of the SIGMOD.
[27]
J. Feigenbaum, S. Kannan, A. McGregor, S. Suri, and J. Zhang. 2008. Graph distances in the data-stream model. SIAM Journal on Computing 38, 5 (2008), 1709–1727.
[28]
E. Galimberti, F. Bonchi, and F. Gullo. 2017. Core decomposition and densest subgraph in multilayer Networks. In Proceedings of the CIKM.
[29]
Edoardo Galimberti, Francesco Bonchi, Francesco Gullo, and Tommaso Lanciano. 2020. Core decomposition in multilayer networks: Theory, algorithms, and applications. ACM Transactions on Knowledge Discovery from Data 14, 1 (2020), 11:1–11:40.
[30]
A. Gionis, H. Mannila, and P. Tsaparas. 2007. Clustering aggregation. TKDD 1 (2007).
[31]
A. Gionis and C. E. Tsourakakis. 2015. Dense subgraph discovery: KDD 2015 tutorial. In Proceedings of the KDD.
[32]
X. Gou, L. Zou, C. Zhao, and T. Yang. 2019. Fast and accurate graph stream summarization. In Proceedings of the ICDE.
[33]
N. Hassanlou, M. Shoaran, and A. Thomo. 2013. Probabilistic graph summarization. In Proceedings of the WAIM(Lecture Notes in Computer Science, Vol. 7923). Springer.
[34]
P. Hu and W. C. Lau. 2013. A survey and taxonomy of graph sampling. arXiv: 1308.5865. Retrieved from https://arxiv.org/abs/1308.5865
[35]
N. Jayaram, A. Khan, C. Li, X. Yan, and R. Elmasri. 2015. Querying knowledge graphs by example entity tuples. IEEE IEEE Transactions on Knowledge and Data Engineering 27, 10 (2015), 2797–2811.
[36]
B. Jin, C. Gao, X. He, D. Jin, and Y. Li. 2020. Multi-behavior recommendation with graph convolutional networks. In Proceedings of the SIGIR.
[37]
D. Jin, R. A. Rossi, E. Koh, S. Kim, A. Rao, and D. Koutra. 2019. Latent network summarization: Bridging network embedding and summarization. In Proceedings of the KDD.
[38]
k. Lee, H. Jo, J. Ko, S. Lim, and K. Shin. 2020. SSumM: Sparse summarization of massive graphs. In Proceedings of the KDD.
[39]
U. Kang and C. Faloutsos. 2011. Beyond ’caveman communities’: Hubs and spokes for graph compression and mining. In Proceedings of the ICDM.
[40]
U. Kang, H. Tong, J. Sun, C.-Y. Lin, and C. Faloutsos. 2011. GBASE: A scalable and general graph management system. In Proceedings of the KDD.
[41]
G. Karypis and V. Kumar. 1995. Analysis of multilevel graph partitioning. In Proceedings of the Supercomputing.
[42]
X. Ke, A. Khan, and G. Cong. 2018. Finding seeds and relevant tags jointly: For targeted influence maximization in social networks. In Proceedings of the SIGMOD.
[43]
A. Khan and C. C. Aggarwal. 2017. Toward query-friendly compression of rapid graph streams. Social Network Analysis and Mining 7, 1 (2017), 23:1–23:19.
[44]
A. Khan, S. S. Bhowmick, and F. Bonchi. 2017. Summarizing static and dynamic big graphs. PVLDB 10, 12 (2017), 1981–1984.
[45]
K. U. Khan, W. Nawaz, and Y.-K. Lee. 2014. Set-based unified approach for attributed graph summarization. In Proceedings of the BDCLOUD.
[46]
D. Koutra, U. Kang, J. Vreeken, and C. Faloutsos. 2014. VOG: Summarizing and understanding large graphs. In Proceedings of the SDM.
[47]
D. Koutra, U. Kang, J. Vreeken, and C. Faloutsos. 2015. Summarizing and understanding large graphs. Statistical Analysis and Data Mining 8, 3 (2015), 183–202.
[48]
D. Koutra, J. Vreeken, and F. Bonchi. 2018. Summarizing graphs at multiple scales: New trends. In Proceedings of the ICDM.
[49]
K. A. Kumar and P. Efstathopoulos. 2018. Utility-driven graph summarization. PVLDB 12, 4 (2018), 335–347.
[50]
K. LeFevre and E. Terzi. 2010. GraSS: Graph structure summarization. In Proceedings of the SDM.
[51]
S.-D. Lin, M.-Y. Yeh, and C.-T. Li. 2013. Sampling and summarization for social networks. In Proceedings of the SDM.
[52]
X. Liu, Y. Tian, Q. He, W.-C. Lee, and J. McPherson. 2014. Distributed graph summarization. In Proceedings of the CIKM.
[53]
Y. Liu, T. Safavi, A. Dighe, and D. Koutra. 2018. Graph summarization methods and applications: A survey. ACM Computing Surveys 51, 3 (2018), 62:1–62:34.
[54]
M. Mao, J. Lu, G. Zhang, and J. Zhang. 2017. Multirelational social recommendations via multigraph ranking. IEEE Transactions on Cybernetics 47, 12 (2017), 4049–4061.
[55]
H. Maserrat and J. Pei. 2010. Neighbor query friendly compression of social networks. In Proceedings of the KDD.
[56]
H. Maserrat and J. Pei. 2012. Community preserving lossy compression of social networks. In Proceedings of the ICDM.
[57]
S. Navlakha, R. Rastogi, and N. Shrivastava. 2008. Graph summarization with bounded error. In Proceedings of the SIGMOD.
[58]
M. E. J. Newman and M. Girvan. 2004. Finding and evaluating community structure in networks. Physical Review E 69, 2 (2004), 026113.
[59]
D. L. Phuoc, H. N. M. Quoc, H. N. Quoc, T. T. Nhat, and M. Hauswirth. 2016. The graph of things: A step towards the live knowledge graph of connected things. Journal of Web Semantics 37–38 (2016), 25–35. https://www.sciencedirect.com/science/article/abs/pii/S1570826816000196.
[60]
M. Purohit, B. A. Prakash, C. Kang, Y. Zhang, and V. S. Subrahmanian. 2014. Fast influence-based coarsening for large networks. In Proceedings of the KDD.
[61]
Q. Qu, S. Liu, C. S. Jensen, F. Zhu, and C. Faloutsos. 2014. Interestingness-driven diffusion process summarization in dynamic networks. In Proceedings of the ECML PKDD.
[62]
S. Raghavan and H. Garcia-Molina. 2003. Representing web graphs. In Proceedings of the ICDE.
[63]
M. Riondato, D. García-Soriano, and F. Bonchi. 2014. Graph summarization with quality guarantees. In Proceedings of the ICDM.
[64]
Matteo Riondato, David García-Soriano, and Francesco Bonchi. 2017. Graph summarization with quality guarantees. Data Mining and Knowledge Discovery 31, 2 (2017), 314–349.
[65]
J. Rissanen. 1978. Modelling by the shortest data description. Automatica 14, 5 (1978), 465–471.
[66]
R. A. Rossi and R. Zhou. 2018. GraphZIP: A clique-based sparse graph compression method. Journal of Big Data 5, 1 (2018), 247–256.
[67]
B.-S. Seah, S. S. Bhowmick, and C. F. Dewey Jr. 2014. DiffNet: Automatic differential functional summarization of dE-MAP networks. Methods 63, 3 (2014), 1–14.
[68]
B.-S. Seah, S. S. Bhowmick, C. F. Dewey Jr, and H. Yu. 2012. FUSE: A profit maximization approach for functional summarization of biological networks. BMC Bioinformatics 13 (2012).
[69]
N. Shah, D. Koutra, T. Zou, B. Gallagher, and C. Faloutsos. 2015. TimeCrunch: Interpretable dynamic graph summarization. In Proceedings of the KDD.
[70]
L. Shi, S. Sun, Y. Xuan, Y. Su, H. Tong, S. Ma, and Y. Chen. 2016. TOPIC: Toward perfect influence graph summarization. In Proceedings of the ICDE.
[71]
K. Shin, A. Ghoting, M. Kim, and H. Raghavan. 2019. SWeG: Lossless and lossy summariation of web-scale graphs. In Proceedings of the WWW.
[72]
M. Stella, C. S. Andreazzi, S. Selakovic, A. Goudarzi, and A. Antonioni. 2017. Parasite spreading in spatial ecological multiplex networks. Journal of Complex Networks 5, 3 (2017), 486–511.
[73]
N. Tang, Q. Chen, and P. Mitra. 2016. Graph stream summarization: From big bang to big crunch. In Proceedings of the SIGMOD.
[74]
R. L. Thorndike. 1953. Who belongs in the family?Psychometrika 18 (1953), 267–276.
[75]
Y. Tian, R. A. Hankins, and J. M. Patel. 2008. Efficient aggregation for graph summarization. In Proceedings of the SIGMOD.
[76]
Y. Tian and J. M. Patel. 2010. Interactive graph summarization. In Proceedings of the Link Mining: Models, Algorithms, and Applications. 389–409.
[77]
H. Toivonen, F. Zhou, A. Hartikainen, and A. Hinkka. 2011. Compression of weighted graphs. In Proceedings of the KDD.
[78]
I. Tsalouchidou, G. De Francisci Morales, F. Bonchi, and R. Baeza-Yates. 2016. Scalable dynamic graph summarization. In Proceedings of the BigData.
[79]
S. White and P. Smyth. 2005. A spectral clustering approach to finding communities in graph. In Proceedings of the SDM.
[80]
Y. Wu, S. Yang, M. Srivatsa, A. Iyengar, and X. Yan. 2013. Summarizing answer graphs induced by keyword queries. PVLDB 6, 14 (2013), 1774–1785.
[81]
L. Xia, C. Huang, Y. Xu, P. Dai, X. Zhang, H. Yang, J. Pei, and L. Bo. 2021. Knowledge-enhanced hierarchical graph transformer network for multi-behavior recommendation. In Proceedings of the AAAI.
[82]
X. Yan, X. J. Zhou, and J. Han. 2005. Mining closed relational graphs with connectivity constraints. In Proceedings of the KDD.
[83]
J. Yang, J. You, and X. Wan. 2021. Graph embedding via graph summarization. IEEE Access 9 (2021), 45163–45174. https://ieeexplore.ieee.org/document/9382981.
[84]
J. Zhang, S. S. Bhowmick, H. H. Nguyen, B. Choi, and F. Zhu. 2015. DaVinci: Data-driven visual interface construction for subgraph search in graph databases. In Proceedings of the ICDE.
[85]
X. Zhang and T. Özsu. 2019. Correlation constraint shortest path over large multi-relation graphs. PVLDB 12, 5 (2019), 488–501.
[86]
P. Zhao, C. C. Aggarwal, and M. Wang. 2011. gSketch: On query estimation in graph streams. PVLDB 5, 3 (2011), 193–204.
[87]
F. Zhou, S. Mahler, and H. Toivonen. 2010. Network simplification with minimal loss of connectivity. In Proceedings of the ICDM.

Cited By

View all
  • (2024)Graph Summarization: Compactness Meets EfficiencyProceedings of the ACM on Management of Data10.1145/36549432:3(1-26)Online publication date: 30-May-2024
  • (2024)Ego-Network Segmentation via (Weighted) Jaccard MedianIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337371236:9(4646-4663)Online publication date: 1-Sep-2024
  • (2024)A survey of text summarization: Techniques, evaluation and challengesNatural Language Processing Journal10.1016/j.nlp.2024.1000707(100070)Online publication date: Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 16, Issue 5
October 2022
532 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3514187
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 March 2022
Accepted: 01 October 2021
Revised: 01 July 2021
Received: 01 April 2021
Published in TKDD Volume 16, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Graph summarization
  2. multi-relation graph
  3. approximation
  4. k-median

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • MOE Tier1 and Tier2

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)195
  • Downloads (Last 6 weeks)20
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Graph Summarization: Compactness Meets EfficiencyProceedings of the ACM on Management of Data10.1145/36549432:3(1-26)Online publication date: 30-May-2024
  • (2024)Ego-Network Segmentation via (Weighted) Jaccard MedianIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337371236:9(4646-4663)Online publication date: 1-Sep-2024
  • (2024)A survey of text summarization: Techniques, evaluation and challengesNatural Language Processing Journal10.1016/j.nlp.2024.1000707(100070)Online publication date: Jun-2024
  • (2023)A Hierarchical Parallel Graph Summarization Approach Based on Ranking NodesApplied Sciences10.3390/app1308466413:8(4664)Online publication date: 7-Apr-2023
  • (2023)Learning Entangled Interactions of Complex Causality via Self-Paced Contrastive LearningACM Transactions on Knowledge Discovery from Data10.1145/363240618:3(1-24)Online publication date: 9-Dec-2023
  • (2023)Learning the Explainable Semantic Relations via Unified Graph Topic-Disentangled Neural NetworksACM Transactions on Knowledge Discovery from Data10.1145/358996417:8(1-23)Online publication date: 12-May-2023
  • (2023)Drift Detection of Intelligent Sensor Networks Deployment Based on Graph StreamIEEE Transactions on Network Science and Engineering10.1109/TNSE.2022.322790910:2(1096-1106)Online publication date: 1-Mar-2023
  • (2023)Temporal Graph CubeIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327046035:12(13015-13030)Online publication date: 26-Apr-2023
  • (2022)Inter- and Intra-Graph Attention Aggregation Learning for Multi-relational GNN Spam DetectionProcedia Computer Science10.1016/j.procs.2022.11.339214:C(1522-1530)Online publication date: 1-Jan-2022

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media