skip to main content
10.1145/3545008.3545058acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article
Open access

On the Parallelization of MCMC for Community Detection

Published: 13 January 2023 Publication History

Abstract

The rapid growth in size of real-world graph datasets necessitates the design of parallel and scalable graph analytics algorithms for large graphs. Community detection is a graph analysis technique with use cases in many domains from bioinformatics to network security. Markov chain Monte Carlo (MCMC)-based methods for performing community detection, such as the stochastic block partitioning (SBP) algorithm, are robust to graphs with a complex structure, but have traditionally been difficult to parallelize due to the serial nature of the underlying MCMC algorithm. This paper presents hybrid SBP (H-SBP), a novel hybrid method to parallelize the inherently sequential computation within each MCMC chain, for SBP. H-SBP processes a fraction of the most influential graph vertices serially and the remaining majority of the vertices in parallel using asynchronous Gibbs. We empirically show that H-SBP speeds up the MCMC computations by up to 5.6  × on real-world graphs while maintaining accuracy.

References

[1]
William Aiello, Fan Chung, and Linyuan Lu. 2001. A random graph model for power law graphs. Experimental Mathematics 10, 1 (2001), 53–66. https://projecteuclid.org/euclid.em/999188420
[2]
Ginestra Bianconi, Paolo Pin, and Matteo Marsili. 2009. Assessing the relevance of node features for network structure. Proceedings of the National Academy of Sciences 106, 28 (7 2009), 11433–11438. https://doi.org/10.1073/PNAS.0811511106
[3]
Timothy A. Davis and Yifan Hu. 2011. The university of Florida sparse matrix collection. ACM Transactions on Mathematical Software (TOMS) 38, 1 (12 2011), 1–25. https://doi.org/10.1145/2049662.2049663
[4]
Christopher De Sa, Kunle Olukotun, and Christopher Ré. 2016. Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling. IJCAI International Joint Conference on Artificial Intelligence 0 (2 2016), 4811–4815. https://doi.org/10.48550/arxiv.1602.07415
[5]
Travis Desell, Lee A. Newberg, Malik Magdon-Ismail, Boleslaw K. Szymanski, and William Thompson. 2012. Finding Protein Binding Sites Using Volunteer Computing Grids. Advances in Intelligent and Soft Computing 144 AISC, VOL. 1 (2012), 385–393. https://doi.org/10.1007/978-3-642-28314-7_52
[6]
Santo Fortunato. 2010. Community detection in graphs. Physics Reports 486, 3-5 (2 2010), 75–174. https://doi.org/10.1016/J.PHYSREP.2009.11.002
[7]
Sayan Ghosh, Mahantesh Halappanavar, Antonino Tumeo, Ananth Kalyanaraman, and Assefaw H. Gebremedhin. 2018. Scalable Distributed Memory Community Detection Using Vite. In 2018 IEEE High Performance extreme Computing Conference (HPEC). IEEE, Waltham, MA, 1–7. https://doi.org/10.1109/HPEC.2018.8547534
[8]
W. K. Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 1 (4 1970), 97–109. https://doi.org/10.1093/biomet/57.1.97
[9]
Edward Kao, Vijay Gadepally, Michael Hurley, Michael Jones, Jeremy Kepner, Sanjeev Mohindra, Paul Monticciolo, Albert Reuther, Siddharth Samsi, William Song, Diane Staheli, and Steven Smith. 2017. Streaming graph challenge: Stochastic block partition. In 2017 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, Waltham, MA, 1–12. https://doi.org/10.1109/HPEC.2017.8091040
[10]
Edward K. Kao, Steven Thomas Smith, and Edoardo M. Airoldi. 2019. Hybrid Mixed-Membership Blockmodel for Inference on Realistic Network Interactions. IEEE Transactions on Network Science and Engineering 6, 3 (7 2019), 336–350. https://doi.org/10.1109/TNSE.2018.2823324
[11]
Brian Karrer and M. E. J. Newman. 2011. Stochastic blockmodels and community structure in networks. Physical Review E 83, 1 (1 2011), 016107. https://doi.org/10.1103/PhysRevE.83.016107
[12]
Glenn G. Ko, Yuji Chai, Rob A. Rutenbar, David Brooks, and Gu-Yeon Wei. 2019. FlexGibbs: Reconfigurable Parallel Gibbs Sampling Accelerator for Structured Graphs. In 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, San Diego, CA, 334–334. https://doi.org/10.1109/FCCM.2019.00075
[13]
Andrea Lancichinetti, Santo Fortunato, and János Kertész. 2009. Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics 11, 3 (3 2009), 033015. https://doi.org/10.1088/1367-2630/11/3/033015
[14]
Xu Liu, Jesun Sahariar Firoz, Marcin Zalewski, Mahantesh Halappanavar, Kevin J. Barker, Andrew Lumsdaine, and Assefaw H. Gebremedhin. 2019. Distributed Direction-Optimizing Label Propagation for Community Detection. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, Waltham, MA, 1–6. https://doi.org/10.1109/HPEC.2019.8916215
[15]
Willie Neiswanger, Chong Wang, and Eric Xing. 2014. Asymptotically Exact, Embarrassingly Parallel MCMC. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence. AUAI Press, Quebec City, Quebec, Canada, 623–632. https://doi.org/10.5555/3020751.3020816
[16]
M. E. J. Newman. 2004. Analysis of weighted networks. Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics 70, 5 (7 2004), 9. https://doi.org/10.1103/PhysRevE.70.056131
[17]
Symeon Papadopoulos, Yiannis Kompatsiaris, Athena Vakali, and Ploutarchos Spyridonos. 2012. Community detection in Social Media. Data Mining and Knowledge Discovery 2011 24:3 24, 3 (6 2012), 515–554. https://doi.org/10.1007/S10618-011-0224-Z
[18]
Tiago P. Peixoto. 2013. Parsimonious Module Inference in Large Networks. Physical Review Letters 110, 14 (4 2013), 148701. https://doi.org/10.1103/PhysRevLett.110.148701
[19]
Tiago P. Peixoto. 2014. Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Physical Review E 89, 1 (1 2014), 012804. https://doi.org/10.1103/PhysRevE.89.012804
[20]
Tiago P. Peixoto. 2014. The graph-tool python library. figshare (2014). https://doi.org/10.6084/m9.figshare.1164194
[21]
Jose B. Pereira-Leal, Anton J. Enright, and Christos A. Ouzounis. 2003. Detection of functional modules from protein interaction networks. Proteins: Structure, Function, and Bioinformatics 54, 1 (9 2003), 49–57. https://doi.org/10.1002/prot.10505
[22]
Alexander Terenin, Dan Simpson, and David Draper. 2020. Asynchronous Gibbs Sampling. In International Conference on Artificial Intelligence and Statistics. PMLR, Palermo, 144–154. https://arxiv.org/pdf/1509.08999.pdf
[23]
Frank Wanye, Vitaliy Gleyzer, Edward Kao, and Wu-chun Feng. 2021. Topology-Guided Sampling for Fast and Accurate Community Detection. arXiv (8 2021). https://doi.org/10.48550/arxiv.2108.06651
[24]
Darren J Wilkinson. 2005. Parallel Bayesian Computation. In Handbook of Parallel Computing and Statistics (1st ed.), Erricos John Kontoghiorghes (Ed.). Marcel Dekker/CRC Press, Boca Raton, Florida; London, UK, Chapter 18, 481–512. http://www.mas.ncl.ac.uk/~ndjw1/docs/pbc.pdf

Cited By

View all
  • (2023)An Integrated Approach for Accelerating Stochastic Block Partitioning2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363599(1-7)Online publication date: 25-Sep-2023
  • (2023)Decontentioned Stochastic Block Partition2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363480(1-6)Online publication date: 25-Sep-2023
  • (2023)uSAP: An Ultra-Fast Stochastic Graph Partitioner2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363426(1-7)Online publication date: 25-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '22: Proceedings of the 51st International Conference on Parallel Processing
August 2022
976 pages
ISBN:9781450397339
DOI:10.1145/3545008
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2023

Check for updates

Author Tags

  1. asynchronous gibbs
  2. bayesian inference
  3. community detection
  4. graph clustering
  5. stochastic blockmodels

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • NSF Center for Space, High-performance, and Resilient Computing (SHREC)

Conference

ICPP '22
ICPP '22: 51st International Conference on Parallel Processing
August 29 - September 1, 2022
Bordeaux, France

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)126
  • Downloads (Last 6 weeks)12
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)An Integrated Approach for Accelerating Stochastic Block Partitioning2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363599(1-7)Online publication date: 25-Sep-2023
  • (2023)Decontentioned Stochastic Block Partition2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363480(1-6)Online publication date: 25-Sep-2023
  • (2023)uSAP: An Ultra-Fast Stochastic Graph Partitioner2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363426(1-7)Online publication date: 25-Sep-2023
  • (2023)On the Multi-Dimensional Acceleration of Stochastic Blockmodeling for Community Detection2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops)10.1109/CLUSTERWorkshops61457.2023.00030(70-71)Online publication date: 31-Oct-2023
  • (2023)Exact Distributed Stochastic Block Partitioning2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00010(25-36)Online publication date: 31-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media