Foundations and Trends® in Communications and Information Theory > Vol 14 > Issue 1-2

Community Detection and Stochastic Block Models

By Emmanuel Abbe, Princeton University, USA, eabbe@princeton.edu

 
Suggested Citation
Emmanuel Abbe (2018), "Community Detection and Stochastic Block Models", Foundations and TrendsĀ® in Communications and Information Theory: Vol. 14: No. 1-2, pp 1-162. http://dx.doi.org/10.1561/0100000067

Publication Date: 07 Jun 2018
© 2018 E. Abbe
 
Subjects
Information theory and statistics,  Information theory and computer science,  Stochastic Networks,  Bayesian learning,  Clustering,  Spectral methods
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. The stochastic block model
3. Tackling the stochastic block model
4. Exact recovery for two communities
5. Weak recovery for two communities
6. Partial recovery for two communities
7. The general SBM
8. The information-computation gap
9. Other block models
10. Concluding remarks and open problems
Acknowledgements
References

Abstract

The stochastic block model (SBM) is a random graph model with different group of vertices connecting differently. It is widely employed as a canonical model to study clustering and community detection, and provides a fertile ground to study the information-theoretic and computational tradeoffs that arise in combinatorial statistics and more generally data science. This monograph surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational tradeoffs, and for various recovery requirements such as exact, partial and weak recovery. The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal SNR-mutual information tradeoff for partial recovery, and the gap between information-theoretic and computational thresholds.

DOI:10.1561/0100000067
ISBN: 978-1-68083-476-5
172 pp. $99.00
Buy book (pb)
 
ISBN: 978-1-68083-477-2
172 pp. $280.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. The stochastic block model
3. Tackling the stochastic block model
4. Exact recovery for two communities
5. Weak recovery for two communities
6. Partial recovery for two communities
7. The general SBM
8. The information-computation gap
9. Other block models
10. Concluding remarks and open problems
Acknowledgements
References

Community Detection and Stochastic Block Models

The field of community detection has been expanding greatly since the 1980s, with a remarkable diversity of models and algorithms developed in different communities like machine learning, computer science, network science, social science, and statistical physics. Various fundamental questions remain nonetheless unsettled, such as: Are there really communities? Algorithms may output community structures, but are these meaningful or artefacts? Can we always extract the communities when they are present; fully, partially? And what is a good benchmark to measure the performance of algorithms, and how good are the current algorithms?

This monograph describes recent developments aiming at answering these questions in the context of block models. Addressing the issues from an information-theoretic view-point, the author gives a comprehensive description of the historical and recent work that has led to key new concepts in the various recovery requirements for community detection.

The monograph provides a compact introduction to community detection, which enables the reader to apply these techniques in applications such as understanding sociological behavior, protein to protein interactions; gene expressions; recommendation systems; medical prognosis; DNA 3D folding; image segmentation, natural language processing, product-customer segmentation, webpage sorting, and many more.

 
CIT-067