By Anna Goldenberg, Center for Cellular and Biomolecular Research, University of Toronto, Canada, anna.goldenberg@utoronto.ca | Alice X. Zheng, Microsoft Research, USA, alicez@microsoft.com | Stephen E. Fienberg, Department of Statistics, Machine Learning Department, Cylab, and iLab Carnegie Mellon University, USA, fienberg@stat.cmu.edu | Edoardo M. Airoldi, Department of Statistics & FAS Center for Systems Biology, Harvard University, USA, airoldi@fas.harvard.edu
Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active "network community" and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online "networking communities" such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data.
Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.
Networks have found a prominent place in our everyday lives. In science, networks have been used to analyze interpersonal social relationships, communication, academic paper co- authorships and citations, protein interaction patterns, and much more. Popular books on networks and their analysis began to appear a decade ago, and online "networking communities" such as Facebook, MySpace, and LinkedIn now include millions of people from around the world. Formal statistical modeling for the analysis of network data has emerged as a major research topic in diverse areas of study. A Survey of Statistical Network Models aims to provide the reader with an entry point to the voluminous literature on statistical network modeling. It guides the reader through the development of key stochastic network models, touches upon a number of examples and commonalities across different parts of the network literature, and discusses major schools of thought in static and dynamic network modeling. In addition it illuminates the interconnections between existing models. Despite the rich and extensive network modeling literature, many statistical questions remain unanswered. It is hoped that the concluding discussion of gaps and challenges will help the interested reader deduce important future research directions.