By Ananth Kalyanaraman, Washington State University, USA, ananth@eecs.wsu.edu | Mahantesh Halappanavar, Pacific Northwest National Laboratory, USA, hala@pnnl.gov | Daniel Chavarría-Miranda, Pacific Northwest National Laboratory, USA, daniel.chavarria@pnnl.gov | Hao Lu, Washington State University, USA, hlu@eecs.wsu.edu | Karthi Duraisamy, Washington State University, USA, kduraisa@eecs.wsu.edu | Partha Pratim Pande, Washington State University, USA, pande@eecs.wsu.edu
Graph representations are pervasive in scientific and social computing. They serve as vital tools to model the interplay among different interacting entities. In this paper, we visit the problem of community detection, which is one of the most widely used graph operations toward scientific discovery. Community detection refers to the process of identifying tightlyknit subgroups of vertices in a large graph. These sub-groups (or communities) represent vertices that are tied together through common structure or function. Identification of communities could help in understanding the modular organization of complex networks. However, owing to large data sizes and high computational costs, performing community detection at scale has become increasingly challenging. Here, we present a detailed review and analysis of some of the leading computational methods and implementations developed for executing community detection on modern day multicore and manycore architectures. Our goals are to: a) define the problem of community detection and highlight its scientific significance; b) relate to challenges in parallelizing the operation on modern day architectures; c) provide a detailed report and logical organization of the approaches that have been designed for various architectures; and d) finally, provide insights into the strengths and suitability of different architectures for community detection, and a preview into the future trends of the area. It is our hope that this detailed treatment of community detection on parallel architectures can serve as an exemplar study for extending the application of modern day multicore and manycore architectures to other complex graph applications.
Graph representations are pervasive in scientific and social computing. They serve as vital tools to model the interplay between different interacting entities. This monograph delves into the problem of community detection, which is one of the most widely used graph operations toward scientific discovery. Community detection refers to the process of identifying tightly-knit subgroups of vertices in a large graph. These sub-groups (or communities) represent vertices that are tied together through common structure or function. Identification of communities could help in understanding the modular organization of complex networks. However, owing to large data sizes and high computational costs, performing community detection at scale has become increasingly challenging.
This monograph presents a detailed review and analysis of some of the leading computational methods and implementations developed for executing community detection on modern day multicore and manycore architectures. The intention is to: a) define the problem of community detection and highlight its scientific significance; b) relate to challenges in parallelizing the operation on modern day architectures; c) provide a detailed report and logical organization of the approaches that have been designed for various architectures; and d) provide insights into the strengths and suitability of different architectures for community detection, and a preview into the future trends of the area. While the focus is on community detection, the challenges, and techniques to overcome the challenges, transcend to several other graph problems that have applications in science and data analytics.