By Ji Liu, University of Rochester and Kuaishou Inc., USA, ji.liu.uwisc@gmail.com | Ce Zhang, ETH Zurich, Switzerland, ce.zhang@inf.ethz.ch
Scalable and efficient distributed learning is one of the main driving forces behind the recent rapid advancement of machine learning and artificial intelligence. One prominent feature of this topic is that recent progress has been made by researchers in two communities: (1) the system community such as database, data management, and distributed systems, and (2) the machine learning and mathematical optimization community. The interaction and knowledge sharing between these two communities has led to the rapid development of new distributed learning systems and theory. In this monograph, we hope to provide a brief introduction of some distributed learning techniques that have recently been developed, namely lossy communication compression (e.g., quantization and sparsification), asynchronous communication, and decentralized communication. One special focus in this monograph is on making sure that it can be easily understood by researchers in both communities — on the system side, we rely on a simplified system model hiding many system details that are not necessary for the intuition behind the system speedups; while, on the theory side, we rely on minimal assumptions and significantly simplify the proof of some recent work to achieve comparable results.
Scalable and efficient distributed learning is one of the main driving forces behind the recent rapid advancement of machine learning and artificial intelligence. One prominent feature of this development is that recent progress has been made by researchers in two communities: (1) the system community such as database, data management, and distributed systems, and (2) the machine learning and mathematical optimization community. The interaction and knowledge sharing between these two communities has led to the rapid development of new distributed learning systems and theory.
This monograph provides a brief introduction to three distributed learning techniques that have recently been developed: lossy communication compression, asynchronous communication, and decentralized communication. These have significant impact on the work in both the system and machine learning and mathematical optimization communities but to fully realize the potential, it is essential they understand the whole picture. This monograph provides the bridge between the two communities. The simplified introduction to the essential aspects of each community enables researchers to gain insights into the factors influencing both.
The monograph provides students and researchers the groundwork for developing faster and better research results in this dynamic area of research.