Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers

Stephen Boyd; Neal Parikh; Eric Chu; Borja Peleato; Jonathan Eckstein

doi:10.1561/2200000016

Foundations and Trends® in Machine Learning > Vol 3 > Issue 1

Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers

By Stephen Boyd, Electrical Engineering Department, Stanford University, USA, boyd@stanford.edu | Neal Parikh, Computer Science Department, Stanford University, USA, npparikh@cs.stanford.edu | Eric Chu, Electrical Engineering Department, Stanford University, USA, echu508@stanford.edu | Borja Peleato, Electrical Engineering Department, Stanford University, USA, peleato@stanford.edu | Jonathan Eckstein, Management Science and Information Systems Department and RUTCOR, Rutgers University, USA, jeckstei@rci.rutgers.edu

Suggested Citation

Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato and Jonathan Eckstein (2011), "Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers", Foundations and Trends® in Machine Learning: Vol. 3: No. 1, pp 1-122. http://dx.doi.org/10.1561/2200000016

Publication Date: 26 Jul 2011

Subjects

Optimization, Statistical learning theory

Journal details

Download article

In this article:

Abstract

Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for ℓ₁ problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop Map Reduce implementations.

DOI:10.1561/2200000016

Book details

ISBN: 978-1-60198-461-6

128 pp. $100.00

To Order this Article please contact ⁠ Emerald Customer Support

Table of contents:

1: Introduction

2: Precursors

3: Alternating Direction Method of Multipliers

4: General Patterns

5: Constrained Convex Optimization

6: ?1-Norm Problems

7: Consensus and Sharing

8: Distributed Model Fitting

9: Nonconvex Problems

10: Implementation

11: Numerical Examples

12: Conclusions

Acknowledgements

A: Convergence Proof

References

Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers

Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers argues that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas-Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for ?1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, it discusses applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. It also discusses general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

1 Introduction
2 Precursors
3 Alternating Direction Method of Multipliers
4 General Patterns
5 Constrained Convex Optimization
6 ℓ₁-Norm Problems
7 Consensus and Sharing
8 Distributed Model Fitting
9 Nonconvex Problems
10 Implementation
11 Numerical Examples
12 Conclusions
Acknowledgments
A Convergence Proof
References

Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers

Free Preview:

Share

Journal details

Abstract

Book details

Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers