Foundations and Trends® in Optimization > Vol 8 > Issue 4

AltGDmin: Alternating GD and Minimization for Partly-decoupled (Federated) Optimization

By Namrata Vaswani, Iowa State University, USA, namrata@iastate.edu

 
Suggested Citation
Namrata Vaswani (2025), "AltGDmin: Alternating GD and Minimization for Partly-decoupled (Federated) Optimization", Foundations and Trends® in Optimization: Vol. 8: No. 4, pp 333-414. http://dx.doi.org/10.1561/2400000051

Publication Date: 28 May 2025
© 2025 N. Vaswani
 
Subjects
Nonlinear signal processing,  Statistical/Machine learning,  Coding and compression,  Reinforcement learning,  Adaptive control and signal processing,  Optimization
 
Keywords
Federated learningDistributed optimizationGradient descentOptimization algorithms
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. Commonly Used Optimization Algorithms
3. AltGDmin for Partly Decoupled Optimization Problems
4. AltGDmin for Three LR Matrix Recovery Problems
5. General Proof Approach for any Problem
6. AltGDmin for LR Problems: Overall Proof Ideas
7. AltGDmin for LR Problems: Proof Details
8. Linear Algebra and Random Matrix Theory Preliminaries
9. Open Questions
References
Appendix

Abstract

This monograph describes a novel optimization solution framework, called alternating gradient descent (GD) and minimization (AltGDmin), that is useful for many problems for which alternating minimization (AltMin) is a popular solution. AltMin is a special case of the block coordinate descent algorithm that is useful for problems in which minimization w.r.t one subset of variables keeping the other fixed is closed form or otherwise reliably solved. Denote the two blocks/subsets of the optimization variables Z by Zslow, Zfast, i.e., Z = {Zslow, Zfast}. AltGDmin is often a faster solution than AltMin for any problem for which (i) the minimization over one set of variables, Zfast, is much quicker than that over the other set, Zslow; and (ii) the cost function is differentiable w.r.t. Zslow. Often, the reason for one minimization to be quicker is that the problem is "decoupled" for Zfast and each of the decoupled problems is quick to solve. This decoupling is also what makes AltGDmin communication-efficient for federated settings.

Important examples where this assumption holds include (a) low rank column-wise compressive sensing (LRCS), low rank matrix completion (LRMC), (b) their outlier-corrupted extensions such as robust PCA, robust LRCS and robust LRMC; (c) phase retrieval and its sparse and low-rank model based extensions; (d) tensor extensions of many of these problems such as tensor LRCS and tensor completion; and (e) many partly discrete problems where GD does not apply – such as clustering, unlabeled sensing, and mixed linear regression. LRCS finds important applications in multi-task representation learning and few shot learning, federated sketching, and accelerated dynamic MRI. LRMC and robust PCA find important applications in recommender systems, computer vision and video analytics.

DOI:10.1561/2400000051
ISBN: 978-1-63828-580-9
104 pp. $75.00
Buy book (pb)
 
ISBN: 978-1-63828-581-6
104 pp. $160.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. Commonly Used Optimization Algorithms
3. AltGDmin for Partly Decoupled Optimization Problems
4. AltGDmin for Three LR Matrix Recovery Problems
5. General Proof Approach for any Problem
6. AltGDmin for LR Problems: Overall Proof Ideas
7. AltGDmin for LR Problems: Proof Details
8. Linear Algebra and Random Matrix Theory Preliminaries
9. Open Questions
References
Appendix

AltGDmin: Alternating GD and Minimization for Partly-decoupled (Federated) Optimization

Federated learning and analytics are a distributed approach for collaboratively learning models from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and other constraints that are not primary considerations in other problem settings.

This monograph describes a novel optimization solution framework, called alternating gradient descent (GD) and minimization (AltGDmin), that is useful for many problems for which alternating minimization (AltMin) is a popular solution. AltMin is a special case of the block coordinate descent algorithm that is used for certain classes of problems for which minimization with regards to one subset of variables, keeping the other fixed in closed form or otherwise reliably solved, for example for bilinear problems.

This set of problems includes those for which one of the two minimization steps of an AltMin iteration is partly decoupled. An optimization problem is decoupled if it can be solved by solving smaller dimensional, and hence quicker, problems over disjoint subsets of the optimization variable.

 
OPT-051