Foundations and Trends® in Machine Learning > Vol 9 > Issue 1

Generalized Low Rank Models

By Madeleine Udell, Cornell University, USA, udell@cornell.edu | Corinne Horn, Stanford University, USA, cehorn@stanford.edu | Reza Zadeh, Stanford University, USA, rezab@stanford.edu | Stephen Boyd, Stanford University, USA, boyd@stanford.edu

 
Suggested Citation
Madeleine Udell, Corinne Horn, Reza Zadeh and Stephen Boyd (2016), "Generalized Low Rank Models", Foundations and TrendsĀ® in Machine Learning: Vol. 9: No. 1, pp 1-118. http://dx.doi.org/10.1561/2200000055

Publication Date: 23 Jun 2016
© 2016 M. Udell, C. Horn, R. Zadeh and S. Boyd
 
Subjects
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. PCA and quadratically regularized PCA
3. Generalized regularization
4. Generalized loss functions
5. Loss functions for abstract data types
6. Multi-dimensional loss functions
7. Fitting low rank models
8. Choosing low rank models
9. Implementations
Acknowledgements
Appendices
References

Abstract

Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well-known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.

DOI:10.1561/2200000055
ISBN: 978-1-68083-140-5
140 pp. $90.00
Buy book (pb)
 
ISBN: 978-1-68083-141-2
140 pp. $130.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. PCA and quadratically regularized PCA
3. Generalized regularization
4. Generalized loss functions
5. Loss functions for abstract data types
6. Multi-dimensional loss functions
7. Fitting low rank models
8. Choosing low rank models
9. Implementations
Acknowledgements
Appendices
References

Generalized Low Rank Models

Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well-known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.

 
MAL-055