Universal Features for High-Dimensional Learning and Inference

Shao-Lun Huang; Anuran Makur; Gregory W. Wornell; Lizhong Zheng

doi:10.1561/0100000107

Book details

ISBN: 978-1-63828-176-4

320 pp. $99.00

Buy book (pb)

ISBN: 978-1-63828-177-1

320 pp. $310.00

Buy E-book (.pdf)

Table of contents:

1. Introduction

2. The Modal Decomposition of Joint Distributions

3. Variational Characterization of the Modal Decomposition

4. Local Information Geometry

5. Universal Feature Characterizations

6. Learning Modal Decompositions

7. Collaborative Filtering and Matrix Factorization

8. Softmax Regression

9. Gaussian Distributions and Linear Features

10. Nonlinear Features and nonGaussian Distributions

11. Semi-Supervised Learning

12. Modal Decomposition of Markov Random Fields

13. Emerging Applications and Related Developments

Acknowledgements

Appendices

References

Universal Features for High-Dimensional Learning and Inference

In many contemporary and emerging applications of machine learning and statistical inference, the phenomena of interest are characterized by variables defined over large alphabets. This increasing size of both the data and the number of inferences, and the limited available training data means there is a need to understand which inference tasks can be most effectively carried out, and, in turn, what features of the data are most relevant to them.

In this monograph, the authors develop the idea of extracting “universally good” features, and establish that diverse notions of such universality lead to precisely the same features. The information-theoretic approach used results in a local information geometric analysis that facilitates their computation in a host of applications.

The authors provide a comprehensive treatment that guides the reader through the basic principles to the advanced techniques including many new results. They emphasize a development from first-principles together with common, unifying terminology and notation, and pointers to the rich embodying literature, both historical and contemporary.

Written for students and researchers, this monograph is a complete treatise on the information theoretic treatment of a recognized and current problem in machine learning and statistical inference.