A Tutorial on Meta-Reinforcement Learning

Jacob Beck; Risto Vuorio; Evan Zheran Liu; Zheng Xiong; Luisa Zintgraf; Chelsea Finn; Shimon Whiteson

doi:10.1561/2200000080

Foundations and Trends® in Machine Learning > Vol 18 > Issue 2-3

A Tutorial on Meta-Reinforcement Learning

By Jacob Beck, University of Oxford, UK, jacob.beck@cs.ox.ac.uk | Risto Vuorio, University of Oxford, UK, risto.vuorio@cs.ox.ac.uk | Evan Zheran Liu, Stanford University, USA, evanliu@cs.stanford.edu | Zheng Xiong, University of Oxford, UK, zheng.xiong@cs.ox.ac.uk | Luisa Zintgraf, University of Oxford, UK, zintgraf@deepmind.com | Chelsea Finn, Stanford University, USA, cbfinn@cs.stanford.edu | Shimon Whiteson, University of Oxford, UK, shimon.whiteson@cs.ox.ac.uk

Suggested Citation

Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn and Shimon Whiteson (2025), "A Tutorial on Meta-Reinforcement Learning", Foundations and Trends® in Machine Learning: Vol. 18: No. 2-3, pp 224-384. http://dx.doi.org/10.1561/2200000080

Publication Date: 03 Apr 2025

Subjects

Bayesian learning, Deep learning, Reinforcement learning, Variational inference, Robot control, Artificial intelligence in robotics

Journal details

Download article

In this article:

Abstract

While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible. In this survey, we describe the meta-RL problem setting in detail as well as its major variations. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. Using these clusters, we then survey meta-RL algorithms and applications. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.

DOI:10.1561/2200000080

Book details

ISBN: 978-1-63828-540-3

176 pp. $99.00

Buy book (pb)

ISBN: 978-1-63828-541-0

176 pp. $320.00

Buy E-book (.pdf)

Table of contents:

1. Introduction

2. Background

3. Few-shot Meta-RL

4. Many-shot Meta-RL

5. Applications

6. Open Problems

7. Conclusion

Appendix

References

A Tutorial on Meta-Reinforcement Learning

While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. Meta-RL considers a family of machine learning (ML) methods that learn to reinforcement learn. That is, meta-RL methods use sample-inefficient ML to learn sample-efficient RL algorithms, or components thereof. Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible.

In this monograph, the meta-RL problem setting is described in detail as well as its major variations. At a high level the book discusses how meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. Using these clusters, the meta-RL algorithms and applications are surveyed. The monograph concludes by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.

A Tutorial on Meta-Reinforcement Learning

Free Preview:

Share

Journal details

Abstract

Book details

A Tutorial on Meta-Reinforcement Learning