By Zhong-Ping Jiang, Tandon School of Engineering, New York University, USA, zjiang@nyu.edu | Tao Bian, Bank of America, USA, tbian@nyu.edu | Weinan Gao, College of Engineering and Science, Florida Institute of Technology, USA, wgao@fit.edu
This monograph presents a new framework for learning-based control synthesis of continuous-time dynamical systems with unknown dynamics. The new design paradigm proposed here is fundamentally different from traditional control theory. In the classical paradigm, controllers are often designed for a given class of dynamical control systems; it is a model-based design. Under the learning-based control framework, controllers are learned online from real-time input–output data collected along the trajectories of the control system in question. An entanglement of techniques from reinforcement learning and model-based control theory is advocated to find a sequence of suboptimal controllers that converge to the optimal solution as learning steps increase. On the one hand, this learning-based design approach attempts to overcome the well-known “curse of dimensionality” and the “curse of modeling” associated with Bellman’s Dynamic Programming. On the other hand, rigorous stability and robustness analysis can be derived for the closed-loop system with real-time learning-based controllers. The effectiveness of the proposed learning-based control framework is demonstrated via its applications to theoretical optimal control problems tied to various important classes of continuous-time dynamical systems and practical problems arising from biological motor control, connected and autonomous vehicles.
The recent success of Reinforcement Learning and related methods can be attributed to several key factors. First, it is driven by reward signals obtained through the interaction with the environment. Second, it is closely related to the human learning behavior. Third, it has a solid mathematical foundation. Nonetheless, conventional Reinforcement Learning theory exhibits some shortcomings particularly in a continuous environment or in considering the stability and robustness of the controlled process.
In this monograph, the authors build on Reinforcement Learning to present a learning-based approach for controlling dynamical systems from real-time data and review some major developments in this relatively young field. In doing so the authors develop a framework for learning-based control theory that shows how to learn directly suboptimal controllers from input-output data.
There are three main challenges on the development of learning-based control. First, there is a need to generalize existing recursive methods. Second, as a fundamental difference between learning-based control and Reinforcement Learning, stability and robustness are important issues that must be addressed for the safety-critical engineering systems such as self-driving cars. Third, data efficiency of Reinforcement Learning algorithms need be addressed for safety-critical engineering systems.
This monograph provides the reader with an accessible primer on a new direction in control theory still in its infancy, namely Learning-Based Control Theory, that is closely tied to the literature of safe Reinforcement Learning and Adaptive Dynamic Programming.