By Brenna D. Argall, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, brennadee.argall@epfl.ch | Eric L. Sauser, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, eric.sauser@epfl.ch | Aude G. Billard, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, aude.billard@epfl.ch
Demonstration learning is a powerful and practical technique to develop robot behaviors. Even so, development remains a challenge and possible demonstration limitations, for example correspondence issues between the robot and demonstrator, can degrade policy performance. This work presents an approach for policy improvement through a tactile interface located on the body of the robot. We introduce the Tactile Policy Correction (TPC) algorithm, that employs tactile feedback for the refinement of a demonstrated policy, as well as its reuse for the development of other policies. The TPC algorithm is validated on humanoid robot performing grasp positioning tasks. The performance of the demonstrated policy is found to improve with tactile corrections. Tactile guidance also is shown to enable the development of policies able to successfully execute novel, undemonstrated, tasks. We further show that different modalities, namely teleoperation and tactile control, provide information about allowable variability in the target behavior in different areas of the state space.
The development of behaviors for robot motion control is fundamental for robot operation in physical environments, yet is challenged by many factors such as sensor noise and approximate actuation models. Techniques like demonstration learning, that seed a training dataset with examples of behaviour execution by a task expert, are both powerful and practical for the development of motion control behaviors. To further endow a robot with the ability to continue learning from experience after demonstration can assist in robustness to poor demonstrators or demonstration interfaces, and also enable behavior adaptation to changes in the environment or task requirements. Tactile Guidance for Policy Adaptation introduces an approach for continuing motion control learning after demonstration that capitalizes on the availability of multiple sensor modalities through which a human teacher may transfer domain knowledge. Of particular note is that motion control corrections are provided through tactile sensors located on the body of the robot. The approach is validated on a high degree-of-freedom robot system, for which both demonstration and correction are challenging. Tactile Guidance for Policy Adaptation should be of interest to those considering the use of demonstration and machine learning for the development of robot behaviors, in particular for high degree-of-freedom humanoids, as well as to those interested in the transfer of task knowledge through multiple sensor modalities.