SUBJECT: Ph.D. Dissertation Defense
BY: Visak Chadalavada Vijay Kumar
TIME: Tuesday, September 7, 2021, 1:00 p.m.
PLACE: name, Online
TITLE: Learning Control Policies for Fall Prevention and Safety in Bipedal Locomotion
COMMITTEE: Dr. Karen Liu, Chair (CS)
Dr. Sehoon Ha (CS)
Dr.Gregory Turk (CS)
Dr. Gregory Sawicki (ME)
Dr. Ye Zhao (ME)


The ability to recover from an unexpected external perturbation is a fundamental motor skill in bipedal locomotion. An effective response includes the ability to not just recover balance and maintain stability but also to fall in a safe manner when balance recovery is physically infeasible. For robots associated with bipedal locomotion, such as humanoid robots and assistive robotic devices that aid humans in walking, designing controllers which can provide this stability and safety can prevent damage to robots or prevent injury related medical costs. This is a challenging task because it involves generating highly dynamic motion for a high-dimensional, non-linear, and under-actuated system with contacts. Despite prior advancements in using model-based and optimization methods, challenges such as requirement of extensive domain knowledge, relatively large computational time, and limited robustness to changes in dynamics still make this an open problem. In this thesis, to address these issues we develop learning-based algorithms capable of synthesizing push recovery control policies for two different kinds of robots : Humanoid robots and assistive robotic devices that assist in bipedal locomotion. Our work can be branched into two closely related directions : 1) Learning safe falling and fall prevention strategies for humanoid robots and 2) Learning fall prevention strategies for humans using a robotic assistive device. To achieve this, we introduce a set of Deep Reinforcement Learning (DRL) algorithms to learn control policies that improve safety while using these robots. To enable efficient learning, we present techniques to incorporate abstract dynamical models, curriculum learning and a novel method of building a graph of policies into the learning framework. We also propose an approach to create virtual human walking agents which exhibit similar gait characteristics to real-world human subjects, using which, we learn an assistive device controller to help virtual human return to steady state walking after an external push is applied. Finally, we extend our work on assistive devices and address the challenge of transferring a push-recovery policy to different individuals. As walking and recovery characteristics differ significantly between individuals, exoskeleton policies must be fine-tuned for each person which is a tedious, time consuming and potentially unsafe process. We propose to solve this by posing it as a transfer learning problem, where a policy trained for one individual can adapt to another without fine tuning.