Landing Site Selection and Divert Maneuver Planning with Deep Reinforcement Learning

Figure 1. Concept of the Research

Autonomous hazard detection and avoidance (HD&A) is of great importance in future planetary landing missions. This research proposes a new integrated framework for identifying safe landing sites and planning in-flight divert maneuvers. Conventional landing site selection algorithms rely on calculated local terrain features (e.g., slope and roughness) to find and prioritize candidate landing sites. However, they cannot select a target landing site that maximizes the expected probability of successful landing with explicit consideration of future divert maneuvers and observation. This study aims to optimize the landing site selection strategy concurrently with guidance and control policy by reinforcement learning to maximize the expected ratio of a successful landing by formulating the HD&A sequence as a Partially Observable Markov Decision Process (POMDP).

The developed framework was applied to a 3-DOF lunar landing with a Lidar observation scenario. The (high) dimension of Lidar DEM data was reduced with the trained auto-encoder, while stability is guaranteed by utilizing the ZEM-ZEV feedback controller as a baseline control law. The target landing position and control gain of the ZEM-ZEV controller was adjusted by the reinforcement learning agent trained by both memory-based and memory-less algorithms. The investigation of the obtained result showed the capability of the agent to effectively adjust the control gain to achieve both long and short divert maneuvers.

This research is a collaboration project with Space Systems Optimization Group in Georgia Institute of Technology.

Figure 2. Overall Framework

Related Publication:

[C5] Iiyama, K., Tomita, K., Jagatia, B.A., Nakagawa, T., Ho, K., “Deep reinforcement learning for safe landing site selection with concurrent consideration of divert maneuvers”, AAS/AIAA Astrodynamics Specialist Conference, 2020,

Keidai Iiyama