Abstract
Future missions to the Moon and beyond are likely to involve low-thrust propulsion technologies due to their propellant efficiency. However, these still present a difficult trajectory design problem. Lyapunov control laws can generate sub-optimal trajectories with minimal computational cost and are suitable for feasibility studies and as initial guesses for optimisation methods. In this work we combine Lyapunov control laws with state-dependent weights trained via reinforcement learning to design low-thrust transfers from GTO towards low-altitude Lunar orbits. The agent is able to explore third-body effects during training and learn to remain stable to perturbations during the different transfer phases. Three different approaches are investigated: backwards propagation, backwards propagation with freed geometry, and forwards propagation including rendezvous capability with the Lunar SOI. The last of these proves to be the most successful, coming within 6.6% of the optimal solution.