Abstract
The interest in using multiple spacecrafts in one mission has been increasing over the last decade. However there are still a number of problems need to be addressed before the multiple spacecraft system can be operated effectively. Automated planning and scheduling is one of the problems. An intelligent planning and scheduling system, referred to as “planner” for short here, is not new to single spacecraft missions. Space agencies like NASA and ESA have already applied this technology to some of their missions. But the question remained is how to plan and schedule a group of spacecrafts effectively? Although some experimental solutions exist, there are still a number of crucial problems left: (i) Multiple spacecraft systems suffer high dimensional unpredictability and non-linearity which traditional planning and scheduling technology can not handle well; (ii) Multiple decision makers need to be incorporated into planning and scheduling; (iii)Exploring the huge state space consumes computational resource heavily. In this research, we introduce a novel way to design a planner for future multiple spacecraft operation, for example a multiple small rovers mission. We borrow the agent concept from artificial intelligence so as to abstract the complex spacecraft, as a result a group of multiple spacecraft can be seen as a multi-agent system without losing the system’s functional integrity. This enables us to utilize the agent based model that simplifies the planner design. We choose the reinforcement learning as the core planning algorithm for agent’s planner, which enables each agent to be adaptive enough for the stochastic environment, and the cooperative behaviour among agents can emerge without communication. A new design has been made for the reinforcement learning value function so that communication among agents is not required for the proposed planner. We have developed a multi-agent simulator to verify the proposed design. The simulator is derived from a pursuit problem in the classic grid world domain, and is modified to simulate a planetary mission scenario containing multiple rovers. We have designed a number of mission scenarios for testing and validating the proposed algorithms. The experimental results show that our algorithm can help the planner generates suboptimal policies by function approximation for agent much more efficiently.