Abstract
This paper investigates the resource scheduling problem for multiple dynamical systems against cyber attacks over wireless sensor networks, which have been extensively deployed in consumer electronics scenarios such as the industrial Internet of Things and collaborative control of smart homes. However, these scenarios are commonly confronted with challenges such as constrained channel resources and vulnerability to disruptions and attacks. Thus, this research focuses on achieving optimal transmission scheduling for wireless sensors during the dynamic game of selective attacks and system defense against denial of service (DoS) attackers. The adversarial nature between sensors and DoS attackers is represented by introducing a two-player zero-sum game. The Nash equilibrium obtained from the game is used to find a stable combination of strategies and indirectly reflects the quality of the communication channel. Unlike other metrics that focus on packet transmission performance, this paper inverts the optimal transmission scheduling strategy to find the maximum off-duty duration of the sensor, which emphasizes the value of up-to-date estimation information and ensures that the trace of the estimation error covariance is bounded. Next, a sufficiently necessary condition for system stability is derived based on the system matrix and packet loss probability. In addition, to address the problem that traditional theoretical optimization methods cannot be applied to large-scale systems, an optimal transmission scheduling algorithm based on deep reinforcement learning (DRL) is designed. Numerical results show that the algorithm significantly reduces system energy consumption compared to existing algorithms, such as minmax-Q-learning and approximate dynamic programming algorithms, and that the optimal transmission scheduling strategy is periodic.