RADIATION PROTECTION ›› 2025, Vol. 45 ›› Issue (5): 517-529.

Previous Articles     Next Articles

Research and development of a dynamic optimization decision model for nuclear emergency evacuation based on deep reinforced learning

LI Mingye1,2, YAO Rentai1,2, GUO Huan1,2, ZHANG Junfang1,2, LV Minghua1,2, XU Xiangjun1,2, NIU Yanjing1,2, JIA Bohui3   

  1. 1. China Institute for Radiation Protection,Taiyuan 030006;
    2. CNNC Key Laboratory for of Nuclear Environment Simulation & Evaluation Technology,Taiyuan 030006;
    3. Forlinx Embedded Technology Co., Hebei Baoding 071052
  • Received:2025-01-04 Online:2025-09-20 Published:2026-01-14

Abstract: Timely and effective evacuation of people during nuclear accident scenarios is critical to minimize radiation exposure and ensure public safety. Although traditional path planning algorithms can quickly compute static shortest paths, they are difficult to adapt to the challenges posed by dynamic changes in radiation fields. In this paper, a dynamic optimization decision model (MD-DQN algorithm model) for nuclear emergency evacuation based on deep reinforced learning is proposed. By establishing a Markov decision process (MDP) model, and taking the dynamic radiation field information, road network information, and real-time location as the state space, a multifactorial reward function that comprehensively considers the path length, radiation exposure and directional guidance is designed. The inteligent agent is driven to learn the optimal dynamic evacuation decision-making strategy autonomously. Meanwhile, the convergence and generalization performance of the algorithm are improved by optimizing the network structure design and instant reward mechanism. Simulation experiments show that compared with the traditional Dijkstra’s algorithm and A* algorithm, the MD-DQN algorithm is able to effectively avoid high-risk areas in time, significantly reduce the radiation dose exposure of personnel in the evacuation process, and has better real-time path adjustment ability and environmental adaptability. The research results can provide an efficient, intelligent and decision support tool for practical nuclear emergency evacuation decision-making, and provide new research ideas for the future in the field of intelligent decision-making driven by multi-source radiation, multi-intelligent agent and real-time data.

Key words: deep reinforced learning, nuclear emergency evacuation, dynamic evacuation decision, Markov decision process, MD-DQN

CLC Number: 

  • TL73