Frontiers of Data and Computing ›› 2026, Vol. 8 ›› Issue (1): 77-90.

CSTR: 32002.14.jfdc.CN10-1649/TP.2026.01.007

doi: 10.11871/jfdc.issn.2096-742X.2026.01.007

• Technology and Application • Previous Articles     Next Articles

A Dynamic Scheduling Method for Police Resources Based on Bayesian Networks and Reinforcement Learning

LIU Chunlong1(),MA Qiuping1,WANG Runsheng2,HU Jinming1,3,HU Xiaofeng1,3,*()   

  1. 1. School of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
    2. Senior Police Officer Academy MPS, Beijing 100045, China
    3. Key Laboratory of Security Prevention Technology and Risk Assessment, Ministry of Public Security, People’s Public Security University of China, Beijing 102623, China
  • Received:2025-06-22 Online:2026-02-20 Published:2026-02-02
  • Contact: HU Xiaofeng E-mail:2228154580@qq.com;huxiaofeng@ppsuc.edu.cn

Abstract:

[Objective] To address issues with traditional fixed police resource allocation models, which cannot promptly respond to dynamic changes in regional crime risks and lack dynamic synergy optimization across multiple types of police resources, [Methods] this paper proposes a dynamic police resource scheduling method based on Bayesian networks and reinforcement learning. The method first uses Bayesian networks to evaluate crime risks in different regions, then employs reinforcement learning algorithms to obtain optimal police resource allocation strategies. To verify the effectiveness of this method, five different resource allocation plans were designed using a district in a large northern city as a case study. [Results] Experimental results show that in the reinforcement learning model, the DQN algorithm achieved the best training performance (with a reward value of 1,755.82). The reinforcement learning method reduced the expected risk value by 6.68% compared to traditional allocation methods. Non-linear fitting results between resources and risk indicate that resource input within the range of 1.1 to 1.2 times the baseline value yields the optimal cost-benefit ratio. The research results are applicable to the rational allocation of policing resources in the field of urban public security.

Key words: bayesian networks, reinforcement learning, police resource allocation, crime risk assessment, DQN algorithm, nonlinear regression