Frontiers of Data and Computing ›› 2026, Vol. 8 ›› Issue (3): 181-190.

doi: 10.11871/jfdc.issn.2096-742X.2026.03.015

• Technology and Application • Previous Articles     Next Articles

Intelligent Scoring Method for Police Training Action Based on Multi-Stream Graph-Temporal Fusion Network

ZHANG Peijing1(),YAN Jiaxin2,WANG Xiaoxuan3,Li Junjie2,*(),ZENG Yunfei1   

  1. 1 College of Informatics and Cyber Security, People’s Public Security University of China, Beijing 100038, China
    2 Xing-zhi (Beijing) Technology Research Institute Co., Ltd., Beijing 102629, China
    3 College of Police Law Enforcement Abilities Training, People’s Public Security University of China, Beijing 100038, China
  • Received:2025-10-31 Online:2026-06-20 Published:2026-06-18
  • Contact: Li Junjie E-mail:zhangpeijing@ppsuc.edu.cn;ljj12393@163.com

Abstract:

[Purpose] An intelligent scoring method training action based on a Multi-Stream Graph-Temporal Fusion Network (MS-GTFN) is proposed to address the issues of subjective evaluation, low efficiency, insufficient standardization, and difficulty in quantifying action quality in police training assessments that rely on manual scoring. This method provides a reliable quality evaluation reference for police officers to conduct independent training. [Methods] First, four types of spatiotemporal feature streams—joint stream, bone stream, and their corresponding motion streams—are constructed to comprehensively encode both structural and dynamic characteristics of movements. Second, Graph Convolutional Networks (GCNs) and Temporal Convolutional Networks (TCNs) are employed in parallel to extract spatiotemporal fusion features of actions. Subsequently, channel attention (CA) and spatial self-attention (SSA) modules are introduced to further enhance the model’s capability to focus on key features. Finally, a multi-layer perceptron (MLP) is used to predict action score. [Results] The proposed model is trained and validated on our self-built police training dataset. Experimental results demonstrate promising performance in motion action score prediction tasks, achieving an MAE of 0.4553, an MSE of 0.3507, and an R2 of 0.9306. [Conclusions] The method offers significant advantages in the fusion of training action features and scoring accuracy, providing more reliable support for intelligent scoring of police training action quality.

Key words: graph convolutional networks, temporal convolutional networks, multi-stream fusion, attention module, police force training, scoring of action quality