Frontiers of Data and Computing ›› 2025, Vol. 7 ›› Issue (2): 109-119.

CSTR: 32002.14.jfdc.CN10-1649/TP.2025.02.011

doi: 10.11871/jfdc.issn.2096-742X.2025.02.011

• Technology and Application • Previous Articles     Next Articles

Video Action Recognition Model Based on Attention and Relative Average Discriminator

WANG Qijun1(),LIU Tinglong2,*()   

  1. 1. Dalian Port Communication Engineering Co., LTD., Dalian, Liaoning 116001, China
    2. Center for Information Technology, Dalian Polytechnic University, Dalian, Liaoning 116034, China
  • Received:2024-09-17 Online:2025-04-20 Published:2025-04-23
  • Contact: LIU Tinglong E-mail:qjwang@cmhk.com;liutl@dlpu.edu.cn

Abstract:

[Objective] In the task of video action recognition, how to obtain more action features is a key research issue. Video action features include temporal and spatial features. If the model cannot extract enough spatiotemporal features, it will seriously affect the results of video action recognition. [Methods] Super-resolution technology mainly serves to convert low-resolution images into high-resolution images; some researchers have applied super-resolution technology to the field of low-resolution video action recognition. However, these methods mainly focus on enhancing the resolution of video frame images from the overall perspective of the video, paying insufficient attention to the features of the action itself. To address this issue, this paper proposes a super-resolution network model based on attention mechanisms and relative average discriminators (ARADSP). [Results] The proposed method focuses on action features in videos with more spatiotemporal information through attention mechanisms and improves data quality and stability through relative average discriminators. Finally, experiments are conducted on HMDB51, UCF101, and Something-Something V1&V2 datasets. Extensive experimental results demonstrate the effectiveness of the proposed method in video action recognition.

Key words: attention, supper-resolution, generative adversarial network, realistic average discriminator, low-resolution action recognition