双向预测BiP-GAN的行人视频异常事件自动检测

张杰, 杨雪, 龚智龙, 关庆锋

张杰, 杨雪, 龚智龙, 关庆锋. 双向预测BiP-GAN的行人视频异常事件自动检测[J]. 武汉大学学报 ( 信息科学版). DOI: 10.13203/j.whugis20240259
引用本文: 张杰, 杨雪, 龚智龙, 关庆锋. 双向预测BiP-GAN的行人视频异常事件自动检测[J]. 武汉大学学报 ( 信息科学版). DOI: 10.13203/j.whugis20240259
ZHANG Jie, YANG Xue, GONG Zhilong, GUAN Qingfeng. Bidirectional Prediction BiP-GAN Pedestrian Video Anomaly Event Automatic Detection[J]. Geomatics and Information Science of Wuhan University. DOI: 10.13203/j.whugis20240259
Citation: ZHANG Jie, YANG Xue, GONG Zhilong, GUAN Qingfeng. Bidirectional Prediction BiP-GAN Pedestrian Video Anomaly Event Automatic Detection[J]. Geomatics and Information Science of Wuhan University. DOI: 10.13203/j.whugis20240259

双向预测BiP-GAN的行人视频异常事件自动检测

基金项目: 

国家自然科学基金资助项目(42271449)。

详细信息
    作者简介:

    张杰,硕士生,主要从事视频图像处理研究。18333981575@163.com

    通讯作者:

    杨雪 副教授,硕士生导师。主要从事面向交通场景的众源感知与服务。yangxue@cug.edu.cn

Bidirectional Prediction BiP-GAN Pedestrian Video Anomaly Event Automatic Detection

  • 摘要: 视频监控系统在安全和监督领域扮演着至关重要的角色,如何在不需要人为干预的情况下从视频中自动精准识别具有潜在安全威胁的行人非正常行为或事件,减少对大量视频监控画面的人工审查压力,是目前计算机视觉领域的研究热点之一。近年来人工智能技术的快速发展使得视频异常检测技术得到了大幅提升,但对于多变、多样环境下异常与正常行为细微差异区分还存在挑战。本文构建了一种新的双向预测BiP-GAN(Bidirectional predictionGAN)视频行人异常检测模型。该模型主要包括CCA-UNet生成器和Globle-Patch判别器,利用光流模型在光流变化及图像序列运动特征的捕获优势,将其用于生成器和判别器的损失函数计算。CCA-UNet生成器以经典U-Net模块为基础,通过引入CCA(CrossCirssAttention)模块增强模型对视频行为关键特征的识别能力。Globle-Patch通过结合Globle判别器和Patch判别器在全局和局部特征的感受优势,提高模型全局及局部的特征感受能力,提高模型的鲁棒性和准确性。BiP-GAN的预训练策略采用前4帧正向预测和后4帧反向预测的双向预测模式,使模型更好地结合图像序列的上下文特征生成图像质量更好的预测帧。另外,BiP-GAN采用Warm-up与CAF(Cosine Annealing Function余弦退火学习率函数)相结合的学习率衰减方法,加快模型寻找全局最优解,从而节省计算资源。实验利用公开数据集CUHK Avenue、UCSD ped2和ShanghaiTech对BiP-GAN进行了验证和分析,其AUC平均值为: 87.3,96.2,73.9,均高于已有baseline模型(如: Ada-GAN; Con-GAN; Mul-GAN)。消融实验表明CCAUNet生成器、Globle-Patch判别器、双向预测策略以及warm-up与CAF结合的学习率衰减方法对于模型的有效性。
    Abstract: Video surveillance system plays a vital role in the field of safety and supervision. How to automatically and accurately identify abnormal behaviors or events with potential security threats from video without human intervention and reduce the pressure of manual review of a large number of video surveillance images is one of the research hotspots in the field of computer vision at present. In recent years, the rapid development of artificial intelligence technology has greatly improved the video anomaly detection technology, but there are still challenges to distinguish the subtle differences between abnormal and normal behaviors in changing and diverse environments. In this paper, a new Bidirectional prediction GAN (BiP-GAN) video pedestrian anomaly detection model is constructed. The model mainly includes CCA-UNet generator and Globle-Patch discriminator. The optical flow model is used to calculate the loss function of the generator and discriminator by taking advantage of the optical flow change and image sequence motion characteristics. Cca-unet generator is based on the classic U-Net module and introduces CrossCirssAttention (CCA) module to enhance the model's ability to recognize key features of video behavior. By combining the global and local feature sensing advantages of Globle discriminator and Patch discriminator, Globle-Patch improves the global and local feature sensing ability of the model, and improves the robustness and accuracy of the model. The BiP-GAN pre-training strategy adopts the bidirectional prediction mode of forward prediction for the first 4 frames and reverse prediction for the last 4 frames, so that the model can better combine the context features of the image sequence to generate prediction frames with better image quality. In addition, BiP-GAN adopts a learning rate attenuation method combining Warm-up and CAF (Cosine Annealing learning rate Function) to speed up the model's search for the global optimal solution, thus saving computational resources. Public data sets CUHK Avenue, UCSD ped2 and ShanghaiTech were used to verify and analyze BiP-GAN. The average AUC values of BIP-GAN were 87.3, 96.2 and 73.9, which were all higher than the baseline model (e.g., AdaGAN; Con-GAN; Mu-gan). Ablation experiments show the effectiveness of the CCA-UNet generator, Globle Patch discriminator, bidirectional prediction strategy and the learning rate attenuation method combined with warm-up and CAF.
计量
  • 文章访问数:  160
  • HTML全文浏览量:  7
  • PDF下载量:  25
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-01-25

目录

    /

    返回文章
    返回