潘文康, 邵振峰, 廖明, 李先怡, 宋杨. 基于深度时空自编码网络与多示例学习的船只异常事件检测[J]. 武汉大学学报 ( 信息科学版). DOI: 10.13203/j.whugis20220121
引用本文: 潘文康, 邵振峰, 廖明, 李先怡, 宋杨. 基于深度时空自编码网络与多示例学习的船只异常事件检测[J]. 武汉大学学报 ( 信息科学版). DOI: 10.13203/j.whugis20220121
PAN Wenkang, SHAO Zhenfeng, LIAO Ming, LI Xianyi, SONG Yang. Ship Abnormal Event Detection Based on Deep Spatiotemporal Autoencoder Network and Multi-instance Learning[J]. Geomatics and Information Science of Wuhan University. DOI: 10.13203/j.whugis20220121
Citation: PAN Wenkang, SHAO Zhenfeng, LIAO Ming, LI Xianyi, SONG Yang. Ship Abnormal Event Detection Based on Deep Spatiotemporal Autoencoder Network and Multi-instance Learning[J]. Geomatics and Information Science of Wuhan University. DOI: 10.13203/j.whugis20220121

基于深度时空自编码网络与多示例学习的船只异常事件检测

Ship Abnormal Event Detection Based on Deep Spatiotemporal Autoencoder Network and Multi-instance Learning

  • 摘要: 异常事件检测是交通安全防控的重要支撑技术,也一直是信息科学领域研究的热点。本文提出一个基于深度时空自编码网络与多示例学习的船只异常事件检测方法,针对目前无法为模型训练提供精确帧级别标注的问题,引入多示例学习模型,将视频作为包,并将视频片段作为包中的示例,通过网络自动学习一个深度异常排序模型,该模型能预测异常视频片段的分数。同时,在特征提取方面,提出了深度时空自编码网络,在空间自编码器中,为了获取更精确的RGB特征,将解码器中的上采样层替换为像素重组层。在时间自编码器中,为了突出运动变化较大的区域,引入基于方差的注意力机制,使快速移动的物体有更大的运动损失,有利于检测出异常事件。本文还构建了一个新的大规模的船只视频数据集,包括100个真实场景的监控视频以及5类真实的异常事件,分别为海面逗留、非港口靠岸、非港口离岸,超速和越界。该数据集可用于模型的训练与测试,实验结果表明,相比传统的双流网络以及基于图像重构的检测方法,本文提出的基于深度时空自编码网络与多示例学习的方法在异常事件检测精度上由71.7%提升为82.4%,表明所提方法在船只异常事件检测上的有效性。

     

    Abstract: Objectives: RGB and Motion features are very important for ship video abnormal event detection. We need to extract these features in video more accurately and apply them to the detection of abnormal events in ship video. Meanwhile, due to the huge cost of frame-level annotations, we also need to solve the problem of not providing frame-level annotations in the model training stage, but using video-level annotations for model training. In addition, we also need to solve the problem of the scarcity of ship video abnormal event database. Method: We acquired a large number of surveillance videos of ships on the sea surface and constructed a data set of abnormal events of ship video after processing. Also, we proposed a ship abnormal event detection model based on deep spatiotemporal autoencoder network and multi-instance learning, using a deep multi-instance ranking framework, without obtaining frame-level annotations, only video-level information is needed. In addition, In the spatial autoencoder, in order to obtain more accurate RGB features, the deconvolution layer in the decoder is replaced by a pixel shuffle layer. In the temporal autoencoder, in order to highlight the regions with large motion variation, a variance-based attention mechanism is introduced, so that fast-moving objects have a larger motion loss. Results: We compared the proposed method with the two benchmark methods and a previous state-of-the-art method. The experimental results show that the proposed method has higher detection accuracy. In addition, we observed that variance-based attention can significantly improve the detection effect of fast motion, such as unexpected stopping and overspeed. Conclusion: This shows that RGB and motion features play an important role in ship abnormal event detection and also proves the necessity of multi-instance ranking model.

     

/

返回文章
返回