基于多阶段特征和非对称-膨胀卷积的目标跟踪

An Object Tracking Method Based on Multi-stage Features and Asymmetric-Dilated Convolution

  • 摘要: 目标跟踪是计算机视觉领域的研究热点,基于相关滤波的目标跟踪方法具有良好的表现,但在特征提取过程中,对图像采用的人工特征描述方法具有一定局限性。为获得图像更具鲁棒性的特征表达,在目标跟踪过程中采用卷积神经网络实现对图像的特征提取,并与相关滤波方法相结合,提出一种基于多阶段特征和非对称-膨胀卷积的目标跟踪方法,以嵌有非对称-膨胀卷积模块的ResNet50网络为特征提取网络,分别输出网络多个阶段的特征图,通过相关滤波实现目标检测定位。在OTB100视频数据集上对所提方法进行实验,设置距离阈值为20像素时,距离精度可达85.38%;设置重叠阈值为50%时,重叠精度可达80.42%。实验结果验证了所提方法的准确性,且对背景复杂、遮挡以及旋转形变等情况具有较好的鲁棒性。

     

    Abstract:
    Objectives Object tracking is a research focus in the field of computer vision. The method based on correlation filters performs well in object tracking, but artificial feature description of images has certain limitations in the process of feature extraction. Convolutional neural network (CNN) has been widely used in computer vision, natural language processing and other fields, and they can tune the weights of network parameters by learning training samples to extract depth features of images. In order to obtain more robust feature expression of images, CNN is used to extract the features of images in object tracking.
    Methods Combining CNN with correlation filters, we propose an object tracking method based on multi-stage features and asymmetric-dilated convolution. The ResNet50 network embedded with asymmetric-dilated convolution block is used as the network of feature extraction and it can respectively output the feature maps from multiple stages of the network for correlation filters to achieve object detection and localization.
    Results The proposed method is tested on OTB100 video dataset. The distance precision can reach 85.38% if the distance threshold is set as 20 pixels, and the overlap precision can reach 80.42% if the overlap threshold is set as 50%.
    Conclusions The experimental results verify the accuracy of the proposed method which is relatively robust under certain conditions such as complexity background, occlusion and rotational deformation.

     

/

返回文章
返回