韦豪东, 易尧华, 余长慧, 林立宇. 融合注意力与序列单元的文本超分辨率[J]. 武汉大学学报 ( 信息科学版). DOI: 10.13203/j.whugis20220158
引用本文: 韦豪东, 易尧华, 余长慧, 林立宇. 融合注意力与序列单元的文本超分辨率[J]. 武汉大学学报 ( 信息科学版). DOI: 10.13203/j.whugis20220158
WEI Haodong, YI Yaohua, YU Changhui, LIN Liyu. Text Super-resolution Method with Attentional Mechanism and Sequential Units[J]. Geomatics and Information Science of Wuhan University. DOI: 10.13203/j.whugis20220158
Citation: WEI Haodong, YI Yaohua, YU Changhui, LIN Liyu. Text Super-resolution Method with Attentional Mechanism and Sequential Units[J]. Geomatics and Information Science of Wuhan University. DOI: 10.13203/j.whugis20220158

融合注意力与序列单元的文本超分辨率

Text Super-resolution Method with Attentional Mechanism and Sequential Units

  • 摘要: 街景影像中的文本信息是感知与理解场景的关键线索,低分辨率街景影像文本区域细节缺乏导致文本识别准确率降低。文本超分辨率通过增强文本区域边缘及纹理细节提高文本识别准确率,本文提出融合注意力与序列单元的街景影像文本超分辨率方法。首先采用混合残差注意力结构提取影像文本区域空间信息、通道信息并融合特征,序列单元通过双向门控循环结构提取影像中文本间序列先验信息;再利用梯度先验知识作为约束条件,重构街景影像文本区域。本文采用TextZoom真实场景影像及合成文本影像进行对比分析,试验结果表明超分辨率重构的街景影像文本区域边缘清晰、纹理细节丰富,可以提高街景影像文本识别准确率。

     

    Abstract: Objectives: The text in street view images is the clue to perceive and understand scene information. Low-resolution street view images lack details in the text region, leading to poor recognition accuracy. Super-resolution can be introduced as pre-processing to reconstruct edge and texture details of the text region. To improve text recognition accuracy, we propose a text super-resolution network combining attentional mechanism and sequential units.   Methods: A hybrid residual attention structure is proposed to extract spatial information and channel information of the image text region, learning multi-level feature representation. A sequential unit is proposed to extract sequential prior information between texts in the image through bidirectional gated recurrent units. Using gradient prior knowledge as the constraint, a gradient prior loss is designed to sharpen character boundaries.   Results and Conclusions: In order to verify the effectiveness of the proposed method, we use real scene text images in TextZoom and synthetic text images to carry out comparative analysis experiments. Experimental results show that the proposed method can reconstruct clear text edges and rich text texture details, and improve text recognition accuracy of street view images.

     

/

返回文章
返回