Abstract:
A novel cross-step word segmentation algorithm is proposed to process real-time traffic information represented in natural Chinese in this paper,to meet the urgent need of real-time traveling information service,for dynamic traffic information. Considering the record length distribution of the word libraries depicting real-time traffic information,this algorithm sets corresponding steps of word segmentation for address,direction and event libraries,and improves the one step running of the string pointer in classical Chinese word segmentation to flexible multiple steps running,so as to aggregate possible Chinese words efficiently. A case study shows that the proposed algorithm runs 10 times faster than an improved MM algorithm,whilst keeping similar accuracy and robustness. The authors argued that the presented algorithm is greatly helpful to the automatic and intelligent processing of the real-time traffic information,and facilitate the development of travel information services.