A Multi-model Fusion Model of Individual Travel Location Prediction Using Markov and Machine Learning Methods
-
摘要: 随着城市化的发展, 人们出行的方式逐渐多样化, 对人类行为的深入理解以及对个体出行行为的建模预测有助于解释若干复杂的社会经济现象, 且在基于位置的服务、交通规划、公共安全等方面具有重要价值。个体出行行为预测建立在深入理解人类活动特性的基础上, 而在移动互联网时代, 网络空间的上网行为与现实空间的出行行为密不可分。首先基于上网行为特征, 融合马尔可夫(Markov)模型和多类机器学习模型, 构建了个体出行位置预测模型, 该模型使用了基于频率分布图的自适应融合规则, 融合了传统的Markov模型和机器学习多分类模型的结果进行个体出行位置预测;然后利用手机数据、上网流量数据、兴趣点数据及天气等多源数据进行个体出行位置预测实验。实验结果表明, 该模型的第1个和前3个预测结果中包括正确结果的准确率分别为74.59%、94.19%, 均优于基础模型的准确率和利用投票法融合规则融合基础模型的准确率, 且预测时间粒度为30 min时, 该模型的预测效果较好。Abstract:Objectives With the development of urbanization, people's travel behaviors have diversified. An in-depth understanding of human behavior and the modeling and prediction of individual travel behaviors are helpful in explaining several complex socio-economic phenomena, and are important in offering location-based services, transportation planning, and public safety. Individual travel behavior prediction is based on a deep understanding of human activity characteristics. In the era of mobile Internet, the online behavior of cyberspace is inseparable from the travel behavior of real space.Methods This paper integrates individuals' mobile phone tracking data and Internet traffic data, and constructs a multi-model fusion model of individual travel location prediction on Markov and machine learning methods. Considering the classification probability of prediction results, an adaptive fusion strategy based on frequency distribution graph is proposed. The prediction results of Markov model and machine learning multi-classification model are merged together to obtain the final mobile phone user travel location prediction result.Results This paper performs individual travel location prediction experiments on the basis of multi-source data. And the experiments show that the correct rate of the first result and the top three results of the multi-model fusion location prediction model based on histogram is respectively 74.59% and 94.19%, higher than the prediction accuracy of the basic model with the highest accuracy and the vote strategy.Conclusions Under the prediction time granularity of 30 minutes, the individual travel location prediction is better.
-
-
表 1 多模型的预测结果
Table 1 Prediction Results of Multiple Models
模型 类别1 类别2 … 类别n 预测结果 模型1 0.33 0.21 … 0 1 模型2 0.05 0.90 … 0.02 2 模型3 0.25 0.24 … 0 1 表 2 个体出行位置基础模型预测准确率对比/%
Table 2 Comparison of Prediction Accuracy of Different Prediction Algorithms/%
基础模型 top1准确率 top3准确率 CART算法 70.56 94.01 RF算法 69.82 87.69 kNN算法 63.30 87.84 SVM算法 57.52 86.54 一阶Markov模型 56.84 91.49 GBDT算法 72.80 92.77 Most Value模型 51.29 55.63 表 3 个体出行位置模型组合融合预测准确率对比/%
Table 3 Comparison of Prediction Accuracy of Different Combined Prediction Algorithms/%
组合模型 top1准确率 top3准确率 Markov模型、DT、SVM 74.14 93.65 Markov模型、DT、kNN 73.97 92.92 Markov模型、kNN、SVM 68.16 91.59 Markov模型、SVM、RF 73.35 93.66 Markov模型、kNN、SVM、RF 72.22 93.79 Markov模型、DT、kNN、SVM 73.54 94.13 本文模型 74.59 94.19 表 4 本文融合模型与投票法融合策略预测准确率对比/%
Table 4 Comparison of Prediction Accuracy Between Our Proposed Method and the Vote Strategy/%
融合方法 top1准确率 top3准确率 投票法融合 72.90 90.58 本文模型 74.59 94.19 表 5 不同时间粒度预测准确率对比/%
Table 5 Comparison of Prediction Accuracy Under Different Temporal Granularities/%
时间粒度/min top1准确率 top3准确率 10 69.80 92.84 20 71.50 94.27 30 74.59 94.19 -
[1] Xiao Y, Wang B, Liu Y, et al. Analyzing, Modeling, and Simulation for Human Dynamics in Social Network[J]. Abstract and Applied Analysis, 2012, (6 684): 552-582 http://downloads.hindawi.com/journals/aaa/2012/208791.xml
[2] Croitoru A, Wayant N, Crooks A, et al. Linking Cyber and Physical Spaces Through Community Detection and Clustering in Social Media Feeds[J]. Computers, Environment and Urban Systems, 2015, 53: 47-64 doi: 10.1016/j.compenvurbsys.2014.11.002
[3] Gonzalez M C, Hidalgo C A, Barabasi A L. Understanding Individual Human Mobility Patterns[J]. Nature, 2018, 453(7 196): 779-782 http://www.nature.com/articles/nature06958/
[4] Ahas R, Aasa A, Silm S, et al. Daily Rhythms of Suburban Commuters' Movements in the Tallinn Metropolitan Area: Case Study with Mobile Positioning Data[J]. Transportation Research Part C, 2010, 18(1): 45-54 doi: 10.1016/j.trc.2009.04.011
[5] 周涛, 韩筱璞, 闫小勇, 等. 人类行为时空特性的统计力学[J]. 电子科技大学学报, 2013, 42(2): 481-540 https://www.cnki.com.cn/Article/CJFDTOTAL-DKDX201304001.htm Zhou Tao, Han Xiaopu, Yan Xiaoyong, et al. Statistical Mechanics on Temporal and Spatial Activities of Human[J]. Journal of University of Electronic Science and Technology of China, 2013, 42(2): 481-540 https://www.cnki.com.cn/Article/CJFDTOTAL-DKDX201304001.htm
[6] 萧世伦, 方志祥. 从时空GIS视野来定量分析人类行为的思考[J]. 武汉大学学报·信息科学版, 2014, 39(6): 667-670 doi: 10.13203/j.whugis20140127 Shaw Shihlun, Fang Zhixiang. Rethinking Human Behavior Research from the Perspective of Space-time GIS[J]. Geomatics and Information Science of Wuhan University, 2014, 39(6): 667-670 doi: 10.13203/j.whugis20140127
[7] Fan Y, Khattak A J. Urban Form, Individual Spatial Footprints, and Travel: Examination of Space-Use Behavior[J]. Transportation Research Record Journal of the Transportation Research Board, 2008, 2 082: 98-106 http://www.researchgate.net/publication/237903442_Urban_Form_Individual_Spatial_Footprints_and_Travel_Examination_of_Space-Use_Behavior
[8] Xu Y, Shaw S L, Zhao Z, et al. Another Tale of Two Cities-Understanding Human Activity Space Using Actively Tracked Cellphone Location Data[J]. Annals of the Association of American Geographers, 2016, 106(2): 489-502 doi: 10.1080/00045608.2015.1120147
[9] Chen B Y, Wang Y, Wang D, et al. Understanding the Impacts of Human Mobility on Accessibility Using Massive Mobile Phone Tracking Data[J]. Annals of the American Association of Geographers, 2018, 108(4): 1-19 doi: 10.1080/24694452.2017.1411244?tab=permissions&scroll=top&
[10] 康朝贵, 刘瑜, 邬伦. 城市手机用户移动轨迹时空熵特征分析[J]. 武汉大学学报·信息科学版, 2017, 42(1): 63-69 doi: 10.13203/j.whugis20160203 Kang Chaogui, Liu Yu, Wu Lun. An Analysis of Entropy of Human Mobility from Mobile Phone Data[J]. Geomatics and Information Science of Wuhan University, 2017, 42(1): 63-69 doi: 10.13203/j.whugis20160203
[11] 杨喜平, 方志祥, 赵志远, 等. 顾及手机基站分布的核密度估计城市人群时空停留分布[J]. 武汉大学学报·信息科学版, 2017, 42(1): 49-55 doi: 10.13203/j.whugis20150646 Yang Xiping, Fang Zhixiang, Zhao Zhiyuan, et al. Analyzing Space-Time Variation of Urban Human Stay Using Kernel Density Estimation by Considering Spatial Distribution of Mobile Phone Towers[J]. Geomatics and Information Science of Wuhan University, 2017, 42(1): 49-55 doi: 10.13203/j.whugis20150646
[12] Zhang C, Han J, Shou L, et al. Splitter: Mining Fine-grained Sequential Patterns in Semantic Trajectories[J]. Proceedings of the VLDB Endowment, 2014, 7(9): 769-780 doi: 10.14778/2732939.2732949
[13] Hou J, Zhao H, Zhao X, et al. Predicting Mobile Users' Behaviors and Locations Using Dynamic Bayesian Networks[J]. Journal of Management Analytics, 2016, 3(3): 191-205 doi: 10.1080/23270012.2016.1198242
[14] Fernandes R, D'Souza R G L. A New Approach to Predict User Mobility Using Semantic Analysis and Machine Learning[J]. Journal of Medical Systems, 2017, 41(12): 188-200 doi: 10.1007/s10916-017-0837-x
[15] Song C, Qu Z, Blumm N, et al. Limits of Predictability in Human Mobility[J]. Science, 2010, 327 (5 968): 1 018-1 021 http://comnet.oxfordjournals.org/cgi/ijlink?linkType=ABST&journalCode=sci&resid=327/5968/1018
[16] Yan X Y, Wang W X, Gao Z Y, et al. Universal Model of Individual and Population Mobility on Diverse Spatial Scales[J]. Nature Communications, 2017, 8(1): 1 639-1 648 doi: 10.1038/s41467-017-01892-8
[17] Ozer M, Keles I, Toroslu H, et al. Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques[J]. The Computer Journal, 2016, 59(6): 908-922 doi: 10.1093/comjnl/bxv075
[18] Qiao Y, Zhao X, Yang J, et al. Mobile Big-Data-Driven Rating Framework: Measuring the Relationship Between Human Mobility and APP Usage Behavior[J]. IEEE Network, 2016, 30(3): 14-21 doi: 10.1109/MNET.2016.7474339
[19] Zheng L, Feng Y, Zhou W, et al. Inferring Correlation Between User Mobility and APP Usage in Massive Coarse-Grained Data Traces[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2018, 1(4): 153-174 doi: 10.1145/3161171
[20] Do T M T, Gatica-Perez D. Where and What: Using Smartphones to Predict Next Locations and Applications in Daily Life[J]. Pervasive and Mobile Computing, 2014, 12: 79-91 doi: 10.1016/j.pmcj.2013.03.006
[21] Huang Q. Mining Online Footprints to Predict User's Next Location[J]. International Journal of Geographical Information Systems, 2017, 31(3): 523-541 doi: 10.1080/13658816.2016.1209506
[22] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016 Zhou Zhihua. Machine Learning[M]. Beijing: Tsinghua University Press, 2016
[23] 方志祥, 倪雅倩, 张韬, 等. 利用终端位置时空转移概率预测通讯基站服务用户规模[J]. 地球信息科学学报, 2017, 19(6): 772-781 doi: 10.3969/j.issn.1560-8999.2017.06.006 Fang Zhixiang, Ni Yaqian, Zhang Tao, et al. Using Terminal Location Spatio-Temporal Transfer Probability to Predict Subscriber Base Size of Communication Base Station[J]. Journal of Geo-information Science, 2017, 19(6): 772-781 doi: 10.3969/j.issn.1560-8999.2017.06.006
[24] 孙娟. 智能参数学习的模糊决策树算法[J]. 计算机工程与应用, 2012, 48(23): 148-154 doi: 10.3778/j.issn.1002-8331.2012.23.034 Sun Juan. Fuzzy Decision Tree Induction Based on Optimization of Parameters[J]. Computer Engineering and Applications, 2012, 48(23): 148-154 doi: 10.3778/j.issn.1002-8331.2012.23.034
[25] 方志祥, 于冲, 张韬, 等. 手机用户上网时段的混合Markov预测方法[J]. 地球信息科学学报, 2017, 19(8): 1 019-1 025 https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX201708004.htm Fang Zhixiang, Yu Chong, Zhang Tao, et al. A Mixed Markov Method to Predict the Surfing Time Period of Mobile Phone Users[J]. Journal of Geo-information Science, 2017, 19(8): 1 019-1 025 https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX201708004.htm