公共地图服务的群体用户访问行为时序特征模型及预测
Research and Prediction on Time-Sequence Characteristics of Group-User Access Behavior in Public Map Service
-
摘要: 群体用户对公共地图服务的访问行为具有社会性,存在着一定的群体访问行为模式。该模式具有高强度的访问聚集性与突发性,且决定着公共地图服务对云计算资源的需求。如何有效地表达和捕捉群体用户访问聚集性及其访问强度的时序变化特征,进行准确的公共地图服务负载预测,是实现按需选择和调度云计算资源,应对海量用户并发服务挑战的关键。本文基于海量的公共地图服务用户访问日志和时间序列聚类方法,建立了群体用户访问到达行为的时序分布模型;考虑降低负载预测复杂度的同时,利用访问强度具有多峰值、变强度以及周期性的特点,分割访问到达率在一个周期内时间序列上的模式区间,实现访问强度时序聚类划分的最优;基于各访问模式区间不同的访问到达概率密度的分布,提出了基于累积概率分布的时间序列平滑预测服务负载方法,该预测方法的算法复杂度低,且所需的先验数据量小。实验证明,本文提出的基于时序的群体用户访问到达率最优分割方法及其预测方法可以以较高的准确率预测服务负载。该方法在应对海量用户并发访问挑战的同时,可提高云计算资源的利用效率,解决公共地图服务质量与服务成本的平衡性问题。Abstract: Group-user access behavior in public map service has a social nature and there is a certain group-user access pattern, which has a high access aggregative and outburst feature. However, the feature has a great influence on the demands for cloud computing resources for public map service. Thus, how to effectively express and capture the access aggregative feature and the changes of access intensity over time, and predict the access load of public map service accurately, is the important key for selecting and scheduling cloud computing resources on demand, that can address the challenge of concurrent service for massive users. Based on the volume user access logs from public map service and the time-sequence clustering method, this paper first builds a time-sequence distribution model for group-user access arriving behavior; then using the features of multi-peak, variable and periodicity in access intensity, this paper optimally partitions the time-sequence of access arrival rate in a period into different temporal patterns; as there are different probability density distribution of access arrival rate in different temporal patterns, this paper proposes a method of service load forecasting method based on a smoothing time-sequence of cumulative probability distribution. This method has a low complexity and needs few priori data. Experimental results and method application prove that the optimal partition and prediction for the access arrival rate of group-user access based on a time-sequence have a good service response performance for massive users concurrent access, improve the utilization of cloud computing resource, and balance the service quality and cost in public map service.