摘要
针对光纤接入(fiber to the x,FTTx)网络规划中频繁路径挖掘问题,在经典算法FP-Growth,SPADE的基础上,结合格理论,利用频繁项集扩展枚举树作为搜索空间,并引入位图方便扩展运算和支持度计算,提出了一个改进的频繁序列挖掘算法FSM+。详细介绍了该算法的相关性质和基本理论,阐述了该算法的基本思想和实现伪码。在VC++6.0和单机的环境下,利用不同规模用户装机数据集和最小支持度比较了该算法与SPADE,FP-Growth算法的性能和准确性。实验证明,FSM+算法在小规模数据集下性能优势并不明显,但在大数据集下其计算性能分别是SPADE,FP-Growth的5倍和7倍多,挖掘结果与SPADE,FP-Growth算法相同。从而在实际网络规划过程中,快速计算信任度较高的频繁模式,并与人工经验干预相结合,来进一步保证预测路径准确有效。
Aiming at the problem of frequent path mining in FTTx network planning, FSM+ (Frequent Sequence Mining) algorithm is put forward based on classical algorithm FP-Growth, SPADE and combined with grid theory. This algorithm u-tilizes frequent itemset enumeration tree and introduces bitmap to handle with extended operation and support calculation. The paper describes the relevant properties and basic theory and also expounds the basic idea and pseudo-code' s imple-mentation of the algorithm in details. In the environment of VC++6.0 and single computer, the performance and accuracy are compared with the algorithm of SPADE and FP-Growth by utilizing different size users' datasets and the minimum sup-port degree. The experimental results proved that the performance advantage of FSM+ algorithm is not obvious in small size datasets, but its computational performance is more than 5 times than SPADE and 7 times than FP-Growth, and yet their mining results are the same in large datasets. Therefore in the actual network planning process, the accuracy and effect of forecasting path is further ensured by calculating higher confidence frequent pattern fast and combining with the manual in-tervention.
出处
《重庆邮电大学学报(自然科学版)》
CSCD
北大核心
2014年第2期280-284,共5页
Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
基金
南通市科技创新计划项目(K2012032)~~
关键词
频繁序列
网络规划
模式挖掘
frequent sequence
network planning
pattern mining