数据流中随机型分形维数计算方法研究

Research of Stochastic Fractal Dimension Calculation Algorithm in Data Stream

下载PDF

导出

摘要分形维数能够有效地描述数据集,反映复杂数据集中隐含的规律性,基于分形理论的数据挖掘算法通常都涉及到分形维数的计算。但是现有的分形维数计算方法的时间复杂度和空间复杂度都比较高,大大降低了算法的效率,使算法很难适应高速、海量的数据流环境。因此,总结分析了现有的几种分形维数计算方法,并提出一种随机型方法,利用固定的内存空间快速估计数据流的关联维数。最后通过与现有算法进行对比实验,证明了这一随机型算法的有效性。 Fractal dimension can describe the data set effectively and can reflect the hidden regularity of the complex data set.Data mining algorithms based on fractal theory are usually related to the calculation of fractal dimension.But most of the existing fractal dimension calculation algorithms are with high time complexity and space complexity,which greatly reduces the efficiency and is not applicable for data stream with high-speed and massive data.In this paper,se-veral existing fractal dimension calculation algorithms were analyzed and a stochastic fractal dimension calculation algorithm were proposed to fast estimate the correlation dimension in fixed space.The comparative experiment and analysis demonstrate the effectiveness of this stochastic fractal dimension calculation algorithm.

作者倪志伟公维峰周之强唐李洋

机构地区合肥工业大学管理学院过程优化与智能决策教育部重点实验室

出处《计算机科学》 CSCD 北大核心 2011年第4期209-212,229,共5页 Computer Science

基金国家自然科学基金(70871033 70801025) 国家高技术研究发展计划(863)(2007AA04Z116) 合肥工业大学校科学研究发展基金(2009HGXJ0040)资助

关键词分形分形维数数据流 Fractal Fractal dimension Data stream

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献12

1倪丽萍,倪志伟,吴昊,叶红云.基于分形维数的数据挖掘技术研究综述[J].计算机科学,2008,35(1):187-189. 被引量：7
2鲍玉斌,王琢,孙焕良,于戈.一种基于分形维的快速属性选择算法[J].东北大学学报（自然科学版）,2003,24(6):527-530. 被引量：14
3闫光辉,李战怀,党建武.基于多重分形的聚类层次优化算法[J].软件学报,2008,19(6):1283-1300. 被引量：15
4Traina C, Traina A, Wu L, et al. Fast feature selection using fractal dimension[C].//Proc. XV Brazilian Symposium on Databases. 2000.
5Barbara D, Chen P. Using the Fractal Dimension to Cluster Datasets [C]. // Proceedings of the Sixth ACM SIGKDD. International Conference on Knowledge Discovery and Data Mining. 2000.
6de Sousa E P M,Traina A J M,Traina J C. SID:Calculating the Intrinsic Dimension of Data Streams [C]. // Proceedings of the 2006 ACM Symposium on Applied Computing. 2006.
7Wong A, Wu L J,Gibbons P B,et al. Fast Estimation of Fractal Dimension and Correlation Integral on Stream Data[J]. Infaomation Processing Letters, 2005,93 (2): 91-97.
8Aton N, Matias Y, Szegedy M. The space complexity of approximating the frequency moments[C].//1996 ACM Symposium on Theory of Computing. Philadelphia, 1996:22-24.
9Thorup M, Zhang Yin. Tabulation Based 4-Universal Hashing with Applications to Second Moment Estimation[C].//Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Dis crete Algorithms. 2004,18:608-617.
10Anna O, Rasmus P. Uniform Hashing in Constant Time and Linear Space[C].//Proeeedings of the Annual ACM Symposium on Theory of Computing. 2003 : 622-628.

二级参考文献30

1吴敏金.多重分形熵与多重分维谱[J].电子学报,1993,21(10):7-13. 被引量：1
2彭佳红,沈岳,张林峰.数据挖掘中的特征选择及其算法研究[J].计算机工程与设计,2005,26(5):1176-1178. 被引量：15
3岳士弘,王正友.二分网格聚类方法及有效性[J].计算机研究与发展,2005,42(9):1505-1510. 被引量：15
4任永功,于戈.一种多维数据的聚类算法及其可视化研究[J].计算机学报,2005,28(11):1861-1865. 被引量：13
5陈辉,李益进.地震检测波的分形神经网络模式识别[J].建筑技术开发,2005,32(12):39-43. 被引量：1
6安国成,刘振华,于文震.基于分形特征的高分辨率SAR图像分类[J].现代雷达,2006,28(6):26-29. 被引量：3
7Talavera L. Feature selection as a preproceasing step for hierarchical clusterking[A]. In: Bratko I. Proc of the 16th Conf on Machine Learning[C]. Bled: AAAI Press,1999. 389 - 397.
8Scherf M, Brauer W. Improving RBF networks by the feature selection appraach EUBAFES[A]. In: Gersmer W.Proc 7th Int Conf on Artificial Neural Networks [C].Lausanne: Springer, 1997.391 - 396.
9Robert A, Stocker E. Classification and feature selection by a self-organizing neural network[A]. In:Dorffner G. Proc of Int Conf on Artificial and Neural Networks[ C ].Edinburgh: Springer, 1999. 651 - 660.
10Pernkopf F, O'Leary P. Feature selection for classification using genetic algorithms with a novel encoding [A]. In:Skarbek W. Proc of Computer Analysis of Images and Patterns[C]. Warsehau: Springer, 2001. 161 - 168.

共引文献29

1梁敏君,倪志伟,倪丽萍,杨葛钟啸.基于网格与分形维数的聚类算法[J].计算机应用,2009,29(3):830-832. 被引量：4
2林国平,陈磊松.一种网格和分形维数的数据流聚类算法[J].郑州大学学报（理学版）,2009,41(2):24-28. 被引量：2
3倪丽萍,倪志伟,吴昊,叶红云.基于分形维数的数据挖掘技术研究综述[J].计算机科学,2008,35(1):187-189. 被引量：7
4杨葛钟啸,倪志伟,倪丽萍,梁敏君.基于分形和邻接空间密度变化的属性选择方法[J].计算机工程与应用,2008,44(20):142-144.
5赵宏霞,杨皎平.波动性产品需求预测的神经网络模型[J].价值工程,2008,27(8):97-100.
6闫光辉,李战怀.两阶段无监督顺序前向分形属性规约算法[J].计算机研究与发展,2008,45(11):1955-1964. 被引量：4
7倪丽萍,倪志伟,吴昊,叶红云.基于分形维数和蚁群算法的属性选择方法[J].模式识别与人工智能,2009,22(2):293-298. 被引量：6
8倪志伟,倪丽萍,杨葛钟啸.分形技术在案例库维护中的应用[J].计算机应用,2009,29(6):1598-1600. 被引量：1
9李广水,郑滔,孙梅.基于分形维的决策树构建及应用研究[J].计算机技术与发展,2009,19(12):5-8. 被引量：2
10董莉丽,郑粉莉.陕北黄土丘陵沟壑区土壤粒径分布分形特征[J].土壤,2010,42(2):302-308. 被引量：23

1王小同,杜方,杨庆雄.一种避免前向网络学习算法局部极小问题的方法[J].西北工业大学学报,1994,12(2):326-329.
2彭国震,邱毓兰,彭德纯.若干随机型负载平衡算法[J].计算机工程,2001,27(2):22-24. 被引量：9
3李英德,鲁建厦,潘国强.穿越策略下考虑相关性的货位优化方法[J].浙江大学学报（工学版）,2012,46(8):1424-1430. 被引量：14

计算机科学

2011年第4期

浏览历史

内容加载中请稍等...

数据流中随机型分形维数计算方法研究

参考文献12

二级参考文献30

共引文献29

相关作者

相关机构

相关主题

浏览历史