基于抽样的概念层次挖掘算法被引量：1

AN ALGORITHM FOR CONCEPT HIERARCHY MINING BASED ON SAMPLING

下载PDF

导出

摘要本文通过对数据挖掘的几种传统属性归纳算法的分析,发现它们存在以下不足:(1)不能处理不平衡的概念层次;(2)没有考虑实际数据分布对最后的泛化规则的影响。因此,本文提出了基于抽样的概念层次挖掘算法,它先采用抽样方法,对概念层次进行初步调整,然后扫描整个数据文件,利用扫描信息再次调整概念层次,最后通过统计调整后的概念层次的叶子信息就可以得到泛化规则。本算法不仅克服了传统算法的不足,而且具有最优的时间复杂度O(h)和空间复杂度O(c)。 This paper first presents some traditional Attribute - Oriented Induction (AOI) algorithms in data mining field and points out the shortcomings of them as follows: (1) they couldn' t deal with the unbalanced concept hierarchy; (2) the final generalized result doesn't refer to the distribution of real data set.Hence,we put forward an algorithm for concept hierarchy mining based on sampling,which samples the dataset first, and arranges the initial concept hierarchy, then scans the whole dataset,later organize the concept hierarchy according to the statistics information, finally get the generalized rule by calculating the information of leaves. It not only solves the above problems, but also has the optimal time and space complexity.

作者胡江滔汪卫周傲英

机构地区复旦大学计算机科学系

出处《计算机应用与软件》 CSCD 北大核心 2001年第3期57-63,共7页 Computer Applications and Software

关键词数据挖掘属性归纳算法概念层次数据库 Data mining Attribute - oriented induction Concept hierarchy

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献8

1[1]J. Han, Y. Cai and N. Cercone, “Data - Driven Discovery of Qtantitative Rules in Relztional Derabases”, IEEE Trans. Knowledge and Data Eng. ,pp.29 ～ 40,Feb. 1993.
2[2]Y. Cai,N. Cercone and J. Han,“Attribute- oriented Induction in Relational Databases,＂In G.Piatetsky- Spapiro and W. J. Frawley,editors, Knowledge Discovery in Databases,pp.213～ 228, AAM/MIT Press, 1991.
3[3]C.L. Carter and H.J. Hanilton, “Efficient Attribute - oriented Algorithms for Knowledge Discovery from Large Datahases, IEEE Trans.on Knowledge and Data Engineering,Vol. 10,No.2,March/April 1998,pp. 193～208.
4[4]J. Han, Y. Cai and N. Cercone,“ Knowledge Discovery in Databases: An Attribute- oriented Approach”, In Proc. 18th Int. Conf. Very Large Databases,pp.547～559, Vancouver,Canada, August 1992.
5[5]Jiawei Han, Yongjian Fu,“Exploration of the Power of Attribute- Oriented Induction in Data Mining”,http://www.cs.uregina.ca.
6[6]Jiawei Han, Yongjian Fu,“Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases”,http: ∥www.cs. Urehina. Ca.
7[7]Agrawal, R., Imielinski, T. And Swami, A., 1993, Mining Association Rules between Sets of Items in Large, In Proc. 1993ACM- SIGMOD Int. Conf. Management ofData, pp. 207 ～ 216, Washington, D. C. :ACM Press.
8[8]Piatesky- Shapiro, G. And Frawley,W. J., 1991, Knowledge Discovery in Databases, AAAI/MIT Press.

同被引文献7

1Han J, Cai Y, Cercone N. Knowledge discovery in databases: an attribute-oriented approach[C]//Proceedings of the 18th VLDB Conference. Vancouver, British Columbia, Canada, 1992 : 547-559.
2Cai Yandong, Cercone N, Han Jiawei. Attribute-oriented induction in relation databases[M]//Shapiro G P, Frawley W J. Knowledge Discovery in Databases. Menlo Park, California: AAAI Press/The MIT Press, 1991:13-228.
3Han Jiawei, Cai Yangdong, Cercone N. Data-driven discovery of quantitative rules in relation databases [J]. IEEE Transactions on Knowledge and Data Engineering, 1993, 5 (1) :29-40.
4Carter C L, Hamilton H J. Performance evaluation of attribute-oriented algorithms for knowledge discovery from databases[C]//Proeeedings of ICTAI, 7th IEEE International Conference Tools with Arrificial Intelligence Wash- ington D C: IEEE Computer Society, 1995:486-489.
5Carter C L, Hamilton H J. Efficient attribute-oriented generalization for knowledge discovery from large databases [J].IEEE Transactions on Knowledge and Data Engineering, 1998, 10(2):193-208.
6周生炳,张钹,成栋.基于规则面向属性的数据库归纳的无回溯算法[J].软件学报,1999,10(7):673-678. 被引量：13
7陈红梅,王丽珍.面向属性的量化归纳[J].计算机研究与发展,2001,38(2):150-156. 被引量：8

引证文献1

1胡学钢,周循,张晶,张润梅.基于多重多层次关系的分类属性泛化研究[J].合肥工业大学学报（自然科学版）,2008,31(9):1433-1437. 被引量：2

二级引证文献2

1卢致杰.基于粒度变换的多范畴复杂信息分类方法[J].计算机与现代化,2014(3):180-185. 被引量：1
2郑宇超.复杂信息分类自然语义表达模型仿真分析[J].计算机仿真,2015,32(7):444-447. 被引量：2

1李波.一种基于取样的概念层次数据挖掘新算法[J].计算机工程与科学,2002,24(3):8-10. 被引量：1
2Onion.数码暗房调教你的景物照片[J].数码,2003(6):102-103.
3李波.基于抽样的概念层次数据挖掘算法[J].计算机科学,2002,29(7):87-89. 被引量：1
4王德兴,胡学钢,刘晓平.一种新颖的基于量化概念格的属性归纳算法[J].西安交通大学学报,2007,41(2):176-179. 被引量：2
5王德兴,胡学钢,刘晓平,黄冬梅.基于量化扩展概念格的属性归纳算法[J].模式识别与人工智能,2007,20(6):843-848. 被引量：3
6王德兴,胡学钢,刘晓平,黄冬梅.基于扩展概念格的属性归纳算法[J].上海交通大学学报,2009,43(3):476-479. 被引量：1
7王德兴,胡学钢,刘晓平.量化扩展概念格的属性归纳及多粒度规则挖掘[J].系统工程学报,2009,24(1):54-61. 被引量：3
8余智华.网络信息过滤系统的过载处理方法研究[J].计算机工程,2008,34(19):86-88. 被引量：2
9徐如燕,鲁汉榕,郭齐胜.最大泛化规则生成[J].空军雷达学院学报,2001,15(2):24-27.
10张宁,何潞申.数字调色的艺术图解DaVinci Resolve 第三章初级调色与影视色彩的基础知识(三)[J].数码影像时代,2015,0(7):106-109.

计算机应用与软件

2001年第3期

浏览历史

内容加载中请稍等...

基于抽样的概念层次挖掘算法被引量：1

参考文献8

同被引文献7

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于抽样的概念层次挖掘算法 被引量：1

参考文献8

同被引文献7

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于抽样的概念层次挖掘算法被引量：1