基于类标签变化的改进SLIQ算法研究被引量：2

Research on Improved SLIQ Algorithm Based on Changes of Class Label

下载PDF

导出

摘要针对数据挖掘的决策树分类技术中,SLIQ分类器在建树阶段寻找最佳分裂属性时,需要计算大量数值型属性间中间值的基尼系数,算法时间效率低的问题,提出一种改进的SLIQ算法。该算法通过判断数值型属性的预排序属性表中的类标签变化来选择合适分裂位置,减少可能存在的最佳分裂点。实验部分中,用UCI机器学习库中的数据集作分类测试。与原来的SLIQ算法相比,在没有降低分类准确率与扩大决策树规模的情况下,需要计算基尼系数的分裂点个数平均减少了36.32%。最后,将改进算法应用于电子商务的客户分析,分类结果有助于商家作出正确决策。 In the technology of decision tree classification in data mining,that large amounts of median Gini Indexes among numeric attributes need to be calculated results in low efficiency of algorithm time when the SLIQ classification searches for the optimal splitting attribute at the stage of tree establishment.Aiming at this problem,an improved SLIQ algorithm is proposed.This algorithm selects appropriate splitting position by judging the changes of class label from the pro-sorting table of numeric attributes to reduce possible existing optimal splitting points.In the experiment,it uses the data sets form UCI machine learning library to do the classification tests.Compared with the traditional SLIQ algorithm,the number of splitting points whose Gini Indexes need to be calculated reduces by 36.32 percent without lowing classification accuracy or expanding the decision tree.At last,after applying the improved algorithm into customer analysis of electronic commerce,the results of classification help the merchants making correct decision.

作者朱王晓嘉余建坤

机构地区云南财经大学

出处《微型电脑应用》 2015年第10期27-31,4-5,共5页 Microcomputer Applications

基金云南省高校商务智能科技创新团队(42212217010)

关键词数据挖掘决策树分类 SLIQ算法分裂点 Data Mining Decision Tree Classification SLIQ Algorithm Splitting Point

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献14

1范明,孟小峰,译.数据挖掘:概念与技术(3版)[M].北京:机械工业出版社,2012:4-10.
2QUINLAN J R. Induction of decision trees [J]. Machine Learning, 1986, 1(1): 81-106.
3QUINLAN J R. C4.5: programs for machine learning [M] Burlington: Morgan Kaufmann Publishers, 1993: 17-42.
4MEHTA M, AGRAWAL R, RISSANEN J. SLIQ: A fast scalable classifier for data mining [C]//EDBT' 96: The 5th International Conference on Extending Database Technology. Avignon: Springer, 1996:67-78.
5SHAFER J, AGRAWAL R, MEHTA M. SPRINT: A scalable parallel classier for data mining [C]// VLDB' 96:1996 International Conference on Very Large Data Bases. Bombay: Citeseer, 1996:240-255.
6RASTOGI R, SHIM K. PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning [J]. Data Mining and Knowledge Discovery, 2000, 4(4): 315-344.
7CHANDRA B, MAZUMDAR S, ARENA V, et al. Elegant decision tree algorithm for classification in data mining [C]// WISE 2002: The Third International Conference On Web Information Systems Engineering (workshops). Singapore: IEEE Computer Society, 2002:160-169.
8张华成.基于SLIQ决策树算法的研究[J].现代计算机,2009,15(10):54-56. 被引量：3
9张薇.一种基于改进SLIQ决策树分类算法的应用研究[J].苏州大学学报（工科版）,2010,30(1):72-77. 被引量：4
10AHA D, MURPHY P. UCI Machine Learning Repository: Data Sets [EB/OL]. [2015 -03 -25]. http://archive.ics.uci.edu/ml/datasets.html.

二级参考文献6

1万源.一个基于SLIQ算法的模型及应用[J].信息技术,2005,29(12):60-62. 被引量：5
2Manish Mehta, Rakesh Agrawal and Jorma Rissanen. SLIQ: a Fast and Scalable Classifier for Data Mining. IBM Almaden Research Center,1996.
3Chandra, B., Varghese, P.P.On Improving Efficiency of SLIQ Decision Tree Algorithm. Neural Networks, 2007. IJC- NN 2007. International Joint Conference on 12-17 Aug. 2007 Page(s): 66-71.
4Hongwen Yan,Rui Ma,Xiaojiao Tong. SLIQ in Data Mining and Application in the Generation Unit's Bidding Decision System of Electricity Market Power Engineering Conference, 2005. IPEC 2005. The 7th International Nov. 29 2005-Dec. 2 2005 Page(s):1-137.
5Chandra, B., Varghese, P.P.. Fuzzy SLIQ Decision Tree Algorithm Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on Volume 38, Issue 5, Oct. 2008 Page (s):1294-1301.
6Rafael S Parpinelli , Heitor S Lopes ,Alex A Freitas. Data mining with a ant colony optimization algorithm [J]. IEEE Trans On Evolution Computing,2002,6(4).

共引文献5

1马健美.基于数据挖掘的信用卡风险评估系统设计[J].自动化技术与应用,2016,35(5):37-40.
2郭晓慧,华桂香,向玉,温泽淮.基于病人报告结局和决策树方法构建证候诊断工具的探索[J].广州中医药大学学报,2016,33(4):588-591. 被引量：1
3高华.计算机海量数据处理SLIQ算法研究[J].长春工业大学学报,2016,37(4):406-410. 被引量：1
4王希玲,江峰,张友强,刘国柱.基于依赖决策熵的决策树分类算法[J].青岛科技大学学报（自然科学版）,2016,37(6):687-692. 被引量：2
5林文怡,宛小燕,刘元元.常见新近决策树算法及其在卫生领域中的应用[J].现代预防医学,2019,46(23):4233-4237. 被引量：8

同被引文献16

1EMIL J.KHATIB,RAQUEL BARCO,Ana GoME Z-ANDRADES,et.al.Data mining for fuzzy diagnosis systems in LTE networks[J].Expert Systems with Applica tions, 2015,42(21) : 7549-7559.
2CHOWDHURY FARHAN AHMED,NICOLAS LACHICHE,CleMENT CHARNAY.et.al. Flexible propositionalization of continuous attributes in relational data mining[J]. Expert Systems with Application,2015,42 (21): 7698-7709.
3VeRONIQUE VAN VLASSELAER,CRISTIdN BRAVO,OLIVIER CAELEN,et.al. APATE.A novel approach for automated credit card transaction fraud detection using network-based extensions[J].Decision Support Systems, 2015,75(7): 38-48.
4MURAT COKGEZEN,TIMUR KURAN.Between consumer demand and Islamic law..The evolution of Islamic credit cards in Turkey[J]. Journal of Comparative Economics, 2015,31 (6): 68-75.
5A.KHOSRAVI TANAK,G.R.MOHTASHAMI BORZADARAN, J. AHMADI.Entropy maximization under the constraints on the generalized Gini index and its application in modeling income distributions[J].Physica A. Statistical Mechanics and its Applications,2015, 15(11):657-666.
6AMANDA E.DAWSEY.State bankruptcy laws and the responsiveness of credit card demand[J]. Journal of Economics and Business, 2015,81(9) : 54-76.
7JAUME GARC A VILLAR,Josep Maria Raya. Use of a Gird index to examine housing price heterogen eity.A quantile approach[J]. Journal of Housing Econo mics, 2015,29(9): 59-71.
8余先吴.计算机海量数据SLIQ算法中云计算技术的应用研究[J].科学导报,2015(10):257.
9王嘉佳.云计算技术在计算机海量数据SLIQ算法中的应用[J].数字通信世界,2015(10):251-251,317.
10张华成.基于SLIQ决策树算法的研究[J].现代计算机,2009,15(10):54-56. 被引量：3

引证文献2

1马健美.基于数据挖掘的信用卡风险评估系统设计[J].自动化技术与应用,2016,35(5):37-40.
2高华.计算机海量数据处理SLIQ算法研究[J].长春工业大学学报,2016,37(4):406-410. 被引量：1

二级引证文献1

1孙娟.计算机云计算的SLIQ并行算法实践研究[J].中国高新技术企业,2017,0(12):11-12. 被引量：1

1李剑英,丁世飞,徐丽,钱钧.一种模糊加权的改进层次聚类算法研究[J].微电子学与计算机,2011,28(9):210-213.
2穆海蓉,丁丽萍,宋宇宁,卢国庆.DiffPRFs:一种面向随机森林的差分隐私保护算法[J].通信学报,2016,37(9):175-182. 被引量：15
3牛建强.探究计算机云计算的SLIQ并行算法[J].城市地理,2015(3X). 被引量：3
4贺俊.探究计算机云计算的SLIQ并行算法分析[J].无线互联科技,2014,11(2):127-127. 被引量：1
5周恒超,黄富,谢军华.FBG横向受力反射谱的仿真研究[J].苏州科技学院学报（自然科学版）,2009,26(3):64-67. 被引量：1
6黄刚,孙媛.基于Hadoop平台的SPRINT算法的分析与研究[J].南京师大学报（自然科学版）,2016,39(4):25-30. 被引量：2
7SATARevision32技术浅析存储设备接口发展的小步伐[J].微型计算机,2013(27):120-124.
8朱慧云,陈森发,曹杰,张丽杰.基于最佳分裂点的客户分类变化挖掘方法[J].信息与控制,2012,41(6):668-674. 被引量：1
9石梦雨,周勇,邢艳.基于LeaderRank的标签传播社区发现算法[J].计算机应用,2015,35(2):448-451. 被引量：13
10王翼,吴斌,杨胜琦.CommTracker:一种基于核心的社区演化跟踪算法(英文)[J].计算机科学与探索,2009,3(3):282-292. 被引量：3

微型电脑应用

2015年第10期

浏览历史

内容加载中请稍等...

基于类标签变化的改进SLIQ算法研究被引量：2

参考文献14

二级参考文献6

共引文献5

同被引文献16

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于类标签变化的改进SLIQ算法研究 被引量：2

参考文献14

二级参考文献6

共引文献5

同被引文献16

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于类标签变化的改进SLIQ算法研究被引量：2