多阶段聚类—朴素贝叶斯的异常检测被引量：1

Anomaly detection based on the multi-phase clustering and naive bayes

下载PDF

导出

摘要针对异常检测手段用来标定数据集中明显的不同于其他数据的对象,提出多阶段聚类旨在解决噪声数据的引入和缺失属性样本的处理,并改变传统的贝叶斯分类的被动学习为主动学习的方式来构造性能优越的分类器。在数据预处理阶段,利用密度聚类滤去噪声点,密度聚类的输出作为下一阶段的K-means聚类的输入,提高了K-means的分类准确率。K-means负责对缺失属性的样本进行处理。在分类阶段,利用adaboost学习算法优化朴素贝叶斯分类器,使其获得较好的分类效果。 Anomaly detection method was used for calibration data concentration significantly different from other data objects. In this paper, the multi-phase clustering aimed at resolving the import of noise data and the lack of the attributive sample, and changing the traditional passive learning of bayes for active learning ways to structure the superior performance classifier. In the pre-processing stage, a clustering algorithm based on density is introduced to handle noise data. And the output of the density-based clustering algorithm can be used as the input of K-means, which responsible for handling the training samples with absent values. At classification time, we introduce adaboost algorithm into naive bayes to generate a more effective classifier.

作者姜立标马乐余建伟刘永花

机构地区哈尔滨工业大学(威海)汽车工程学院哈尔滨工业大学(威海)计算机科学与技术学院

出处《重庆大学学报（自然科学版）》 EI CAS CSCD 北大核心 2009年第8期983-986,共4页 Journal of Chongqing University

基金山东省自然科学基金资助项目(Y2007G19) 哈尔滨工业大学(威海)研究基金资助项目(HIT(WH)ZB200813)

关键词聚类朴素贝叶斯主动学习 K—means算法 clustering algorithms naive bayes active learning k-means algorithms

分类号 TP301 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献4

1张亚萍,胡学钢.基于K-means的朴素贝叶斯分类算法的研究[J].计算机技术与发展,2007,17(11):33-35. 被引量：7
2王实,高文.增强型朴素贝叶斯学习[J].计算机科学,2000,27(4):46-49. 被引量：13
3罗海蛟,刘显.数据挖掘中分类算法的研究及其应用[J].微机发展,2003,13(a02):48-50. 被引量：28
4李玲,李海军,王钲旋,王利民.基于主动学习提升朴素贝叶斯[J].计算机工程与应用,2006,42(19):164-166. 被引量：2

二级参考文献11

1Dougherty J,Kohavi R,Sahami M.Supervised and unsupervised discretization of continuous features[C].In:Proceedings of the 12th International Conference on Machine Learning,Morgan Kaufmann Publishers,San Francisco,CA,1995:194～202
2Schapire R E.The strength of learnability[J].Machine learning,1990;5(2):197～227
3Freund Y,Schapire R E.A decision theoretic genelization of online learning and an application to boosting[J].Joumal of Computer and System Science,1997; 55 (1):119～139
4Schapire R E,Singer Y.Improved boosting algorithms using confidence related predictions[C].In:Proceeding of the 11th Annual Conference on Computational Learning Theory,1998:80～91
5E Bauer,R Kohavi.An empirical comparison of voting classification algorithm:Bagging,boosting and variants.Machine Leaming,1999:105～142
6K M Ting,Z Zheng.Improving the performance of boosting for naive Bayesian classification[C].In:Proceedings of the 3th Pacific-Asia Conference on Knowledge Discovery and Data Mining,Berlin Germany:Springer-Verlag,1999:296～312
7Yonglong W et al.MIQR Active Learning on a Continuous Function and a Discontinuous Function[J].Neural Computing and Applications,2001; 10(3):253～270
8Hong L Shang-teng H.A Genetic Semi-supervised Fuzzy Clustering Approach to Text Classification[C].In:Proceedings of the 4th International Conference on Web-Age Information Management,Chengdu,China,2003:173～180
9Han Jiawei,Kamber M.数据挖掘概念与技术[M].第2版.范明,孟小峰,等译.北京:机械工业出版社,2001.
10Semi K L.Naive Bayesian Classifiers[C]//In:Proceedings of European Conference on Artificial Intelligence.Porto,Portugal:Springer Verlag,1991:206-219.

共引文献46

1库姝婧.基于QUEST算法的有线电视销售的客户分析[J].经济视野,2013(18).
2周忠眉.数据挖掘与统计理论[J].漳州师范学院学报（自然科学版）,2006,19(1):23-26. 被引量：5
3许文杰,刘希玉.基于无监督神经网络聚类算法的研究[J].信息技术与信息化,2006(6):85-88. 被引量：3
4迟庆云.商业智能软件在汽车销售中的应用[J].商场现代化,2007(04Z):55-56. 被引量：1
5王冉冉,王刚,黄青松.基于Deep Web的信息采集系统[J].计算机技术与发展,2007,17(10):171-173. 被引量：3
6赵阳,陆静.C4.5算法在大豆致病性分析中的应用[J].河北农业科学,2007,11(6):96-98. 被引量：1
7陈明忠.基于数据挖掘的成绩管理系统的研究与设计[J].福建电脑,2008,24(5):120-121. 被引量：3
8赵亚南.数据挖掘在沈阳世博园旅游业中的应用[J].商场现代化,2008(29):29-30.
9柳秋云.改进的朴素贝叶斯分类器在医疗诊断中的应用[J].科技创新导报,2008,5(31):192-192. 被引量：6
10余志毅,赵青,冯运仿.商业智能在旅游产品销售中的应用分析[J].黄石理工学院学报,2008,24(5):16-19. 被引量：3

引证文献1

1唐洪林,刘笃晋.基于多阶段聚类支持向量机的入侵检测算法[J].电脑知识与技术（过刊）,2010,0(15):3933-3934.

1王添,姜麟,米允龙.海量数据下不完备信息系统的知识约简算法[J].计算机技术与发展,2015,25(1):137-142. 被引量：2
2祁瑞华,杨德礼,李慧芬.两阶段半监督加权朴素信念分类模型[J].运筹与管理,2011,20(5):156-161.
3李丹,顾宏,张立勇.基于属性加权的不完全数模糊c均值聚类算法[J].大连理工大学学报,2012,52(5):749-754. 被引量：5
4焦娜,苗夺谦,张红云.多决策表缺失属性补齐算法的研究[J].计算机科学,2009,36(1):142-145. 被引量：2
5雷远伟.视频监控中的人脸检测与跟踪[J].电子技术与软件工程,2015(12):99-99.
6祁瑞华,杨德礼,胡润波.基于相关系数加权朴素信念分类模型[J].计算机工程与设计,2010,31(22):4824-4826. 被引量：1
7杨志荣,周建中.一种基于D-S证据理论的水轮机故障诊断方法[J].水电能源科学,2009,27(2):152-154. 被引量：11
8李振龙,李翔,秦佳丽.基于Camshift算法的虹膜实时跟踪[J].科学技术与工程,2014,22(13):225-230. 被引量：1
9赵姝,吕靖,张燕平,张以文.不完整数据集的信息熵集成分类算法[J].模式识别与人工智能,2014,27(3):193-198. 被引量：6
10王琳琳,聂财香.基于AdaBoost算法和Cascade算法的人脸检测系统的实现[J].电子制作,2013,21(11X):66-66. 被引量：1

重庆大学学报（自然科学版）

2009年第8期

浏览历史

内容加载中请稍等...

多阶段聚类—朴素贝叶斯的异常检测被引量：1

参考文献4

二级参考文献11

共引文献46

引证文献1

相关作者

相关机构

相关主题

浏览历史

多阶段聚类—朴素贝叶斯的异常检测 被引量：1

参考文献4

二级参考文献11

共引文献46

引证文献1

相关作者

相关机构

相关主题

浏览历史

多阶段聚类—朴素贝叶斯的异常检测被引量：1