摘要
为消除朴素贝叶斯分类时的零概率以及过度拟合问题,分析了各种概率平滑方法,给出了基于M估计的多关系朴素贝叶斯分类方法(MRNBC-M)和基于Laplace估计的多关系朴素贝叶斯分类方法(MRNBC-L),分析探讨了M平滑和Laplace平滑方法对多关系分类的影响情况,为进一步优化分类,方法基于扩展互信息标准对数据进行属性过滤。多关系标准数据集上的实验显示,MRNBC-M可以有效改进分类性能。
To eliminate the naive Bayesian classification of zero probability and overfitting problem, this paper discusses the various probability smoothing method, gives MRNBC-M(Multi-Relational Naive Bayesian Classifier based on Mestimation)and MRNBC- L(Multi-Relational Naive Bayesian Classifier based on Laplace- estimation). In the case of multi-relationship, the impact of M and Laplace estimation methods on the classification is analyzed. In order to further optimize the classification, the method is based on the extended mutual information criterion. Experiments on the multirelational datasets show that MRNBC-M can effectively improve the classification performance.
作者
徐光美
刘宏哲
张敬尊
王金华
XU Guangmei;LIU Hongzhe;ZHANG Jingzun;WANG Jinhua(College of Information Technology, Beijing Union University, Beijing 100101, China;Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China)
出处
《计算机工程与应用》
CSCD
北大核心
2017年第5期69-72,共4页
Computer Engineering and Applications
基金
国家自然科学基金(No.61372148
No.61202245)
北京市"长城学者"计划项目(No.CIT&TCD20130320)
北京市优秀人才培养项目(No.2010D005022000011)
北京联合大学自然科学项目(No.zk20201403)
关键词
多关系数据挖掘
朴素贝叶斯
参数平滑
互信息
Multi-Relational Data Mining(MRDM)
Naive Bayes
smoothing methods
mutual information