期刊文献+

用平滑方法改进多关系朴素贝叶斯分类 被引量:9

Improving multi- relational Naive Bayesian classifier using smoothing methods
下载PDF
导出
摘要 为消除朴素贝叶斯分类时的零概率以及过度拟合问题,分析了各种概率平滑方法,给出了基于M估计的多关系朴素贝叶斯分类方法(MRNBC-M)和基于Laplace估计的多关系朴素贝叶斯分类方法(MRNBC-L),分析探讨了M平滑和Laplace平滑方法对多关系分类的影响情况,为进一步优化分类,方法基于扩展互信息标准对数据进行属性过滤。多关系标准数据集上的实验显示,MRNBC-M可以有效改进分类性能。 To eliminate the naive Bayesian classification of zero probability and overfitting problem, this paper discusses the various probability smoothing method, gives MRNBC-M(Multi-Relational Naive Bayesian Classifier based on Mestimation)and MRNBC- L(Multi-Relational Naive Bayesian Classifier based on Laplace- estimation). In the case of multi-relationship, the impact of M and Laplace estimation methods on the classification is analyzed. In order to further optimize the classification, the method is based on the extended mutual information criterion. Experiments on the multirelational datasets show that MRNBC-M can effectively improve the classification performance.
作者 徐光美 刘宏哲 张敬尊 王金华 XU Guangmei;LIU Hongzhe;ZHANG Jingzun;WANG Jinhua(College of Information Technology, Beijing Union University, Beijing 100101, China;Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China)
出处 《计算机工程与应用》 CSCD 北大核心 2017年第5期69-72,共4页 Computer Engineering and Applications
基金 国家自然科学基金(No.61372148 No.61202245) 北京市"长城学者"计划项目(No.CIT&TCD20130320) 北京市优秀人才培养项目(No.2010D005022000011) 北京联合大学自然科学项目(No.zk20201403)
关键词 多关系数据挖掘 朴素贝叶斯 参数平滑 互信息 Multi-Relational Data Mining(MRDM) Naive Bayes smoothing methods mutual information
  • 相关文献

参考文献1

二级参考文献38

  • 1Taskar B, Abbeel P, Koller D. Discriminative probabilistic models for relational data. In Proc. the 18th Conf. Uncer- tainty in Artificial Intelligence, August 2002, pp.485-492.
  • 2Chakrabarti S, Dom B, Indyk P. Enhanced hypertext catego- rization using hyperlinks. In Proc. International Conference on Management of Data, June 1998, pp.307-318.
  • 3Neville J, Jensen D. Iterative classification in relational data. In Proc. AAAI 2000 Workshop on Learning Statistical Mod- els from Relational Data, July 2000, pp.13-20.
  • 4Getoor L, Diehl C P. Link mining: A survey. ACM SIGKDD Explorations Newsletter, 2005, 7(2): 3-12.
  • 5Ganiz M C, Kanitkar S, Chuah M C, Pottenger W M. Detec- tion of interdomain routing anomalies based on higher-order path analysis. In Proc. the 6th IEEE International Confer- ence on Data Mining, December 2006, pp.874-879.
  • 6Ganiz M C, Lytkin N, Pottenger W M. Leveraging higher or- der dependencies between features for text classification. In Proc. European Conference on Machine Learning and Prin- ciples and Practice of Knowledge Discovery in Databases, September 2009, pp.375-390.
  • 7Ganiz M C, George C, Pottenger W M. Higher order Naive Bayes: A novel non-IID approach to text classification. IEEE Trans. Knowledge and Data Engineering, 2011, 23(7): 1022- 1034.
  • 8Lytkin N. Variance-based clustering methods and higher or- der data transformations and their applications [Ph.D. The- sis]. Rutgers University, N J, 2009.
  • 9Edwards A, Pottenger W M. Higher order Q-Learning. In Proc. IEEE Syrup. Adaptive Dynamic Programming and Re- inforcement Learning, April 2011, pp.128-134.
  • 10Deerwester S C, Dumais S T, Landauer T K et al. Indexing by latent semantic analysis. Journal of the American Society for information Science, 1990, 41(6): 391-407.

共引文献2

同被引文献74

引证文献9

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部