期刊文献+

基于改进权重贝叶斯的维文文本分类模型 被引量:10

Uyghur text classification model based on improved weighted Bayes
下载PDF
导出
摘要 为提高朴素贝叶斯分类器的分类性能,考虑决策分类过程中条件属性的不同重要程度,提出了一种基于特征选择权重的贝叶斯分类算法。采用卡方值和文档频数相结合的数值来表示特征词的重要程度,对该值进行处理获得每个特征词权重,建立加权贝叶斯分类器。在研究维文特点的基础上,利用该算法构建了一个维文文本分类模型。在搜集到的维文语料库上进行的实验结果表明,该算法比朴素贝叶斯拥有更好的分类性能。 To improve the performance of the Naive Bayes classifier, a weighted Bayes method is proposed based on feature selec- tion weight for taking into account different conditions have different effects to the decision conditions. Firstly, the effect value of every feature is computed by the combination of the Chi Square value and document frequency, DF. Then, the weight of every feature is computed by the effect value and weighted Bayes Classifier is built. A Uyghur text classification model is built based on the properties of Uyghur Language and this method. Results of the experiment based on the Uyghur corpus which collected from the internet indicate the metric had better classification performance than Naive Bayesian Classifier.
出处 《计算机工程与设计》 CSCD 北大核心 2012年第12期4726-4730,共5页 Computer Engineering and Design
基金 新疆维吾尔自治区高技术研究发展基金项目(201012112) 新疆维吾尔自治区电子发展专项基金项目(XJDZZXZJ20109)
关键词 文本分类 贝叶斯 卡方 加权 文档频率 特征选择 text classification Bayes CHI weighted DF feature selection
  • 相关文献

参考文献9

  • 1阿里木江·艾沙,吐尔根?伊布拉音,等.基于机器学习的维吾尔文文本分类研究[DB/OL]. [2011-07-14], http://www. cnki. net/kcms/detail/11. 2127. TP. 20110714. 1549. 012.html.
  • 2JIANG Liangxiao,WANG Dianhong, CAI Zhihua, et al. Sur-vey of improving naive Bayes for classification [C]. Proceedingsof the 3rd International Conference on Advanced Data Miningand Applications, 2007; 134-145.
  • 3LIN Jie, YU Jiankun. Weighted naive Bayes classification algo-rithm based on particle swarm optimization [C]. IEEE the 3rdInternational Conference on Communication Software and Net-works ,2011 : 444-447.
  • 4薛化建,董兴华,王磊,吐尔洪.吾司曼,蒋同海.基于词缀库的非监督维吾尔语词切分方法[J].计算机工程与设计,2011,32(9):3191-3194. 被引量:7
  • 5Antonio Arauzo-Azofra? Jose Luis Aznarte, Jose M,et al. Em-pirical study of feature selection methods based on individual fea-ture evaluation for classification problems [J]. Expert Systemswith Applications, 2011,38 (7) ; 8170-8177.
  • 6HUA Jianping, Waibhav D Tembe. Performance of feature-se-lection methods in the classification of high-dimension data [J].Pattern Recognition, 2009,42 ?3) : 409-424.
  • 7REN Jiangtao, SAU Dan Lee, CHEN Xianlu, et al. Naive bayesclassification of uncertain data [C]. IEEE the Ninth InternationalConference on Data Mirdng, Manmi, FL: 2009: 944-949.
  • 8艾青,秦玉平,李迎春.基于超球支持向量机的多主题文本分类算法[J].计算机工程与设计,2010,31(10):2273-2275. 被引量:5
  • 9奉国和.文本分类性能评价研究[J].情报杂志,2011,30(8):66-70. 被引量:38

二级参考文献65

共引文献47

同被引文献59

引证文献10

二级引证文献51

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部