期刊文献+

融合C4.5与SVM算法的汉语句义类型识别方法 被引量:1

Chinese Sentential Semantic Type Recognition Based on C4.5 Decision Tree and SVM Algorithm
下载PDF
导出
摘要 选择50个词法和句法特征,进行了大量特征筛选实验,并基于筛选后的特征组合提出了一种融合C4.5和SVM的句义类型识别方法.该方法充分利用C4.5对多重句义的高精度识别和SVM对简单句义、复杂句义的高精度识别的特点,将C4.5与SVM分别识别的结果进行融合处理.给出最终的句义类型识别结果.识别结果表明,在BFS-CTC汉语标注语料库中,选取了4 500个句子,经十折交叉验证,句义类型的识别准确率达到92.1%. 50 lexical and syntax features were chosen to implement selecting experiments of twofeature combinations. Based on those feature combinations, a Chinese sentential semantic recognition method combining C4.5 (decision tree) and SVM was proposed. The method makes full use of the features of high precision of multiple by C4.5 as well as high precision of single and complex by SVM. The final recognition results are determined by synthetic blend of recognition results from C4.5 and SVM, respectively. The experimental data contains 4 500 sentences chosen from Beijing Forest Studio-Chinese Tag Corpus (BFS-CTC). Through ten-fold cross verification, it is concluded that the accuracy rate of proposed method for recognizing sentential semantic type is 92.1 %.
出处 《北京理工大学学报》 EI CAS CSCD 北大核心 2012年第10期1036-1041,共6页 Transactions of Beijing Institute of Technology
基金 国家"二四二"计划项目(2005C48) 北京理工大学基础研究基金资助项目(20060142014) 北京理工大学研究生创新资助项目(GC200802) 北京理工大学科技创新计划重大项目培育专项资助项目(2011CX01015)
关键词 自然语言处理 语义分析 句义结构 句义类型识别 natural language processing semantic parsing sentential semantic structure sentential semantic type recognition(SSTR)
  • 相关文献

参考文献9

  • 1贾彦德.汉语语义学[M].北京:北京大学出版社,2005:117-130.
  • 2张涛.基于HNC理论的句子语义分析[D].北京:北京理工大学出版社,2010.
  • 3李伟.现代汉语句型自动识别的研究[D].厦门:厦门大学出版社,2007.
  • 4徐昌火.试论现代汉语核心句的句义结构类型[J].南京师大学报(社会科学版),2002(5):125-131. 被引量:1
  • 5Quinlan J R. Induction of decision trees[J]. Machine Learning, 1986(1) :81 - 106.
  • 6Vapnik V N. The nature of statistical learning[M]. New York: Theory Springer, 1995.
  • 7Li S Z, Guo Guodong. Content-based audio classification and retrieval using SVM learning[C]//Proceedings of ICME (IEEE International Conference on Multimedia and Expo). Tokyo, Japan: IEEE Computer Society 2001 Contents, 2001:749 - 752.
  • 8罗森林,刘盈盈,冯扬,韩磊,陈功,王倩.BFS-CTC汉语句义结构标注语料库构建方法[J].北京理工大学学报,2012,32(3):311-315. 被引量:10
  • 9Xue N, Palmer M. Annotating the propositions in the penn Chinese treehank[C]// Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing. Sapporo, Japan: [s. n. ], 2003:47 - 54.

二级参考文献13

共引文献14

同被引文献5

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部