期刊文献+

面向高考的现代文阅读材料体裁自动分类 被引量:2

Text genre classification oriented to Chinese GaoKao reading comprehension
下载PDF
导出
摘要 针对高考答题系统中现代文阅读理解的体裁自动分类问题,采用统计分析的方法分析文学作品和科技文在词性分布、标点符号和词汇使用上的差异,提出基于词性、符号和词汇特征的体裁分类特征抽取方法和基于类内和类间离差的特征评价方法,在此基础上使用SVM算法训练分类器。实验结果表明,特征抽取、特征选择、分类方法可行有效,分类器在高考测试集上平均准确率达到96%,能较好地解决高考语文现代文阅读材料体裁自动分类问题。 Aiming at the problem of genre auto classification in GaoKao reading comprehension,the differences between literary works and sci-tech articles in the usage of character,lexical and Part-of-Speech(POS)features were analyzed using statistical methods,based on which the feature abstraction methods for text genre classification were proposed.The feature selection method based on between-class deviation and within-class deviation was proposed.With these methods,the text genre classification features were constructed,and the classifier was trained using support vector machine method based on these features.Experimental results show the methods of features abstraction,feature selection and classification are feasible and effective.The average precision of the classifier reaches 96%above on GaoKao testing data-sets,which can efficiently solve the problem of reading texts genre auto classification in GaoKao reading comprehension.
作者 苏雪峰 李茹 张虎 SU Xue-feng1 , LI Ru2,3, ZHANG Hu2(1. Department of Electronic Business, Business College of Shanxi University, Taiyuan 030031, China; 2. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China; 3. Key Laboratory of Ministry of Education for Computation Intelligence and Chinese Information Processing, Shanxi University, Taiyuan 030006, Chin)
出处 《计算机工程与设计》 北大核心 2018年第6期1755-1760,1794,共7页 Computer Engineering and Design
基金 国家863高技术研究发展计划基金项目(2015AA015407) 山西省自然科学基金项目(201601D102030)
关键词 体裁分类 词性特征 符号特征 词汇特征 支持向量机 genre classification part-of-speech features character features lexical features support vector machine
  • 相关文献

参考文献4

二级参考文献33

共引文献18

同被引文献17

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部