期刊文献+

汉语水平考试(HSK)阅读文本可读性自动评估研究 被引量:5

A Study on the Automatic Text Readability Assessment of Reading Texts in Hanyu Shuiping Kaoshi(HSK)
下载PDF
导出
摘要 本文基于汉语二语文本可读性的特征集合,通过对比六种机器学习模型的效果,引入特征选择算法,实现了汉语水平考试(HSK)阅读文本可读性的自动评估。实验结果表明,支持向量机模型在HSK阅读文本可读性评估中的表现最好;基于汉字、词汇、句法和篇章的全特征模型的预测准确率达0.876;不同层面的特征预测能力存在差异,其中词汇层面表现最好;剔除冗余特征后,词汇和汉字两个层面的18个特征进入最优模型,句法和篇章特征未能进入该模型。本研究对HSK阅读文本的选择和改编及其他类型的文本可读性评估具有一定的参考意义。 This paper proposed a set of features for CSL text readability assessment and then compared the effectiveness of six machine learning models in addition to employing the algorithms of feature selection to assess the readability of the Hanyu Shuiping Kaoshi(HSK)reading texts.The experiments demonstrated that the prediction of the support vector machine was significantly higher than others.The accuracy based on the full-featured model including Chinese characters,lexical,syntactic,and discourse reached 0.876 and there existed gaps at different linguistic levels,among which the lexical-level features were the most reliable.The optimal model consisted of 18 features at the lexical level and character level after eliminating the redundant features,while syntactic and discourse features were not in the model.This study has implications for the selection and adaptation of HSK reading texts and the readability evaluation of other types of texts.
作者 杜月明 王亚敏 王蕾 DU Yueming;WANG Yamin;WANG Lei
出处 《语言文字应用》 CSSCI 北大核心 2022年第3期73-86,共14页 Applied Linguistics
基金 国家社会科学基金重大项目“面向全球孔子学院的中国概况教学创新研究及其数字课程建设”(18ZDA339)的资助。
关键词 文本可读性 HSK阅读文本 语言特征 机器学习 支持向量机 text readability HSK reading text linguistic features machine learning Support vector machine
  • 相关文献

参考文献15

二级参考文献115

共引文献302

同被引文献60

引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部