期刊文献+

关于中文拼写纠错数据增强的方法 被引量:1

Data Enhancement Approach of Chinese Spelling Error Correction
原文传递
导出
摘要 针对中文文本纠错领域中训练深度学习模型所需要的标注数据有限这一问题,提出了五种数据噪声替换方案。通过实验验证,证明了其中的音似替换和形似替换两种方案可以有效增强该领域数据质量,然后通过对这两种替换方案的对比实验,探索出了一种更有效的混合替换方案。其核心在于通过噪声替换的方式提升现有数据集的大小和多样性,进而提高中文拼写纠错模型的性能。 Due to the limitation of the label data needed to train the deep learning model in the field of Chinese text error correction,five data noise replacement schemes are proposed.Experiment proves that the sound similarity replacement and form similarity can effectively enhance the data quality in the area.Then a more effective hybrid alternative scheme is explored through the comparative experiment of the two alternatives.The core of this method is to improve the performance of Chinese spelling correction model by increasing the size and diversity of existing data sets by means of noise substitution.
作者 李建义 白雪丽 王洪俊 王迦南 Li Jianyi;Bai Xueli;Wang Hongjun;Wang Jianan(School of Computer Science&Engineering,North China Institute of Aerospace Engineering,Langfang 065000,China;TRS Information Technology Co.,Ltd.,Beijing 100000,China)
出处 《北华航天工业学院学报》 CAS 2021年第6期1-4,44,共5页 Journal of North China Institute of Aerospace Engineering
基金 河北省自然科学基金项目(F2019409056)。
关键词 中文拼写纠错 深度学习 标注数据 噪声替换 数据增强 Chinese spelling correction deep learning label data noise substitution data enhancement
  • 相关文献

参考文献2

二级参考文献40

  • 1张仰森,曹元大,徐波.基于统计的纠错建议给出算法及其实现[J].计算机工程,2004,30(11):106-109. 被引量:7
  • 2张磊,周明,黄昌宁,潘海华.中文文本自动校对[J].语言文字应用,2001(1):19-26. 被引量:23
  • 3张仰森,曹元大,俞士汶.基于规则与统计相结合的中文文本自动查错模型与算法[J].中文信息学报,2006,20(4):1-7. 被引量:34
  • 4易蓉湘,何克抗.计算机汉语文稿校对系统[J].计算机研究与发展,1997,34(5):346-350. 被引量:12
  • 5Karen Kukich.Techniques for Automatically Correcting Words in Text[J].ACM Computing Surveys,1992,24(4):377-438.
  • 6Andrew R Golding.A Winnow-based Approach to Context-Sensitive Spelling Correction[J].Machine Learning,1999,34:107-130.
  • 7Li Jianhua,Wang Xiaolong.Study on Automatic Spelling Check and Correction[J].Journal of Chinese Language and Computing,2003,1(1):25-36.
  • 8Lei Zhang,Ming Zhou,Changning Huang,et al.Automatic Chinese Text Error Correction Approach Based on Fast Approximate Chinese Word-matching Algorithm[C].Microsoft Research China Paper Collection,2000.231-235.
  • 9Lei Zhang,Ming Zhou,Changning Huang,et al.Automatic Detecting/Correcting Errors in Chinese Text by an Approximate Word-matching Algorithm[C].Microsoft Research China Paper Collection,2000.135-141.
  • 10Lei Zhang,Ming Zhou,Changning Huang.Multifeature-based Approach to Automatic Error Detection and Correction of Chinese Text[C].Microsoft Research China Paper Collection,2000.193-197.

共引文献47

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部