期刊文献+

基于改进K-SVD的英文语料库分词特征提取模型构建 被引量:2

Construction of English Corpus Word Segmentation Feature Extraction Model Based on Improved K-SVD
下载PDF
导出
摘要 为提升英文语料库分词精准度,以英文语料库作为研究对象,采用改进的K-SVD算法,构建一个分词特征提取模型。利用稀疏编码与字典更新两个步骤,将初始数据替换为更高级别的特征表示,作为K-SVD算法输入项来获取最优字典。基于模型开发平台,采用文本预处理模块、文本网络构建模块、特征提取模块以及特征加权模块,构建英文语料库分词特征提取模型。选取近十年的新闻素材作为英文语料库,组成训练集,根据分词特征提取结果与提取效果度量指标数据,验证所建模型具有语义辨别与文本还原的有效性,且准确率与召回率也有显著优越性。 In order to improve the accuracy of word segmentation in English corpus,an improved K-SVD algorithm is used to construct a segmentation feature extraction model.Using sparse coding and dictionary updating,the initial data is replaced by higher-level feature representation,which is used as the input of K-SVD algorithm to obtain the optimal dictionary.Based on the model development platform,the text preprocessing module,text network construction module,feature extraction module and feature weighting module are used to construct the feature extraction model of English corpus segmentation.This paper selects news materials from the past ten years as English corpus to form a training set.According to the results of word segmentation feature extraction and the index data of extraction effect,the validity of the model is verified,and the accuracy and recall rate are also significantly superior.
作者 周永英 ZHOU Yong-ying(Xingzhi College of Xi'an University of Finance and Economics,Xi'an 710038 China)
出处 《自动化技术与应用》 2021年第11期127-130,135,共5页 Techniques of Automation and Applications
关键词 K-SVD算法 英文语料库 分词特征提取 稀疏编码 K-SVD algorithm English corpus word segmentation feature extraction sparse coding
  • 相关文献

参考文献8

二级参考文献40

共引文献47

同被引文献23

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部