期刊文献+

基于词向量和条件随机场的中文命名实体分类 被引量:8

Chinese named entity classification based on word2vec and conditional random fields
下载PDF
导出
摘要 针对中文命名实体识别及分类问题,提出一种基于词向量聚类和条件随机场的方法。分析语料语言特点并选取统计特征,构建特征模板识别测试语料中的命名实体;利用词向量包含丰富语义信息这一特点,将训练集中的实体词向量聚类成一个个簇;通过比较每一个簇与已识别的测试集命名实体之间的相似度距离,实现中文命名实体的分类。实验结果表明,在该方法下所分8个类别中,命名实体分类的F 1值最高达到93.04%,F 1值的平均值达到了83.82%。 A method based on word vector clustering and conditional random field was proposed to solve the problem of Chinese named entities recognition and classification.The characteristics of the corpus were analyzed and statistical features were selected to construct a feature template to identify named entities in the test corpus.The entity word vectors in the training set were clustered into clusters by taking advantage of the feature that word vectors contain rich semantic information.The classification of Chinese named entities was realized by comparing the similarity distance between each cluster and the identified test set named entities.Experimental results show that in the eight categories,the F 1 value of the named entity classification is up to 93.04%,and the average value of F 1 value is up to 83.82%.
作者 马孟铖 杨晴雯 艾斯卡尔·艾木都拉 吐尔地·托合提 MA Meng-cheng;YANG Qing-wen;Askar·Hamdulla;Turdi·Tohti(College of Information Science and Engineering,Xinjiang University,Urumqi 830046,China)
出处 《计算机工程与设计》 北大核心 2020年第9期2515-2522,共8页 Computer Engineering and Design
基金 国家自然科学基金项目(61562083) 国家重点研发计划基金项目(2017YFC0820603)。
关键词 命名实体识别 条件随机场 词向量 聚类 命名实体分类 named entity recognition conditional random fields word embedding clustering named entity classification
  • 相关文献

参考文献10

二级参考文献70

共引文献292

同被引文献64

引证文献8

二级引证文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部