期刊文献+

词典增强与部首感知的羊病命名实体识别

Dictionary Enhancement and Radical Perception for Sheep Disease Named Entity Recognition
下载PDF
导出
摘要 信息技术的快速发展催生了海量且赋有潜在价值的羊病信息。但目前鲜有针对羊病文本的命名实体识别研究,且通用模型难以表征羊病语义信息,相比于其他领域,羊病的命名实体识别存在更多未登记词。基于此,提出词典增强和部首感知的羊病实体识别模型。该方法构建羊病词典,将BERT底层字向量与其匹配到词汇向量的相似权重矩阵集成,深度底层嵌入羊病词典信息,改进通用BERT模型难以表征羊病信息问题;此外基于卷积神经网络框架提取羊病实体独特的象形部首特征,使用递归式拆解字符偏旁部首,将最终提取部首特征与BERT输出特征序列拼接映射到下层BiLSTM-CRF模型输入层,提高羊病实体边界感知。通过实验证明,该模型在羊病文本的命名实体识别中具备更高适配性。 The rapid development of information technology has given rise to a vast and potentially valuable amount of informa-tion on sheep diseases.However,there are few researches on named entity recognition of sheep disease texts,and the general model is difficult to represent the semantic information of sheep disease.Compared with other fields,there are more unregistered words in named entity recognition of sheep disease.Based on this,a sheep disease entity recognition model with dictionary enhancement and radical perception is proposed.This method constructs a sheep disease dictionary,integrates the similarity weight matrix of BERT's underlying word vector and its matching to the vocabulary vector,deeply embeds sheep disease dictionary information in the under-lying layer,and improves the difficulty of characterizing sheep disease information in the universal BERT model.In addition,based on the convolutional neural network framework,the unique pictographic radical features of sheep disease entities are extracted.Re-cursive disassembly of character radicals is used to concatenate and map the final extracted radical features with BERT output fea-ture sequences to the lower BiLSTM-CRF model input layer,improving the boundary awareness of sheep disease entities.Through experiments,it has been proven that this model has higher adaptability in named entity recognition of sheep disease texts.
作者 杨朋 王天一 YANG Peng;WANG Tianyi(College of Big Data and Information Engineering,Guizhou University,Guiyang 550025)
出处 《计算机与数字工程》 2024年第2期443-450,共8页 Computer & Digital Engineering
基金 贵州省科技计划项目(编号:黔科合支撑[2021]一般176号)资助。
关键词 羊疾病 命名实体识别 部首特征 双向长短记忆网络 sheep disease NER radical feature BiLSTM
  • 相关文献

参考文献5

二级参考文献25

  • 1张向喆,王明辉,赵洪波,王起山,潘玉春.生物医学文本中命名实体识别研究[J].上海交通大学学报(农业科学版),2010,28(2):132-139. 被引量:6
  • 2冯冲,陈肇雄,黄河燕,王江伟.最大熵模型的树-栅格最优N解码算法[J].计算机科学,2005,32(10):167-169. 被引量:1
  • 3PALMER D, DAY D S. A statistical profile of the named entity task [ C] / / Proc of the 5th Conference on Applied Natural Language Processing. Washington D C: [ s. n. ] , 1997: 191 -192.
  • 4VLACHOS A. Active learning with support vector machines [ D] .MS: University of Edinburgh, 2004: 12-14 .
  • 5BERGER A L, PIETRA S A D, DELLA-PIETRA V J. A maximum entropy approach to natural language processing[ J] . Computational Linguistics, 1996, 22 ( 1) : 39 - 71.
  • 6LEWIS D D, GALE W A. A sequential algorithm for training text classifiers[ C] / /Proc of the 17th ACM International Conference on Research and Development in Information Retrieval. 1994: 3-12.
  • 7PLATT J. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods [ C] / /Advances in Large Margin Classifiers. 2000: 61- 74 .
  • 8VAPNIK V. The nature of statistical learning theory[ M] . New York: Springer, 1995.
  • 9JOACHIMS T. Text categorization with support vector machines: learning with many relevant features[ C] / /Proc of the European Conference on Machine Learning. 1998: 137-142.
  • 10Wang Haochang Zhao Tiejun Li Sheng Yu Hao.A CONDITIONAL RANDOM FIELDS APPROACH TO BIOMEDICAL NAMED ENTITY RECOGNITION[J].Journal of Electronics(China),2007,24(6):838-844. 被引量:3

共引文献162

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部