期刊文献+

基于视频资源与WoBERT-AT-BiLSTM-CRF的命名实体识别方法

Named entity recognition method based on video resources and WoBERT-AT-BiLSTM-CRF
下载PDF
导出
摘要 针对教育领域命名实体识别数据集的缺乏,提出利用视频资源构建相应的学科数据集。传统的语音识别模型存在词错率高、难以处理长时序列等情况,提出使用端到端的语音识别模型Whisper。对于实体识别存在误差积累、实体多样性等问题,提出一种以词为单位的WoBERT-AT-BiLSTM-CRF命名实体识别方法。数据集通过WoBERT预训练模型学习到拥有上下文语义信息的词向量,加入对抗训练生成对抗样本提高模型鲁棒性,再通过BiLSTM获得全面的文本表示,最后使用CRF利用序列标注之间的相关性来进一步优化命名实体识别结果。实验表明,WoBERT-AT-BiLSTM-CRF模型识别结果优于其他对比模型,该模型准确率、召回率、F1值分别为94.21%、94.39%、94.30%,说明该方法的可行性,并为教育领域构建命名实体提供了一种新的方案。 Aiming at the lack of named entity recognition data sets in the field of education,this paper proposes using video resources to construct corresponding subject datasets.The traditional speech recognition model has a high word error rate,which is difficult to deal with long sequences.The end-to-end speech recognition model Whisper is proposed.A WoBERT-AT-BiLSTMCRF named entity recognition method based on word is proposed to solve the problems of error accumulation and entity diversity in entity recognition.The data set learns word vectors with contextual semantic information through WoBERT pre-training model,and adds adversarial training to generate adversarial samples to improve the robustness of the model,then obtains comprehensive text representation through BiLSTM.Finally,CRF is used to further optimize the named entity recognition results by using the correlation between sequence annotations.The experimental results show that WoBERT-AT-BiLSTM-CRF model is superior to other comparison models.The accuracy rate,recall rate and F1 value of this model are 94.21%,94.39%and 94.30%,respectively,indicating the feasibility of this method and providing a new scheme for constructing named entities in the field of education.
作者 刘洋 唐海 朱梦涵 徐洪胜 LIU Yang;TANG Hai;ZHU Menghan;XU Hongsheng(School of Electrical and Information Engineering,Hubei University of Automotive Technology,Shiyan 442002,Hubei,China)
出处 《智能计算机与应用》 2024年第10期63-69,共7页 Intelligent Computer and Applications
基金 湖北省教育科学规划2022年度重点课题(2022GA049)。
关键词 命名实体识别 Whisper WoBERT 对抗训练 双向长短期记忆网络 条件随机场 named entity recognition Whisper WoBERT adversarial training BiLSTM CRF
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部