摘要
为提高机构名识别精度,满足关系抽取等下游任务的需求,提出分阶段细粒度命名实体识别思想。利用Bert-BiLSTM-CRF模型对机构名进行粗粒度识别,将机构名视为短文本,采用Bert-CNN对构建的机构名词典训练细粒度分类模型,获取机构名的细粒度标签。实验结果表明,提出的分阶段方法在细粒度机构名识别上F1值最佳达到了0.8117,远超词典匹配方法。
To improve the accuracy of organizational entity recognition and satisfy the requirements of downstream tasks such as relation extraction,an idea of fine-grained named entity recognition in stages was proposed.Bert-BiLSTM(bi-directional long short-term memory)-CRF(conditional random fields)was used to identify the coarse-grained organizational entities.Organizational entities were regarded as short texts and the fine-grained classifier with the constructed dictionary of organizational entities was trained using Bert-CNN(convolutional neural networks).The fine-grained labels of organizational entities were obtained.Experimental results show that the optimal F1 of multi-stages method proposed reaches 0.8117,which is far more than the dictionary matching method.
作者
李磊
王路路
吐尔根·依布拉音
姜丽婷
艾山·吾买尔
LI Lei;WANG Lu-lu;Turgun Yibulayin;JIANG Li-ting;Aishan Wumaier(School of Information Science and Engineering,Xinjiang University,Urumqi 830046,China)
出处
《计算机工程与设计》
北大核心
2022年第1期245-251,共7页
Computer Engineering and Design
基金
国家重点研发子课题基金项目(2017YFB1002103)
国家自然科学基金项目(61762084)
新疆维吾尔自治区重点实验室开放课题基金项目(2018D04019)
国家语委基金项目(ZDI135-54)。
关键词
粗粒度
命名实体识别
细粒度
机构名识别
分类器
coarse-grained
named entity recognition
fine-grained
organizational entity recognition
classifier