摘要
本文旨在解决从非结构化的中文文本中提取实体和关系的问题,重点关注命名实体识别(NER)和关系提取(RE)所面临的挑战。为了增强识别与提取能力,我们设计了一个管道模型,分别应用于NER和RE,并整合了外部词典信息以及中文语义信息。我们还引入了一种创新的NER模型,结合了中文拼音、字符和词语的特征。此外,我们利用实体距离、句子长度和词性等信息来提高关系提取的性能。本文经过深入研究数据、模型和推理算法之间的关联作用,以提高解决这一挑战的学习效率。通过与现有多个方法的实验结果对比,我们的模型取得了显著的成果。
This paper aims to address the problem of extracting entities and relationships from unstructured Chinese text, focusing on the challenges faced in Named Entity Recognition (NER) and Relation Extraction (RE). To enhance recognition and extraction capabilities, we designed a pipeline model specifically for NER and RE, integrating external dictionary information as well as Chinese semantic information. We also introduced an innovative NER model that combines features of Chinese pinyin, characters, and words. Furthermore, we utilized information such as entity distance, sentence length, and part-of-speech to improve the performance of relation extraction. We delved into the interplay between data, models, and inference algorithms to improve the learning efficiency in tackling this challenge. Compared to existing methods, our model has achieved significant results.
出处
《人工智能与机器人研究》
2024年第2期425-440,共16页
Artificial Intelligence and Robotics Research