摘要
航行通告信息是保障飞行安全所需的重要情报。针对航行通告信息难以采用统一格式处理特点,通过令牌化实现分词,并通过词嵌入方法,实现了航行通告中命名实体抽取。针对航行通告无标注数据集而无法进行机器学习的问题,采用改进的KMP算法结合实体间距离进行关系抽取。实验结果表明,采用此方法抽取航行通告信息,实现了航信通告信息中实体与关系的提取,得到了格式化的数据,解决了航信通告领域无标注数据集的问题。
NOTAM(Notice To Airman)information is an important information required to ensure flight safety. In view of the fact that the NOTAM information is difficult to be processed in a unified format, this paper realizes word segmentation through tokenization, and realizes named entity extraction in NOTAM through word embedding method. Aiming at the problem that the NOTAM has no labeled data set and can not be learned by machine, the improved KMP algorithm method combined with the distance between entities is used to extract the relationship. The experimental results show that the method extraction of NOTAM information realizes the extraction of entities and relationships in aeronautical notice information, obtains the formatted data, solved the problem of unlabeled data sets in the NOTAM information field.
作者
潘正宵
罗银辉
李荣枝
Pan Zhengxiao;Luo Yinhui;Li Rongzhi(School of Computer Science,Civil Aviation Flight University of Chian,Sichuan 618300)
出处
《现代计算机》
2022年第2期82-87,共6页
Modern Computer
关键词
信息抽取
航行通告
词嵌入
实体识别
关系抽取
information extraction
NOTAM information
word embedding
entity recognition
relation extraction