摘要
商品新闻事件抽取是对新闻非结构化语句进行归纳和表达,以提取出该语句所包含的事件以及相关信息,能够为供货需求预测、价格预测、问答系统等提供基础。现有研究工作普遍存在候选触发词与实体向量关联性利用不强以及参数角色提取准确率不够的问题,因此本文在已有研究工作的基础上,提出了一种基于自注意力机制和平均池化图卷积网络及依赖解析树的提取模型(SAT-GCN-DPT)。该模型主要分为3个模块,ComBERT预训练模块、self-attention机制下的触发分类模块、利用平均池化图卷积和依赖解析树的参数角色分类模块。模型利用self-attention机制对输入数据进行操作增强候选触发词与实体向量之间关联性,同时对图卷积结果使用平均池化函数进行信息聚合来更大程度地还原事件之间关联性和提高分类准确率。实验结果表明,在CON数据集上,本文提出的模型在触发分类以及参数角色分类的准确度以及F 1值均有了提高。
Commodity News event extraction involves analyzing unstructured sentences in news items to extract the information contained in them.Extracting information from news events on commodities can provide the basis for forecasting supply and demand,predicting prices,and developing question-answering systems.The existing researches generally have the problems that the correction between candidate trigger words and entity vector is not strong and the accuracy of parameter role extraction is not enough.In this study,we propose a model to extract commodity news events(SAT-GCN-DPT)based on a self-attention mechanism,the average pooling-based graph convolutional network,and a dependency parse tree.The model is mainly divided into three modules:a ComBERT pre-training module,a module to classify trigger words based on the self-attention mechanism,and a module to classify parameter roles by using the average pooling-based graph convolution network and dependency parsing tree.The model uses the self-attention mechanism to manipulate the input data and enhance the association between the trigger words,while the results of graph convolution are aggregated by using the average pooling function to restore the association between events and improve the accuracy of classification.The results of experiments on the CON dataset showed that that the proposed model achieved high values of accuracy and the F 1 score on tasks of classifying the trigger words and parameter roles.
作者
罗茜雅
李红军
王子怡
甘晨灼
胡正浩
LUO Xiya;LI Hongjun;WANG Ziyi;GAN Chenzhuo;HU Zhenghao(College of Computer Science and Cyber Security(Pilot So ftware College),Chengdu University of Technology,Chengdu 610059,China)
出处
《成都理工大学学报(自然科学版)》
CAS
CSCD
北大核心
2024年第3期500-512,共13页
Journal of Chengdu University of Technology: Science & Technology Edition
基金
国家自然科学基金(42050104)
自然资源部深时地理环境重建与应用重点实验室开放基金项目(DGERA20221102)。
关键词
商品新闻事件抽取
自注意力机制
平均池化函数
图卷积网络
依赖解析树
commodity news event extraction
self-attention mechanism
average pooling function
graph convolutional network
dependency parse tree