摘要
【目的】提升模型对文本结构特征和文本特征间关联性的感知,充分挖掘文本内在语义,深层次指导抽取任务。【方法】对文本、句法和词性进行特征抽取,得到各自的特征;将不同的特征进行融合,获得完备的文本结构特征;再设计一个多层交互注意力机制,该机制聚焦于文本结构特征和文本特征之间的深层关联,并采用双线性融合策略,以保证信息的完整性;最后,通过常用的分类器进行属性抽取。【结果】在公开的数据集上,所提模型的属性抽取准确率相比于已有模型至少提高了1.2个百分点。【局限】所提模型对隐式属性词感知迟钝,句子中出现三个以上隐式属性词,模型的性能将大幅度降低。【结论】在显式的商品属性词抽取任务中,建模文本结构特征与文本特征间关联性的方法可以有效提高属性抽取的准确率。
[Objective]This paper develops a new model to improve the perception of structural features and correlation between text features,aiming to fully explore the internal semantics and extract attributes.[Methods]First,we extracted the features of text,syntax and part of speech.Then,we merged different features to obtain complete text structure features.Third,we designed a multi-layer interactive attention mechanism,which focuses on the deep correlation between text structural features and text features.Fourth,we adopted bilinear fusion strategy to ensure the information integrity.Finally,we extracted attributes with common classifiers.[Results]We examined the new model with publicly available data sets,and found its extraction accuracy was at least 1.2percentage point higher than that of the existing methods.[Limitations]The model was insensitive to implicit attribute words,and the performance of the model will be greatly reduced with more than three implicit attribute words in the sentence.[Conclusions]The proposed method can effectively improve the accuracy of commodity attributes extraction.
作者
苏明星
吴厚月
李健
黄菊
张顺香
Su Mingxing;Wu Houyue;Li Jian;Huang Ju;Zhang Shunxiang(School of Computer Science and Engineering,Anhui University of Science&Technology,Huainan 232001,China)
出处
《数据分析与知识发现》
CSCD
北大核心
2023年第2期108-118,共11页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金项目(项目编号:62076006)
安徽省属高校协同创新项目(项目编号:GXXT-2021-008)的研究成果之一。
关键词
属性抽取
交互注意力机制
依存关系
BiGRU
BERT
Attribute Extraction
Interactive Attention Mechanism
Dependency Relationship
BiGRU
BERT