摘要
为改进传统特征方法很难获取中文句子中结构信息的问题,提出一种基于深度神经网络的句法要素识别模型。采用Bi-LSTM网络从原始数据中自动抽取句子中的结构信息和语义信息,利用Attention机制自动计算抽象语义特征的分类权重,通过CRF层对输出标签进行约束,输出最优的标注序列。经过对比验证,该模型能有效识别句子中的句法要素,在标注数据集上F1达到84.85%。
It was difficult to obtain structural information in Chinese sentences by the traditional feature method. To solve the problem, according to characteristics of Chinese sentence, a Bi-LSTM-Attention-CRF model was proposed based on deep neural network. A Bi-LSTM network was used to automatically extract structural information and semantic information from raw input sentences. Attention mechanism was adopted to weight abstract semantic features for classification. An optimized label sequence was output through the CRF layer. Comparing with other methods, our model could effectively identify syntactic elements in sentences. The performance reached to 84.85% in F1 score in the evaluation data sets.
作者
陈艳平
冯丽
秦永彬
黄瑞章
CHEN Yanping;FENG Li;QIN Yongbin;HUANG Ruizhang(School of Computer Science and Technology,Guizhou University,Guiyang 550025,Guizhou,China;Data Fusion and Analysis Laboratory(Guizhou University),Guiyang 550025,Guizhou,China;Guizhou Intelligent Human-Computer Interaction Engineering Technology Research Center,Guiyang 550025,Guizhou,China)
出处
《山东大学学报(工学版)》
CAS
CSCD
北大核心
2020年第2期44-49,共6页
Journal of Shandong University(Engineering Science)
基金
国家自然科学基金联合基金重点项目(U1836205)
国家自然科学基金重大研究计划项目(91746116)
贵州省重大应用基础研究项目(黔科合JZ字[2014]2001)
贵州省科技重大专项计划(黔科合重大专项字[2017]3002)
贵州省自然科学基金(黔科合基础[2018]1035)。
关键词
句法要素
信息抽取
深度神经网络
syntactic elements
information extraction
deep neural network