摘要
将注意力机制与深度强化学习相结合,利用标签信息研究如何自主学习出有效的朝鲜语文本结构化表示,提出了两种结构化表示模型:信息蒸馏注意力模型(ID-Attention)和层次结构注意力模型(HS-Attention)。ID-Attention选择与任务相关的重要单词,而HS-Attention在句中发现短语结构。两种表示模型中的结构发现是一个顺序决策问题,使用强化学习中的Policy Gradient实现。实验结果表明:ID-Attention能够识别朝鲜语重要单词;HS-Attention能够很好地提取出句子结构,在文本分类任务上有很好的性能表现,同时,两模型的结果对语料库的标注有很好的辅助作用。
In this paper,attention mechanism is combined with deep reinforcement learning,and label information is used to study how to learn effective Korean language text structured representation independently.Two structured representation models are proposed,which are called Information Distilled Attention(IDA)and Hierarchically Structured Attention(HSA).IDA selects the important words related to the task,and HSA finds the phrase structure in the sentence.The structural discovery in both presentation models is a sequential decision problem that can be implemented using Policy Gradient(PG)in reinforcement learning.The experimental results show that the proposed IDA can recognize the important words of Korean,and HSA can extract the sentence structure well,and have good performance in the task of text classification.At the same time,the results of the two models have a good auxiliary effect on corpus tagging.
作者
赵亚慧
杨飞扬
张振国
崔荣一
ZHAO Ya-hui;YANG Fei-yang;ZHANG Zhen-guo;CUI Rong-yi(Deptartment of Computer Science&Technology,Yanbian University,Yanji 133002,China)
出处
《吉林大学学报(工学版)》
EI
CAS
CSCD
北大核心
2021年第4期1387-1395,共9页
Journal of Jilin University:Engineering and Technology Edition
基金
国家语委科研项目(YB135-76)
延边大学外国语言文学一流学科建设项目(18YLPY13).
关键词
人工智能
深度强化学习
注意力机制
文本结构发现
朝鲜语自然语言处理
artificial intelligence
deep reinforcement learning
attention mechanism
text structure discovery
Korean natural language processing