摘要
传统的词向量模型生成的词向量,存在着难以表达出一词多义和学习到词与词之间的依赖关系的问题。针对于此,本文提出基于自注意力机制的用户画像。首先采用自注意力机制,将所有单词信息编码进每一个单词中,学习查询句中词的语义,理解一词多义、一义多词。然后利用多头注意力机制提升模型能力,全面理解查询句中词与词之间的复杂语义。最后利用支持向量机(SVM)分类算法,得到用户基本属性的分类结果,构建用户画像。实验结果表明,模型分类精度高于使用词向量模型和LDA模型方法的分类精度。
The word vector generated by the traditional word vector model is difficult to express the polysemy of a word and to learn the dependence between words.In view of this,this paper proposes a user portrait based on self-attention mechanism.Firstly,the self-attention mechanism is adopted to encode all word information into each word,learn the semantics of the words in the query sentence,and understand polysemy and polysemy.Then the multi-head attention mechanism is used to improve the model’s ability to fully understand the complex semantics between words in the query sentence.Finally,support vector machine(SVM)classification algorithm is used to get the classification results of the basic attributes of users,and the user portrait is constructed.The experimental results show that the classification accuracy of the model is higher than that of the word vector model and LDA model.
作者
张维
陈泽宇
ZHANG Wei;CHEN Zeyu(School of Electrical and Electronic Engineering,Shanghai University of Engineering and Technology,Shanghai 201620,China)
出处
《智能计算机与应用》
2020年第12期49-53,共5页
Intelligent Computer and Applications