摘要
提出一种基于多通道的语义信息融合交互方法,使用不同的网络结构来提取原始语音信息、图像信息以及行为信息的语义特征,通过隐马尔可夫模型加强不同特征之间的交互,使用注意力机制建立语义信息融合,捕获了深层语义特征。在IEMOCAP数据集上验证了所提方法的有效性。
Therefore,in order to solve the existing problem of single interaction channel,this paper proposes a semantic information fusion interaction method based on multi-channel,using different network structures to extract the semantic features of original voice information,image information and behavioral information,through hidden Markov the model strengthens the interaction between different features,uses the attention mechanism to establish semantic information fusion,and captures deep semantic features.The effectiveness of the method proposed in this article is verified on the IEMOCAP data set.
作者
王出航
陈丹
WANG Chuhang;CHEN Dan(College of Computer Science and Technology,Changchun Normal University,Changchun 130026,China;Asset and Laboratory Management Department,Changchun Normal University,Changchun 130026,China)
出处
《长春工业大学学报》
CAS
2024年第2期160-163,共4页
Journal of Changchun University of Technology
基金
吉林省教育厅科学技术研究项目(JJKH20220839KJ)
吉林省科技厅重点研发项目(20210203161SF)。
关键词
多通道
语义特征
融合交互
注意力机制
multi-channel
semantic features
fusion interaction
attention mechanism