摘要
社交媒体作为人们表达情感提供了一种简单的方式,人们利用这些平台来展示他们对某事的喜欢或不喜欢和他们对情况的感受等等。在自然语言处理中,情感识别和分类是一个常见的研究任务,其中一个模型可以检测这些类型的情感。由于缺乏数据,对于印度语言来说是相当具有挑战性的,而且作为一个多语言社会,人们往往在社交媒体上使用代码混合模式。在文章中,为分析这些数据,创建一个12000条来自不同来源的印地语-英语代码混合文本的数据集,并用快乐、悲伤和愤怒的情绪对其进行注释。使用预先训练的双语模型来生成特征向量,并使用深度神经网络作为分类模型。CNN-BiLSTM的分类准确率为83.21%,优于其他实验模型。
Social media provides a simple way for people to express their feelings,and people use these platforms to show how they like or dislike something,how they feel about the situation,and so on.Emotion recognition and classification is a common research task in natural language processing,and one of the models can detect these types of emotions.Due to the lack of data,it is quite challenging for Indian languages,and as a multilingual society,people tend to use code mixing patterns on social media.In the article,in order to analyze the data,the article created a dataset of 12000 Hindi-English code mixed texts from different sources and annotated them with happiness,sadness and anger.In this paper,a pre-trained bilingual model is used to generate the feature vector,and the depth neural network is used as the classification model.The classification accuracy of CNN-BiLSTM is 83.21%,which is better than other experimental models.
出处
《科技创新与应用》
2022年第12期28-30,34,共4页
Technology Innovation and Application
关键词
代码混合英文文本
情感检测
一维卷积神经网络
长短期记忆
code mixing English text
emotion detection
one-dimensional convolution neural network
long short-term memory