The global growth of the Internet and the rapid expansion of social networks such as Facebook make multilingual sentiment analysis of social media content very necessary. This paper performs the first sentiment analys...The global growth of the Internet and the rapid expansion of social networks such as Facebook make multilingual sentiment analysis of social media content very necessary. This paper performs the first sentiment analysis on code-mixed Bambara-French Facebook comments. We develop four Long Short-term Memory(LSTM)-based models and two Convolutional Neural Network(CNN)-based models, and use these six models, Na?ve Bayes, and Support Vector Machines(SVM) to conduct experiments on a constituted dataset. Social media text written in Bambara is scarce. To mitigate this weakness, this paper uses dictionaries of character and word indexes to produce character and word embedding in place of pre-trained word vectors. We investigate the effect of comment length on the models and perform a comparison among them. The best performing model is a one-layer CNN deep learning model with an accuracy of 83.23 %.展开更多
基金Supported by the National Natural Science Foundation of China(61272451,61572380,61772383 and 61702379)the Major State Basic Research Development Program of China(2014CB340600)
文摘The global growth of the Internet and the rapid expansion of social networks such as Facebook make multilingual sentiment analysis of social media content very necessary. This paper performs the first sentiment analysis on code-mixed Bambara-French Facebook comments. We develop four Long Short-term Memory(LSTM)-based models and two Convolutional Neural Network(CNN)-based models, and use these six models, Na?ve Bayes, and Support Vector Machines(SVM) to conduct experiments on a constituted dataset. Social media text written in Bambara is scarce. To mitigate this weakness, this paper uses dictionaries of character and word indexes to produce character and word embedding in place of pre-trained word vectors. We investigate the effect of comment length on the models and perform a comparison among them. The best performing model is a one-layer CNN deep learning model with an accuracy of 83.23 %.