期刊文献+

A Stacking-Based Ensemble Approach with Embeddings from Language Models for Depression Detection from Social Media Text

A Stacking-Based Ensemble Approach with Embeddings from Language Models for Depression Detection from Social Media Text
下载PDF
导出
摘要 Depression is a major public health problem around the world and contributes significantly to poor health and poverty. The rate of the number of people being affected is very high compared to the rate of medical treatment of the disease. Thus, the disease often remains untreated and suffering continues. Machine learning has been widely used in many studies in detecting depressive individuals from their contents on online social networks. From the related reviews, it is apparent that the application of stacking for diagnosing depression has been minimal. The study implements stacking based on Extra Tree, Extreme Gradient Boosting, Light Gradient Boosting and Multi-layer perceptron and compares its performance to state of the art bagging and boosting ensemble learners. To better evaluate the effectiveness of the proposed stacking approach, three pretrain word embeddings techniques including: Word2vec, Global Vectors and Embeddings from language models were employed with two datasets. Also, a corrected resampled paired t-test was applied to test the significance of the stacked accuracy against the baseline accuracy. The experimental results shows that the stacking approach yields favourable results with a best accuracy of 99.54%. Depression is a major public health problem around the world and contributes significantly to poor health and poverty. The rate of the number of people being affected is very high compared to the rate of medical treatment of the disease. Thus, the disease often remains untreated and suffering continues. Machine learning has been widely used in many studies in detecting depressive individuals from their contents on online social networks. From the related reviews, it is apparent that the application of stacking for diagnosing depression has been minimal. The study implements stacking based on Extra Tree, Extreme Gradient Boosting, Light Gradient Boosting and Multi-layer perceptron and compares its performance to state of the art bagging and boosting ensemble learners. To better evaluate the effectiveness of the proposed stacking approach, three pretrain word embeddings techniques including: Word2vec, Global Vectors and Embeddings from language models were employed with two datasets. Also, a corrected resampled paired t-test was applied to test the significance of the stacked accuracy against the baseline accuracy. The experimental results shows that the stacking approach yields favourable results with a best accuracy of 99.54%.
作者 Akwa Gaius Ronald Waweru Mwangi Antony Ngunyi Akwa Gaius;Ronald Waweru Mwangi;Antony Ngunyi(Department of Mathematics, Pan African University Institute for Basic Sciences, Technology and Innovation (PAUSTI), Nairobi, Kenya;Department of Computing, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya;Department of Statistics and Actuarial Sciences, Dedan Kimathi University of Technology, Nairobi, Kenya)
出处 《Journal of Data Analysis and Information Processing》 2023年第4期420-453,共34页 数据分析和信息处理(英文)
关键词 Machine Learning Natural Language Processing DEPRESSION Machine Learning Natural Language Processing Depression
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部