A Stacking-Based Ensemble Approach with Embeddings from Language Models for Depression Detection from Social Media Text

A Stacking-Based Ensemble Approach with Embeddings from Language Models for Depression Detection from Social Media Text

下载PDF

导出

摘要 Depression is a major public health problem around the world and contributes significantly to poor health and poverty. The rate of the number of people being affected is very high compared to the rate of medical treatment of the disease. Thus, the disease often remains untreated and suffering continues. Machine learning has been widely used in many studies in detecting depressive individuals from their contents on online social networks. From the related reviews, it is apparent that the application of stacking for diagnosing depression has been minimal. The study implements stacking based on Extra Tree, Extreme Gradient Boosting, Light Gradient Boosting and Multi-layer perceptron and compares its performance to state of the art bagging and boosting ensemble learners. To better evaluate the effectiveness of the proposed stacking approach, three pretrain word embeddings techniques including: Word2vec, Global Vectors and Embeddings from language models were employed with two datasets. Also, a corrected resampled paired t-test was applied to test the significance of the stacked accuracy against the baseline accuracy. The experimental results shows that the stacking approach yields favourable results with a best accuracy of 99.54%. Depression is a major public health problem around the world and contributes significantly to poor health and poverty. The rate of the number of people being affected is very high compared to the rate of medical treatment of the disease. Thus, the disease often remains untreated and suffering continues. Machine learning has been widely used in many studies in detecting depressive individuals from their contents on online social networks. From the related reviews, it is apparent that the application of stacking for diagnosing depression has been minimal. The study implements stacking based on Extra Tree, Extreme Gradient Boosting, Light Gradient Boosting and Multi-layer perceptron and compares its performance to state of the art bagging and boosting ensemble learners. To better evaluate the effectiveness of the proposed stacking approach, three pretrain word embeddings techniques including: Word2vec, Global Vectors and Embeddings from language models were employed with two datasets. Also, a corrected resampled paired t-test was applied to test the significance of the stacked accuracy against the baseline accuracy. The experimental results shows that the stacking approach yields favourable results with a best accuracy of 99.54%.

作者 Akwa Gaius Ronald Waweru Mwangi Antony Ngunyi Akwa Gaius;Ronald Waweru Mwangi;Antony Ngunyi(Department of Mathematics, Pan African University Institute for Basic Sciences, Technology and Innovation (PAUSTI), Nairobi, Kenya;Department of Computing, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya;Department of Statistics and Actuarial Sciences, Dedan Kimathi University of Technology, Nairobi, Kenya)

机构地区 Department of Mathematics Department of Computing Department of Statistics and Actuarial Sciences

出处《Journal of Data Analysis and Information Processing》 2023年第4期420-453,共34页 数据分析和信息处理（英文）

关键词 Machine Learning Natural Language Processing DEPRESSION Machine Learning Natural Language Processing Depression

分类号 O15 [理学—基础数学]

引文网络
相关文献

1Vinicio Mosca,Giacomo Fuschillo,Guido Sciaudone,Kapil Sahnan,Francesco Selvaggi,Gianluca Pellino.Use of artificial intelligence in total mesorectal excision in rectal cancer surgery: State of the art and perspectives[J].Artificial Intelligence in Gastroenterology,2023,4(3):64-71.
2Qin Xie,Wei Ma,Jianhang Zhang,Shiliang Li,Xiaobing Deng,Youjun Xu,Weilin Zhang.Exploration on learning molecular docking with deep learning models[J].Quantitative Biology,2023,11(3):320-331.
3Yong Li,Qiming Liang,Bo Gan,Xiaolong Cui.Action Recognition and Detection Based on Deep Learning: A Comprehensive Summary[J].Computers, Materials & Continua,2023,77(10):1-23.
4Wang Yalin,Zhang Shanshan,Yue Na,Liu Guixue,Huang Huijie,Han Qiuqin,Gong Wenqing,Chen Xiaorong,Zhang Yaodong,Yu Jin,Xiao Honglei,Qin Song,Li Wensheng,Liu Qiong.Clomipramine inhibits microglial NLRP3 inflammasome in the hippocampus of depressive rats[J].解剖学杂志,2021,44(S01):148-149.
5Li Xiaoyang.Beyond Mountains And Seas China-LAC economic partnership continues to thrive[J].Beijing Review,2023,66(46):26-28.
6古小明,周世权,吴友珍,尹章汉,何林.邻苯二甲酸酯类职业暴露人群尿中化合物及主要代谢产物水平分析[J].中国工业医学杂志,2023,36(5):455-459.
7Zhen Zhen,Jian Gao.Chinese Cyber Threat Intelligence Named Entity Recognition via RoBERTa-wwm-RDCNN-CRF[J].Computers, Materials & Continua,2023,77(10):299-323.
8Qiang Wang,Hao Jiang,Ying Jiang,Shuwen Yi,Qi Nie,Geng Zhang.Multiplex network infomax:Multiplex network embedding via information fusion[J].Digital Communications and Networks,2023,9(5):1157-1168.
9Farhana Haque,Shahana Pervin,Annekathryn Goodman.Epithelial Ovarian Cancer Patients and Clinicopathological Features and Survival: A Comparison of Outcomes of Two Age Cohorts in Bangladesh[J].Journal of Cancer Therapy,2023,14(10):416-428.
10Ming Lin,Meng Jin,Yufu Liu,Yuqi Bai.Satellite and instrument entity recognition using a pre-trained language model with distant supervision[J].International Journal of Digital Earth,2022,15(1):1290-1304.

Journal of Data Analysis and Information Processing

2023年第4期

浏览历史

内容加载中请稍等...

A Stacking-Based Ensemble Approach with Embeddings from Language Models for Depression Detection from Social Media Text

相关作者

相关机构

相关主题

浏览历史