期刊文献+

基于Albert-TextCNN模型的多标签新闻文本分类

Multi-label News Text Classification Based on AlBERT-TextCNN Model
下载PDF
导出
摘要 针对智能信息推送管理者的多标签新闻文本分类任务,提出了基于ALBERT-CNN模型的解决方案。利用ALBERT预训练模型和TextCNN卷积神经网络,充分进行语义理解和特征提取。通过ALBERT模型进行语义筛选,精准把握新闻文本内容和主题,再传递给TextCNN模型进行分类和标签预测。采用Sigmoid函数输出每个标签的概率,实现精准的多标签分类。实验验证382688条来自今日头条客户端的数据,ALBERT-CNN模型的F1-Score达到92.05%,召回率达到96.8%,精确率达到90%,相比于优于传统的ALBERT和ALBERT-Denses模型的F1-Score和召回率有所提升。在精确率上略低于AlBERT-Dense。该研究为提高信息推送效率和降低误导性信息的传播提供了一个新的解决方案。 Aiming at the multi-label news text classification task of intelligent information push managers,a solution based on ALBERT-CNN model is proposed.The ALBERT pre-trained model and TextCNN Convolutional Neural Network are employed to comprehensively understand semantics and extract features.Semantic filtering is performed through the ALBERT model to accurately grasp the content and themes of news texts,which are then passed to the TextCNN model for classification and label prediction.The sigmoid function is utilized to output the probability of each label,achieving precise multi-label classification.The experiment verifies 382688 data from the Toutiao client.The F1-Score of ALBERT-CNN model reaches 92.05%,the Recall reaches 96.8%,and the Precision reaches 90%.Compared with the traditional ALBERT and ALBERT-Dense models,it has improved in F1-Score and Recall.It is slightly lower than ALBERT-Dense model in Precision.This study provides a new solution for enhancing information push efficiency and reducing the spread of misleading information.
作者 麦咏欣 林志豪 葸娟霞 MAI Yongxin;LIN Zhihao;XI Juanxia(School of Information Management and Engineering,Neusoft Institute Guangdong,Foshan 528225,China)
出处 《现代信息科技》 2024年第20期31-36,共6页 Modern Information Technology
基金 广东省大学生创新创业训练项目(S202312574015)。
关键词 多标签分类 ALBERT TextCNN 自然语言处理 multi-label classification ALBERT TextCNN NLP
  • 相关文献

参考文献12

二级参考文献109

共引文献41

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部