摘要
针对自媒体网络舆情情绪预测问题提出了一种基于主动学习的预测方法SPM-AL(A Sentiment Prediction Method for Self-Media Online Public Opinion based on Active Learning,简称SPM-AL)。SPM-AL在数据预处理阶段基于TextRank算法提取关键词,基于FastText模型实现词嵌入;在模型构建阶段,从未标记的自媒体文本中选出具有代表性的文本进行专家标记,并借助逻辑回归分类来构建模型。基于实际数据集的结果表明:SPM-AL仅需标记14.21%的数据,模型的F_(1)值就可以达到86.23,并优于使用所有训练数据构建的模型。
SPM-AL,a method of sentiment prediction for online public opinions on we-media,was proposed on the basis of active learning.At the stage of data preprocessing,keywords were collected through TextRank and embedded through FastText model.At the stage of model construction,typical texts that were selected from we-media were labeled expert marks and the model was built with the aid of logistic regression.The results from the realistic data set show that SPM-AL performs better than any other models using complete training data because its F_(1) value can reach 86.23 with only 14.21%data labelled.
作者
杨帆
李芳
吴新华
YANG Fan;LI Fang;WU Xinhua(Jiangsu College of Engineering and Technology,Nantong 226006,China)
出处
《江苏工程职业技术学院学报》
2022年第3期19-22,共4页
Journal of Jiangsu College of Engineering and Technology
基金
江苏工程职业技术学院科研项目(编号:GYKY/2019/9)
江苏工程职业技术学院教学改革研究课题(编号:GYJY202021)
江苏省高校哲学社会科学研究项目(编号:2020SJB0836)
江苏省高校哲学社会科学研究思政专项课题(编号:2021SJB0874)
江苏省现代教育技术研究课题(编号:2022-R-105790)。
关键词
自媒体
网络舆情
情绪预测
主动学习
we-media
online public opinions
sentiment prediction
active learning