摘要
教育信息化的快速发展,推动了在线学习和混合式学习的普及,学习风险预警研究受到越来越多的关注,然而传统的预警算法都是针对平衡数据集设计,而教育数据集属于高度不平衡数据集,使得这些算法识别风险学生的难度增加。为了解决这个问题,提出一种基于变分自编码器和随机森林的混合式学习风险预警框架(VRFRisk)。框架使用变分自编码器对不平衡数据进行数据平衡处理,利用随机森林算法进行分类。在给定的数据集上的实验结果表明,VRFRisk可以有效处理不平衡的教育数据集,并在召回率和F分数方面提供比基线方法更好、更稳定的预测结果。
The rapid development of educational informationization has promoted the popularization of online learning and blended learning.The research on learning risk early warning has attracted more and more attention.However,the traditional early warning algorithms are designed for balanced datasets,and the education datasets belong to highly unbalanced datasets,which makes it more difficult for these algorithms to identify risk students.To solve this problem,a blended learning risk early warning framework(VRFRisk)based on variational autoencoder and random forest was proposed.The VRFRisk framework used the variational autoencoder to balance the unbalanced data,and used the random forest algorithm to classify.The experimental results on a given data set showed that VRFRisk can effectively handle the imbalanced education dataset and provide much better and more stable prediction results than baseline methods in terms of accuracy and F score.
作者
于海霞
王家骐
YU Hai-xia;WANG Jia-qi(School of Information Engineering and Media,Hefei Technology College,Hefei,Anhui 230000,China;School of Computer Science and Technology,Anhui University,Hefei,Anhui 230000,China;Department of Computer Information Engineering,Anhui Vocational and Technical College of Indutry and Trade,Huainan,Anhui 232001,China)
出处
《河北北方学院学报(自然科学版)》
2022年第11期1-6,共6页
Journal of Hebei North University:Natural Science Edition
基金
2020年安徽省高校自然科学基金重点项目:“基于教育大数据的在线学习预警系统研究与设计”(KJ2020A0990)
2020年安徽省优秀拔尖人才资助国内访学项目(gxgnfx2020157)
2020年安徽省高校质量工程项目:“《C语言程序设计》线下课程”(2020kfkc479)
2021年合肥职业技术学院人才项目:“数据挖掘在学生成绩分析中的应用研究”(2021KYQDZ011)
2021年合肥职业技术学院质量工程项目:“基于全过程的混合式教学评价体系构建”(2021JYXM05)。
关键词
变分自编码器
混合式学习
风险预警
随机森林
不平衡数据集
variational autoencoder
blended leanring
risk early warning
random forest
unbalanced dataset