摘要
大学生在课程规划方面有很高的自由度,这使得成绩数据较不规整,研究者很难对学生的前序课程成绩进行有效分析、利用。已有的成绩预测方法普遍未考虑学生前序课程成绩残缺的现象,从而导致预测准确性不佳。提出一种基于K近邻局部最优重建的残缺数据插补方法,该方法能够有效抑制前序课程成绩缺失对预测模型精度的影响。实验表明,该方法的补全效果优于已有的均值插补、GMM插补等方法,结合随机森林模型实现了有效的成绩预测,为学生成绩管理、就业能力预警提供了客观的参考。
College students have high freedom on their course planning,which makes the score data irregular and in disorder,and makes it difficult for researchers to effectively analyze and utilize students′ scores of foreword curriculums. The score missing phenomenon of students′ foreword curriculums is generally not considered in the existing score prediction methods,resulting in relatively low prediction accuracy. Therefore,a missing data imputation method based on local optimal reconstruction of k-nearest neighbors is proposed,which could effectively suppress the influence of foreword curriculum score missing on the accuracy of prediction model. The experimental results show that the completion effect of the proposed method outperforms that of the existing mean imputation method,GMM imputation method,and other methods. Effective score prediction is realized by combining with random forest model to provide an objective reference for students′ score management and early warning on students′ employability.
出处
《现代电子技术》
北大核心
2018年第6期145-149,共5页
Modern Electronics Technique
基金
国家自然科学基金资助项目(61403301)
国家自然科学基金(61773310)~~
关键词
成绩预测
缺失数据
数据插补
数据挖掘
机器学习
随机森林模型
score prediction
missing data
data imputation
data mining
machine learning
random forest model