摘要
为了解决校园学生行为数据量大、结构复杂和来源广泛造成的数据管理难度大的问题,提出了一种在大数据平台上构建知识库的模型,分析海量学生数据。通过搭建Hadoop集群对学生一卡通数据做数据抽取、数据融合、数据入库分析和数据更新等操作形成学生行为知识库,并通过改进TextRank算法和采用随机游走技术实现知识库的自动推理和异常检测。实验结果表明,所构建的知识库与sym-KL算法构建的知识库进行对比分析,明显提升了对知识的分类、关系的链接和异常的检测效率,也为学校信息化平台增添了智能分析的功能。
Aiming at the problem of data management difficulty caused by the large amount,complex structure and wide sources of campus student behavior data,a model of knowledge base on big data platform is proposed to analyze massive student data.Through the construction of Hadoop cluster,data extraction,data fusion,data warehousing analysis and data update are performed with the data of the students all-in-one card,to form the knowledge base of students behavior.Automatic reasoning and anomaly detection of the knowledge base are realized by improving the TextRank algorithm and the random walk technology.The experiment results show that the proposed knowledge base significantly improves the efficiency of knowledge classification,the relationship link and the anomaly detection compared with the knowledge base constructed by the sym-KL algorithm.It also adds intelligent analysis function to the school information platform.
作者
刘建华
常发财
LIU Jianhua;CHANG Facai(Information Center,Xi an University of Posts and Telecommunications,Xi an 710121,China;School of Computer Science,Xi an University of Posts and Telecommunications,Xi an 710121,China)
出处
《西安邮电大学学报》
2021年第3期98-104,共7页
Journal of Xi’an University of Posts and Telecommunications