期刊文献+

基于文本挖掘的互联网教育课程主题发现与聚类研究 被引量:7

Topic Discovery and Clustering Research for Online Courses Based on Text Mining
下载PDF
导出
摘要 如何通过有效的数据挖掘对互联网教育平台中的课程主题进行挖掘、聚类是当前互联网教育亟待解决的问题之一。实验基于文本信息对某互联网教育平台的1 472门课程体系的主题分布及类别进行了分析。采集了某平台1 472门课程的描述信息,进而通过自建词典和停用词库对文本进行切词分词,并通过TF-IDF对词频权重进行处理。利用LDA主题模型对课程的主题分布进行识别,发现了230个主题,并得到了每门课程在这230个主题下的文档–主题分布以及主题–词分布。进一步基于分布相似性函数对课程进行层次聚类,发现基于不同抽象层次主题的课程相互关联。最后将16个主题信息进行了可视化,这些主题分别从内容和数量两个角度反映出了课程的主题特征以及课程的聚合分布情况。 How to dig out informations from courses and conduct cluster analysis through effective data mining for online education is one of the problems to be solved. The topic distribution and classification of 1 472 courses from an online education platform were analyzed experimentally based on the text description. The text informations of 1 472 courses from the platform were collected, a customized dictionary and stop word list were constructed to do the word segmentation, and then the TF-IDF was employed to calculate the word frequency weighting. The topic distribution was recognized by using LDA and 230 topics were discovered.Both the document-topic distribution and topic-word distribution for each course text were obtained under the 230 topics. The hierarchical clustering for courses was completed based on the distribution similarity function and it is found that the courses were interrelated based on different levels of abstract topics. In the end, informations of 16 topics were visualized. This discovery of topics hidden in the semantics reflects the topic feature and the aggregate distribution of massive courses.
作者 李梦杰 刘建国 郭强 李仁德 汤晓雷 LI Mengjie;LIU Jianguo;GUO Qiang;LI Rende;TANG Xiaolei(Research Center of Complex Systems Science,University of Shanghai for Science and Technology,Shanghai 200093,China;Laboratory Center,Shanghai University of Finance and Economics,Shanghai 200433,China;Hujiang Education &Technology Co.,Ltd.,Shanghai 201203,China)
出处 《上海理工大学学报》 CAS 北大核心 2018年第3期259-266,共8页 Journal of University of Shanghai For Science and Technology
基金 国家自然科学基金资助项目(61773248 71771152)
关键词 主题发现 层次聚类 互联网教育 文本挖掘 topic discovery hierarchical clustering online education text mining
  • 相关文献

参考文献15

二级参考文献254

共引文献600

同被引文献77

引证文献7

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部