摘要
为研究聚类算法在高校学生微博的应用情况,针对K-means算法和分层聚类算法在聚类中心选择不精确的问题,基于高校学生使用微博的背景,对微博文本挖掘应用中聚类算法的应用进行改进。通过文本的矢量表示、文本相似度计算和聚类算法的实现,验证了聚类算法在微博热门话题检测的准确性和效率,并针对实验数据提出几点针对性的措施。
On the basis of the research applications on clustering algorithm in college students′McroBlog,and MicroBlog use background,the application of clustering algorithm in MicroBlog test mining is improved to solve the inaccurate selection problem of clustering center by using K-means algorithm and hierarchical clustering algorithm.The accuracy and efficiency of MicroBlog hot topic detection are verified via text vector representation,text similarity calculation,and implementation of clustering algorithm,and some specific measures for the experimental data are pointed out.
作者
代明竹
高嵩峰
DAI Mingzhu;GAO Songfeng(School of Mechanical.Electronic and Vehicle Engineering,Beijing University of Civil Engineering and Architecture,Beijing 100044,China)
出处
《现代电子技术》
北大核心
2019年第7期177-180,共4页
Modern Electronics Technique
关键词
聚类算法
热门话题
微博
高校
文本
算法改进
clustering algorithm
hot topic
MicroBlog
college
text
algorithm improvement