摘要
目的:在使用数据挖掘发现BBS热点话题的过程中,标题的重要性经常被忽略。本文旨在论证和凸显标题在BBS热点话题挖掘中的重要作用,同时区别在BBS热点话题挖掘时标题和文本内容作用的不同。方法:以南京大学小百合BBS的每日10大热门话题帖子的标题为数据样本,采用凝聚式层次聚类法进行数据的聚类。结果:将270条样本数据聚为单类,选取其中有代表性的前五组进行讨论。结论:仅凭标题内容就能够有效挖掘出在一段时间内BBS上的热点主题,证明了标题在BBS热点话题挖掘中的重要性。
Purpose: During the process in mining hot topics in BBS by utilizing data mining, the special importance of ti tles is often ignored. This paper intends to prove the significance of the titles in BBS hot topic mining, and to distinguish the func tion of rifles from that of post content during hot topic mining. Method: Taking the sample data from lily BBS in Nanjing Universi ty, applying agglomerated hierarchy clustering. Results: Put 270 smaples into dusters; Collect the first five reprebemive dusters under discussion. Conclusion: Based on the titles, the hot topies on the BBS in a certain period can be mined effeotively, prov hag the substantial role of the titles in Hot Topic Mining in BBS.
出处
《现代情报》
CSSCI
2013年第1期162-165,共4页
Journal of Modern Information
关键词
BBS
热点话题
数据挖掘
凝聚式层次聚类
BBS
Hot topic mining
data mining
agglomerated hierarchy clustering