摘要
利用话题检测技术将Blog信息按照所表达的话题进行归类和组织,可以使Blog信息更加有效、准确地为用户使用。研究了话题检测模型中的词频统计、权重计算以及相似度计算,把简单聚类算法与ISODATA算法相结合,并应用到中文Blog热门话题检测系统中,实验结果表明,文本分类的效果有了进一步的提高。
Using the topic detection technique to effectivly organize and classify the blog information which can make client use the blog information effectivly and accurately.The research focus on word frequency statistics,words weight calculation and similarity calculation,appling the method of combing simple algorithm with isodata algorithm to the chinese blog hot topic of inspection system can improve the text classification effectivly.
出处
《软件导刊》
2011年第9期6-9,共4页
Software Guide