摘要
目前中国有3.73亿网民拥有博客,博客网站上已经存在海量的信息。对这些博客资源进行挖掘,可以获得有价值的信息。博客资源挖掘是Web数据挖掘的一种具体应用。探讨了国内外学者对博客资源进行数据挖掘的已有成果、各种方法与技术,涉及到博客网页的识别、博客传播特征、语义博客系统、博客之间的链接与交互、博客作者信息挖掘、博客主题挖掘、博客分类与聚类算法等。热点话题挖掘是博客数据挖掘的一种具体形式,也介绍了博客热点话题挖掘的方法与技术。
Currently,there are 373 million Internet users have a blog in China.There are vast amounts of information on the blog sites.Valuable information can be got through blog data-mining.The blog data-mining is a specific application of web data-mining.The achievements about blog data-mining,at home and abroad,have been discussed,including the identification of the blog page,blog propagation characteristics,semantic blog system,linking and interaction between blog pages,blog writer’s information mining,blog theme mining,blog classification and clustering algorithm.Hot topic mining is a specific form of blog data mining,the methods and techniques of it have been introduced too.
出处
《电脑知识与技术》
2013年第4X期2771-2773,共3页
Computer Knowledge and Technology
基金
安徽省教育厅人文社会科学研究项目"面向博客的专业知识热点话题挖掘研究"(编号:SK2013B443)