期刊文献+

基于个人微博特征的事件提取研究 被引量:3

ON EVENTS EXTRACTION BASED ON MICROBLOGGING CHARACTERISTICS
下载PDF
导出
摘要 个人微博在事件提取上大多都是运用文本进行相似度计算最终达到聚类结果,而没有充分的考虑到微博特征。针对微博标签、URL、时间等特征,提出一种基于微博特征的事件提取算法。该算法针对微博的特征进行TF-IDF的改进,并加入标签相似度,URL相似度,进行综合相似度计算,最后按时间先分段后合并的改进K-means聚类方法得出事件提取结果。实验结果表明,基于微博特征的事件提取算法对微博关键字提取和事件提取的精确度有明显的提高。 Individual microblogs,in regard to events extraction,mostly use their texts to calculate the similarity to finally achieve the clustering results,but the microblogging features are not fully taken into consideration. Aiming at the characteristics of microblogging hashtag,URL and time,this paper puts forward a microblogging characteristic-based events extraction algorithm. The algorithm makes the TF-IDF improvement against microblogging characteristics,and adds hashtag similarity and URL similarity to carry out the comprehensive similarity calculation. Finally,it uses the improved K-means clustering method,that segments first and merges afterwards according to the time,to get the events extraction results. Experimental results show that the microblogging characteristics-based events extraction algorithm achieves obvious improvement in accuracy of microblogging keywords extraction and events extraction.
出处 《计算机应用与软件》 CSCD 2016年第7期47-51,共5页 Computer Applications and Software
基金 国家自然科学基金项目(61163025)
关键词 微博特点 事件提取 综合相似度 Microblogging characteristic Events extraction Comprehensive similarity
  • 相关文献

参考文献4

二级参考文献45

共引文献99

同被引文献28

引证文献3

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部