期刊文献+

基于短文本的突发事件发展过程表示方法

Burst-event evolution process expression based on short-text
下载PDF
导出
摘要 针对当前短文本的突发事件分析不能较为简易且准确地描述事件发展过程的问题,提出一种新的基于短文本的突发事件发展过程表示方法。首先,提出一种事件状态值,它被用于描述事件在各个时间点的状态,以便于用户分析事件的发展过程;其次,根据短文本的结构化信息,将事件状态值从文本信息和用户信息两个方面考虑;然后,考虑文本信息的影响因子,构造相关公式计算文本信息权重;再次,考虑用户信息的影响因子,提出一种改造的Page Rank算法和用户分层思想,构造相关公式计算用户信息权重;最后,根据文本信息权重和用户信息权重计算事件状态值。实验结果表明依次考虑用户信息、采用改造的PageRank算法以及采用分层思想均能修正1~2个描述点,提高事件发展过程表示的准确度。 Current analytical method based on short-text can not describe the evolution process of burst-event in a simple and accurate manner. In order to solve the problem,a new method was proposed to express the evolution process of burst-event based on short-text data sets. Firstly,a method of measuring event status was proposed to describe the state of event at each time for analyzing the development process of the event. Secondly,according to the structured information of short-text,the value of event status was set from two aspects: text information and user information. Thirdly,with the consideration of the impact factor of text information,the weight of text information was calculated by constructing related formulas. Fourthly,with the consideration of the impact factor of user information,a modified Page Rank algorithm was proposed,and users were divided into different layers to calculate the weight of user information by constructing related formulas. Finally,the weight of text information and the weight of user information were combined to calculate the value of event status. The experimental results show that considering user information in turn,the modified Page Rank algorithm,and the idea of dividing the users into different layers all can correct 1 ~ 2 points of description and improve the accuracy of expressing the evolution process of event.
出处 《计算机应用》 CSCD 北大核心 2016年第6期1605-1612,共8页 journal of Computer Applications
基金 上海市教育委员会科研创新项目(B.10-0108-14-202)~~
关键词 事件分析 PAGERANK 分层 短文本 状态值 event analysis PageRank layering short-text status value
  • 相关文献

参考文献12

  • 1Lt C, SUN A, DATI'A A. Twevent: segment-based event detection from tweets [ C]// Proceedings of the 21st ACM International Con- ference on Information and Knowledge Management. New York: ACM, 2012:155 - 164.
  • 2VACA C K, MANTRACH A, JAIMES A, et al. A time-based col- lective factorization for topic discovery and monitoring in news [ C]// WWW 2014: Proceedings of the 23rd International Conference on World Wide Web. New York: ACM, 2014:527 -538.
  • 3ZHAO Z B, JIA Y F, LAN Y, et al. 5WTAG: detecting the topics of Chinese microblogs based on 5W model [ C]// Proceedings of the 2013 10th Web Information System and Application Conference. Piscataway, NJ: IEEE, 2013:237-242.
  • 4SONG B, LOU H, WANG Y. A composite events detecting ap- proach based on similar sub-events [ C]// Proceedings of the 2012 9th Web Information Systems and Applications Conference. Piscat- away, NJ: IEEE, 2012:49-53.
  • 5YOU B, LIU M, LIU B Q, et al. Detecting hot topics in technology news streams [ C]// Proceedings of the 2012 International Confer- ence on Machine Learning and Cybernetics. Piscataway, NJ: IEEE, 2012, 5:1968 - 1974.
  • 6LI W, HUANG Y. New event detect based on LDA and correlation of subject terms [ C]//Proceedings of the 2011 International Confer- ence on Intemet Technology and Applications. Piscataway, N J: IEEE, 2011:1-4.
  • 7王昊,杨亮,林鸿飞.日本地震的微博热点事件分析[J].中文信息学报,2012,26(5):7-13. 被引量:9
  • 8吴昊,耿焕同,吴祥.一种基于聚类分析的BBS主题发现算法研究[J].安徽师范大学学报(自然科学版),2009,32(1):9-13. 被引量:7
  • 9唐果,陈宏刚.基于BBS热点主题发现的文本聚类方法[J].计算机工程,2010,36(7):79-81. 被引量:14
  • 10刘骅,朱庆华.基于标题的BBS热点话题挖掘——以南京大学小百合BBS为例[J].现代情报,2013,33(1):162-165. 被引量:9

二级参考文献50

共引文献77

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部