期刊文献+

基于改进的χ~2检验的热点词突发性度量研究 被引量:1

Bursty Measurement of Hot Term Based on Improvement χ~2 Test Combined with TF
下载PDF
导出
摘要 采用原始χ2检验公式进行突发性度量时存在低频词偏袒问题,论文提出了结合TF的改进的χ2检验方法能有效克服该问题。该方法将词频累加和作为文档统计篇数的影响因子β引入原始χ2检验公式从而解决了低频词偏袒问题,提高了度量热点词突发性的精确度。动态突发性热点词库依据改进后的χ2检验公式得到的突发性度量值来建立,并将该词库运用在动态突发性向量空间模型中来发现与追踪网络突发性热点话题。实例验证表明,利用该文的方法进行话题发现与追踪,可以获得有更高的准确率、召回率以及F度量。 Original x2 test formula favors low frequency words when it measures bursty of hot term. To overcome this problem, the im- provedx2 test formula combined with TF is proposed. In this approach, the term frequency summary, an impact factor 13 to the document statistics, is introduced into the original x2 test formula. The experimental results show the dynamic bursty vector space model achieved high- er precision, recall and F-measure in online bursty topic detection and tracking, when dynamic bursty lexicon is constructed according to the bursty measurement using the improved x2 test.
出处 《计算机与数字工程》 2013年第11期1788-1790,共3页 Computer & Digital Engineering
基金 国家语委"十二五"科研规划项目(编号:YB125-49) 教育部科学技术研究重点项目(编号:212167) 中央高校基本科研业务费专项资金科技创新项目(编号:SWJTU12CX096) 国家级大学生创新创业训练计划项目(编号:201210694017)资助
关键词 突发性热点词 χ2检验 词频 动态突发性词库 bursty of hot term, x2 test formula, term frequency, dynamic bursty lexicon
  • 相关文献

参考文献6

  • 1薛峰,周亚东,高峰,刘霁,赵俊舟,党琪.一种突发性热点话题在线发现与跟踪方法[J].西安交通大学学报,2011,45(12):64-69. 被引量:23
  • 2FUNG G P C,YU J X,LIU H C.Time-dependent event hierarchy construction[C] //BERKHIN P,eds.Proceedings of the 21st Annual International Conference on Knowledge Discovery and Data Mining.New York,USA:ACM,2007:300-309.
  • 3FUNG G P C,YU J X,YU PS,et al.Parameter free bursty events detection in text streams[C] //Proceedings of the 31st International Conference on Very Large Data Bases.Trondheim,Norway:VLDB Endowment,2005:181-192.
  • 4WANG X H,ZHAI C X,HU X,et al.Mining correlated bursty topic patterns from coordinated text streams[C] //BERKHIN P,et al.Proceedings of the Thirteenth ACM International Conference on Knowledge Discovery and Data Mining.New York,USA:ACM,2007:784-793.
  • 5SUBA I.From bursty patterns to bursty facts:the effectiveness of temporal text mining for news[C] //Proceedings of 19th Artificial Intelligence.Fairfax,VA,IOS Press,2010:517-522.
  • 6SWAN R,ALLAN J.Automatic generation of overview timelines[C] //Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York,USA:ACM,2000:49-56.

二级参考文献16

  • 1CROFT B, METZLER D, STROHMAN T. Search engines: information retrieval in practice [M]. Reading, MA, USA: Addison-Wesley Publishing Company, 2009: 552.
  • 2LI Hong, WEI Jinfeng. Netnews bursty hot topic detection based on bursty features [C] // Proceedings of International Conference on E-Business and E-Government. Washington DC, USA: IEEE, 2010:1437- 1440.
  • 3HOLZ F, TERESNIAK S. Towards automatic detection and tracking of topic change[M] // GELBUKH A. Computational Linguistics and Intelligent Text Processing. Berlin, Germany: Springer-Verlag, 2010: 327-339.
  • 4JING Qiu, LIAO Lejian, DONG Xiujie. Topic detection and tracking for Chinese news web pages [C]// Proceedings of Seventh International Conference on Advanced Language Processing and Web Information Technology. Washington DC, USA: IEEE Computer Society, 2008: 114-120.
  • 5ALLAN J, PAPKA R, LAVRENKO V. On-line new event detection and tracking [C]//Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 1998: 37-45.
  • 6YANG Yiming, PIERCE T, CARBONELL J. A study of retrospective and on-line event detection [C] //Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information. New York, USA: ACM, 1998: 28-36.
  • 7FUNG G P C, YU J X, LIU H, et al. Time-dependent event hierarchy construction [C]//BERKHIN P, et al. Proceedings of the Thirteenth ACM International Conference on Knowledge Discovery and Data Mining. New York, USA. ACM, 2007: 300-309.
  • 8FUNG G P C, YU J X, YU P S, et al. Parameter free bursty events detection in text streams [C]//Proceedings of the 31st International Conference on Very Large Data Bases. Trondheim, Norway: VLDB Endowment, 2005. 181-192.
  • 9WANG Xuanhui, ZHAI Chengxiang, HU Xiao, et al. Mining correlated bursty topic patterns from coordina- ted text streams [C]//BERKHIN P, et al. Proceedings of the Thirteenth ACM International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2007: 784-793.
  • 10SUBA I, BERENDT B. From bursty patterns to bursty facts: the effectiveness of temporal text mining for news [C]//Proceedings of 19th European Conference on Artificial Intelligence. Fairfax, VA, USA: IOS Press, 2010: 517-522.

共引文献22

同被引文献19

  • 1翟东海,王佳君,聂洪玉,崔静静.基于互信息的热点词发现和突发性话题检测研究[J].西藏大学学报(社会科学版),2013,28(4):82-87. 被引量:2
  • 2贾自艳,何清,张海俊,李嘉佑,史忠植.一种基于动态进化模型的事件探测和追踪算法[J].计算机研究与发展,2004,41(7):1273-1280. 被引量:58
  • 3洪宇,张宇,刘挺,李生.话题检测与跟踪的评测及研究综述[J].中文信息学报,2007,21(6):71-87. 被引量:153
  • 4Li H, Wei J. Netnews bursty hot topic detection based on bursty fea-tures[C]. Proceedings of the 2010 International Conference on E-Busi-ness and E-Government.Guangzhou, China, 2010.
  • 5Holz F, Teresniak S. Towards automatic detection and tracking of topic change[M]//Computational linguistics and Intelligent Text Processing. Berlin Heidelberg: Springer, 2010: 327-339.
  • 6Qiu J, Liao L J, Dong X J. Topic detection and tracking for Chinese news web pages[C]. Advanced Language Processing and Web Information Tech-nology, 2008. ALPIT'08. International Conference on Dalian, China, 2008.
  • 7Allan J, Lavrenko V, Jin H. First story detection in TDT is hard[C]//Pro-ceedings of the Ninth International Conference on Information and Knowledge Management. Atlanta, Georgia ACM, 2000: 374-381.
  • 8Salton G, Wong A, Yang C S. A vector space model for automatic index-ing[J]. Communications of the ACM, 1975, 18(11): 613-620.
  • 9Fung G P C, Yu J X, Yu P S, et al. Parameter free bursty events detection in text streams[C]//Proceedings of the 31st International Conference on Very Large Data Bases. Trondheim, VLDB Endowment, 2005: 181-192.
  • 10Fung G P C, Yu J X, Liu H, et al. Time-dependent event hierarchy construction[C]//Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Jose: ACM, 2007: 300-309.

引证文献1

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部