期刊文献+

基于事件分析的Web地震新闻时空信息挖掘研究 被引量:1

Web based extraction of spatiotemporal information of earthquake event by semantic technology
原文传递
导出
摘要 针对Web地震新闻挖掘的需求,采用网络爬虫抓取新闻文本作为研究语料,采用改进的TF-IDF(Term Frequency-Inverse Document Frequency)算法对语料集进行文本训练,选取权值较大的特征词初步识别地震类文档;采用特征词构成要素描述地震事件,构建了地震事件的知识框架;基于框架的要素特征词匹配从地震类文档中获取候选事件语句,对候选事件语句进行句法分析,总结出地震要素出现形式和规律,构造抽取规则,编写抽取算法,完成了地震事件识别和提取实验,并对地震事件提取的精度进行分析和评价,验证了该方法具有较高的地震事件识别和提取精度,是一种有前景的Web专题事件挖掘的途径. Aiming at the demands of earthquake news Web mining, the Web news texts are crawled as the research corpus; and an improved TF-IDF (Term Frequency-Inverse Document Frequency) algorithm is used for text training of corpus; and then the thematic words with the highest weights is selected to pre- liminarily identify the seismic texts; the four elements are used to described seismic events to build knowl- edge framework of seismic events recognition; the candidate thematic sentences from the seismic texts are obtained through the thematic word matching, and syntax analysis of candidate sentences are conducted; through summing up how the seismic elements appeared in sentences, and then the extraction rules are constructed and extraction algorithm is coded, and the seismic event identification and extraction experiments are fulfilled. Finally, the extraction accuracy of seismic events are analyzed and evaluated, so as to verify that the method proposed has a higher precision of seismic event identification and extraction, which is a promising approach of thematic event Web mining.
出处 《武汉大学学报(工学版)》 CAS CSCD 北大核心 2018年第2期183-188,共6页 Engineering Journal of Wuhan University
基金 国家自然科学基金资助项目(编号:41471323) 测绘遥感信息工程国家重点实验室专项科研经费资助
关键词 Web地震新闻 信息挖掘 事件框架 文本分析 Web earthquake news linformation mining event framework syntactic analysis
  • 相关文献

参考文献6

二级参考文献82

  • 1刘继岳.哲学与本体论[J].北京师范大学学报(社会科学版),1996(5):11-19. 被引量:5
  • 2姜吉发.一种跨语句汉语事件信息抽取方法[J].计算机工程,2005,31(2):27-29. 被引量:12
  • 3安杨,边馥苓,关佶红.GIS中地理本体的建立与比较[J].武汉大学学报(信息科学版),2006,31(12):1108-1111. 被引量:13
  • 4林尧璃 马少平.人工智能导论[M].北京:清华大学出版社,1989..
  • 5Srmivasan P, Menczer F, Pant G.A general evaluation framework for topic crawler [ J ]. Information Retrieval, 2005,8(3):417447.
  • 6Heaton J. Programming Crawlers,Bots and Aggregators in Java [EB/OL]. http://www.jeffheaton.com2004.
  • 7Jeff Heaton[美],董兆丰译.网络机器人JAVA编程指南[M].北京:北京电子工业出版社.2002.238-252
  • 8佟晓筠等.面向主题的智能机器人ROBOT研究与实现.电子与信息学报,2003,25.
  • 9[16]Hobbs J,Appelt D,Bear J et al.FASTUS:A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text[C].In:Roche,Schabes eds. Finite State Devices for Natural Language Processing, MIT Press,Cambridge MA, 1996
  • 10[17]Appelt D E.Introduction to Information Extraction[J].AI COMMUNICATIONS, 1999; 12(3)

共引文献298

同被引文献16

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部