期刊文献+

面向历史灾害地震的Web信息精确抽取与分析方法 被引量:4

A Method of Accurate Extraction and Analysis of Web Data on Historical Disaster Earthquakes
下载PDF
导出
摘要 以中国大陆地区灾害地震目录为基础,选取2010—2019年灾害地震的互联网信息,提出基于百度搜索引擎的信息获取技术,并以“时间、地名、震级”为关键词,设计一套URL生成规则。使用该技术进行百度检索,得到前100个站点的主体文字信息,建立地震信息基础语料库,形成灾害地震的网络灾情信息获取方法;通过采用已有的停用词词库剔除无用信息,对爬取到的信息进行初步清洗工作,进一步深入挖掘隐含信息,探索灾害关联关系,为震后互联网灾情信息快速获取建立基础。 Taking the earthquake catalogs from the Internet information of earthquakes between 2010 to 2019 in China's Mainland as an example,we propose an information acquisition technique based on Baidu search engine,and generate a set of URL generation rules with “time,place name,magnitude” as keywords. The first 100 sites retrieved by Baidu by using this technique are used to build a basic corpus of earthquake information and to form a method for acquiring Internet disaster information on earthquakes. The existing deactivation thesaurus is used to eliminate useless information,and then to conduct preliminary cleaning of the crawled information. The further digging into the implied information is performed in order to explore disaster correlations,and to establish a basis for rapid acquisition of Internet disaster information after earthquakes.
作者 文鑫涛 郑通彦 王钟浩 李华玥 李晨曦 吕文超 Wen Xintao;Zheng Tongyan;Wang Zhonghao;Li Huayue;Li Chenxi;LüWenchao(China Earthquake Networks Center,Beijing 100045,China;Institute of Disaster Prevention,Sanhe 065201,Hebei,China)
出处 《中国地震》 北大核心 2021年第4期819-828,共10页 Earthquake Research in China
基金 地震应急信息快速可视化技术研究(2018YFC1504506)资助。
关键词 灾害地震 WEB信息抽取 灾情信息获取 数据分析 Disaster earthquake Web information extraction Disaster information acquisition Data analysis
  • 相关文献

参考文献5

二级参考文献36

  • 13GPP. 3GPP-LTE [ EB/OL]. http://www. 3gpp. org/LTE,2012.
  • 2RFC. Caching in I-ITrP[ EB/OL]. http://www, w3. org/Proto- cols/rfc2616/rfc2616-sec13, html, 1996.
  • 3Rhea S C, Liang K, Brewer E. Value-based Web caching [ C ]. Proceedings of the 12th International Conference on World Wide Web, 2003: 619-628.
  • 4Mickens J. Silo: exploiting JavaScript and DOM storage for faster page loadsE C]. Proceedings of the 2010 USENIX Conference onWeb Application Development, 2010.
  • 5Webkit. SunSpider JavaScript benchmark [ EB/OL]. http:// www. webkit, org/perf/sunspider/sunspider, html, 2010.
  • 6Google. V8 JavaScript engine[EB/OL], http://code, google. com/p/v8/, 2008.
  • 7Zhang K, Wang L, Pan A, et al. Smart caching for web browsers [C]. Proceedings of the 19th International Conference on World Wide Web, 2010: 491-500.
  • 8Mesbah A, Mirshokraie S. Automated analysis of CSS rules to sup- port style maintenance[ C ]. Proceedings of the 2012 International Conference on Software Engineering, 2012: 408-418.
  • 9Weber J. A closer look at Intemet explorer 9 hardware acceleration through flying images [ EB/OL 1. http ://blogs. msdn. com/b/ie/ archive/2O l O /O4 /O7 / a-closer-look-at-intemet-explorer-9 -hardware- acceleration-through-flying-images, aspx, 2010.
  • 10Yahoo. Best practices for speeding up your Web site [ EB/OL ]. http://developer, yahoo, com/performance/rules, html, 2008.

共引文献21

同被引文献42

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部