期刊文献+

网络竞争情报主题采集技术研究 被引量:5

Focused Crawler Based Network Competitive Intelligence Acquisition
下载PDF
导出
摘要 文章设计与实现了一种网络竞争情报的主题采集系统。该系统在进行主题预测时采用的基于改进的朴素贝叶斯算法提高了主题判断准确率,在进行链接预测时采用的基于规则与锚文本主题相似度结合的算法,避免了URL锚文本较短和噪声的问题。与宽度优先的采集技术相比,通过实验验证该方法具有明显的优越性。 This paper designs and implements the network competitive intelligence acquisition system based on focused crawler. The Webpage''s topic is predicted by an improved Nave Bayes algorithm, which can improve the accuracy rate. The URL''s topic is predicted by the rule and anchor text similarity combined algorithm, which can avoid the problems of URL anchor text short and noise. Compared with the breadth-first acquisition techniques, experimental results show that the method has obvious advantages.
作者 田雪筠
出处 《图书与情报》 CSSCI 北大核心 2014年第5期132-137,共6页 Library & Information
关键词 竞争情报 主题爬虫 链接过滤 主题过滤 competitive intelligence focused crawler URL filtering topic filtering
  • 相关文献

参考文献10

二级参考文献89

共引文献75

同被引文献54

  • 1祝宇,夏诏杰,聂峰光,郭力.支持向量机在化学主题爬虫中的应用[J].计算机与应用化学,2006,23(4):329-332. 被引量:8
  • 2国家突发公共事件总体应急预案[EB/OL]http ://www. gov. cn/yjgl/2005 - 08/07/content-21048. htm,2005 - 08 - 07
  • 3杨丽英,李红娟,张永奎.突发事件新闻语料分类体系研究[A].中文信息处理前沿进展--中国中文信息学会二十五周年学术会议论文集[C].北京:中国中文信息学会,2006:403-409.
  • 4国家特别重大、重大突发公共事件分级标准(试行)[EB/OL].[2015-02-10].http://www.xjhc.gov.cn/zwgk/ShowArticle.asp.ArticleID=61914.
  • 5突发公共事件分级标准[EB/OL].[2015-02-20].http://www.jinshui.gov.cn/jswwzz/zwgk/yjgl/webinfo/2010/11/1288849327699257.htm.
  • 6广东省突发公共卫生事件应急预案[EB/0L].[2015-02-20].http://www.gdemo.gov.cn/zt/2013fangxun/yjya/201308/t20130823_184486.htm.
  • 7Steven Fink, Crisis Management Planning for the Inevitable[M].New York: American Management Association, 1986:20-21.
  • 8Ian I. Mitroff, Gus Anagnos,Managing Crises Before They Happen: What Every Executive and Manager Needs to Knowabout Crisis Management[M].New York: American Management Association,2001:30-33.
  • 9Mohamed Shaluf I,Ahmadun F R. Disaster types in Malaysia: an overview [J]. Disaster Prevention and Management: AnInternational Journal, 2006, 15(2): 286-298.
  • 10CHAKRABARTI S, VAN DEN BERG M, DOM B. Focused crawling: a new approach to topic-specific Web resource discovery [J]. Computer Networks, 1999, 31 (11 ): 1623-40.

引证文献5

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部