期刊文献+

一种断点续传的多线程新闻组抓取方法及存储结构 被引量:2

Usenet-snatcher Based on Multithread and Mass-data Storage Supporting Breakpoint Transmission
原文传递
导出
摘要 针对新闻组的海量性及相关编码特点设计新闻帖抓取流程,采用多线程方式加快新闻帖的抓取及解析速度,同时设计一种便于海量数据断点续传的数据存储结构,通过实验采集验证该方法能有效达到信息检测的数据采集要求,抓取及解析新闻帖的速度比普通单线程抓取解析方式有显著提高。 A usenet - snatcher is designed based on multithread to improve the download - speed and MIME - parsing - speed and a storage schema supporting breakpoint transmission is also proposed. Experiment shows that the usenet - snatcher can gather articles effectively and gathering - speed under muhithread is faster than single thread.
出处 《现代图书情报技术》 CSSCI 北大核心 2011年第2期29-33,共5页 New Technology of Library and Information Service
基金 国家十一五科技支撑计划子课题"网络科技信息监测与评价"(项目编号:2006BAH03B05)的研究成果之一
关键词 新闻组 多线程 海量数据 网络新闻传输协议 Usenet Muhithread Mass - data NNTP
  • 相关文献

参考文献7

  • 1NewsAdmin/Usenet Statistics / Top 100 Binary Newsgroups by Postings [ EB/OL ]. [ 2010 - 01 - 09 ]. http ://www. newsadmin. com/top100bmsgs. asp.
  • 2Windows专区/一般软件使用[EB/OL].[2010-09-09].http://topic.csdn.net/20020924/10/1048150.html.
  • 3Developing an NNTP Newsgroup Reader [ EB/OL ]. [ 2009 - 08 - 24]. http://www. geekpedia. com/tutoria1212_Developing- an - NNTP -Newsgroup - Reader. html.
  • 4NNTP Library that Supports Post Retrieval with Attachments and a Lot More[ EB/OL]. [2009 - 09 - 20 ]. http ://www. codeproject. com/KB/IP/ngainntplibrary. aspx.
  • 5News Rover Usenet Newsreader [ EB/OL ]. [ 2009 - 04 - 20 ]. http ://www. newsrover, com/Search. htm? gclid = CLCQyZaS3KYC FYVypAod6 GdV8 A.
  • 6Binsearch - Usenet Search Engine [ EB/OL]. [ 2009 - 04 - 20 ]. http ://www3. binsearch. info/.
  • 7MIME Package [ EB/OL ]. [ 2009 - 04 - 20 ]. http ://docs. sun. com/source/816 - 6028 - 10/JavaRef/MIME/tree. html.

同被引文献13

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部