期刊文献+

垂直搜索引擎爬虫系统DIPRE算法及改进

下载PDF
导出
摘要 针对垂直搜索引擎中精确抽取网页中特定字段的问题,对DIPRE算法进行了研究和改进。阐述了DIPRE算法在垂直搜索引擎中的重要作用,探讨了DIPRE算法在抽取复杂结构网页时的不足,并提出了改进,包括种子定位方式,将单模匹配扩展成多模匹配并引入定位索引,再根据已有技术对改进后的算法进行了实验验证。结果表明,改进后的算法在精度和效率上都符合预期。
作者 赵君
出处 《软件导刊》 2016年第8期30-32,共3页 Software Guide
  • 相关文献

参考文献11

  • 1OREN KURLAND,LILLIAN LEE. PageRank without hyperlinks [J]. ACM Transactions on Information Systems (TOIS), 2010,28 (4) : 1-38.
  • 2LIU Gui-mei.An adaptive improvement on PageRank algorithm[J].Applied Mathematics(A Journal of Chinese Universities),2013,28(1):17-26. 被引量:3
  • 3GHOLAM R AMIN,ALI EMROUZNEJAD. Optimizing search en- gines results using linear programming[J]. Expert Systems With Applications, 2011,38 (9) :11534-11537.
  • 4LIN LI, GUANDONG XU, YANCHUN ZHANG, et al. Random walk based rank aggregation to improving web search[J]. Knowl- edge-Based Systems, 2011,24 (7) : 943-951.
  • 5E GARCIA,F PEDROCHE,M ROMANCE. On the localization of the personalized PageRank of complex networks[J]. Linear Alge- bra and Its Applications, 2013,439 (3) : 640-652.
  • 6SHAYAN A,TABRIZI, AZADEH SHAKERY, et al. Personalized pagerank clustering:a graph clustering algorithm based on random walks[J]. Physica A: Statistical Mechanics and its Applications, 2013,12(5) : 15-24.
  • 7ALEXGOH KWANG LENG, P RAVI KUMAR, ASHUTOSH- KUMAR SINGH,et al. Link-Based spam algorithms in adversarial information retrieval[J]. Cybernetics and Systems, 2012,43 (6) : 459-475.
  • 8LI LIAN,ZHU AI HONG,SU TAO. An improved text similarity calculation algorithm based on vsm[J]. Advanced Materials Re- search, 2011,1250(225) : 1105-1108.
  • 9LI MIN,ZHAO JUN. Research and design of the crawler system in a vertical search engine[C]. Guilin: In Proceedings of the 2010 In- ternational Conference on Intelligent Computing and Integrated Systems, 2010 : 790-792.
  • 10EVANTHIA E TRIPOLITI, DIMITRIOS I FOTIADIS, GEORGE MANIS. Modifications of the construction and voting mechanisms of the random forests algorithm[J]. Data & Knowledge Engineer- ing, 2013,87 (7) : 112-118.

二级参考文献2

共引文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部