期刊文献+

基于Lucene的数字作品搜索引擎的研究与设计 被引量:10

Research and design of search engine for digital works based on Lucene
下载PDF
导出
摘要 在Lucene的全文检索工具包的基础上,分析现有的主流中文分词算法和Lucene相关度排序算法,提出了改进的分词算法和改进的相关度排序算法。还采用倒排索引、检索技术、分布式存储和并行计算等技术,分析并设计了一个对海量数字作品信息的搜索引擎,为用户提供对海量数字作品信息的快速、准确的搜索服务。实验分析比较了分词速度和分词效果,还比较了关键词搜索结果的响应时间、命中数量、准确率和召回率。实验结果表明,本系统在很大程度上提高了搜索速度,保证了搜索结果的准确性。 On the basis of the Lucene's full-text retrieval toolkit, the current main Chinese word segmentation algorithm and the Lucene relevance sorting algorithm was analyzed, and an improved segmen- tation algorithm and an improved relevance sorting algorithm were proposed. The paper also used the inverted index, search technologies, distributed storage and parallel computing to analyze and design a search engine for the massive digital works, thus providing users with fast and accurate search service of massive digital works. The experiments compared the segmentation speed, segmentation results and the response time of the keyword search results, the hit number, accuracy and recall rate. The experiment results show that this system does improve the search speed and ensure the accuracy of search results.
出处 《计算机工程与科学》 CSCD 北大核心 2013年第5期166-172,共7页 Computer Engineering & Science
基金 国家科技部支撑计划课题基金资助项目(2012BAH04f03) 科研基地-科研创新平台资助项目(PXM2013_014212_000011)
关键词 LUCENE 分词算法 索引 相关度排序算法 分布式 Lucene segmentation algorithm index relevance sorting algorithm distributed
  • 相关文献

参考文献2

二级参考文献16

  • 1Baeza-Yates R, Ribeiro-Neto B. Modern Information Retrieval.Addison Wesley, Essex, England,1999
  • 2Bergman M K. The Deep Web: Surfacing Hidden Value. The Journal of Electronic Publishing, 2001,7 (1)
  • 3Selberg E, Etzioni O. Multi-service search and comparison using the meta crawler. In: Proc. of the fourth Int'l Confon the World Wide Web. Boston, USA, 1995. 195-208
  • 4White J E. Mobile agents. In: Bradshaw, Jeffrey, eds. Soft-ware Agents, Menlo Pork/California: AAAI Press/ TheMI TPress,1996
  • 5Lange DB. Mobile objects and mobile agents: The future of distributed computing. In: Proc. of the Europe an Confon Oh-ject-Oriented Programming'98. Brussels, 1998
  • 6Chess D M, Harrison C G, Kershenbaum A. Mobile agents: Are the ya good idea: [Tech Rep: RC 19887]. IBM T J Waston Research Center, 1995
  • 7Detal M. MASIF: The OMG mobile agent system inter operability facility. In: Proe. of the Second Int'l Workshop on Mobile Agents, Stuttgart, Germany, 1998. 50-67
  • 8Puliafito A, Riccobene S, Scarpa M. An analytical comparison of the client-server,remote evaluation and mobile agents paradigms.In: Proc. of First International Symposium on Agent Systems and Applications/Third Int' 1 Symposium on Mobile Agents. Palm Springs, 1999. 278-292
  • 9Gray R S. Agent TCL: A flexible and secure mobile agent system: [Ph D dissertation]. Computer Science Department, Dartmouth College, Hanover,1997
  • 10Zhang M. Study on Web text information retrieval:[Ph. D. Thesis]. Beijing: Tsinghua University, 2003 (in Chinese with English abstract) ]

共引文献35

同被引文献96

  • 1陈亮,屠成宇.基于TCAM的大容量文本搜索[J].计算机工程,2005,31(5):210-212. 被引量:2
  • 2刘卫昌,马增良.企业综合自动化系统中实时数据库系统设计[J].计算机应用研究,2005,22(8):146-149. 被引量:7
  • 3陈明晶,姚建荣,唐志豪.电子商务系统的商品搜索算法研究[J].计算机工程与应用,2006,42(3):209-211. 被引量:5
  • 4孔伯煊,李祥.基于Lucene\XML技术的Web搜索引擎设计与实现[J].航空计算技术,2006,36(4):5-8. 被引量:6
  • 5中文分词.http://baike.baidu.com/view/19109.htm.
  • 6D. Shieh, These lectures are gone in 60 seconds. The Chronicle of Higher Education, vol. 6, pp. A1-A13, 2009.
  • 7M. L. Crescente and D. Lee, Critical issues of m-learning: design models, adoption processes, and future trends. Journal of the Chinese Institute of Industrial Engineers, vol. 28, no. 2, pp. 111-123, 2011.
  • 8X. Zhao, X. Wan, and T. Okamoto, Adaptive content delivery in ubiq- uitous learning environment. In the 6th IEEE International Conferen- ceon Wireless, Mobile and Ubiquitous Technologies in Education (WMUTE). IEEE, 2010, pp. 19-26.
  • 9M. Goulish, 39 microlectures: in proximity of performance. Rout ledge, 2002.
  • 10A. V. Morris, "little lectures?". Innovative Higher Education, vol. 34, no. 2, pp. 67-68, 2009.

引证文献10

二级引证文献90

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部