期刊文献+

针对中文检索的Lucene改进策略 被引量:10

STRATEGIES TO IMPROVE LUCENE AIMING AT THE CHINESE SEARCH
下载PDF
导出
摘要 为了提高基于Lucene中文检索系统的检索精度和效率,通过分析Lucene的结构,在系统中加入了中文分词模块和索引文档预处理模块。给出了具体的实验方法和实验过程,对改进原理和实验数据进行了分析,表明了加入中文分词模块和在索引预处理模块中采用提取特定数量的特征词来替代文档的方法能够有效提高Lucene检索系统的效率和精度,增强Lucene检索系统中文的性能。 To improve the efficiency and accuracy of retrieval system based on Lucene in searching Chinese information, we add the Chinese word segmentation module and indexing documents pretreatment module into the system by analyzing the structure of Lucene. The specific way and process of experiment are given in the paper. Both the analysis of improvement principle in theoretic and the experimental results prove that, by substituting documents with specific quantity of characteristic words picked up in index pretreatment module, this method can effectively improve the efficiency and precision of Lucene retrieval system and enhance the proficiency of Lucene in searching Chinese words.
作者 索红光 孙鑫
出处 《计算机应用与软件》 CSCD 2009年第6期175-177,共3页 Computer Applications and Software
关键词 LUCENE 索引 中文分词 文档预处理 Lucene Index Chinese word segmentation Documents pretreatment
  • 相关文献

参考文献7

  • 1郎小伟,王申康.基于Lucene的全文检索系统研究与开发[J].计算机工程,2006,32(4):94-96. 被引量:68
  • 2向晖,郭一平,王亮.基于Lucene的中文字典分词模块的设计与实现[J].现代图书情报技术,2006(8):46-50. 被引量:27
  • 3Yuejie Zhang,Tao Zhang,Shijie Chen.Research on Lucene-based English-Chinese Cross-Language Information Retrieval.Journal of Chinese Language and Computing,2005,15(1):25-32.
  • 4Apache Lucene[CP].http://lucene.apache.org/ java/docs/.
  • 5Salton G,learnin A.Approach to Personalized information Filtering[D].Massachusetts Inst of Technology,1994.
  • 6搜狐分类数据.http://www.sogou.com/labs/dl/c.html,[2008-9-1]
  • 7Wu Z M.Tseng Ct Chinese Text Segmentation for Text Retrieval:Achievements and Problems.Journal of the American Society for Information Science,1993,44(9):532-542.

二级参考文献9

共引文献90

同被引文献59

引证文献10

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部