摘要
Lucene是一个高性能、易扩展的基于Java技术的全文信息检索工具包,它能非常方便地为各种应用程序加入全文索引和搜索功能。该文探讨了Lucene中使用的向量空间模型,分析了Lucene索引文件的结构以及搜索排序算法,讨论了Lucene的压缩算法并且通过实验验证了Lucene的建立索引的过程。
As an information retrieval library written in Java, Lucene, with high performance and easy to scale, can easily add searching and indexing capabilities to applications. This paper discusses the vector space model used in Lucene, analyzes the structure of index files and ranking algorithm, and describes the compressing algorithm in Lucene, An experiment is done to test the indexing process of Lucene.
出处
《计算机工程》
CAS
CSCD
北大核心
2007年第18期95-96,118,共3页
Computer Engineering