may incur significant bandwidth for executing more com- plicated search queries such as multiple-attribute queries. In order to reduce query overhead, KSS (keyword-set search) by Gnawali partitions the index by a set ...may incur significant bandwidth for executing more com- plicated search queries such as multiple-attribute queries. In order to reduce query overhead, KSS (keyword-set search) by Gnawali partitions the index by a set of keywords. However, a KSS index is considerably larger than a standard inverted index, since there are more word sets than there are individual words. And the insert overhead and storage overhead are obviously un- acceptable for full-text search on a collection of documents even if KSS uses the distance window technology. In this paper, we extract the relationship information between query keywords from websites’ queries logs to improve performance of KSS system. Experiments results clearly demonstrated that the improved keyword-set search system based on keywords relationship (KRBKSS) is more efficient than KSS index in insert overhead and storage overhead, and a standard inverted index in terms of communication costs for query.展开更多
基金Project supported by the National Natural Science Foundation of China (No. 60221120145) and Science & Technology Committee of Shanghai Municipality Key Project (No. 02DJ14045), China
文摘may incur significant bandwidth for executing more com- plicated search queries such as multiple-attribute queries. In order to reduce query overhead, KSS (keyword-set search) by Gnawali partitions the index by a set of keywords. However, a KSS index is considerably larger than a standard inverted index, since there are more word sets than there are individual words. And the insert overhead and storage overhead are obviously un- acceptable for full-text search on a collection of documents even if KSS uses the distance window technology. In this paper, we extract the relationship information between query keywords from websites’ queries logs to improve performance of KSS system. Experiments results clearly demonstrated that the improved keyword-set search system based on keywords relationship (KRBKSS) is more efficient than KSS index in insert overhead and storage overhead, and a standard inverted index in terms of communication costs for query.