Keyword Search Over Relational Databases (KSORD) enables casual or Web users easily access databases through free-form keyword queries. Improving the performance of KSORD systems is a critical issue in this area. In...Keyword Search Over Relational Databases (KSORD) enables casual or Web users easily access databases through free-form keyword queries. Improving the performance of KSORD systems is a critical issue in this area. In this paper, a new approach CLASCN (Classification, Learning And Selection of Candidate Network) is developed to efficiently perform top-κ keyword queries in schema-graph-based online KSORD systems. In this approach, the Candidate Networks (CNs) from trained keyword queries or executed user queries are classified and stored in the databases, and top-κ results from the CNs are learned for constructing CN Language Models (CNLMs). The CNLMs are used to compute the similarity scores between a new user query and the CNs from the query. The CNs with relatively large similarity score, which are the most promising ones to produce top-κ results, will be selected and performed. Currently, CLASCN is only applicable for past queries and New All-keyword-Used (NAU) queries which are frequently submitted queries. Extensive experiments also show the efficiency and effectiveness of our CLASCN approach.展开更多
基金This work is supported by the National Natural Science Foundation of China under Grant Nos. 60473069 and 60496325.
文摘Keyword Search Over Relational Databases (KSORD) enables casual or Web users easily access databases through free-form keyword queries. Improving the performance of KSORD systems is a critical issue in this area. In this paper, a new approach CLASCN (Classification, Learning And Selection of Candidate Network) is developed to efficiently perform top-κ keyword queries in schema-graph-based online KSORD systems. In this approach, the Candidate Networks (CNs) from trained keyword queries or executed user queries are classified and stored in the databases, and top-κ results from the CNs are learned for constructing CN Language Models (CNLMs). The CNLMs are used to compute the similarity scores between a new user query and the CNs from the query. The CNs with relatively large similarity score, which are the most promising ones to produce top-κ results, will be selected and performed. Currently, CLASCN is only applicable for past queries and New All-keyword-Used (NAU) queries which are frequently submitted queries. Extensive experiments also show the efficiency and effectiveness of our CLASCN approach.