The existing solutions to keyword search in the cloud can be divided into two categories: searching on exact keywords and searching on error-tolerant keywords. An error-tolerant keyword search scheme permits to make ...The existing solutions to keyword search in the cloud can be divided into two categories: searching on exact keywords and searching on error-tolerant keywords. An error-tolerant keyword search scheme permits to make searches on encrypted data with only an approximation of some keyword. The scheme is suitable to the case where users' searching input might not exactly match those pre-set keywords. In this paper, we first present a general framework for searching on error-tolerant keywords. Then we propose a concrete scheme, based on a fuzzy extractor, which is proved secure against an adaptive adversary under well-defined security definition. The scheme is suitable for all similarity metrics including Hamming distance, edit distance, and set difference. It does not require the user to construct or store anything in advance, other than the key used to calculate the trapdoor of keywords and the key to encrypt data documents. Thus, our scheme tremendously eases the users' burden. What is more, our scheme is able to transform the servers' searching for error-tolerant keywords on ciphertexts to the searching for exact keywords on plaintexts. The server can use any existing approaches of exact keywords search to search plaintexts on an index table.展开更多
With more and more knowledge provided by WWW, querying and mining the knowledge bases have attracted much research attention. Among all the queries over knowledge bases, which are usually modelled as graphs, a keyword...With more and more knowledge provided by WWW, querying and mining the knowledge bases have attracted much research attention. Among all the queries over knowledge bases, which are usually modelled as graphs, a keyword query is the most widely used one. Although the problem of keyword query over graphs has been deeply studied for years, knowledge bases, as special error-tolerant graphs, lead to the results of the traditional defined keyword queries out of users' satisfaction. Thus, in this paper, we define a new keyword query, called confident r-clique, specific for knowledge bases based on the r-clique definition for keyword query on general graphs, which has been proved to be the best one. However, as we prove in the paper, finding the confident r-cliques is #P-hard. We propose a filtering-and-verification framework to improve the search efficiency. In the filtering phase, we develop the tightest upper bound of the confident r-clique, and design an index together with its search algorithm, which suits the large scale of knowledge bases well. In the verification phase, we develop an efficient sampling method to verify the final answers from the candidates remaining in the filtering phase. Extensive experiments demonstrate that the results derived from our new definition satisfy the users' requirement better compared with the traditional r-clique definition, and our algorithms are efficient.展开更多
基金supported by the National Natural Science Foundation of China under Grant Nos.61272436,61003232 and 61272404the Natural Science Foundation of Guangdong Province of China under Grant No.10351806001000000
文摘The existing solutions to keyword search in the cloud can be divided into two categories: searching on exact keywords and searching on error-tolerant keywords. An error-tolerant keyword search scheme permits to make searches on encrypted data with only an approximation of some keyword. The scheme is suitable to the case where users' searching input might not exactly match those pre-set keywords. In this paper, we first present a general framework for searching on error-tolerant keywords. Then we propose a concrete scheme, based on a fuzzy extractor, which is proved secure against an adaptive adversary under well-defined security definition. The scheme is suitable for all similarity metrics including Hamming distance, edit distance, and set difference. It does not require the user to construct or store anything in advance, other than the key used to calculate the trapdoor of keywords and the key to encrypt data documents. Thus, our scheme tremendously eases the users' burden. What is more, our scheme is able to transform the servers' searching for error-tolerant keywords on ciphertexts to the searching for exact keywords on plaintexts. The server can use any existing approaches of exact keywords search to search plaintexts on an index table.
基金Yu-Rong Cheng and Guo-Ren Wang are supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. 61332006, 61332014, 61328202 and U1401256. Ye Yuan is supported by the NSFC under Grant No. 61572119 and the Fundamental Research Fudnds for the Central Universities of China under Grant Nos. N150402005 and N130504006. Lei Chen is supported by the NSFC under Grant No. 61328202.
文摘With more and more knowledge provided by WWW, querying and mining the knowledge bases have attracted much research attention. Among all the queries over knowledge bases, which are usually modelled as graphs, a keyword query is the most widely used one. Although the problem of keyword query over graphs has been deeply studied for years, knowledge bases, as special error-tolerant graphs, lead to the results of the traditional defined keyword queries out of users' satisfaction. Thus, in this paper, we define a new keyword query, called confident r-clique, specific for knowledge bases based on the r-clique definition for keyword query on general graphs, which has been proved to be the best one. However, as we prove in the paper, finding the confident r-cliques is #P-hard. We propose a filtering-and-verification framework to improve the search efficiency. In the filtering phase, we develop the tightest upper bound of the confident r-clique, and design an index together with its search algorithm, which suits the large scale of knowledge bases well. In the verification phase, we develop an efficient sampling method to verify the final answers from the candidates remaining in the filtering phase. Extensive experiments demonstrate that the results derived from our new definition satisfy the users' requirement better compared with the traditional r-clique definition, and our algorithms are efficient.