In this paper,we study the problem of privacy-preserving top-k keyword similarity search over outsourced cloud data.Taking edit distance as a measure of similarity,we first build up the similarity keyword sets for all...In this paper,we study the problem of privacy-preserving top-k keyword similarity search over outsourced cloud data.Taking edit distance as a measure of similarity,we first build up the similarity keyword sets for all the keywords in the data collection.We then calculate the relevance scores of the elements in the similarity keyword sets by the widely used tf-idf theory.Leveraging both the similarity keyword sets and the relevance scores,we present a new secure and efficient treebased index structure for privacy-preserving top-k keyword similarity search.To prevent potential statistical attacks,we also introduce a two-server model to separate the association between the index structure and the data collection in cloud servers.Thorough analysis is given on the validity of search functionality and formal security proofs are presented for the privacy guarantee of our solution.Experimental results on real-world data sets further demonstrate the availability and efficiency of our solution.展开更多
基金supported partly by the following funding agencies:the National Natural Science Foundation(No.61170274)the Innovative Research Groups of the National Natural Science Foundation(No.61121061)+1 种基金the National Key Basic Research Program of China (No.2011CB302506)Youth Scientific Research and Innovation Plan of Beijing University of Posts and Telecommunications(No. 2013RC1101)
文摘In this paper,we study the problem of privacy-preserving top-k keyword similarity search over outsourced cloud data.Taking edit distance as a measure of similarity,we first build up the similarity keyword sets for all the keywords in the data collection.We then calculate the relevance scores of the elements in the similarity keyword sets by the widely used tf-idf theory.Leveraging both the similarity keyword sets and the relevance scores,we present a new secure and efficient treebased index structure for privacy-preserving top-k keyword similarity search.To prevent potential statistical attacks,we also introduce a two-server model to separate the association between the index structure and the data collection in cloud servers.Thorough analysis is given on the validity of search functionality and formal security proofs are presented for the privacy guarantee of our solution.Experimental results on real-world data sets further demonstrate the availability and efficiency of our solution.