The fundamental problem of similarity studies, in the frame of data-mining, is to examine and detect similar items in articles, papers, and books with huge sizes. In this paper, we are interested in the probabilistic,...The fundamental problem of similarity studies, in the frame of data-mining, is to examine and detect similar items in articles, papers, and books with huge sizes. In this paper, we are interested in the probabilistic, and the statistical and the algorithmic aspects in studies of texts. We will be using the approach of k-shinglings, a k-shingling being defined as a sequence of k consecutive characters that are extracted from a text (k ≥ 1). The main stake in this field is to find accurate and quick algorithms to compute the similarity in short times. This will be achieved in using approximation methods. The first approximation method is statistical and, is based on the theorem of Glivenko-Cantelli. The second is the banding technique. And the third concerns a modification of the algorithm proposed by Rajaraman et al. ([1]), denoted here as (RUM). The Jaccard index is the one being used in this paper. We finally illustrate these results of the paper on the four Gospels. The results are very conclusive.展开更多
With the rapid development of location-based services and online social networks,POI recommendation services considering geographic and social factors have received extensive attention.Meanwhile,the vigorous developme...With the rapid development of location-based services and online social networks,POI recommendation services considering geographic and social factors have received extensive attention.Meanwhile,the vigorous development of cloud computing has prompted service providers to outsource data to the cloud to provide POI recommendation services.However,there is a degree of distrust of the cloud by service providers.To protect digital assets,service providers encrypt data before outsourcing it.However,encryption reduces data availability,making it more challenging to provide POI recommendation services in outsourcing scenarios.Some privacy-preserving schemes for geo-social-based POI recommendation have been presented,but they have some limitations in supporting group query,considering both geographic and social factors,and query accuracy,making these schemes impractical.To solve this issue,we propose two practical and privacy-preserving geo-social-based POI recommendation schemes for single user and group users,which are named GSPR-S and GSPR-G.Specifically,we first utilize the quad tree to organize geographic data and the MinHash method to index social data.Then,we apply BGV fully homomorphic encryption to design some private algorithms,including a private max/min operation algorithm,a private rectangular set operation algorithm,and a private rectangular overlapping detection algorithm.After that,we use these algorithms as building blocks in our schemes for efficiency improvement.According to security analysis,our schemes are proven to be secure against the honest-but-curious cloud servers,and experimental results show that our schemes have good performance.展开更多
文摘The fundamental problem of similarity studies, in the frame of data-mining, is to examine and detect similar items in articles, papers, and books with huge sizes. In this paper, we are interested in the probabilistic, and the statistical and the algorithmic aspects in studies of texts. We will be using the approach of k-shinglings, a k-shingling being defined as a sequence of k consecutive characters that are extracted from a text (k ≥ 1). The main stake in this field is to find accurate and quick algorithms to compute the similarity in short times. This will be achieved in using approximation methods. The first approximation method is statistical and, is based on the theorem of Glivenko-Cantelli. The second is the banding technique. And the third concerns a modification of the algorithm proposed by Rajaraman et al. ([1]), denoted here as (RUM). The Jaccard index is the one being used in this paper. We finally illustrate these results of the paper on the four Gospels. The results are very conclusive.
基金supported by the National Key Research and Development Program of China(2021YFB3101300,2021YFB3101303)the Natural Science Foundation of China(U22B2030,62302374)+4 种基金Shaanxi Provincial Key Research and Development Program(2023-ZDLGY-35)China Postdoctoral Science Foundation(2022M722498)the Natural Science Basic Research Plan in Shaanxi Province of China(2023-JC-QN-0699)Qin Chuangyuan Cited High-level Innovative and Entrepreneurial Talents Project(QCYRCXM-2022-244)the Science and Technology on Communication Networks Laboratory(HHX23641X003)。
文摘With the rapid development of location-based services and online social networks,POI recommendation services considering geographic and social factors have received extensive attention.Meanwhile,the vigorous development of cloud computing has prompted service providers to outsource data to the cloud to provide POI recommendation services.However,there is a degree of distrust of the cloud by service providers.To protect digital assets,service providers encrypt data before outsourcing it.However,encryption reduces data availability,making it more challenging to provide POI recommendation services in outsourcing scenarios.Some privacy-preserving schemes for geo-social-based POI recommendation have been presented,but they have some limitations in supporting group query,considering both geographic and social factors,and query accuracy,making these schemes impractical.To solve this issue,we propose two practical and privacy-preserving geo-social-based POI recommendation schemes for single user and group users,which are named GSPR-S and GSPR-G.Specifically,we first utilize the quad tree to organize geographic data and the MinHash method to index social data.Then,we apply BGV fully homomorphic encryption to design some private algorithms,including a private max/min operation algorithm,a private rectangular set operation algorithm,and a private rectangular overlapping detection algorithm.After that,we use these algorithms as building blocks in our schemes for efficiency improvement.According to security analysis,our schemes are proven to be secure against the honest-but-curious cloud servers,and experimental results show that our schemes have good performance.