Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the lea...Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the leakage of sensitive information.The segment pairs method(SPM),multiple-channel method(MCM)and prefix extending method(PEM)are three known LDP protocols for heavy hitter identification as well as the frequency oracle(FO)problem with large domains.However,the low scalability of these three LDP algorithms often limits their application.Specifically,communication and computation strongly affect their efficiency.Moreover,excessive grouping or sharing of privacy budgets makes the results inaccurate.To address the abovementioned problems,this study proposes independent channel(IC)and mixed independent channel(MIC),which are efficient LDP protocols for FO with a large domains.We design a flexible method for splitting a large domain to reduce the number of sub-domains.Further,we employ the false positive rate with interaction to obtain an accurate estimation.Numerical experiments demonstrate that IC outperforms all the existing solutions under the same privacy guarantee while MIC performs well under a small privacy budget with the lowest communication cost.展开更多
基金This work was supported by the National Key R&D Program of China(2018YFB1004401)the National Natural Science Foundation of China(NSFC)(Grant Nos.61772537,61772536,62072460,and 62076245)Beijing Natural Science Foundation(4212022).
文摘Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the leakage of sensitive information.The segment pairs method(SPM),multiple-channel method(MCM)and prefix extending method(PEM)are three known LDP protocols for heavy hitter identification as well as the frequency oracle(FO)problem with large domains.However,the low scalability of these three LDP algorithms often limits their application.Specifically,communication and computation strongly affect their efficiency.Moreover,excessive grouping or sharing of privacy budgets makes the results inaccurate.To address the abovementioned problems,this study proposes independent channel(IC)and mixed independent channel(MIC),which are efficient LDP protocols for FO with a large domains.We design a flexible method for splitting a large domain to reduce the number of sub-domains.Further,we employ the false positive rate with interaction to obtain an accurate estimation.Numerical experiments demonstrate that IC outperforms all the existing solutions under the same privacy guarantee while MIC performs well under a small privacy budget with the lowest communication cost.