In recent years,the research field of data collection under local differential privacy(LDP)has expanded its focus fromelementary data types to includemore complex structural data,such as set-value and graph data.Howev...In recent years,the research field of data collection under local differential privacy(LDP)has expanded its focus fromelementary data types to includemore complex structural data,such as set-value and graph data.However,our comprehensive review of existing literature reveals that there needs to be more studies that engage with key-value data collection.Such studies would simultaneously collect the frequencies of keys and the mean of values associated with each key.Additionally,the allocation of the privacy budget between the frequencies of keys and the means of values for each key does not yield an optimal utility tradeoff.Recognizing the importance of obtaining accurate key frequencies and mean estimations for key-value data collection,this paper presents a novel framework:the Key-Strategy Framework forKey-ValueDataCollection under LDP.Initially,theKey-StrategyUnary Encoding(KS-UE)strategy is proposed within non-interactive frameworks for the purpose of privacy budget allocation to achieve precise key frequencies;subsequently,the Key-Strategy Generalized Randomized Response(KS-GRR)strategy is introduced for interactive frameworks to enhance the efficiency of collecting frequent keys through group-anditeration methods.Both strategies are adapted for scenarios in which users possess either a single or multiple key-value pairs.Theoretically,we demonstrate that the variance of KS-UE is lower than that of existing methods.These claims are substantiated through extensive experimental evaluation on real-world datasets,confirming the effectiveness and efficiency of the KS-UE and KS-GRR strategies.展开更多
基金supported by a grant fromthe National Key R&DProgram of China.
文摘In recent years,the research field of data collection under local differential privacy(LDP)has expanded its focus fromelementary data types to includemore complex structural data,such as set-value and graph data.However,our comprehensive review of existing literature reveals that there needs to be more studies that engage with key-value data collection.Such studies would simultaneously collect the frequencies of keys and the mean of values associated with each key.Additionally,the allocation of the privacy budget between the frequencies of keys and the means of values for each key does not yield an optimal utility tradeoff.Recognizing the importance of obtaining accurate key frequencies and mean estimations for key-value data collection,this paper presents a novel framework:the Key-Strategy Framework forKey-ValueDataCollection under LDP.Initially,theKey-StrategyUnary Encoding(KS-UE)strategy is proposed within non-interactive frameworks for the purpose of privacy budget allocation to achieve precise key frequencies;subsequently,the Key-Strategy Generalized Randomized Response(KS-GRR)strategy is introduced for interactive frameworks to enhance the efficiency of collecting frequent keys through group-anditeration methods.Both strategies are adapted for scenarios in which users possess either a single or multiple key-value pairs.Theoretically,we demonstrate that the variance of KS-UE is lower than that of existing methods.These claims are substantiated through extensive experimental evaluation on real-world datasets,confirming the effectiveness and efficiency of the KS-UE and KS-GRR strategies.