With the rapid developments of Internet of Things(IoT)and proliferation of embedded devices,large volume of personal data are collected,which however,might carry massive private information about attributes that users...With the rapid developments of Internet of Things(IoT)and proliferation of embedded devices,large volume of personal data are collected,which however,might carry massive private information about attributes that users do not want to share.Many privacy-preserving methods have been proposed to prevent privacy leakage by perturbing raw data or extracting task-oriented features at local devices.Unfortunately,they would suffer from significant privacy leakage and accuracy drop when applied to other tasks as they are designed and optimized for predefined tasks.In this paper,we propose a novel task-free privacy-preserving data collection method via adversarial representation learning,called TF-ARL,to protect private attributes specified by users while maintaining data utility for unknown downstream tasks.To this end,we first propose a privacy adversarial learning mechanism(PAL)to protect private attributes by optimizing the feature extractor to maximize the adversary’s prediction uncertainty on private attributes,and then design a conditional decoding mechanism(ConDec)to maintain data utility for downstream tasks by minimizing the conditional reconstruction error from the sanitized features.With the joint learning of PAL and ConDec,we can learn a privacy-aware feature extractor where the sanitized features maintain the discriminative information except privacy.Extensive experimental results on real-world datasets demonstrate the effectiveness of TF-ARL.展开更多
基金supported by National Key R&D Program of China (Grant No. 2021ZD0112803)National Natural Science Foundation of China (Grants No. 62122066, U20A20182, 61872274)
文摘With the rapid developments of Internet of Things(IoT)and proliferation of embedded devices,large volume of personal data are collected,which however,might carry massive private information about attributes that users do not want to share.Many privacy-preserving methods have been proposed to prevent privacy leakage by perturbing raw data or extracting task-oriented features at local devices.Unfortunately,they would suffer from significant privacy leakage and accuracy drop when applied to other tasks as they are designed and optimized for predefined tasks.In this paper,we propose a novel task-free privacy-preserving data collection method via adversarial representation learning,called TF-ARL,to protect private attributes specified by users while maintaining data utility for unknown downstream tasks.To this end,we first propose a privacy adversarial learning mechanism(PAL)to protect private attributes by optimizing the feature extractor to maximize the adversary’s prediction uncertainty on private attributes,and then design a conditional decoding mechanism(ConDec)to maintain data utility for downstream tasks by minimizing the conditional reconstruction error from the sanitized features.With the joint learning of PAL and ConDec,we can learn a privacy-aware feature extractor where the sanitized features maintain the discriminative information except privacy.Extensive experimental results on real-world datasets demonstrate the effectiveness of TF-ARL.