摘要
目前的隐私保护算法尚未充分考虑位置的语义信息,这极大影响了个人隐私安全.针对该问题,提出一种基于停留点和位置语义的TPSS算法来保护用户的真实位置数据,切断隐私信息泄露的源头.首先过滤掉异常位置并提取出具有代表性的停留位置点,从而有效降低数据处理量,缓解服务器的性能瓶颈.然后通过地理信息和中文维基百科语料库预训练Word2Vec词向量模型,计算出位置间的语义相似度.再利用多属性决策模型评估各位置在地理距离、位置语义和服务请求概率方面的表现以生成安全匿名集.最后,为停留点轨迹添加基于指数分布的噪声以进一步混淆真实数据.实验结果证明,该算法有效提高了位置语义的使用效果,在位置熵、语义、轨迹相似度等方面具有竞争力.
The existing privacy protection doesn′t fully consider location semantics,which greatly affects the individual privacy security.In order to address this problem,a new TPSS algorithm based on stay-points and semantic information is proposed to protect users′real locations and cut off the source of privacy disclosure.Firstly,TPSS algorithm filter out unusual location point and extract representative stay-points,thereby effectively reducing data throughput and relaxing the performance bottleneck of the LBS server.Secondly,the algorithm pretreatments the Word2Vec word vector model based on Chinese Wikipedia corpus and geographic information to obtain semantic similarity between positions.Then,it generates secure anonymous sets with a multi-attribute decision model,which evaluates performance of locations in terms of geographic distance,location semantics,and the probability of service requests.Finally,Exponential Distribution noise which conforms to differential privacy is added to further confuse the real data.Experimental results show that this method greatly improves the availability of semantic and keeps competitive in terms of position entropy,semantics,trajectory similarity and so on.
作者
陆佳瑜
张琳
雷诚
王汝传
LU Jiayu;ZHANG Lin;LEI Cheng;WANG Ruchuan(College of Computer,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks,Nanjing 210003,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2024年第10期2500-2507,共8页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61872196,61872194)资助
江苏省科技支撑计划基金项目(BE2017166)资助
南京邮电大学校级自然科学基金项目(NY222142)资助.