The development of technologies such as big data and blockchain has brought convenience to life,but at the same time,privacy and security issues are becoming more and more prominent.The K-anonymity algorithm is an eff...The development of technologies such as big data and blockchain has brought convenience to life,but at the same time,privacy and security issues are becoming more and more prominent.The K-anonymity algorithm is an effective and low computational complexity privacy-preserving algorithm that can safeguard users’privacy by anonymizing big data.However,the algorithm currently suffers from the problem of focusing only on improving user privacy while ignoring data availability.In addition,ignoring the impact of quasi-identified attributes on sensitive attributes causes the usability of the processed data on statistical analysis to be reduced.Based on this,we propose a new K-anonymity algorithm to solve the privacy security problem in the context of big data,while guaranteeing improved data usability.Specifically,we construct a new information loss function based on the information quantity theory.Considering that different quasi-identification attributes have different impacts on sensitive attributes,we set weights for each quasi-identification attribute when designing the information loss function.In addition,to reduce information loss,we improve K-anonymity in two ways.First,we make the loss of information smaller than in the original table while guaranteeing privacy based on common artificial intelligence algorithms,i.e.,greedy algorithm and 2-means clustering algorithm.In addition,we improve the 2-means clustering algorithm by designing a mean-center method to select the initial center of mass.Meanwhile,we design the K-anonymity algorithm of this scheme based on the constructed information loss function,the improved 2-means clustering algorithm,and the greedy algorithm,which reduces the information loss.Finally,we experimentally demonstrate the effectiveness of the algorithm in improving the effect of 2-means clustering and reducing information loss.展开更多
Many studies have shown evidence for significant changes in surface climate in different regions of the world and during different seasons over the past 100 years. Based on daily temperature and precipitation data fro...Many studies have shown evidence for significant changes in surface climate in different regions of the world and during different seasons over the past 100 years. Based on daily temperature and precipitation data from 720 climate stations in China, cluster analysis was used to identify regions in China that have experienced similar changes in the seasonal cycle of temperature and precipitation during the 1971-2000 climate normal period. Differences in 11-day averages of daily mean temperature and total precipitation between the first (1971-1985) and second (1986-2000) halves of the record were analyzed using the Mann- Whitney U test and the global κ-means clustering algorithm. Results show that most parts of China experienced significant increases in temperature between the two periods, especially in winter, although some of this warming may be attributable to the urban heat island effect in large cities. Most of western China experienced more precipitation in 1986-2000, while precipitation decreased in the Yellow River valley. Changes in the summer monsoon were also evident, with decreases in precipitation during the onset and decay phases, and increases during the wettest period.展开更多
文摘为了有效填充不完整的公交到站时间信息,提出了一种基于改进k~*-means算法的不完整到站时间的填充方法.根据到站流动人数、到站所属时段、站点间距离、站点间运行时间特征加权度量站点间相似性,对现有kmeans算法进行改进以构建公交站点间运行时间完备信息表.以北京市地面公交运行数据为例,验证了该方法的可靠性,并与线性拟合、最近邻插值、k-means算法等填充方法进行了对比试验.结果表明:该方法对不完整到站时间的填充率高于97%,且对已知到站时间平均填充误差不高于100 s.
基金Foundation of National Natural Science Foundation of China(62202118)Scientific and Technological Research Projects from Guizhou Education Department([2023]003)+1 种基金Guizhou Provincial Department of Science and Technology Hundred Levels of Innovative Talents Project(GCC[2023]018)Top Technology Talent Project from Guizhou Education Department([2022]073).
文摘The development of technologies such as big data and blockchain has brought convenience to life,but at the same time,privacy and security issues are becoming more and more prominent.The K-anonymity algorithm is an effective and low computational complexity privacy-preserving algorithm that can safeguard users’privacy by anonymizing big data.However,the algorithm currently suffers from the problem of focusing only on improving user privacy while ignoring data availability.In addition,ignoring the impact of quasi-identified attributes on sensitive attributes causes the usability of the processed data on statistical analysis to be reduced.Based on this,we propose a new K-anonymity algorithm to solve the privacy security problem in the context of big data,while guaranteeing improved data usability.Specifically,we construct a new information loss function based on the information quantity theory.Considering that different quasi-identification attributes have different impacts on sensitive attributes,we set weights for each quasi-identification attribute when designing the information loss function.In addition,to reduce information loss,we improve K-anonymity in two ways.First,we make the loss of information smaller than in the original table while guaranteeing privacy based on common artificial intelligence algorithms,i.e.,greedy algorithm and 2-means clustering algorithm.In addition,we improve the 2-means clustering algorithm by designing a mean-center method to select the initial center of mass.Meanwhile,we design the K-anonymity algorithm of this scheme based on the constructed information loss function,the improved 2-means clustering algorithm,and the greedy algorithm,which reduces the information loss.Finally,we experimentally demonstrate the effectiveness of the algorithm in improving the effect of 2-means clustering and reducing information loss.
基金supported by the National Natural Science Foundation of China(Grant No.40475031).
文摘Many studies have shown evidence for significant changes in surface climate in different regions of the world and during different seasons over the past 100 years. Based on daily temperature and precipitation data from 720 climate stations in China, cluster analysis was used to identify regions in China that have experienced similar changes in the seasonal cycle of temperature and precipitation during the 1971-2000 climate normal period. Differences in 11-day averages of daily mean temperature and total precipitation between the first (1971-1985) and second (1986-2000) halves of the record were analyzed using the Mann- Whitney U test and the global κ-means clustering algorithm. Results show that most parts of China experienced significant increases in temperature between the two periods, especially in winter, although some of this warming may be attributable to the urban heat island effect in large cities. Most of western China experienced more precipitation in 1986-2000, while precipitation decreased in the Yellow River valley. Changes in the summer monsoon were also evident, with decreases in precipitation during the onset and decay phases, and increases during the wettest period.