摘要
为应对大数据环境下用户用电的最佳聚类数的选择问题,提出一种用户用电行为的聚类优选策略;针对用户用电的行为复杂性和特征选择的有效性,通过一种基于信息量的方法对用户用电进行聚类。首先,引入评价指标,提出一种合理的聚类优选方法。然后,针对用户用电特征选择,提出基于互信息的特征优选算法。在传统聚类算法中,聚类数是随机给定的,其值选取不合理会使聚类陷入局部最优,基于此,根据“类内相似度最大化,类间相似度最小化”原理,提出距离评价函数并将其作为评判最佳聚类数的标准,采用多种方法进行综合分析得到最优聚类数。最后,用具体的用电数据对用户进行计算机仿真,验证了聚类优选策略的合理性。以自适应分布式聚类算法作为对比算法,进一步验证了所提算法的有效性。
In order to cope with the problem of selecting the optimal number of clusters of electricity consumption of users in a big data environment, a clustering optimization strategy for electricity consumption behaviors of users is proposed. In view of the complexity of users’ electricity consumption behavior and the effectiveness of feature selection, a clustering method based on information quantity is proposed. Firstly, an evaluation index is introduced, and a reasonable clustering optimization method is proposed. Then, for the user’s electricity consumption characteristics selection, a feature optimization algorithm based on mutual information is proposed. In the traditional clustering algorithm, the number of clusters is given randomly, and the unreasonable selection of its value will make the clustering enter the local optimal. Based on this, a distance evaluation function is proposed and used as a standard to evaluate the optimal number of clusters,according to the principle of "maximizing the similarity within a class and minimizing the similarity between classes". The optimal number of clusters is obtained by comprehensive analysis of various methods. Finally, the rationality of clustering optimization strategy is verified by computer simulation with specific electricity consumption data of users. At the same time, taking the adaptive distributed clustering algorithm as the comparison algorithm, the effectiveness of the proposed algorithm is further verified.
作者
张悦
宋运忠
ZHANG Yue;SONG Yunzhong(School of Electrical Engineering and Automation,Henan Polytechnic University,Jiaozuo 454000,China)
出处
《武汉大学学报(工学版)》
CAS
CSCD
北大核心
2022年第5期493-502,共10页
Engineering Journal of Wuhan University
基金
国家自然科学基金项目(编号:61340041,61374079)
河南省自然科学基金项目(编号:182300410112)。
关键词
智能化
数据挖掘
聚类算法
用户用电行为分析
有效性
intelligentization
data mining
clustering algorithm
analysis of electricity consumption behavior of users
effectiveness