摘要
针对当前分流策略无法应对高维数据的冗余度、复杂性以及动态变化,为了有效提升云计算高维数据的分流精度与分流效率,以云计算环境下高维数据为研究对象,提出基于数据分区的云计算高维数据均衡分流方法;通过分析高维数据分布特性,明确分区维度、数量以及边界,利用峰间低谷部分中任意点的对应扫描线,完成高维数据分区;采用构建的数据特征提取策略提取高维数据特征,经迭代更新数据聚类中心,实现高维数据均衡分流;通过模拟实验,以数据流标准方差与数据流比例标准方差为指标,验证数据流分流质量与负载均衡效果。结果表明,云计算高维数据均衡分流方法的数据流比例标准方差较小,具有较高的数据分流精度和效率。
In view of the fact that current streaming strategies cannot cope with redundancy,complexity,and dynamic changes of high-dimensional data,in order to effectively improve streaming accuracy and streaming efficiency of high-dimensional data for cloud computing,a balanced streaming method of high-dimensional data for cloud computing based on data partition was proposed taking high-dimensional data in cloud computing environment as the research object.By analyzing distribution characteristics of high-dimensional data,the partition dimension,number,and boundary were defined,and the corresponding scan lines of any point in the peak valley part were used to complete high-dimensional data partition.The constructed data feature extraction strategy was used to extract high-dimensional data features,and the data clustering center was updated iteratively to realize the balanced streaming of high-dimensional data.In the simulation experiment,the standard deviation of data flow and the standard deviation of data flow proportion were used as indexes to verify quality of data flow streaming and effect of load balancing.The results show that the standard deviation of data flow proportion of the high-dimensional data balanced streaming method for cloud computing is small,and it has higher accuracy and efficiency of data streaming.
作者
张露
尚艳玲
ZHANG Lu;SHANG Yanling(College of Computer and Information Engineering,Henan Normal University,Xinxiang 453007,Henan,China;Work Department of Radio and Television University,Anyang Vocational and Technical College,Anyang 455000,Henan,China;School of Automation,Nanjing Institute of Technology,Nanjing 211167,Jiangsu,China)
出处
《济南大学学报(自然科学版)》
CAS
北大核心
2022年第1期74-79,共6页
Journal of University of Jinan(Science and Technology)
基金
国家自然科学基金项目(61873120)。
关键词
数据分区
云计算
高维数据
均衡分流
特征提取
聚类中心
data partition
cloud computing
high-dimensional data
balanced streaming
feature extraction
clustering center