期刊文献+

不完整高维大数据的相似度度量方法研究

Research on Similarity Measurement Method of Incomplete High-Dimensional Big Data
下载PDF
导出
摘要 为提高不完整高维大数据的挖掘和检索能力,需要进行相似度度量研究,提出基于信息融合和模糊聚类的不完整高维大数据的相似度度量方法。构建不完整高维大数据的统计序列模型,采用大数据空间区域结构重组方法进行不完整高维大数据的相似度度量,提取不完整高维大数据的相似度的描述性统计特征量,结合量化回归分析方法,对提取的不完整高维大数据的关联特征集进行分类融合,构建基于模糊C均值聚类的不完整高维大数据信息融合模型,采用分段检验方法进行数据聚类中心寻优控制,实现不完整高维大数据的相似度度量与建模。仿真结果表明,采用该方法进行不完整高维大数据的相似度度量的准确性较好,特征匹配能力较强,提高了大数据的挖掘准确性和完整性。 In order to improve the ability of mining and retrieving incomplete high-dimensional data,a similarity measurement method based on information fusion and fuzzy clustering is proposed.A statistical sequence model is constructed,and the similarity measurement of incomplete high-dimensional large data is carried out through regional structure reorganization of large data space.The descriptive statistical features of incomplete high-dimensional large data similarity are extracted.Combining with quantitative regression analysis method,the associated feature set of incomplete high-dimensional large data is classified and fused,and an information fusion model based on fuzzy C-based mean clustering is built.The data clustering center is optimized by means of piecewise test,and the similarity measurement and modeling of incomplete high-dimensional and large data are realized.The simulation results show that the proposed method improves the accuracy and integrity of large data mining with better accuracy in similarity measurement and stronger feature matching ability.
作者 漆世钱 QI Shiqian(China Coast Guard Academy, Ningbo 315801,China)
机构地区 武警海警学院
出处 《信息工程大学学报》 2019年第4期487-491,共5页 Journal of Information Engineering University
基金 武警海警学院教学改革项目(KG201812) 教育部高教司教学改革项目(201802087033)。
关键词 不完整高维大数据 相似度度量 特征提取 挖掘 模糊聚类 incomplete high-dimensional big data similarity measurement feature extraction mining fuzzy clustering
  • 相关文献

参考文献9

二级参考文献64

  • 1Lee Y C,Zomaya A Y.Energy conscious scheduling for distributed computing systems under different operating conditions[J].IEEE Transactions on Parallel and Distributed Systems,2011,22(8):1374-1381.
  • 2Zong Z,Manzanares A,Ruan X,et al.EAD and PEBD:two energy-aware duplication scheduling algorithms for parallel tasks on homogeneous clusters[J].IEEE Transactions on Computers,2011,60(3):360-374.
  • 3Hou E S H,Ansari N,Ren H.A genetic algorithm for multiprocessor scheduling[J].IEEE Transactions on Parallel and Distributed Systems,1994,5(2):113-120.
  • 4Wu A S,Yu H,Jin S,et al.An incremental genetic algorithm approach to multiprocessor scheduling[J].IEEE Transactions on Parallel and Distributed Systems,2004,15(9):824-834.
  • 5Kashani M,Jahanshahi M.Using simulated annealing for task scheduling in distributed systems[C]//International Conference on Computational Intelligence,Modelling and Simulation,2009:265-269.
  • 6Wolpert D H,Macready W G.No free lunch theorems for optimization[J].IEEE Transactions on Evolutionary Computation,1997,1(1):67-82.
  • 7Chandrakasan A P,Sheng S,Brodersen R W.Low-power CMOS digital design[J].IEICE Transactions on Electronics,1992,75(4):371-382.
  • 8Burke E,Kendall G,Newall J,et al.Hyper-heuristics:an emerging direction in modern search technology[M]//International series in operations research and management science.US:Springer,2003:457-474.
  • 9Topcuoglu H,Hariri S,Wu M.Performance-effective and low-complexity task scheduling for heterogeneous computing[J].IEEE Transactions on Parallel and Distributed Systems,2002,13(3):260-274.
  • 10Burke E K,Kendall G,Soubeiga E.A tabu-search hyperheuristic for timetabling and rostering[J].Journal of Heuristics,2003,9(6):451-470.

共引文献196

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部