期刊文献+

分布式网络多层次数据挖掘改进方法研究 被引量:5

Research on Improved Method of Hierarchical Data Mining in Distributed Network
下载PDF
导出
摘要 传统数据挖掘方法对数据挖掘时必须为高速通信网络,而且还导致系统响应时间延长,对数据安全性产生威胁。文中以分布式环境为背景,提出基于熵值思想的聚类挖掘改进方法,实现网络多层次数据挖掘。设定网络多层次数据聚类参数,计算产生新聚类数,将该数据值作为聚类搜索范围的上限值kmax,选取合适的有效性Silhouette指标,结合最大最小距离理论设置的聚类中心,获得最佳聚类数目;运用熵值理论及动态规划思想形成改进聚类挖掘方法,运用熵值理论判定数据属性权重值,并获取多层次数据对象与邻近数据间的权重关系,将欧氏距离当作数据相似度衡量依据;利用动态规划思想计算获得最大k个数据对象,确定多层次数据挖掘聚类中心。实验证明,利用文中改进数据挖掘方法可有效挖掘网络多层次数据中的有价值信息。 Traditional data mining methods must be high-speed communication networks for data mining,but also lead to longer response time and threat to data security. Based on the distributed environment,this paper proposes an improved clustering mining method based on entropy value to realize multi-level data mining. According to the network data of multi level clustering parameters set in advance,produce new clustering number by calculating the data value as clustering the search range of the upper limit of kmax,select the appropriate indicators of the effectiveness of Silhouette,with the maximum and minimum distance clustering center set theory,obtain the optimal number of clusters; using the entropy theory and dynamic programming form improvement clustering mining method,determine the data value of attribute weight by entropy theory,and obtain the weight hierarchy data object and the adjacent data between the Euclidean distance as a similarity measure based on the maximum data; k data object is calculated by using the dynamic programming to determine the multi-level data mining clustering center. Experimental results show that the improved data mining method can effectively mine valuable information in multi-level data.
作者 孙艳
机构地区 西安翻译学院
出处 《科技通报》 2018年第5期208-211,共4页 Bulletin of Science and Technology
基金 2017年西安市社科规划基金项目(项目编号:17Z61)
关键词 分布式网络 数据挖掘 多层次数据 有价值信息 distributed network data mining multilevel data valuable information
  • 相关文献

参考文献8

二级参考文献91

  • 1胡金林,梅士员.基于元数据扩展的空间数据质量管理方法[J].现代测绘,2004,27(3):21-24. 被引量:7
  • 2翁敬农.译.数据挖掘教程[M].北京:清华大学出版社,2003.
  • 3孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量:1076
  • 4陈黎飞,姜青山,王声瑞.基于层次划分的最佳聚类数确定方法[J].软件学报,2008,19(1):62-72. 被引量:82
  • 5Wu K,Otoo E J,Shoshani A,et al.Notes on design and implementation of compressed bit vectors,Technical Report LBNL/PUB-3161[R].Lawrence Berkeley National Laboratory,Berkeley,CA,2001.
  • 6Wu K.Fast Bit:An efficient indexing technology for accelerating data-intensive science[J].Journal of Physics:Conference Series,2005,16(1):556-560.
  • 7Chou J,Wu K.Design of Fast Query:How to generalize indexing and querying system for scientific data[D].University of California,2012.
  • 8Chou J,Wu K,Prabhat P.Fast Query:A parallel indexing system for scientific data[C]//Proceedings of IEEE International Conference on Cluster Computing,2011:455-464.
  • 9Gosink L J,Wu K,Bethel E W,et al.Data parallel binbased indexing for answering queries on multi-core architectures[M]//Scientific and Statistical Database Management.Berlin Heidelberg:Springer,2009:110-129.
  • 10Chou J,Wu K,Rubel O,et al.Parallel index and query for large scale data analysis[C]//Proceedings of International Conference on High Performance Computing,Networking,Storage and Analysis,2011:1-11.

共引文献75

同被引文献62

引证文献5

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部