期刊文献+

考虑数据分布特征的多属性数据完备化方法研究 被引量:2

Research on Multi-attribute Data Completion Method Considering Data Distribution Characteristics
下载PDF
导出
摘要 对于数据完备化方法,极大似然估计方法适合大样本数据,K近邻算法仅考虑不同数据同一属性间的线性关系,BP神经网络算法虽考虑了数据属性间的非线性联系,但样本分布对数据完备化效果影响较大。文章利用DBSCAN密度聚类方法对样本数据进行分类,分析其分布特征,剔除噪声数据选择训练样本,运用BP神经网络拟合数据属性间的非线性关系,预测数据缺失值。实例数据集分析结果显示,考虑数据分布特征的BP神经网络算法的数据完备化准确率最优。 For the data completion method,the maximum likelihood estimation method is suitable for large sample data.The K-nearest neighbor algorithm only considers the linear relationship between the same attributes of different data.The BP neural network algorithm considers the nonlinear relationship between data attributes,but the sample distribution has a great influence on the data completion effect.This paper uses DBSCAN density clustering method to classify the sample data,analyzes its distribution characteristics,eliminates the noisy data and selects training samples,employs BP neural network to fit the nonlinear relationship between data attributes,and predicts the values of missing data.The results of analysis on instance data set show that the BP neural network algorithm considering the data distribution characteristics has the best data completion accuracy.
作者 汪勇 李好 王静 Wang Yong;Li Hao;Wang Jing(Evergrande School of Management,Wuhan University of Science and Technology,Wuhan 430081,China)
出处 《统计与决策》 CSSCI 北大核心 2020年第24期15-19,共5页 Statistics & Decision
关键词 数据完备 密度聚类 样本分类 BP神经网络 机器学习 data completion density clustering sample classification BP neural network machine learning
  • 相关文献

参考文献3

二级参考文献21

  • 1L.基什.抽样调查[M].北京:中国统计出版社,1997.
  • 2PAWLAK Z.Rough sets[J].International Journal of Computer and Information Science,1982,11(5):341-356.
  • 3LINGRAS P J,YAO Y Y.Data mining using extensions of the rough set model[J].Journal of the American Society for Information Science,1998,49(5):415-422.
  • 4TSUMOTO S.Automated discovery of positive and negative knowledge in clinical databases based on rough set model[J].IEEE EMB Magazine,2000,19(4):56-62.
  • 5WOJCIK Z M.Detecting spots for nasa space programs using rough sets[C] // Proceedings of the 2nd International Conference on Rough Sets and Current Trends in Computing.London:Springer-Verlag,2000:531-537.
  • 6PAWLAK Z.Rough set approach to multi-attribute decision analysis[J].European Journal of Operational Research,1994,72(3):443-459.
  • 7KRYSZKEIWICZ M.Rough set approach to incomplete information systems[J].Information Science,1998,112(1):39-49.
  • 8STEFAMOWSKI J,TSOUKEAS A.On the extension of rough sets under incomplete information[J].International Journal of Intelligent System,2000,16(1):29-38.
  • 9WANG GUOYING.Extension of rough set under incomplete information systems[C] // Proceedings of the 2002 IEEE International Conference on Fuzzy Systems.New York:IEEE,2002:1098-1103.
  • 10ZHANG QINGHUA,WANG GUOYING,HU JUN,et al.Incomplete information systems processing based on fuzzy-clustering[C] // Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.Washington,DC:IEEE Computer Society,2006:486-489.

共引文献48

同被引文献19

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部