摘要
为解决神经网络算法中样本数据包含大量与目标数据无关的属性而导致网络训练时间长、效率低的问题,提出基于改进模糊k均值(FKM)和BP神经网络算法的数据挖掘模型.利用改进的FKM聚类算法对输入数据的属性进行聚类,摈弃与目标属性相关性弱或冗余的属性,保留相关性强的属性,减少了神经网络的训练样本数据量,提高了网络的训练效率.对儿童血红蛋白含量的预测结果表明,该模型具有很好的实用性和可靠性.
A data mining model based on fuzzy k-means (FKM) algorithm and back propagation (BP) neural network algorithm was proposed to solve the problem of long training time and low efficiency when the sample data contains the attributes unrelated to target data. The attributes of input data was clustered by using FKM clustering algorithm, and the attributes with weak correlation or Redundancy to target data were abandoned, and then the attributes with strong correlation to target data were reserveed, which reduce the training samples of neural network, and training efficiency of the network was improved. Tests on forecasting the content of Hemoglobin in the body of children show that the proposed model is practicable and reliable.
出处
《大连海事大学学报》
EI
CAS
CSCD
北大核心
2008年第4期37-40,44,共5页
Journal of Dalian Maritime University
关键词
模糊k均值算法
BP神经网络
数据挖掘
fuzzy k-means algorithm
back propagation (BP) neural network
data mining