摘要
为优化缺失数据的填充效果,提出混合多元信息大数据缺失值填充算法。首先,引进相似度距离计算法,进行离散属性的转换处理,计算混合多元信息大数据的预处理与不完整信息的相似度;其次,设计自动编码机结构,进行混合多元信息大数据中填充数据的自动编码;最后,根据数据的类别,提取混合多元信息大数据中缺失数据的属性,进行不同类别缺失数据的填充。实验结果表明,该算法的实际应用效果良好,可以有效提高实际数据与缺失值填充数据之间的匹配程度,确保对缺失值的高精度还原与填充。
To optimize the filling effect of missing data,a mixed multivariate information big data missing value filling algorithm is proposed.Firstly,the similarity distance calculation method is introduced to transform discrete attributes and calculate the similarity between the preprocessing of mixed multivariate information big data and incomplete information.Secondly,design an automatic encoding machine structure to automatically encode data filled in mixed multivariate information big data.Finally,based on the category of data,extract the attributes of missing data in mixed multivariate information big data and fill in missing data of different categories.The experimental results show that the algorithm has a good practical application effect and can effectively improve the matching degree between actual data and missing value filled data,ensuring highprecision restoration and filling of missing values.
作者
张洪升
ZHANG Hongsheng(School of Information Engineering,Zhengzhou University of Industrial Technology,Xinzheng Henan 451100,China)
出处
《信息与电脑》
2023年第15期113-115,共3页
Information & Computer
关键词
混合
自动编码
填充算法
缺失值
多元信息
hybrid
automatic coding
filling algorithm
missing value
multivariate information