摘要
针对农业复杂环境易发的物联网感知数据丢失异常问题,该文提出一种基于正则化惩罚的K最近邻数据重构方法(K nearest neighbor-regularization penalty,KNN-RP),采用岭回归方法对最近邻方法中的最小二乘因子进行正则化,并讨论了惩罚项的范数选取形式。通过对农业物联网感知数据的时空稳定性与相关性分析,确定了时间与空间约束矩阵的定义方式。采用温室数据样本对算法性能进行交叉验证,结果显示该文的KNN-RP性能在点丢失模型下优于KNN、反距离加权KNN算法以及DT算法,而在块丢失模型下优于KNN和反距离加权KNN算法,略低于DT算法,提高了农业物联网的感知数据质量。该研究可为基于物联网数据的农业生产决策提供参考。
The internet of things(IoT)technology has been widely applied in the agriculture production monitoring.Accurate decision-making and environment regulation can be made based on monitoring results.However,data loss in agriculture wireless sensor networks is common due to noise,collision,unreliable link,and unexpected damage,which greatly reduces the quality of data acquisition and then affects the results of decision analysis.In order to solve this problem,this paper proposed a data reconstruction method based on K nearest neighbor with regularization penalty constraints(KNN-RP).Firstly,the ridge regression method was used in order to regularize the least square factor.Secondly,there was a problem that it is difficult to get a unique solution due to the algorithmic error while the data matrix is not full-column rank.This could be improved by introducing a penalty term into the method.The combination of 1-norm and 2-norm could ensure the sparsity of the matrix as well as prevent the loss function from over-fitting.It is suitable for high-dimensional agricultural WSN(wireless sensor network)data reconstruction with high noise.Furthermore,the definition of time and space constraint matrix was determined according to the temporal and spatial stability of perceptual data in agricultural IoT.Finally,the K value was determined by model training to achieve the better reconstruction performance.A cross-validate experiment was done to evaluate the algorithm performance according to the greenhouse data samples.KNN(K nearest neighbor),KNN-inverse and DT(delaunay triangulation)algorithms were chosen for the performance comparison.In the element random loss case,the overall reconstruction error rate of the 4 algorithms increased with the increasing of data loss rate.The KNN and KNN-inverse had higher error rate when the data loss rate above 60%compared with the other 2 algorithms.Besides,the performance of KNN-RP was superior to the DT algorithm in both high and low data loss rates.In the block loss case,the reconstruction error rates of the 4 algorithms were close to the element random loss case,but reconstruction error rates increased faster than the element random loss case while the data loss rate increased.In the block loss case,the overall performance of KNN-RP was better than KNN and KNN-inverse,but lower than that of DT algorithm when the data loss rate was above 60%.The K value had a significant influence on the performance of KNN-RP.The reconstruction error of KNN-RP decreased first and then increased with the increasing of K value.For the stable parameter like temperature,the reconstruction error rate was less affected by K value.On the contrast,the reconstruction error rates of humidity and lightness data were more affected by K value.The reason maybe the humidity and lightness data changed faster than temperature.Considering all 3 parameters,temperature,humidity and lightness,the optimal K value was between 6 and 8.In summary,KNN-RP algorithm could effectively reconstruct the missing errors in the agricultural IoT,especially in element random loss case.The proposed algorithm improves the quality of perceptual data in agricultural IoT monitoring and may provide reference for agricultural production decision-making.
作者
吴华瑞
李庆学
缪祎晟
宋玉玲
Wu Huarui;Li Qingxue;Miao Yisheng;Song Yuling(National Engineering Research Center for Information Technology in Agriculture,Beijing 100097,China;Beijing Research Center for Information Technology in Agriculture,Beijing Academy of Agriculture and Forestry Sciences,Beijing 100097,China;Key Laboratory of Agricultural Internet of Things,Ministry of Agriculture and Rural Affairs,Yangling 712100,China)
出处
《农业工程学报》
EI
CAS
CSCD
北大核心
2019年第14期183-189,共7页
Transactions of the Chinese Society of Agricultural Engineering
基金
国家自然科学基金项目(61871041,61571051)
北京市自然科学基金项目(4172024,4172026)
农业农村部农业物联网重点实验室开放课题(2018AIOT-06)
关键词
算法
模型
农业物联网
数据重构
聚类回归
algorithms
models
agricultural internet of things
data reconfiguration
cluster regression