摘要
为了处理大量分布式存储的农场环境数据,为作物增产提供异常环境参考并制定预防策略,本文结合农场环境数据的特点,在Hadoop平台中实现了对农场环境数据的Dirichlet过程混合模型聚类,并提出了基于聚类分析的农场环境异常检测方法。在Map Reduce框架下,Map阶段完成样本点到模型的分配;Reduce阶段对模型与类簇个数进行更新。通过实验验证了分布式Dirichlet聚类的性能,分析结果表明该方法可以应用于大量农场环境数据的异常检测。
The massive farm environment data stored in the distributed system should be dealt with so as to provide abnormal environment reference and make preventive strategies for crop yield. Considering the characteristics of the farm environment data, the Dirichlet Process Mixture Model (DPMM) clustering is implemented with the farm environment data on Hadoop and the anomaly detection method of the farm environment is proposed based on clustering analysis. Under the framework of MapReduce, Map stage implements the distribution of the sample points to the models; Reduce stage completes the update of models and the number of clusters. The performance has been verified by experiments. The results of clustering and the index of suitable environment for tomato are compared to implement the anomaly detection. The analysis results show that the method can be applied to anomaly detection of large number of farm environment data.
作者
邓丽
庞洪霖
王灵
费敏锐
Deng Li;Pang Honglin;Ling Wang;Minrui Fei(School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 200072, China;Shanghai Key Laboratory of Power Station Automation Technology, Shanghai 200072, China)
出处
《系统仿真学报》
CAS
CSCD
北大核心
2017年第12期3035-3041,共7页
Journal of System Simulation
基金
上海市科委重点项目(14DZ1206302)