摘要
针对作物建模过程中监测数据存在的时间段识别分段以及异常数据问题,研究自动、有效的数据预处理方法,以期为后续分析和建模提供可靠的样本。采用改进后的聚类分析对监测获得的多维时间序列数据进行分类,进而获得连续、全面、均匀的时间段,在此基础上,根据样本数据规模选择合适的异常检测准则,用于异常数据检测和处理。通过对黑豆以及番茄、南瓜、黄瓜等作物监测数据的聚类分析及异常检测处理,高效地筛选出适合分析和建模的数据样本集。研究表明,聚类分析与异常检测相结合的数据预处理方法有效避免了作物监测实验中存在的误差和干扰,提高了数据分析与建模的质量和效率,在作物模型尤其是小样本作物模型的研究和应用中具有潜力。
Aiming at recognition and selection of time period and anomalous data problems,the present study focused on automatic and effective data preprocess methods to offer reliable sample data for the further analysis and modeling.In the present study,the improved cluster analysis was adopted to classify multidimensional time series to obtain the continuous,complete and homogeneous time period. On the basis of that,the suitable anomaly detection criterion was selected to detect and process the outlier,according to the size of sample data. Through the cluster analysis and anomaly detection process of monitoring data about black bean,tomato,pumpkin and cucumber,the data sample suitable for analysis and modeling was chosen effectively. The data preprocess based on cluster analysis and anomaly detection could effectively avoid errors and disturbances in crop monitoring experiments,and improve the quality and efficiency of data analysis and modeling. The method is valuable in the study and application of crop model,in particular crop model for small sample size.
出处
《浙江农业学报》
CSCD
北大核心
2016年第5期885-892,共8页
Acta Agriculturae Zhejiangensis
基金
安徽省自然科学基金(1508085MF110)
安徽省科技攻关项目(1501031102)
引进国际先进农业科学技术计划(948计划
2015-Z44)
关键词
作物模型
监测实验
数据预处理
聚类分析
异常检测
crop model
monitor experiment
data preprocess
cluster analysis
anomaly detection