Coffee plays a key role in the generation of rural employment in Colombia.More than 785,000 workers are directly employed in this activity,which represents the 26%of all jobs in the agricultural sector.Colombian coffe...Coffee plays a key role in the generation of rural employment in Colombia.More than 785,000 workers are directly employed in this activity,which represents the 26%of all jobs in the agricultural sector.Colombian coffee growers estimate the production of cherry coffee with the main aim of planning the required activities,and resources(number of workers,required infrastructures),anticipating negotiations,estimating,price,and foreseeing losses of coffee production in a specific territory.These important processes can be affected by several factors that are not easy to predict(e.g.,weather variability,diseases,or plagues.).In this paper,we propose a non-destructive time series model,based on weather and crop management information,that estimate coffee production allowing coffee growers to improve their management of agricultural activities such as flowering calendars,harvesting seasons,definition of irrigation methods,nutrition calendars,and programming the times of concentration of production to define the amount of personnel needed for harvesting.The combination of time series and machine learning algorithms based on regression trees(XGBOOST,TR and RF)provides very positive results for the test dataset collected in real conditions for more than a year.The best results were obtained by the XGBOOST model(MAE=0.03;RMSE=0.01),and a difference of approximately 0.57%absolute to the main harvest of 2018.展开更多
随着工业物联网(industrial Internet of things,IIoT)的不断发展,越来越多的设备和传感器开始连接到网络中,产生了大量的时间序列数据(简称“时序数据”),时序数据爆炸式的增长给数据库管理系统带来了新的挑战:持续高吞吐量数据摄取、...随着工业物联网(industrial Internet of things,IIoT)的不断发展,越来越多的设备和传感器开始连接到网络中,产生了大量的时间序列数据(简称“时序数据”),时序数据爆炸式的增长给数据库管理系统带来了新的挑战:持续高吞吐量数据摄取、低延迟多维度数据查询、高性能时间序列索引以及低成本数据存储.近年来时序数据库技术已经成为一个研究热点,一些学者对时序数据库技术进行了深入的研究,同时出现了一些专门用于管理时序数据的时序数据库,并且已经被应用在多个领域,成为工业物联网中不可缺少的关键组成.现有的时序数据库相关综述侧重于时序数据库的功能和性能比较,以及在特定领域中对时序数据库的选择建议,缺少对时序数据库持久化存储、查询、计算和索引等关键技术的研究,同时这些综述工作出现的时间较早,缺少对现代时序数据库关键技术的研究.对学术界时序数据存储研究和工业界时序数据库进行了全面的调查和研究,凝练了时序数据库的4类关键技术:1)时间序列索引优化技术;2)内存数据组织技术;3)高吞吐量数据摄取和低延迟数据查询技术;4)海量历史数据低成本存储技术.同时分析总结了时序数据库评测基准.最后,展望了时序数据库关键技术在未来的发展方向.展开更多
基金We thank to the Telematics Engineering Group(GIT)of the University of Cauca and Tecnicaféfor the technical support.In addition,we are grateful to COLCIENCIAS for PhD scholarship granted to PhD.David Camilo Corrales.This work has been also supported by Innovacción-Cauca(SGR-Colombia)under project“Alternativas Innovadoras de Agricultura Inteligente para sistemas productivos agrícolas del departamento del Cauca soportado en entornos de IoT ID 4633-Convocatoria 04C-2018 Banco de Proyectos Conjuntos UEES-Sostenibilidad”.
文摘Coffee plays a key role in the generation of rural employment in Colombia.More than 785,000 workers are directly employed in this activity,which represents the 26%of all jobs in the agricultural sector.Colombian coffee growers estimate the production of cherry coffee with the main aim of planning the required activities,and resources(number of workers,required infrastructures),anticipating negotiations,estimating,price,and foreseeing losses of coffee production in a specific territory.These important processes can be affected by several factors that are not easy to predict(e.g.,weather variability,diseases,or plagues.).In this paper,we propose a non-destructive time series model,based on weather and crop management information,that estimate coffee production allowing coffee growers to improve their management of agricultural activities such as flowering calendars,harvesting seasons,definition of irrigation methods,nutrition calendars,and programming the times of concentration of production to define the amount of personnel needed for harvesting.The combination of time series and machine learning algorithms based on regression trees(XGBOOST,TR and RF)provides very positive results for the test dataset collected in real conditions for more than a year.The best results were obtained by the XGBOOST model(MAE=0.03;RMSE=0.01),and a difference of approximately 0.57%absolute to the main harvest of 2018.
文摘随着工业物联网(industrial Internet of things,IIoT)的不断发展,越来越多的设备和传感器开始连接到网络中,产生了大量的时间序列数据(简称“时序数据”),时序数据爆炸式的增长给数据库管理系统带来了新的挑战:持续高吞吐量数据摄取、低延迟多维度数据查询、高性能时间序列索引以及低成本数据存储.近年来时序数据库技术已经成为一个研究热点,一些学者对时序数据库技术进行了深入的研究,同时出现了一些专门用于管理时序数据的时序数据库,并且已经被应用在多个领域,成为工业物联网中不可缺少的关键组成.现有的时序数据库相关综述侧重于时序数据库的功能和性能比较,以及在特定领域中对时序数据库的选择建议,缺少对时序数据库持久化存储、查询、计算和索引等关键技术的研究,同时这些综述工作出现的时间较早,缺少对现代时序数据库关键技术的研究.对学术界时序数据存储研究和工业界时序数据库进行了全面的调查和研究,凝练了时序数据库的4类关键技术:1)时间序列索引优化技术;2)内存数据组织技术;3)高吞吐量数据摄取和低延迟数据查询技术;4)海量历史数据低成本存储技术.同时分析总结了时序数据库评测基准.最后,展望了时序数据库关键技术在未来的发展方向.