摘要
数据处理、分析、预测是当前计算机行业发展的新增长点,也是经济社会不断进步的网络技术支撑。为更好地从数据中挖掘隐式特征和隐性关系,进一步提高数据预测的命中率、准确性,依托所研发的科研大数据服务平台提出了基于监督学习的数据预测服务构建方法。通过样本采集和特征提取、特征预处理、建模技术选取的步骤建立用于数据预测的数学模型,进而基于服务平台构建数据预测服务,同时结合平台共建共享、操作便捷等优势,提升数据预测服务的实用性和复用性。以新闻延时预测为实验用例,在平台中使用前向逐步线性回归和三维点云建模技术构建预测服务,通过10-折交叉验证对服务性能进行度量。实验结果表明,该方法复用性强,所构建的服务可对数据进行有效预测,为用户进行准确决策提供支持。
Data processing,analysis and prediction are new growth points for the development of computer industry,and also the support of network technology for the continuous progress of economic society. To better mine implicit characteristics and implicit relations from data,and further improve the hit ratio and accuracy of the data prediction,we put forward a data prediction service construction method relying on the existing research big data service platform. The mathematical model of data prediction is established by sample collection,feature extraction,feature preprocessing,modelling technology selection,etc,and then the data prediction service is built based on the service platform. Meanwhile,combined with the advantages of platform co-construction and sharing,convenient operation,etc,the practicability and reusability of data prediction services are improved. Taking the news delay prediction as an experimental case,the prediction service is constructed by using forward progressive linear regression and three-dimensional point cloud modeling technology on the platform,and the model performance is measured by 10-fold cross validation. Experiment shows that the proposed method has strong reusability,and the service built can predict the data effectively,providing support for users to make accurate decisions.
作者
李昭
宋壹
陈鹏
LI Zhao;SONG Yi;CHEN Peng(School of Computer and Information Technology,China Three Gorges University,Yichang 443002,China)
出处
《计算机技术与发展》
2019年第9期188-194,共7页
Computer Technology and Development
基金
国家重点研发计划项目(2016YFC0802500)
国家自然科学基金(61272236)
湖北省自然科学基金(2018CFC852)
教育部人文社科规划基金(20171304)
三峡库区地质灾害教育部重点实验室开放基金(2015KDZ05)
三峡大学人才专项基金(8000303)
关键词
监督学习
数据预测
服务构建
前向逐步回归
三维点云
交叉验证
新闻延时
supervised learning
data prediction
service building
forward stepwise regression
three-dimensional point cloud
cross-validation
news delay