摘要
地理加权回归方法在小样本数据下回归分析精度往往不高。半监督学习是一种利用未标记样本参与训练的机器学习方法,可以有效地提升少量有标记样本的学习性能。基于此本文提出了一种基于半监督学习的地理加权回归方法,其核心思想是利用有标记样本建立回归模型来训练未标记样本,再选择置信度高的结果扩充有标记样本,不断训练,以提高回归性能。本文采用模拟数据和真实数据进行试验,以均方误差提升百分比作为性能评价指标,将SSLGWR与GWR、COREG对比分析。模拟数据试验中,SSLGWR在3种不同配置下性能分别提升了39.66%、11.92%和0.94%。真实数据试验中,SSLGWR在3种不同配置下性能分别提升了8.94%、3.36%和5.87%。SSLGWR结果均显著优于GWR和COGWR。试验证明,半监督学习方法能利用未标记数据提升地理加权回归模型的性能,特别是在有标记样本数量较少时作用显著。
Geographically weighted regression (GWR) approach will be affected by the quantity of label data. However, it is difficult to get labeled data but easy to get the unlabeled data in applications. Therefore it is indispensable to find an useful way that can use the unlabeled data to improve the regression results. As we know semi-supervised learning is a class of supervised learning tasks and techniques that also make use of unlabeled data for training typically a small amount of labeled data with a large amount of unlabeled data. So this article develops a semi-supervised-learning geographically weighted regression (SSLGWR). Firstly it builds the GWR model by labeled data. Then the unlabeled data can be calculated the value by the GWR model and they will be signed as new labeled data. Thirdly, use both labeled data and new labeled data to rebuild the GWR model to improve the model's precision. The experiments use both simulated data and real data to compare GWR COGWR and SSLGWR. Mean square error is closed as the framework to estimate the models. Experiments using simulated data have shown that the proposed model improves the performance by 39.66%, 11.92% and 0.94% relative to 10%,30% and 50% label data. And experiments using real data have shown that the proposed model improves the performance by 8.94%, 3.36% and 5. 87%. The results demonstrate that there are substantial benefits of SSLGWR in the improvement of GWR.
作者
赵阳阳
刘纪平
徐胜华
张福浩
杨毅
ZHAO Yangyang LIU Jiping XU Shenghua ZHANG Fuhao YANG Yi(School of Mapping and Geographical Science, Liaoning Technical University, Fuxin 123000, China Chinese Academy of Surveying and Mapping, Beijing 100830, China)
出处
《测绘学报》
EI
CSCD
北大核心
2017年第1期123-129,共7页
Acta Geodaetica et Cartographica Sinica
基金
测绘地理信息公益性行业科研专项(201512032)
国家重点研发计划(2016YFC0803101)~~
关键词
地理加权回归
半监督学习
SSLGWR
人口分布
geographically weighted regressiom semi-supervised learning
SSLGWR
population distribution