We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method...We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation;it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.展开更多
Rectification for airborne linear images is an indispensable preprocessing step. This paper presents in detail a two-step rectification algorithm. The first step is to establish the model of direct georeference positi...Rectification for airborne linear images is an indispensable preprocessing step. This paper presents in detail a two-step rectification algorithm. The first step is to establish the model of direct georeference position using the data provided by the Po- sitioning and Orientation System (POS) and obtain the mathematical relationships between the image points and ground reference points. The second step is to apply polynomial distortion model and Bilinear Interpolation to get the final precise rectified images. In this step, a reference image is required and some ground control points (GCPs) are selected. Experiments showed that the final rectified images are satisfactory, and that our two-step rectification algorithm is very effective.展开更多
基于随机划分的隔离森林算法并没有考虑子样本中含有离群点的概率大小,针对此问题提出基于极差的隔离森林算法,在随机子采样过程中应用极差筛选样本子集,使样本子集中存在较多离群点的概率较大。同时,在隔离树构建过程中通过子节点与其...基于随机划分的隔离森林算法并没有考虑子样本中含有离群点的概率大小,针对此问题提出基于极差的隔离森林算法,在随机子采样过程中应用极差筛选样本子集,使样本子集中存在较多离群点的概率较大。同时,在隔离树构建过程中通过子节点与其直接父节点的样本量比重控制树的生长形态,以避免生成性能较差的隔离树。在离群值检测数据库(ODDS)中的7个公开数据集以及KDD CUP 99数据集上与8种离群点检测算法比较结果显示,r-iForest算法的准确率高出其他算法2%~40%,且比iForest算法的时间消耗减少约15%。展开更多
文摘We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation;it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.
基金Project (No. 02DZ15001) supported by Shanghai Science and Technology Development Funds, China
文摘Rectification for airborne linear images is an indispensable preprocessing step. This paper presents in detail a two-step rectification algorithm. The first step is to establish the model of direct georeference position using the data provided by the Po- sitioning and Orientation System (POS) and obtain the mathematical relationships between the image points and ground reference points. The second step is to apply polynomial distortion model and Bilinear Interpolation to get the final precise rectified images. In this step, a reference image is required and some ground control points (GCPs) are selected. Experiments showed that the final rectified images are satisfactory, and that our two-step rectification algorithm is very effective.
文摘基于随机划分的隔离森林算法并没有考虑子样本中含有离群点的概率大小,针对此问题提出基于极差的隔离森林算法,在随机子采样过程中应用极差筛选样本子集,使样本子集中存在较多离群点的概率较大。同时,在隔离树构建过程中通过子节点与其直接父节点的样本量比重控制树的生长形态,以避免生成性能较差的隔离树。在离群值检测数据库(ODDS)中的7个公开数据集以及KDD CUP 99数据集上与8种离群点检测算法比较结果显示,r-iForest算法的准确率高出其他算法2%~40%,且比iForest算法的时间消耗减少约15%。