Certain distributions do not have a closed-form density, but it is simple to draw samples from them. For such distributions, simulated minimum Hellinger distance (SMHD) estimation appears to be useful. Since the metho...Certain distributions do not have a closed-form density, but it is simple to draw samples from them. For such distributions, simulated minimum Hellinger distance (SMHD) estimation appears to be useful. Since the method is distance-based, it happens to be naturally robust. This paper is a follow-up to a previous paper where the SMHD estimators were only shown to be consistent;this paper establishes their asymptotic normality. For any parametric family of distributions for which all positive integer moments exist, asymptotic properties for the SMHD method indicate that the variance of the SMHD estimators attains the lower bound for simulation-based estimators, which is based on the inverse of the Fisher information matrix, adjusted by a constant that reflects the loss of efficiency due to simulations. All these features suggest that the SMHD method is applicable in many fields such as finance or actuarial science where we often encounter distributions without closed-form density.展开更多
Minimum Hellinger distance (MHD) estimation is extended to a simulated version with the model density function replaced by a density estimate based on a random sample drawn from the model distribution. The method does...Minimum Hellinger distance (MHD) estimation is extended to a simulated version with the model density function replaced by a density estimate based on a random sample drawn from the model distribution. The method does not require a closed-form expression for the density function and appears to be suitable for models lacking a closed-form expression for the density, models for which likelihood methods might be difficult to implement. Even though only consistency is shown in this paper and the asymptotic distribution remains an open question, our simulation study suggests that the methods have the potential to generate simulated minimum Hellinger distance (SMHD) estimators with high efficiencies. The method can be used as an alternative to methods based on moments, methods based on empirical characteristic functions, or the use of an expectation-maximization (EM) algorithm.展开更多
Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challen...Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challenge to machine learning algorithms.To tackle such issues,we propose a new splitting criterion of the decision tree based on the one-against-all-based Hellinger distance(OAHD).Two crucial elements are included in OAHD.First,the one-against-all scheme is integrated into the process of computing the Hellinger distance in OAHD,thereby extending the Hellinger distance decision tree to cope with the multiclass imbalance problem.Second,for the multiclass imbalance problem,the distribution and the number of distinct classes are taken into account,and a modified Gini index is designed.Moreover,we give theoretical proofs for the properties of OAHD,including skew insensitivity and the ability to seek a purer node in the decision tree.Finally,we collect 20 public real-world imbalanced data sets from the Knowledge Extraction based on Evolutionary Learning(KEEL)repository and the University of California,Irvine(UCI)repository.Experimental and statistical results show that OAHD significantly improves the performance compared with the five other well-known decision trees in terms of Precision,F-measure,and multiclass area under the receiver operating characteristic curve(MAUC).Moreover,through statistical analysis,the Friedman and Nemenyi tests are used to prove the advantage of OAHD over the five other decision trees.展开更多
本文提出一种基于Hellinger距离的管道泄漏检测方法。该方法首先根据管道正常运行时的压力数据,利用滑动窗方法建立时间序列,并基于Hellinger距离构建管道泄漏检测的统计量。为了能够更加准确的监测管道的泄漏,通过休哈特控制图(Shewhar...本文提出一种基于Hellinger距离的管道泄漏检测方法。该方法首先根据管道正常运行时的压力数据,利用滑动窗方法建立时间序列,并基于Hellinger距离构建管道泄漏检测的统计量。为了能够更加准确的监测管道的泄漏,通过休哈特控制图(Shewhart control chart)建立控制限。仿真结果表明该方法简单、有效且具有较好的实时性。展开更多
文摘Certain distributions do not have a closed-form density, but it is simple to draw samples from them. For such distributions, simulated minimum Hellinger distance (SMHD) estimation appears to be useful. Since the method is distance-based, it happens to be naturally robust. This paper is a follow-up to a previous paper where the SMHD estimators were only shown to be consistent;this paper establishes their asymptotic normality. For any parametric family of distributions for which all positive integer moments exist, asymptotic properties for the SMHD method indicate that the variance of the SMHD estimators attains the lower bound for simulation-based estimators, which is based on the inverse of the Fisher information matrix, adjusted by a constant that reflects the loss of efficiency due to simulations. All these features suggest that the SMHD method is applicable in many fields such as finance or actuarial science where we often encounter distributions without closed-form density.
文摘Minimum Hellinger distance (MHD) estimation is extended to a simulated version with the model density function replaced by a density estimate based on a random sample drawn from the model distribution. The method does not require a closed-form expression for the density function and appears to be suitable for models lacking a closed-form expression for the density, models for which likelihood methods might be difficult to implement. Even though only consistency is shown in this paper and the asymptotic distribution remains an open question, our simulation study suggests that the methods have the potential to generate simulated minimum Hellinger distance (SMHD) estimators with high efficiencies. The method can be used as an alternative to methods based on moments, methods based on empirical characteristic functions, or the use of an expectation-maximization (EM) algorithm.
基金Project supported by the National Natural Science Foundation of China(Nos.61802085 and 61563012)the Guangxi Provincial Natural Science Foundation,China(Nos.2021GXNSFAA220074and 2020GXNSFAA159038)+1 种基金the Guangxi Key Laboratory of Embedded Technology and Intelligent System Foundation,China(No.2018A-04)the Guangxi Key Laboratory of Trusted Software Foundation,China(No.kx202011)。
文摘Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challenge to machine learning algorithms.To tackle such issues,we propose a new splitting criterion of the decision tree based on the one-against-all-based Hellinger distance(OAHD).Two crucial elements are included in OAHD.First,the one-against-all scheme is integrated into the process of computing the Hellinger distance in OAHD,thereby extending the Hellinger distance decision tree to cope with the multiclass imbalance problem.Second,for the multiclass imbalance problem,the distribution and the number of distinct classes are taken into account,and a modified Gini index is designed.Moreover,we give theoretical proofs for the properties of OAHD,including skew insensitivity and the ability to seek a purer node in the decision tree.Finally,we collect 20 public real-world imbalanced data sets from the Knowledge Extraction based on Evolutionary Learning(KEEL)repository and the University of California,Irvine(UCI)repository.Experimental and statistical results show that OAHD significantly improves the performance compared with the five other well-known decision trees in terms of Precision,F-measure,and multiclass area under the receiver operating characteristic curve(MAUC).Moreover,through statistical analysis,the Friedman and Nemenyi tests are used to prove the advantage of OAHD over the five other decision trees.
文摘本文提出一种基于Hellinger距离的管道泄漏检测方法。该方法首先根据管道正常运行时的压力数据,利用滑动窗方法建立时间序列,并基于Hellinger距离构建管道泄漏检测的统计量。为了能够更加准确的监测管道的泄漏,通过休哈特控制图(Shewhart control chart)建立控制限。仿真结果表明该方法简单、有效且具有较好的实时性。