In real-life freeway transportation system, a few number of incident observation (very rare event) is available while there are large numbers of normal condition dataset. Most of researches on freeway incident detec...In real-life freeway transportation system, a few number of incident observation (very rare event) is available while there are large numbers of normal condition dataset. Most of researches on freeway incident detection have considered the incident detection problem as classification one. However, because of insufficiency of incident events, most of previous researches have utilized simulated incident events to develop freeway incident detection models. In order to overcome this drawback, this paper proposes a wavelet-based Hotelling 7a control chart for freeway incident detection, which integrates a wavelet transform into an abnormal detection method. Firstly, wavelet transform extracts useful features from noisy original traffic observations, leading to reduce the dimensionality of input vectors. Then, a Hotelling T2 control chart describes a decision boundary with only normal traffic observations with the selected features in the wavelet domain. Unlike the existing incident detection algorithms, which require lots of incident observations to construct incident detection models, the proposed approach can decide a decision boundary given only normal training observations. The proposed method is evaluated in comparison with California algorithm, Minnesota algorithm and conventional neural networks. The experimental results present that the proposed algorithm in this paper is a promising alternative for freeway automatic incident detections.展开更多
The past two decades have witnessed the active development of a rich probability theory of Studentized statistics or self-normalized processes, typified by Student’s t-statistic as introduced by W. S. Gosset more tha...The past two decades have witnessed the active development of a rich probability theory of Studentized statistics or self-normalized processes, typified by Student’s t-statistic as introduced by W. S. Gosset more than a century ago, and their applications to statistical problems in high dimensions, including feature selection and ranking, large-scale multiple testing and sparse, high dimensional signal detection. Many of these applications rely on the robustness property of Studentization/self-normalization against heavy-tailed sampling distributions. This paper gives an overview of the salient progress of self-normalized limit theory, from Student’s t-statistic to more general Studentized nonlinear statistics. Prototypical examples include Studentized one- and two-sample U-statistics. Furthermore, we go beyond independence and glimpse some very recent advances in self-normalized moderate deviations under dependence.展开更多
The letter proposed a sound source localization method of digital hearing aids using wavelet based multivariate statistics with the Generalized Cross Correlation (GCC) algorithm. Haar wavelet is used to decompose GCC ...The letter proposed a sound source localization method of digital hearing aids using wavelet based multivariate statistics with the Generalized Cross Correlation (GCC) algorithm. Haar wavelet is used to decompose GCC sequences and extract four wavelet characteristics. And then, Hotelling T2 statistical method is used to fuse the four wavelet characteristics. The statistical value is used to judge the number of sound sources and obtain corresponding time delay estimation which is used to localize the position of sound source. The experimental results show that the proposed method has better robustness in an environment with severe noise and reverberation. Meanwhile, the complexity of al-gorithm is moderate, which is available for sound source localization of hearing aids.展开更多
It is predicted that the distribution on score of children's motor performance show abnormal curve, because their physical fitness shows decline and gap between good and poor. The aim of present study is to type patt...It is predicted that the distribution on score of children's motor performance show abnormal curve, because their physical fitness shows decline and gap between good and poor. The aim of present study is to type pattern of distribution curve on physique and motor performance (PMP) in preschoolers, and to investigate gender difference, and change in season and age on distribution of PMP. Seven hundred and seven preschoolers participate in measurement on 21 items of PMP. And mix-longitudinal data which made from 6 terms (2 seasons of spring and autumn for 3 years) is completed. The histograms of 10 sections are trained for each gender, term, and test item based on range of data including mean ± 3 standard deviation. By cluster analysis and multi-correspondence analysis, "physique & jumping" show closely normal curve, but "manipulation (MP)" is skewed to poor and "running & prompt (RP)" is skewed to good. By Hotelling's T2-test and Mahalanobis' Distance, gender difference is shown that boys' RP is skewed to good, and their "throwing" and "weight & flexibility (WF)" are skewed to poor. About season, WF is skewed to poor and RP is skewed to good, in spring. In annual change about pattern of distribution, skewing poor in MP and good in "running straight & prompt" are changed to normal curve with aging.展开更多
高维数据下的均值检验是统计学检验的重要组成部分.在高维情形下,样本协方差矩阵往往是奇异矩阵,传统的均值检验方法因此失效.为解决该问题,对高维数据进行分段,使分得的每一段的维数均小于样本容量,继而运用Hotelling T 2检验依次对每...高维数据下的均值检验是统计学检验的重要组成部分.在高维情形下,样本协方差矩阵往往是奇异矩阵,传统的均值检验方法因此失效.为解决该问题,对高维数据进行分段,使分得的每一段的维数均小于样本容量,继而运用Hotelling T 2检验依次对每一段进行检验,同时给出一种控制犯第一类错误的概率的新方法,使原假设下的检验水平稳定在事先给定的显著性水平左右.经模拟显示,该逐段检验方法比已有方法能更好地控制犯第一类错误的概率.展开更多
For several decades, much attention has been paid to the two-sample Behrens-Fisher (BF) problem which tests the equality of the means or mean vectors of two normal populations with unequal variance/covariance structur...For several decades, much attention has been paid to the two-sample Behrens-Fisher (BF) problem which tests the equality of the means or mean vectors of two normal populations with unequal variance/covariance structures. Little work, however, has been done for the k-sample BF problem for high dimensional data which tests the equality of the mean vectors of several high-dimensional normal populations with unequal covariance structures. In this paper we study this challenging problem via extending the famous Scheffe’s transformation method, which reduces the k-sample BF problem to a one-sample problem. The induced one-sample problem can be easily tested by the classical Hotelling’s T 2 test when the size of the resulting sample is very large relative to its dimensionality. For high dimensional data, however, the dimensionality of the resulting sample is often very large, and even much larger than its sample size, which makes the classical Hotelling’s T 2 test not powerful or not even well defined. To overcome this difficulty, we propose and study an L 2-norm based test. The asymptotic powers of the proposed L 2-norm based test and Hotelling’s T 2 test are derived and theoretically compared. Methods for implementing the L 2-norm based test are described. Simulation studies are conducted to compare the L 2-norm based test and Hotelling’s T 2 test when the latter can be well defined, and to compare the proposed implementation methods for the L 2-norm based test otherwise. The methodologies are motivated and illustrated by a real data example.展开更多
Detecting differential expression of genes in genom research(e.g.,2019-nCoV)is not uncommon,due to the cost only small sample is employed to estimate a large number of variances(or their inverse)of variables simultane...Detecting differential expression of genes in genom research(e.g.,2019-nCoV)is not uncommon,due to the cost only small sample is employed to estimate a large number of variances(or their inverse)of variables simultaneously.However,the commonly used approaches perform unreliable.Borrowing information across different variables or priori information of variables,shrinkage estimation approaches are proposed and some optimal shrinkage estimators are obtained in the sense of asymptotic.In this paper,we focus on the setting of small sample and a likelihood-unbiased estimator for power of variances is given under the assumption that the variances are chi-squared distribution.Simulation reports show that the likelihood-unbiased estimators for variances and their inverse perform very well.In addition,application comparison and real data analysis indicate that the proposed estimator also works well.展开更多
文摘In real-life freeway transportation system, a few number of incident observation (very rare event) is available while there are large numbers of normal condition dataset. Most of researches on freeway incident detection have considered the incident detection problem as classification one. However, because of insufficiency of incident events, most of previous researches have utilized simulated incident events to develop freeway incident detection models. In order to overcome this drawback, this paper proposes a wavelet-based Hotelling 7a control chart for freeway incident detection, which integrates a wavelet transform into an abnormal detection method. Firstly, wavelet transform extracts useful features from noisy original traffic observations, leading to reduce the dimensionality of input vectors. Then, a Hotelling T2 control chart describes a decision boundary with only normal traffic observations with the selected features in the wavelet domain. Unlike the existing incident detection algorithms, which require lots of incident observations to construct incident detection models, the proposed approach can decide a decision boundary given only normal training observations. The proposed method is evaluated in comparison with California algorithm, Minnesota algorithm and conventional neural networks. The experimental results present that the proposed algorithm in this paper is a promising alternative for freeway automatic incident detections.
文摘The past two decades have witnessed the active development of a rich probability theory of Studentized statistics or self-normalized processes, typified by Student’s t-statistic as introduced by W. S. Gosset more than a century ago, and their applications to statistical problems in high dimensions, including feature selection and ranking, large-scale multiple testing and sparse, high dimensional signal detection. Many of these applications rely on the robustness property of Studentization/self-normalization against heavy-tailed sampling distributions. This paper gives an overview of the salient progress of self-normalized limit theory, from Student’s t-statistic to more general Studentized nonlinear statistics. Prototypical examples include Studentized one- and two-sample U-statistics. Furthermore, we go beyond independence and glimpse some very recent advances in self-normalized moderate deviations under dependence.
基金Supported by the National Natural Science Foundation of China (No. 60472058, No. 60975017)Jiangsu Provincial Natural Science Foundation (No. BK2008291)
文摘The letter proposed a sound source localization method of digital hearing aids using wavelet based multivariate statistics with the Generalized Cross Correlation (GCC) algorithm. Haar wavelet is used to decompose GCC sequences and extract four wavelet characteristics. And then, Hotelling T2 statistical method is used to fuse the four wavelet characteristics. The statistical value is used to judge the number of sound sources and obtain corresponding time delay estimation which is used to localize the position of sound source. The experimental results show that the proposed method has better robustness in an environment with severe noise and reverberation. Meanwhile, the complexity of al-gorithm is moderate, which is available for sound source localization of hearing aids.
文摘It is predicted that the distribution on score of children's motor performance show abnormal curve, because their physical fitness shows decline and gap between good and poor. The aim of present study is to type pattern of distribution curve on physique and motor performance (PMP) in preschoolers, and to investigate gender difference, and change in season and age on distribution of PMP. Seven hundred and seven preschoolers participate in measurement on 21 items of PMP. And mix-longitudinal data which made from 6 terms (2 seasons of spring and autumn for 3 years) is completed. The histograms of 10 sections are trained for each gender, term, and test item based on range of data including mean ± 3 standard deviation. By cluster analysis and multi-correspondence analysis, "physique & jumping" show closely normal curve, but "manipulation (MP)" is skewed to poor and "running & prompt (RP)" is skewed to good. By Hotelling's T2-test and Mahalanobis' Distance, gender difference is shown that boys' RP is skewed to good, and their "throwing" and "weight & flexibility (WF)" are skewed to poor. About season, WF is skewed to poor and RP is skewed to good, in spring. In annual change about pattern of distribution, skewing poor in MP and good in "running straight & prompt" are changed to normal curve with aging.
文摘高维数据下的均值检验是统计学检验的重要组成部分.在高维情形下,样本协方差矩阵往往是奇异矩阵,传统的均值检验方法因此失效.为解决该问题,对高维数据进行分段,使分得的每一段的维数均小于样本容量,继而运用Hotelling T 2检验依次对每一段进行检验,同时给出一种控制犯第一类错误的概率的新方法,使原假设下的检验水平稳定在事先给定的显著性水平左右.经模拟显示,该逐段检验方法比已有方法能更好地控制犯第一类错误的概率.
基金supported by the National University of Singapore Academic Research Grant (Grant No. R-155-000-085-112)
文摘For several decades, much attention has been paid to the two-sample Behrens-Fisher (BF) problem which tests the equality of the means or mean vectors of two normal populations with unequal variance/covariance structures. Little work, however, has been done for the k-sample BF problem for high dimensional data which tests the equality of the mean vectors of several high-dimensional normal populations with unequal covariance structures. In this paper we study this challenging problem via extending the famous Scheffe’s transformation method, which reduces the k-sample BF problem to a one-sample problem. The induced one-sample problem can be easily tested by the classical Hotelling’s T 2 test when the size of the resulting sample is very large relative to its dimensionality. For high dimensional data, however, the dimensionality of the resulting sample is often very large, and even much larger than its sample size, which makes the classical Hotelling’s T 2 test not powerful or not even well defined. To overcome this difficulty, we propose and study an L 2-norm based test. The asymptotic powers of the proposed L 2-norm based test and Hotelling’s T 2 test are derived and theoretically compared. Methods for implementing the L 2-norm based test are described. Simulation studies are conducted to compare the L 2-norm based test and Hotelling’s T 2 test when the latter can be well defined, and to compare the proposed implementation methods for the L 2-norm based test otherwise. The methodologies are motivated and illustrated by a real data example.
基金Supported by the National Natural Science Foundation of China(11971433)First Class Discipline of Zhejiang-A(Zhejiang Gongshang University-Statistics)Hunan Soft Science Research Project(2012ZK3064)
文摘Detecting differential expression of genes in genom research(e.g.,2019-nCoV)is not uncommon,due to the cost only small sample is employed to estimate a large number of variances(or their inverse)of variables simultaneously.However,the commonly used approaches perform unreliable.Borrowing information across different variables or priori information of variables,shrinkage estimation approaches are proposed and some optimal shrinkage estimators are obtained in the sense of asymptotic.In this paper,we focus on the setting of small sample and a likelihood-unbiased estimator for power of variances is given under the assumption that the variances are chi-squared distribution.Simulation reports show that the likelihood-unbiased estimators for variances and their inverse perform very well.In addition,application comparison and real data analysis indicate that the proposed estimator also works well.