Modeling non coding background sequences appropriately is important for the detection of regulatory elements from DNA sequences. Based on the chi square statistic test, some explanations about why to choose higher ...Modeling non coding background sequences appropriately is important for the detection of regulatory elements from DNA sequences. Based on the chi square statistic test, some explanations about why to choose higher order Markov chain model and how to automatically select the proper order are given in this paper. The chi square test is first run on synthetic data sets to show that it can efficiently find the proper order of Markov chain. Using chi square test, distinct higher order context dependences inherent in ten sets of sequences of yeast S.cerevisiae from other literature have been found. So the Markov chain with higher order would be more suitable for modeling the non coding background sequences than an independent model.展开更多
In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the m...In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the modified Chi-square technique, the unknown concentration of reactants and products in test samples withdrawn from the process is determined. The technique avoids the need for the spectral data to conform to Beer’s Law and the best spectral range is determined automatically.展开更多
The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two c...The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.展开更多
We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be c...We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be calculated explicitly which is a squared function.展开更多
A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function an...A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function and cumulative distribution function is presented. The statistical features of the Generalized Kumaraswamy Generalized Power Gompertz distribution are systematically derived and adequately studied. The estimation of the model parameters in the absence of censoring and under-right censoring is performed using the method of maximum likelihood. The test statistic for right-censored data, criteria test for GKGPG distribution, estimated matrix Ŵ, Ĉ, and Ĝ, criteria test Y<sup>2</sup>n</sub>, alongside the quadratic form of the test statistic is derived. Mean simulated values of maximum likelihood estimates and their corresponding square mean errors are presented and confirmed to agree closely with the true parameter values. Simulated levels of significance for Y<sup>2</sup>n</sub> (γ) test for the GKGPG model against their theoretical values were recorded. We conclude that the null hypothesis for which simulated samples are fitted by GKGPG distribution is widely validated for the different levels of significance considered. From the summary of the results of the strength of a specific type of braided cord dataset on the GKGPG model, it is observed that the proposed GKGPG model fits the data set for a significance level ε = 0.05.展开更多
In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For exampl...In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.展开更多
We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and ...We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.展开更多
Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-v...Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.展开更多
针对传统故障检测算法对组合导航系统的缓变故障检测效率不高的问题,提出了一种基于正交性原理的故障检测算法。无故障时,卡尔曼滤波相邻残差满足正交关系,残差正交值为零均值的白噪声序列;有故障时,相邻残差高度相关,残差正交值不满足...针对传统故障检测算法对组合导航系统的缓变故障检测效率不高的问题,提出了一种基于正交性原理的故障检测算法。无故障时,卡尔曼滤波相邻残差满足正交关系,残差正交值为零均值的白噪声序列;有故障时,相邻残差高度相关,残差正交值不满足零均值条件。综上所述,以残差正交值为基础,构建卡方检验算法以实现对故障的检测。残差正交值的特殊结构使其对故障较敏感。实验结果表明,该算法对缓变故障的检测效果优于残差卡方检验算法和渐消序贯概率比检验(sequential probability ratio test,SPRT)算法,提高了组合导航系统的估计精度与可靠性。展开更多
文摘Modeling non coding background sequences appropriately is important for the detection of regulatory elements from DNA sequences. Based on the chi square statistic test, some explanations about why to choose higher order Markov chain model and how to automatically select the proper order are given in this paper. The chi square test is first run on synthetic data sets to show that it can efficiently find the proper order of Markov chain. Using chi square test, distinct higher order context dependences inherent in ten sets of sequences of yeast S.cerevisiae from other literature have been found. So the Markov chain with higher order would be more suitable for modeling the non coding background sequences than an independent model.
文摘In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the modified Chi-square technique, the unknown concentration of reactants and products in test samples withdrawn from the process is determined. The technique avoids the need for the spectral data to conform to Beer’s Law and the best spectral range is determined automatically.
文摘The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.
基金the National Natural Science Foundation of China (10571139)
文摘We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be calculated explicitly which is a squared function.
文摘A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function and cumulative distribution function is presented. The statistical features of the Generalized Kumaraswamy Generalized Power Gompertz distribution are systematically derived and adequately studied. The estimation of the model parameters in the absence of censoring and under-right censoring is performed using the method of maximum likelihood. The test statistic for right-censored data, criteria test for GKGPG distribution, estimated matrix Ŵ, Ĉ, and Ĝ, criteria test Y<sup>2</sup>n</sub>, alongside the quadratic form of the test statistic is derived. Mean simulated values of maximum likelihood estimates and their corresponding square mean errors are presented and confirmed to agree closely with the true parameter values. Simulated levels of significance for Y<sup>2</sup>n</sub> (γ) test for the GKGPG model against their theoretical values were recorded. We conclude that the null hypothesis for which simulated samples are fitted by GKGPG distribution is widely validated for the different levels of significance considered. From the summary of the results of the strength of a specific type of braided cord dataset on the GKGPG model, it is observed that the proposed GKGPG model fits the data set for a significance level ε = 0.05.
文摘In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.
文摘We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.
文摘Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.
文摘针对传统故障检测算法对组合导航系统的缓变故障检测效率不高的问题,提出了一种基于正交性原理的故障检测算法。无故障时,卡尔曼滤波相邻残差满足正交关系,残差正交值为零均值的白噪声序列;有故障时,相邻残差高度相关,残差正交值不满足零均值条件。综上所述,以残差正交值为基础,构建卡方检验算法以实现对故障的检测。残差正交值的特殊结构使其对故障较敏感。实验结果表明,该算法对缓变故障的检测效果优于残差卡方检验算法和渐消序贯概率比检验(sequential probability ratio test,SPRT)算法,提高了组合导航系统的估计精度与可靠性。