We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be c...We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be calculated explicitly which is a squared function.展开更多
In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For exampl...In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.展开更多
Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-v...Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.展开更多
The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for mul...The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.展开更多
The logistic regression model has been become commonly used to study the association between a binary response variable;it is widespread application rests on its easy application and interpretation. The subject of ass...The logistic regression model has been become commonly used to study the association between a binary response variable;it is widespread application rests on its easy application and interpretation. The subject of assessment of goodness-of-fit in logistic regression model has attracted the attention of many scientists and researchers. Goodness-of-fit tests are methods to determine the suitability of the fitted model. Many of methods proposed and discussed for assessing goodness-of fit in logistic regression model, however, the asymptotic distribution of goodness-of-fit statistics are less examine, it is need more investigated. This work, will focus on assessing the behavior of asymptotic distribution of goodness-of-fit tests, also make comparison between global goodness-of-fit tests, and evaluate it by simulation.展开更多
We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and ...We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.展开更多
“Human-elephant conflict(HEC)”,the alarming issue,in present day context has attracted the attention of environmentalists and policy makers.The rising conflict between human beings and wild elephants is common in Bu...“Human-elephant conflict(HEC)”,the alarming issue,in present day context has attracted the attention of environmentalists and policy makers.The rising conflict between human beings and wild elephants is common in Buxa Tiger Reserve(BTR)and its adjoining area in West Bengal State,India,making the area volatile.People’s attitudes towards elephant conservation activity are very crucial to get rid of HEC,because people’s proximity with wild elephants’habitat can trigger the occurrence of HEC.The aim of this study is to conduct an in-depth investigation about the association of people’s attitudes towards HEC with their locational,demographic,and socio-economic characteristics in BTR and its adjoining area by using Pearson’s bivariate chi-square test and binary logistic regression analysis.BTR is one of the constituent parts of Eastern Doors Elephant Reserve(EDER).We interviewed 500 respondents to understand their perceptions to HEC and investigated their locational,demographic,and socio-economic characteristics including location of village,gender,age,ethnicity,religion,caste,poverty level,education level,primary occupation,secondary occupation,household type,and source of firewood.The results indicate that respondents who are living in enclave forest villages(EFVs),peripheral forest villages(PFVs),corridor village(CVs),or forest and corridor villages(FCVs),mainly males,at the age of 18–48 years old,engaged with agriculture occupation,and living in kancha and mixed houses,have more likelihood to witness HEC.Besides,respondents who are illiterate or at primary education level are more likely to regard elephant as a main problematic animal around their villages and refuse to participate in elephant conservation activity.For the sake of a sustainable environment for both human beings and wildlife,people’s attitudes towards elephants must be friendly in a more prudent way,so that the two communities can live in harmony.展开更多
In detecting system fault algorithms,the false alarm rate and undectect rate generated by residual Chi-square test can affect the stability of filters.The paper proposes a fault detection algorithm based on sequential...In detecting system fault algorithms,the false alarm rate and undectect rate generated by residual Chi-square test can affect the stability of filters.The paper proposes a fault detection algorithm based on sequential residual Chi-square test and applies to fault detection of an integrated navigation system.The simulation result shows that the algorithm can accurately detect the fault information of global positioning system(GPS),eliminate the influence of false alarm and missed detection on filter,and enhance fault tolerance of integrated navigation systems.展开更多
In goodness-of-fit tests, Pearson's chi-squared test is one of most widely used tools of formal statistical analysis. However, Pearson's chi-squared test depends on the partition of the sample space. Different const...In goodness-of-fit tests, Pearson's chi-squared test is one of most widely used tools of formal statistical analysis. However, Pearson's chi-squared test depends on the partition of the sample space. Different constructions of the partition of the sample space may lead to different conclusions. Based on an equiprobable partition of sample space, a modified chi^quared test is proposed. A method for constructing the modified chi-squared test is proposed. As an application, the proposed test is used to test whether vectorial data come from an uniformity distribution defined on the hypersphere. Some simulation studies show that the modified chisquared test against different alternative is robust.展开更多
In this paper, we generalize the proof of the Cochran statistic in the case of an ANOVA two ways structure that asymptotically follows a Chi-2. While construction of homogeneity statistics test usually resorts to the ...In this paper, we generalize the proof of the Cochran statistic in the case of an ANOVA two ways structure that asymptotically follows a Chi-2. While construction of homogeneity statistics test usually resorts to the determination of the covariance matrix and its inverse, the Moore-Penrose matrix, our approach, avoids this step. We also show that the Cochran statistic in ANOVA two ways is equivalent to conventional homogeneity statistics test. In particular, we show that it satisfies the invariance property. Finally, we conduct empirical verification from a meta-analysis that confirms our theoretical results.展开更多
Principle component analysis (PCA) based chi-square test is more sensitive to subtle gross errors and has greater power to correctly detect gross errors than classical chi-square test. However, classical principal c...Principle component analysis (PCA) based chi-square test is more sensitive to subtle gross errors and has greater power to correctly detect gross errors than classical chi-square test. However, classical principal com- ponent test (PCT) is non-robust and can be very sensitive to one or more outliers. In this paper, a Huber function liked robust weight factor was added in the collective chi-square test to eliminate the influence of gross errors on the PCT. Meanwhile, robust chi-square test was applied to modified simultaneous estimation of gross error (MSEGE) strategy to detect and identify multiple gross errors. Simulation results show that the proposed robust test can reduce the possibility of type Ⅱ errors effectively. Adding robust chi-square test into MSEGE does not obviously improve the power of multiple gross error identification, the proposed approach considers the influence of outliers on hypothesis statistic test and is more reasonable.展开更多
Knowledge on individual’s HIV/AIDS status provides a tool to reduce or avoid HIV transmission, spread and mortalities due to HIV-related illness. However, most people still do not know their HIV status because they a...Knowledge on individual’s HIV/AIDS status provides a tool to reduce or avoid HIV transmission, spread and mortalities due to HIV-related illness. However, most people still do not know their HIV status because they are not willing to test for HIV/AIDS due to various reasons. Hence the aim of this paper is to investigate the effects of various risk factors that are likely to influence decision to ever test for HIV/AIDS. The data used in this paper were obtained from the Ghana Demographic and Health Survey (n = 1828 observations and 32 risk factors). We applied the Chi-Square test statistic and the logistic regression model to the data in order to study the effects of these risk factors on one’s decision to ever test for HIV. STATA version 14.1 and R version 3.5.2 were used to carry out the statistical analyses. Generally, the results show that education, especially higher education significantly (OR = 0.53, 95% = 0.230, 0.837) increases the likelihood to ever test for HIV. Also, the younger the age groups the higher the effect and significance in the likelihood to ever test for HIV. We found that HIV-TB co-infection (OR = 0.53, 95% = 0.165, 0.893), use of condom anytime one has sex (OR = 0.31, 95% = 0.054, 0.573), wealth index (OR = 0.46, 95% = 0.137, 0.791), awareness of HIV transmission during child-delivery, number of partners significantly affect HIV testing. Those with many partners are less likely (OR = -0.26, 95% = -0.504, -0.007) to ever test for HIV and those who know that healthy person may have HIV are more likely (OR = 0.41, 95% = 0.137, 0.679) to ever test for HIV. Age is the common significant risk factor of ever tested for HIV across the 10 regions in Ghana. Resources should be allocated for more education on these significant risk factors in order to help in the fight against HIV-Health related issues.展开更多
Genetic association studies usually apply the simple chi-square (χ<sup>2</sup>)-test for testing association between a single-nucleotide polymorphism (SNP) and a particular phenotype, assuming the genotyp...Genetic association studies usually apply the simple chi-square (χ<sup>2</sup>)-test for testing association between a single-nucleotide polymorphism (SNP) and a particular phenotype, assuming the genotypes and phenotypes are independent. So, the conventional χ<sup>2</sup>-test does not consider the increased risk of an individual carrying the increasing number of disease responsible allele (a particular genotype). But, the association tests should be performed with the consideration of this disease risk according to the mode of inheritance (additive, dominant, recessive). Practical demonstration of the two possible methods for considering such order or trends in contingency tables of genetic association studies using SNP genotype data is the purpose of this paper. One method is by pooling the genotypes, and the other is scoring the individual genotypes, based on the disease risk according to the inheritance pattern. The results show that the p-values obtained from both the methods are similar for the dominant and recessive models. The other important features of the methods were also extracted using the SNP genotype data for different inheritance patterns.展开更多
Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.Th...Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.展开更多
It is well known that smart thermostats (STs) have become key devices in the implementation of smart homes;thus, they are considered as primary elements for the control of electrical energy consumption in households. ...It is well known that smart thermostats (STs) have become key devices in the implementation of smart homes;thus, they are considered as primary elements for the control of electrical energy consumption in households. Moreover, energy consumption is drastically affected when the end users select unsuitable STs or when they do not use the STs correctly. Furthermore, in future, Mexico will face serious electrical energy challenges that can be considerably resolved if the end users operate the STs in a correct manner. Hence, it is important to carry out an in-depth study and analysis on thermostats, by focusing on social aspects that influence the technological use and performance of the thermostats. This paper proposes the use of a signal detection theory (SDT), fuzzy detection theory (FDT), and chi-square (CS) test in order to understand the perceptions and beliefs of end users about the use of STs in Mexico. This paper extensively shows the perceptions and beliefs about the selected thermostats in Mexico. Besides, it presents an in-depth discussion on the cognitive perceptions and beliefs of end users. Moreover, it shows why the expectations of the end users about STs are not met. It also promotes the technological and social development of STs such that they are relatively more accepted in complex electrical grids such as smart grids.展开更多
The classical chi-squared goodness of fit test assumes the number of classes is fixed,meanwhile the test statistic has a limiting chi-square distribution under the null hypothesis.It is well known that the number of c...The classical chi-squared goodness of fit test assumes the number of classes is fixed,meanwhile the test statistic has a limiting chi-square distribution under the null hypothesis.It is well known that the number of classes varying with sample size in the test has attached more and more attention.However,in this situation,there is not theoretical results for the asymptotic property of such chi-squared test statistic.This paper proves the consistency of chi-squared test with varying number of classes under some conditions.Meanwhile,the authors also give a convergence rate of KolmogorovSimirnov distance between the test statistic and corresponding chi-square distributed random variable.In addition,a real example and simulation results validate the reasonability of theoretical result and the superiority of chi-squared test with varying number of classes.展开更多
Multiple dominant gear meshing frequencies are present in the vibration signals collected from gearboxes and the conventional spiky features that represent initial gear fault conditions are usually difficult to detect...Multiple dominant gear meshing frequencies are present in the vibration signals collected from gearboxes and the conventional spiky features that represent initial gear fault conditions are usually difficult to detect. In order to solve this problem, we propose a new gearbox deterioration detection technique based on autoregressive modeling and hypothesis testing in this paper. A stationary autoregressive model was built by using a normal vibration signal from each shaft. The established autoregressive model was then applied to process fault signals from each shaft of a two-stage gearbox. What this paper investigated is a combined technique which unites a time-varying autoregressive model and a two sample Kolmogorov-Smimov goodness-of-fit test, to detect the deterioration of gearing system with simultaneously variable shaft speed and variable load. The time-varying autoregressive model residuals representing both healthy and faulty gear conditions were compared with the original healthy time-synchronous average signals. Compared with the traditional kurtosis statistic, this technique for gearbox deterioration detection has shown significant advantages in highlighting the presence of incipient gear fault in all different speed shafts involved in the meshing motion under variable conditions.展开更多
This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on m...This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.展开更多
基金the National Natural Science Foundation of China (10571139)
文摘We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be calculated explicitly which is a squared function.
文摘In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.
文摘Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.
基金supported by the Program of Introducing Talents of Disciplines to Universities of the Ministry of Education and State Administration of the Foreign Experts Affairs of China (the 111 Project, Grant No.B08048)the Special Basic Research Fund for Methodology in Hydrology of the Ministry of Sciences and Technology of China (Grant No. 2011IM011000)
文摘The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.
文摘The logistic regression model has been become commonly used to study the association between a binary response variable;it is widespread application rests on its easy application and interpretation. The subject of assessment of goodness-of-fit in logistic regression model has attracted the attention of many scientists and researchers. Goodness-of-fit tests are methods to determine the suitability of the fitted model. Many of methods proposed and discussed for assessing goodness-of fit in logistic regression model, however, the asymptotic distribution of goodness-of-fit statistics are less examine, it is need more investigated. This work, will focus on assessing the behavior of asymptotic distribution of goodness-of-fit tests, also make comparison between global goodness-of-fit tests, and evaluate it by simulation.
文摘We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.
文摘“Human-elephant conflict(HEC)”,the alarming issue,in present day context has attracted the attention of environmentalists and policy makers.The rising conflict between human beings and wild elephants is common in Buxa Tiger Reserve(BTR)and its adjoining area in West Bengal State,India,making the area volatile.People’s attitudes towards elephant conservation activity are very crucial to get rid of HEC,because people’s proximity with wild elephants’habitat can trigger the occurrence of HEC.The aim of this study is to conduct an in-depth investigation about the association of people’s attitudes towards HEC with their locational,demographic,and socio-economic characteristics in BTR and its adjoining area by using Pearson’s bivariate chi-square test and binary logistic regression analysis.BTR is one of the constituent parts of Eastern Doors Elephant Reserve(EDER).We interviewed 500 respondents to understand their perceptions to HEC and investigated their locational,demographic,and socio-economic characteristics including location of village,gender,age,ethnicity,religion,caste,poverty level,education level,primary occupation,secondary occupation,household type,and source of firewood.The results indicate that respondents who are living in enclave forest villages(EFVs),peripheral forest villages(PFVs),corridor village(CVs),or forest and corridor villages(FCVs),mainly males,at the age of 18–48 years old,engaged with agriculture occupation,and living in kancha and mixed houses,have more likelihood to witness HEC.Besides,respondents who are illiterate or at primary education level are more likely to regard elephant as a main problematic animal around their villages and refuse to participate in elephant conservation activity.For the sake of a sustainable environment for both human beings and wildlife,people’s attitudes towards elephants must be friendly in a more prudent way,so that the two communities can live in harmony.
基金supported by the National Natural Science Foundation of China(6063403060702066)+1 种基金the Aerospace Science Foundation(20090853013)Fundmental Research Foundation of NWPU(JC201015),Soaring Star of NWPU
文摘In detecting system fault algorithms,the false alarm rate and undectect rate generated by residual Chi-square test can affect the stability of filters.The paper proposes a fault detection algorithm based on sequential residual Chi-square test and applies to fault detection of an integrated navigation system.The simulation result shows that the algorithm can accurately detect the fault information of global positioning system(GPS),eliminate the influence of false alarm and missed detection on filter,and enhance fault tolerance of integrated navigation systems.
基金Foundation item: the Natural Science Foundation of Beijing (No. 1062001)Academic Human Resources Development in Institutions of Higher Learning Under the Jurisdiction of Beijing Municipality(No. 05006011200702).Acknowledgements The authors cordially thank the Associate Editor and Reviewers for their constructive comments which lead to improvement of the manuscript. They are also very grateful to Prof. Adelaide Figueiredo for his help.
文摘In goodness-of-fit tests, Pearson's chi-squared test is one of most widely used tools of formal statistical analysis. However, Pearson's chi-squared test depends on the partition of the sample space. Different constructions of the partition of the sample space may lead to different conclusions. Based on an equiprobable partition of sample space, a modified chi^quared test is proposed. A method for constructing the modified chi-squared test is proposed. As an application, the proposed test is used to test whether vectorial data come from an uniformity distribution defined on the hypersphere. Some simulation studies show that the modified chisquared test against different alternative is robust.
文摘In this paper, we generalize the proof of the Cochran statistic in the case of an ANOVA two ways structure that asymptotically follows a Chi-2. While construction of homogeneity statistics test usually resorts to the determination of the covariance matrix and its inverse, the Moore-Penrose matrix, our approach, avoids this step. We also show that the Cochran statistic in ANOVA two ways is equivalent to conventional homogeneity statistics test. In particular, we show that it satisfies the invariance property. Finally, we conduct empirical verification from a meta-analysis that confirms our theoretical results.
基金The National Natural Science Foundation of China(No 60504033)
文摘Principle component analysis (PCA) based chi-square test is more sensitive to subtle gross errors and has greater power to correctly detect gross errors than classical chi-square test. However, classical principal com- ponent test (PCT) is non-robust and can be very sensitive to one or more outliers. In this paper, a Huber function liked robust weight factor was added in the collective chi-square test to eliminate the influence of gross errors on the PCT. Meanwhile, robust chi-square test was applied to modified simultaneous estimation of gross error (MSEGE) strategy to detect and identify multiple gross errors. Simulation results show that the proposed robust test can reduce the possibility of type Ⅱ errors effectively. Adding robust chi-square test into MSEGE does not obviously improve the power of multiple gross error identification, the proposed approach considers the influence of outliers on hypothesis statistic test and is more reasonable.
文摘Knowledge on individual’s HIV/AIDS status provides a tool to reduce or avoid HIV transmission, spread and mortalities due to HIV-related illness. However, most people still do not know their HIV status because they are not willing to test for HIV/AIDS due to various reasons. Hence the aim of this paper is to investigate the effects of various risk factors that are likely to influence decision to ever test for HIV/AIDS. The data used in this paper were obtained from the Ghana Demographic and Health Survey (n = 1828 observations and 32 risk factors). We applied the Chi-Square test statistic and the logistic regression model to the data in order to study the effects of these risk factors on one’s decision to ever test for HIV. STATA version 14.1 and R version 3.5.2 were used to carry out the statistical analyses. Generally, the results show that education, especially higher education significantly (OR = 0.53, 95% = 0.230, 0.837) increases the likelihood to ever test for HIV. Also, the younger the age groups the higher the effect and significance in the likelihood to ever test for HIV. We found that HIV-TB co-infection (OR = 0.53, 95% = 0.165, 0.893), use of condom anytime one has sex (OR = 0.31, 95% = 0.054, 0.573), wealth index (OR = 0.46, 95% = 0.137, 0.791), awareness of HIV transmission during child-delivery, number of partners significantly affect HIV testing. Those with many partners are less likely (OR = -0.26, 95% = -0.504, -0.007) to ever test for HIV and those who know that healthy person may have HIV are more likely (OR = 0.41, 95% = 0.137, 0.679) to ever test for HIV. Age is the common significant risk factor of ever tested for HIV across the 10 regions in Ghana. Resources should be allocated for more education on these significant risk factors in order to help in the fight against HIV-Health related issues.
文摘Genetic association studies usually apply the simple chi-square (χ<sup>2</sup>)-test for testing association between a single-nucleotide polymorphism (SNP) and a particular phenotype, assuming the genotypes and phenotypes are independent. So, the conventional χ<sup>2</sup>-test does not consider the increased risk of an individual carrying the increasing number of disease responsible allele (a particular genotype). But, the association tests should be performed with the consideration of this disease risk according to the mode of inheritance (additive, dominant, recessive). Practical demonstration of the two possible methods for considering such order or trends in contingency tables of genetic association studies using SNP genotype data is the purpose of this paper. One method is by pooling the genotypes, and the other is scoring the individual genotypes, based on the disease risk according to the inheritance pattern. The results show that the p-values obtained from both the methods are similar for the dominant and recessive models. The other important features of the methods were also extracted using the SNP genotype data for different inheritance patterns.
基金supported by the Science and Technology Project of China Southern Power Grid(GZHKJXM20210043-080041KK52210002).
文摘Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.
文摘It is well known that smart thermostats (STs) have become key devices in the implementation of smart homes;thus, they are considered as primary elements for the control of electrical energy consumption in households. Moreover, energy consumption is drastically affected when the end users select unsuitable STs or when they do not use the STs correctly. Furthermore, in future, Mexico will face serious electrical energy challenges that can be considerably resolved if the end users operate the STs in a correct manner. Hence, it is important to carry out an in-depth study and analysis on thermostats, by focusing on social aspects that influence the technological use and performance of the thermostats. This paper proposes the use of a signal detection theory (SDT), fuzzy detection theory (FDT), and chi-square (CS) test in order to understand the perceptions and beliefs of end users about the use of STs in Mexico. This paper extensively shows the perceptions and beliefs about the selected thermostats in Mexico. Besides, it presents an in-depth discussion on the cognitive perceptions and beliefs of end users. Moreover, it shows why the expectations of the end users about STs are not met. It also promotes the technological and social development of STs such that they are relatively more accepted in complex electrical grids such as smart grids.
基金supported by the Natural Science Foundation of China under Grant Nos.11071022,11028103,11231010,11471223,BCMIISthe Beijing Municipal Educational Commission Foundation under Grant Nos.KZ201410028030,KM201210028005Jishou University Subject in 2014(No:14JD035)
文摘The classical chi-squared goodness of fit test assumes the number of classes is fixed,meanwhile the test statistic has a limiting chi-square distribution under the null hypothesis.It is well known that the number of classes varying with sample size in the test has attached more and more attention.However,in this situation,there is not theoretical results for the asymptotic property of such chi-squared test statistic.This paper proves the consistency of chi-squared test with varying number of classes under some conditions.Meanwhile,the authors also give a convergence rate of KolmogorovSimirnov distance between the test statistic and corresponding chi-square distributed random variable.In addition,a real example and simulation results validate the reasonability of theoretical result and the superiority of chi-squared test with varying number of classes.
基金supported by National Natural Science Foundation of China (Grant No. 50675232)Key Project of Ministry of Education of ChinaChongqing Municipal Natural Science Key Foundation of China (Grant No. 2007BA6021)
文摘Multiple dominant gear meshing frequencies are present in the vibration signals collected from gearboxes and the conventional spiky features that represent initial gear fault conditions are usually difficult to detect. In order to solve this problem, we propose a new gearbox deterioration detection technique based on autoregressive modeling and hypothesis testing in this paper. A stationary autoregressive model was built by using a normal vibration signal from each shaft. The established autoregressive model was then applied to process fault signals from each shaft of a two-stage gearbox. What this paper investigated is a combined technique which unites a time-varying autoregressive model and a two sample Kolmogorov-Smimov goodness-of-fit test, to detect the deterioration of gearing system with simultaneously variable shaft speed and variable load. The time-varying autoregressive model residuals representing both healthy and faulty gear conditions were compared with the original healthy time-synchronous average signals. Compared with the traditional kurtosis statistic, this technique for gearbox deterioration detection has shown significant advantages in highlighting the presence of incipient gear fault in all different speed shafts involved in the meshing motion under variable conditions.
文摘This study explored and reviewed the logistic regression (LR) model, a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, with emphasis on medical research. Thirty seven research articles published between 2000 and 2018 which employed logistic regression as the main statistical tool as well as six text books on logistic regression were reviewed. Logistic regression concepts such as odds, odds ratio, logit transformation, logistic curve, assumption, selecting dependent and independent variables, model fitting, reporting and interpreting were presented. Upon perusing the literature, considerable deficiencies were found in both the use and reporting of LR. For many studies, the ratio of the number of outcome events to predictor variables (events per variable) was sufficiently small to call into question the accuracy of the regression model. Also, most studies did not report on validation analysis, regression diagnostics or goodness-of-fit measures;measures which authenticate the robustness of the LR model. Here, we demonstrate a good example of the application of the LR model using data obtained on a cohort of pregnant women and the factors that influence their decision to opt for caesarean delivery or vaginal birth. It is recommended that researchers should be more rigorous and pay greater attention to guidelines concerning the use and reporting of LR models.