This study introduces a new classifier tailored to address the limitations inherent in conventional classifiers such as K-nearest neighbor(KNN),random forest(RF),decision tree(DT),and support vector machine(SVM)for ar...This study introduces a new classifier tailored to address the limitations inherent in conventional classifiers such as K-nearest neighbor(KNN),random forest(RF),decision tree(DT),and support vector machine(SVM)for arrhythmia detection.The proposed classifier leverages the Chi-square distance as a primary metric,providing a specialized and original approach for precise arrhythmia detection.To optimize feature selection and refine the classifier’s performance,particle swarm optimization(PSO)is integrated with the Chi-square distance as a fitness function.This synergistic integration enhances the classifier’s capabilities,resulting in a substantial improvement in accuracy for arrhythmia detection.Experimental results demonstrate the efficacy of the proposed method,achieving a noteworthy accuracy rate of 98% with PSO,higher than 89% achieved without any previous optimization.The classifier outperforms machine learning(ML)and deep learning(DL)techniques,underscoring its reliability and superiority in the realm of arrhythmia classification.The promising results render it an effective method to support both academic and medical communities,offering an advanced and precise solution for arrhythmia detection in electrocardiogram(ECG)data.展开更多
Diabetes mellitus is a metabolic disease that is ranked among the top 10 causes of death by the world health organization.During the last few years,an alarming increase is observed worldwide with a 70%rise in the dise...Diabetes mellitus is a metabolic disease that is ranked among the top 10 causes of death by the world health organization.During the last few years,an alarming increase is observed worldwide with a 70%rise in the disease since 2000 and an 80%rise in male deaths.If untreated,it results in complications of many vital organs of the human body which may lead to fatality.Early detection of diabetes is a task of significant importance to start timely treatment.This study introduces a methodology for the classification of diabetic and normal people using an ensemble machine learning model and feature fusion of Chi-square and principal component analysis.An ensemble model,logistic tree classifier(LTC),is proposed which incorporates logistic regression and extra tree classifier through a soft voting mechanism.Experiments are also performed using several well-known machine learning algorithms to analyze their performance including logistic regression,extra tree classifier,AdaBoost,Gaussian naive Bayes,decision tree,random forest,and k nearest neighbor.In addition,several experiments are carried out using principal component analysis(PCA)and Chi-square(Chi-2)fea-tures to analyze the influence of feature selection on the performance of machine learning classifiers.Results indicate that Chi-2 features show high performance than both PCA features and original features.However,the highest accuracy is obtained when the proposed ensemble model LTC is used with the proposed fea-ture fusion framework-work which achieves a 0.85 accuracy score which is the highest of the available approaches for diabetes prediction.In addition,the statis-tical T-test proves the statistical significance of the proposed approach over other approaches.展开更多
In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the m...In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the modified Chi-square technique, the unknown concentration of reactants and products in test samples withdrawn from the process is determined. The technique avoids the need for the spectral data to conform to Beer’s Law and the best spectral range is determined automatically.展开更多
The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two c...The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.展开更多
We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be c...We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be calculated explicitly which is a squared function.展开更多
We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and ...We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.展开更多
In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For exampl...In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.展开更多
A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function an...A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function and cumulative distribution function is presented. The statistical features of the Generalized Kumaraswamy Generalized Power Gompertz distribution are systematically derived and adequately studied. The estimation of the model parameters in the absence of censoring and under-right censoring is performed using the method of maximum likelihood. The test statistic for right-censored data, criteria test for GKGPG distribution, estimated matrix Ŵ, Ĉ, and Ĝ, criteria test Y<sup>2</sup>n</sub>, alongside the quadratic form of the test statistic is derived. Mean simulated values of maximum likelihood estimates and their corresponding square mean errors are presented and confirmed to agree closely with the true parameter values. Simulated levels of significance for Y<sup>2</sup>n</sub> (γ) test for the GKGPG model against their theoretical values were recorded. We conclude that the null hypothesis for which simulated samples are fitted by GKGPG distribution is widely validated for the different levels of significance considered. From the summary of the results of the strength of a specific type of braided cord dataset on the GKGPG model, it is observed that the proposed GKGPG model fits the data set for a significance level ε = 0.05.展开更多
Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-v...Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.展开更多
“Human-elephant conflict(HEC)”,the alarming issue,in present day context has attracted the attention of environmentalists and policy makers.The rising conflict between human beings and wild elephants is common in Bu...“Human-elephant conflict(HEC)”,the alarming issue,in present day context has attracted the attention of environmentalists and policy makers.The rising conflict between human beings and wild elephants is common in Buxa Tiger Reserve(BTR)and its adjoining area in West Bengal State,India,making the area volatile.People’s attitudes towards elephant conservation activity are very crucial to get rid of HEC,because people’s proximity with wild elephants’habitat can trigger the occurrence of HEC.The aim of this study is to conduct an in-depth investigation about the association of people’s attitudes towards HEC with their locational,demographic,and socio-economic characteristics in BTR and its adjoining area by using Pearson’s bivariate chi-square test and binary logistic regression analysis.BTR is one of the constituent parts of Eastern Doors Elephant Reserve(EDER).We interviewed 500 respondents to understand their perceptions to HEC and investigated their locational,demographic,and socio-economic characteristics including location of village,gender,age,ethnicity,religion,caste,poverty level,education level,primary occupation,secondary occupation,household type,and source of firewood.The results indicate that respondents who are living in enclave forest villages(EFVs),peripheral forest villages(PFVs),corridor village(CVs),or forest and corridor villages(FCVs),mainly males,at the age of 18–48 years old,engaged with agriculture occupation,and living in kancha and mixed houses,have more likelihood to witness HEC.Besides,respondents who are illiterate or at primary education level are more likely to regard elephant as a main problematic animal around their villages and refuse to participate in elephant conservation activity.For the sake of a sustainable environment for both human beings and wildlife,people’s attitudes towards elephants must be friendly in a more prudent way,so that the two communities can live in harmony.展开更多
Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.Th...Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.展开更多
One of the significant health issues affecting women that impacts their fertility and results in serious health concerns is Polycystic ovarian syndrome(PCOS).Consequently,timely screening of polycystic ovarian syndrom...One of the significant health issues affecting women that impacts their fertility and results in serious health concerns is Polycystic ovarian syndrome(PCOS).Consequently,timely screening of polycystic ovarian syndrome can help in the process of recovery.Finding a method to aid doctors in this procedure was crucial due to the difficulties in detecting this condition.This research aimed to determine whether it is possible to optimize the detection of PCOS utilizing Deep Learning algorithms and methodologies.Additionally,feature selection methods that produce the most important subset of features can speed up calculation and enhance the effectiveness of classifiers.In this research,the tri-stage wrapper method is used because it reduces the computation time.The proposed study for the Automatic diagnosis of PCOS contains preprocessing,data normalization,feature selection,and classification.A dataset with 39 characteristics,including metabolism,neuroimaging,hormones,and biochemical information for 541 subjects,was employed in this scenario.To start,this research pre-processed the information.Next for feature selection,a tri-stage wrapper method such as Mutual Information,ReliefF,Chi-Square,and Xvariance is used.Then,various classification methods are tested and trained.Deep learning techniques including convolutional neural network(CNN),multi-layer perceptron(MLP),Recurrent neural network(RNN),and Bi long short-term memory(Bi-LSTM)are utilized for categorization.The experimental finding demonstrates that with effective feature extraction process using tri stage wrapper method+CNN delivers the highest precision(97%),high accuracy(98.67%),and recall(89%)when compared with other machine learning algorithms.展开更多
Imagery assessment is an efficient method for detecting craniofacial anomalies.A cephalometric landmark matching approach may help in orthodontic diagnosis,craniofacial growth assessment and treatment planning.Automati...Imagery assessment is an efficient method for detecting craniofacial anomalies.A cephalometric landmark matching approach may help in orthodontic diagnosis,craniofacial growth assessment and treatment planning.Automatic landmark matching and anomalies detection helps face the manual labelling lim-itations and optimize preoperative planning of maxillofacial surgery.The aim of this study was to develop an accurate Cephalometric Landmark Matching method as well as an automatic system for anatomical anomalies classification.First,the Active Appearance Model(AAM)was used for the matching process.This pro-cess was achieved by the Ant Colony Optimization(ACO)algorithm enriched with proximity information.Then,the maxillofacial anomalies were classified using the Support Vector Machine(SVM).The experiments were conducted on X-ray cephalograms of 400 patients where the ground truth was produced by two experts.The frameworks achieved a landmark matching error(LE)of 0.50±1.04 and a successful landmark matching of 89.47%in the 2 mm and 3 mm range and of 100%in the 4 mm range.The classification of anomalies achieved an accuracy of 98.75%.Compared to previous work,the proposed approach is simpler and has a comparable range of acceptable matching cost and anomaly classification.Results have also shown that it outperformed the K-nearest neigh-bors(KNN)classifier.展开更多
The Internet service provider(ISP)is the heart of any country’s Internet infrastructure and plays an important role in connecting to theWorld WideWeb.Internet exchange point(IXP)allows the interconnection of two or m...The Internet service provider(ISP)is the heart of any country’s Internet infrastructure and plays an important role in connecting to theWorld WideWeb.Internet exchange point(IXP)allows the interconnection of two or more separate network infrastructures.All Internet traffic entering a country should pass through its IXP.Thus,it is an ideal location for performing malicious traffic analysis.Distributed denial of service(DDoS)attacks are becoming a more serious daily threat.Malicious actors in DDoS attacks control numerous infected machines known as botnets.Botnets are used to send numerous fake requests to overwhelm the resources of victims and make them unavailable for some periods.To date,such attacks present a major devastating security threat on the Internet.This paper proposes an effective and efficient machine learning(ML)-based DDoS detection approach for the early warning and protection of the Saudi Arabia Internet exchange point(SAIXP)platform.The effectiveness and efficiency of the proposed approach are verified by selecting an accurate ML method with a small number of input features.A chi-square method is used for feature selection because it is easier to compute than other methods,and it does not require any assumption about feature distribution values.Several ML methods are assessed using holdout and 10-fold tests on a public large-size dataset.The experiments showed that the performance of the decision tree(DT)classifier achieved a high accuracy result(99.98%)with a small number of features(10 features).The experimental results confirmthe applicability of using DT and chi-square for DDoS detection and early warning in SAIXP.展开更多
文摘This study introduces a new classifier tailored to address the limitations inherent in conventional classifiers such as K-nearest neighbor(KNN),random forest(RF),decision tree(DT),and support vector machine(SVM)for arrhythmia detection.The proposed classifier leverages the Chi-square distance as a primary metric,providing a specialized and original approach for precise arrhythmia detection.To optimize feature selection and refine the classifier’s performance,particle swarm optimization(PSO)is integrated with the Chi-square distance as a fitness function.This synergistic integration enhances the classifier’s capabilities,resulting in a substantial improvement in accuracy for arrhythmia detection.Experimental results demonstrate the efficacy of the proposed method,achieving a noteworthy accuracy rate of 98% with PSO,higher than 89% achieved without any previous optimization.The classifier outperforms machine learning(ML)and deep learning(DL)techniques,underscoring its reliability and superiority in the realm of arrhythmia classification.The promising results render it an effective method to support both academic and medical communities,offering an advanced and precise solution for arrhythmia detection in electrocardiogram(ECG)data.
基金supported by the Florida Center for Advanced Analytics and Data Science funded by Ernesto.Net(under the Algorithms for Good Grant).
文摘Diabetes mellitus is a metabolic disease that is ranked among the top 10 causes of death by the world health organization.During the last few years,an alarming increase is observed worldwide with a 70%rise in the disease since 2000 and an 80%rise in male deaths.If untreated,it results in complications of many vital organs of the human body which may lead to fatality.Early detection of diabetes is a task of significant importance to start timely treatment.This study introduces a methodology for the classification of diabetic and normal people using an ensemble machine learning model and feature fusion of Chi-square and principal component analysis.An ensemble model,logistic tree classifier(LTC),is proposed which incorporates logistic regression and extra tree classifier through a soft voting mechanism.Experiments are also performed using several well-known machine learning algorithms to analyze their performance including logistic regression,extra tree classifier,AdaBoost,Gaussian naive Bayes,decision tree,random forest,and k nearest neighbor.In addition,several experiments are carried out using principal component analysis(PCA)and Chi-square(Chi-2)fea-tures to analyze the influence of feature selection on the performance of machine learning classifiers.Results indicate that Chi-2 features show high performance than both PCA features and original features.However,the highest accuracy is obtained when the proposed ensemble model LTC is used with the proposed fea-ture fusion framework-work which achieves a 0.85 accuracy score which is the highest of the available approaches for diabetes prediction.In addition,the statis-tical T-test proves the statistical significance of the proposed approach over other approaches.
文摘In this paper Singular Decompositon Value (SVD) formula and modified Chi-square solution are provided, and the modified Chi-square is combined with FT-IR instrument to control biochemical reaction process. Using the modified Chi-square technique, the unknown concentration of reactants and products in test samples withdrawn from the process is determined. The technique avoids the need for the spectral data to conform to Beer’s Law and the best spectral range is determined automatically.
文摘The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.
基金the National Natural Science Foundation of China (10571139)
文摘We study the asymptotics tot the statistic of chi-square in type Ⅱ error. By the contraction principle, the large deviations and moderate deviations are obtained, and the rate function of moderate deviations can be calculated explicitly which is a squared function.
文摘We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.
文摘In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.
文摘A new six-parameter continuous distribution called the Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is proposed in this study, a graphical illustration of the probability density function and cumulative distribution function is presented. The statistical features of the Generalized Kumaraswamy Generalized Power Gompertz distribution are systematically derived and adequately studied. The estimation of the model parameters in the absence of censoring and under-right censoring is performed using the method of maximum likelihood. The test statistic for right-censored data, criteria test for GKGPG distribution, estimated matrix Ŵ, Ĉ, and Ĝ, criteria test Y<sup>2</sup>n</sub>, alongside the quadratic form of the test statistic is derived. Mean simulated values of maximum likelihood estimates and their corresponding square mean errors are presented and confirmed to agree closely with the true parameter values. Simulated levels of significance for Y<sup>2</sup>n</sub> (γ) test for the GKGPG model against their theoretical values were recorded. We conclude that the null hypothesis for which simulated samples are fitted by GKGPG distribution is widely validated for the different levels of significance considered. From the summary of the results of the strength of a specific type of braided cord dataset on the GKGPG model, it is observed that the proposed GKGPG model fits the data set for a significance level ε = 0.05.
文摘Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.
文摘“Human-elephant conflict(HEC)”,the alarming issue,in present day context has attracted the attention of environmentalists and policy makers.The rising conflict between human beings and wild elephants is common in Buxa Tiger Reserve(BTR)and its adjoining area in West Bengal State,India,making the area volatile.People’s attitudes towards elephant conservation activity are very crucial to get rid of HEC,because people’s proximity with wild elephants’habitat can trigger the occurrence of HEC.The aim of this study is to conduct an in-depth investigation about the association of people’s attitudes towards HEC with their locational,demographic,and socio-economic characteristics in BTR and its adjoining area by using Pearson’s bivariate chi-square test and binary logistic regression analysis.BTR is one of the constituent parts of Eastern Doors Elephant Reserve(EDER).We interviewed 500 respondents to understand their perceptions to HEC and investigated their locational,demographic,and socio-economic characteristics including location of village,gender,age,ethnicity,religion,caste,poverty level,education level,primary occupation,secondary occupation,household type,and source of firewood.The results indicate that respondents who are living in enclave forest villages(EFVs),peripheral forest villages(PFVs),corridor village(CVs),or forest and corridor villages(FCVs),mainly males,at the age of 18–48 years old,engaged with agriculture occupation,and living in kancha and mixed houses,have more likelihood to witness HEC.Besides,respondents who are illiterate or at primary education level are more likely to regard elephant as a main problematic animal around their villages and refuse to participate in elephant conservation activity.For the sake of a sustainable environment for both human beings and wildlife,people’s attitudes towards elephants must be friendly in a more prudent way,so that the two communities can live in harmony.
基金supported by the Science and Technology Project of China Southern Power Grid(GZHKJXM20210043-080041KK52210002).
文摘Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.
基金The authors extend their appreciation to the Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through Project Number WE-44-0033.
文摘One of the significant health issues affecting women that impacts their fertility and results in serious health concerns is Polycystic ovarian syndrome(PCOS).Consequently,timely screening of polycystic ovarian syndrome can help in the process of recovery.Finding a method to aid doctors in this procedure was crucial due to the difficulties in detecting this condition.This research aimed to determine whether it is possible to optimize the detection of PCOS utilizing Deep Learning algorithms and methodologies.Additionally,feature selection methods that produce the most important subset of features can speed up calculation and enhance the effectiveness of classifiers.In this research,the tri-stage wrapper method is used because it reduces the computation time.The proposed study for the Automatic diagnosis of PCOS contains preprocessing,data normalization,feature selection,and classification.A dataset with 39 characteristics,including metabolism,neuroimaging,hormones,and biochemical information for 541 subjects,was employed in this scenario.To start,this research pre-processed the information.Next for feature selection,a tri-stage wrapper method such as Mutual Information,ReliefF,Chi-Square,and Xvariance is used.Then,various classification methods are tested and trained.Deep learning techniques including convolutional neural network(CNN),multi-layer perceptron(MLP),Recurrent neural network(RNN),and Bi long short-term memory(Bi-LSTM)are utilized for categorization.The experimental finding demonstrates that with effective feature extraction process using tri stage wrapper method+CNN delivers the highest precision(97%),high accuracy(98.67%),and recall(89%)when compared with other machine learning algorithms.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R196)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Imagery assessment is an efficient method for detecting craniofacial anomalies.A cephalometric landmark matching approach may help in orthodontic diagnosis,craniofacial growth assessment and treatment planning.Automatic landmark matching and anomalies detection helps face the manual labelling lim-itations and optimize preoperative planning of maxillofacial surgery.The aim of this study was to develop an accurate Cephalometric Landmark Matching method as well as an automatic system for anatomical anomalies classification.First,the Active Appearance Model(AAM)was used for the matching process.This pro-cess was achieved by the Ant Colony Optimization(ACO)algorithm enriched with proximity information.Then,the maxillofacial anomalies were classified using the Support Vector Machine(SVM).The experiments were conducted on X-ray cephalograms of 400 patients where the ground truth was produced by two experts.The frameworks achieved a landmark matching error(LE)of 0.50±1.04 and a successful landmark matching of 89.47%in the 2 mm and 3 mm range and of 100%in the 4 mm range.The classification of anomalies achieved an accuracy of 98.75%.Compared to previous work,the proposed approach is simpler and has a comparable range of acceptable matching cost and anomaly classification.Results have also shown that it outperformed the K-nearest neigh-bors(KNN)classifier.
文摘The Internet service provider(ISP)is the heart of any country’s Internet infrastructure and plays an important role in connecting to theWorld WideWeb.Internet exchange point(IXP)allows the interconnection of two or more separate network infrastructures.All Internet traffic entering a country should pass through its IXP.Thus,it is an ideal location for performing malicious traffic analysis.Distributed denial of service(DDoS)attacks are becoming a more serious daily threat.Malicious actors in DDoS attacks control numerous infected machines known as botnets.Botnets are used to send numerous fake requests to overwhelm the resources of victims and make them unavailable for some periods.To date,such attacks present a major devastating security threat on the Internet.This paper proposes an effective and efficient machine learning(ML)-based DDoS detection approach for the early warning and protection of the Saudi Arabia Internet exchange point(SAIXP)platform.The effectiveness and efficiency of the proposed approach are verified by selecting an accurate ML method with a small number of input features.A chi-square method is used for feature selection because it is easier to compute than other methods,and it does not require any assumption about feature distribution values.Several ML methods are assessed using holdout and 10-fold tests on a public large-size dataset.The experiments showed that the performance of the decision tree(DT)classifier achieved a high accuracy result(99.98%)with a small number of features(10 features).The experimental results confirmthe applicability of using DT and chi-square for DDoS detection and early warning in SAIXP.