In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selec...In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically ...Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically modeling the actual scene, so that the hand-held full-band spectrum analyzer would be able to collect signal field strength values for indoor complex scenes. An improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression was proposed to predict the signal field strengths for the whole plane before and after being shield. Then the highest accuracy set of data could be picked out by comparison. The experimental results show that the improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression can scientifically and objectively predict the indoor complex scenes’ signal strength and evaluate the interference protection with high accuracy.展开更多
In this paper, a memetic algorithm with competition(MAC) is proposed to solve the capacitated green vehicle routing problem(CGVRP). Firstly, the permutation array called traveling salesman problem(TSP) route is used t...In this paper, a memetic algorithm with competition(MAC) is proposed to solve the capacitated green vehicle routing problem(CGVRP). Firstly, the permutation array called traveling salesman problem(TSP) route is used to encode the solution, and an effective decoding method to construct the CGVRP route is presented accordingly. Secondly, the k-nearest neighbor(k NN) based initialization is presented to take use of the location information of the customers. Thirdly, according to the characteristics of the CGVRP, the search operators in the variable neighborhood search(VNS) framework and the simulated annealing(SA) strategy are executed on the TSP route for all solutions. Moreover, the customer adjustment operator and the alternative fuel station(AFS) adjustment operator on the CGVRP route are executed for the elite solutions after competition. In addition, the crossover operator is employed to share information among different solutions. The effect of parameter setting is investigated using the Taguchi method of design-ofexperiment to suggest suitable values. Via numerical tests, it demonstrates the effectiveness of both the competitive search and the decoding method. Moreover, extensive comparative results show that the proposed algorithm is more effective and efficient than the existing methods in solving the CGVRP.展开更多
Liquid leakage from pipelines is a critical issue in large-scale process plants.Damage in pipelines affects the normal operation of the plant and increases maintenance costs.Furthermore,it causes unsafe and hazardous ...Liquid leakage from pipelines is a critical issue in large-scale process plants.Damage in pipelines affects the normal operation of the plant and increases maintenance costs.Furthermore,it causes unsafe and hazardous situations for operators.Therefore,the detection and localization of leakages is a crucial task for maintenance and condition monitoring.Recently,the use of infrared(IR)cameras was found to be a promising approach for leakage detection in large-scale plants.IR cameras can capture leaking liquid if it has a higher(or lower)temperature than its surroundings.In this paper,a method based on IR video data and machine vision techniques is proposed to detect and localize liquid leakages in a chemical process plant.Since the proposed method is a vision-based method and does not consider the physical properties of the leaking liquid,it is applicable for any type of liquid leakage(i.e.,water,oil,etc.).In this method,subsequent frames are subtracted and divided into blocks.Then,principle component analysis is performed in each block to extract features from the blocks.All subtracted frames within the blocks are individually transferred to feature vectors,which are used as a basis for classifying the blocks.The k-nearest neighbor algorithm is used to classify the blocks as normal(without leakage)or anomalous(with leakage).Finally,the positions of the leakages are determined in each anomalous block.In order to evaluate the approach,two datasets with two different formats,consisting of video footage of a laboratory demonstrator plant captured by an IR camera,are considered.The results show that the proposed method is a promising approach to detect and localize leakages from pipelines using IR videos.The proposed method has high accuracy and a reasonable detection time for leakage detection.The possibility of extending the proposed method to a real industrial plant and the limitations of this method are discussed at the end.展开更多
Whale optimization algorithm(WOA)is a new population-based meta-heuristic algorithm.WOA uses shrinking encircling mechanism,spiral rise,and random learning strategies to update whale’s positions.WOA has merit in term...Whale optimization algorithm(WOA)is a new population-based meta-heuristic algorithm.WOA uses shrinking encircling mechanism,spiral rise,and random learning strategies to update whale’s positions.WOA has merit in terms of simple calculation and high computational accuracy,but its convergence speed is slow and it is easy to fall into the local optimal solution.In order to overcome the shortcomings,this paper integrates adaptive neighborhood and hybrid mutation strategies into whale optimization algorithms,designs the average distance from itself to other whales as an adaptive neighborhood radius,and chooses to learn from the optimal solution in the neighborhood instead of random learning strategies.The hybrid mutation strategy is used to enhance the ability of algorithm to jump out of the local optimal solution.A new whale optimization algorithm(HMNWOA)is proposed.The proposed algorithm inherits the global search capability of the original algorithm,enhances the exploitation ability,improves the quality of the population,and thus improves the convergence speed of the algorithm.A feature selection algorithm based on binary HMNWOA is proposed.Twelve standard datasets from UCI repository test the validity of the proposed algorithm for feature selection.The experimental results show that HMNWOA is very competitive compared to the other six popular feature selection methods in improving the classification accuracy and reducing the number of features,and ensures that HMNWOA has strong search ability in the search feature space.展开更多
The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effectiv...The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effective algorithm to estimate the finite mixture model parameters. However, EM algorithm can not guarantee to find the global optimal solution, and often easy to fall into local optimal solution, so it is sensitive to the determination of initial value to iteration. Traditional EM algorithm select the initial value at random, we propose an improved method of selection of initial value. First, we use the k-nearest-neighbor method to delete outliers. Second, use the k-means to initialize the EM algorithm. Compare this method with the original random initial value method, numerical experiments show that the parameter estimation effect of the initialization of the EM algorithm is significantly better than the effect of the original EM algorithm.展开更多
基金the Deputyship for Research and Innovation,“Ministry of Education”in Saudi Arabia for funding this research(IFKSUOR3-014-3).
文摘In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金the National Natural Science Foundation of China under projects 61772150 and 61862012the Guangxi Key R&D Program under project AB17195025+5 种基金the Guangxi Natural Science Foundation under grants 2018GXNSFDA281054 and 2018GXNSFAA281232the National Cryptography Development Fund of China under project MMJJ20170217the Guangxi Science and Technology Base and Special Talents Program AD18281044the Innovation Project of GUET Graduate Education under project 2017YJCX46the Guangxi Young Teachers’ Basic Ability Improvement Program under Grant 2018KY0194the open program of Guangxi Key Laboratory of Cryptography and Information Security under projects GCIS201621 and GCIS201702.
文摘Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically modeling the actual scene, so that the hand-held full-band spectrum analyzer would be able to collect signal field strength values for indoor complex scenes. An improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression was proposed to predict the signal field strengths for the whole plane before and after being shield. Then the highest accuracy set of data could be picked out by comparison. The experimental results show that the improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression can scientifically and objectively predict the indoor complex scenes’ signal strength and evaluate the interference protection with high accuracy.
基金supported by the National Science Fund for Distinguished Young Scholars of China(61525304)the National Natural Science Foundation of China(61873328)
文摘In this paper, a memetic algorithm with competition(MAC) is proposed to solve the capacitated green vehicle routing problem(CGVRP). Firstly, the permutation array called traveling salesman problem(TSP) route is used to encode the solution, and an effective decoding method to construct the CGVRP route is presented accordingly. Secondly, the k-nearest neighbor(k NN) based initialization is presented to take use of the location information of the customers. Thirdly, according to the characteristics of the CGVRP, the search operators in the variable neighborhood search(VNS) framework and the simulated annealing(SA) strategy are executed on the TSP route for all solutions. Moreover, the customer adjustment operator and the alternative fuel station(AFS) adjustment operator on the CGVRP route are executed for the elite solutions after competition. In addition, the crossover operator is employed to share information among different solutions. The effect of parameter setting is investigated using the Taguchi method of design-ofexperiment to suggest suitable values. Via numerical tests, it demonstrates the effectiveness of both the competitive search and the decoding method. Moreover, extensive comparative results show that the proposed algorithm is more effective and efficient than the existing methods in solving the CGVRP.
基金funded by the German Federal Ministry for Economic Affairs and Energy(BMWi)(01MD15009F).
文摘Liquid leakage from pipelines is a critical issue in large-scale process plants.Damage in pipelines affects the normal operation of the plant and increases maintenance costs.Furthermore,it causes unsafe and hazardous situations for operators.Therefore,the detection and localization of leakages is a crucial task for maintenance and condition monitoring.Recently,the use of infrared(IR)cameras was found to be a promising approach for leakage detection in large-scale plants.IR cameras can capture leaking liquid if it has a higher(or lower)temperature than its surroundings.In this paper,a method based on IR video data and machine vision techniques is proposed to detect and localize liquid leakages in a chemical process plant.Since the proposed method is a vision-based method and does not consider the physical properties of the leaking liquid,it is applicable for any type of liquid leakage(i.e.,water,oil,etc.).In this method,subsequent frames are subtracted and divided into blocks.Then,principle component analysis is performed in each block to extract features from the blocks.All subtracted frames within the blocks are individually transferred to feature vectors,which are used as a basis for classifying the blocks.The k-nearest neighbor algorithm is used to classify the blocks as normal(without leakage)or anomalous(with leakage).Finally,the positions of the leakages are determined in each anomalous block.In order to evaluate the approach,two datasets with two different formats,consisting of video footage of a laboratory demonstrator plant captured by an IR camera,are considered.The results show that the proposed method is a promising approach to detect and localize leakages from pipelines using IR videos.The proposed method has high accuracy and a reasonable detection time for leakage detection.The possibility of extending the proposed method to a real industrial plant and the limitations of this method are discussed at the end.
基金This work was supported by the National Natural Science Foundation of China(Grant No.2017YFC0403605 and No.11601419).
文摘Whale optimization algorithm(WOA)is a new population-based meta-heuristic algorithm.WOA uses shrinking encircling mechanism,spiral rise,and random learning strategies to update whale’s positions.WOA has merit in terms of simple calculation and high computational accuracy,but its convergence speed is slow and it is easy to fall into the local optimal solution.In order to overcome the shortcomings,this paper integrates adaptive neighborhood and hybrid mutation strategies into whale optimization algorithms,designs the average distance from itself to other whales as an adaptive neighborhood radius,and chooses to learn from the optimal solution in the neighborhood instead of random learning strategies.The hybrid mutation strategy is used to enhance the ability of algorithm to jump out of the local optimal solution.A new whale optimization algorithm(HMNWOA)is proposed.The proposed algorithm inherits the global search capability of the original algorithm,enhances the exploitation ability,improves the quality of the population,and thus improves the convergence speed of the algorithm.A feature selection algorithm based on binary HMNWOA is proposed.Twelve standard datasets from UCI repository test the validity of the proposed algorithm for feature selection.The experimental results show that HMNWOA is very competitive compared to the other six popular feature selection methods in improving the classification accuracy and reducing the number of features,and ensures that HMNWOA has strong search ability in the search feature space.
文摘The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effective algorithm to estimate the finite mixture model parameters. However, EM algorithm can not guarantee to find the global optimal solution, and often easy to fall into local optimal solution, so it is sensitive to the determination of initial value to iteration. Traditional EM algorithm select the initial value at random, we propose an improved method of selection of initial value. First, we use the k-nearest-neighbor method to delete outliers. Second, use the k-means to initialize the EM algorithm. Compare this method with the original random initial value method, numerical experiments show that the parameter estimation effect of the initialization of the EM algorithm is significantly better than the effect of the original EM algorithm.