In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selec...In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically ...Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically modeling the actual scene, so that the hand-held full-band spectrum analyzer would be able to collect signal field strength values for indoor complex scenes. An improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression was proposed to predict the signal field strengths for the whole plane before and after being shield. Then the highest accuracy set of data could be picked out by comparison. The experimental results show that the improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression can scientifically and objectively predict the indoor complex scenes’ signal strength and evaluate the interference protection with high accuracy.展开更多
In this paper, a memetic algorithm with competition(MAC) is proposed to solve the capacitated green vehicle routing problem(CGVRP). Firstly, the permutation array called traveling salesman problem(TSP) route is used t...In this paper, a memetic algorithm with competition(MAC) is proposed to solve the capacitated green vehicle routing problem(CGVRP). Firstly, the permutation array called traveling salesman problem(TSP) route is used to encode the solution, and an effective decoding method to construct the CGVRP route is presented accordingly. Secondly, the k-nearest neighbor(k NN) based initialization is presented to take use of the location information of the customers. Thirdly, according to the characteristics of the CGVRP, the search operators in the variable neighborhood search(VNS) framework and the simulated annealing(SA) strategy are executed on the TSP route for all solutions. Moreover, the customer adjustment operator and the alternative fuel station(AFS) adjustment operator on the CGVRP route are executed for the elite solutions after competition. In addition, the crossover operator is employed to share information among different solutions. The effect of parameter setting is investigated using the Taguchi method of design-ofexperiment to suggest suitable values. Via numerical tests, it demonstrates the effectiveness of both the competitive search and the decoding method. Moreover, extensive comparative results show that the proposed algorithm is more effective and efficient than the existing methods in solving the CGVRP.展开更多
Whale optimization algorithm(WOA)is a new population-based meta-heuristic algorithm.WOA uses shrinking encircling mechanism,spiral rise,and random learning strategies to update whale’s positions.WOA has merit in term...Whale optimization algorithm(WOA)is a new population-based meta-heuristic algorithm.WOA uses shrinking encircling mechanism,spiral rise,and random learning strategies to update whale’s positions.WOA has merit in terms of simple calculation and high computational accuracy,but its convergence speed is slow and it is easy to fall into the local optimal solution.In order to overcome the shortcomings,this paper integrates adaptive neighborhood and hybrid mutation strategies into whale optimization algorithms,designs the average distance from itself to other whales as an adaptive neighborhood radius,and chooses to learn from the optimal solution in the neighborhood instead of random learning strategies.The hybrid mutation strategy is used to enhance the ability of algorithm to jump out of the local optimal solution.A new whale optimization algorithm(HMNWOA)is proposed.The proposed algorithm inherits the global search capability of the original algorithm,enhances the exploitation ability,improves the quality of the population,and thus improves the convergence speed of the algorithm.A feature selection algorithm based on binary HMNWOA is proposed.Twelve standard datasets from UCI repository test the validity of the proposed algorithm for feature selection.The experimental results show that HMNWOA is very competitive compared to the other six popular feature selection methods in improving the classification accuracy and reducing the number of features,and ensures that HMNWOA has strong search ability in the search feature space.展开更多
The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effectiv...The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effective algorithm to estimate the finite mixture model parameters. However, EM algorithm can not guarantee to find the global optimal solution, and often easy to fall into local optimal solution, so it is sensitive to the determination of initial value to iteration. Traditional EM algorithm select the initial value at random, we propose an improved method of selection of initial value. First, we use the k-nearest-neighbor method to delete outliers. Second, use the k-means to initialize the EM algorithm. Compare this method with the original random initial value method, numerical experiments show that the parameter estimation effect of the initialization of the EM algorithm is significantly better than the effect of the original EM algorithm.展开更多
Most of the machineries in small or large-scale industry have rotating elementsupported by bearings for rigid support and accurate movement. For proper functioning ofmachinery, condition monitoring of the bearing is v...Most of the machineries in small or large-scale industry have rotating elementsupported by bearings for rigid support and accurate movement. For proper functioning ofmachinery, condition monitoring of the bearing is very important. In present study soundsignal is used to continuously monitor bearing health as sound signals of rotatingmachineries carry dynamic information of components. There are numerous studies inliterature that are reporting superiority of vibration signal of bearing fault diagnosis.However, there are very few studies done using sound signal. The cost associated withcondition monitoring using sound signal (Microphone) is less than the cost of transducerused to acquire vibration signal (Accelerometer). This paper employs sound signal forcondition monitoring of roller bearing by K-star classifier and k-nearest neighborhoodclassifier. The statistical feature extraction is performed from acquired sound signals. Thentwo-layer feature selection is done using J48 decision tree algorithm and random treealgorithm. These selected features were classified using K-star classifier and k-nearestneighborhood classifier and parametric optimization is performed to achieve the maximumclassification accuracy. The classification results for both K-star classifier and k-nearestneighborhood classifier for condition monitoring of roller bearing using sound signals werecompared.展开更多
Deep learning has reached many successes in Video Processing.Video has become a growing important part of our daily digital interactions.The advancement of better resolution content and the large volume offers serious...Deep learning has reached many successes in Video Processing.Video has become a growing important part of our daily digital interactions.The advancement of better resolution content and the large volume offers serious challenges to the goal of receiving,distributing,compressing and revealing highquality video content.In this paper we propose a novel Effective and Efficient video compression by the Deep Learning framework based on the flask,which creatively combines the Deep Learning Techniques on Convolutional Neural Networks(CNN)and Generative Adversarial Networks(GAN).The video compression method involves the layers are divided into different groups for data processing,using CNN to remove the duplicate frames,repeating the single image instead of the duplicate images by recognizing and detecting minute changes using GAN and recorded with Long Short-Term Memory(LSTM).Instead of the complete image,the small changes generated using GAN are substituted,which helps with frame-level compression.Pixel wise comparison is performed using K-nearest Neighbours(KNN)over the frame,clustered with K-means and Singular Value Decomposition(SVD)is applied for every frame in the video for all three colour channels[Red,Green,Blue]to decrease the dimension of the utility matrix[R,G,B]by extracting its latent factors.Video frames are packed with parameters with the aid of a codec and converted to video format and the results are compared with the original video.Repeated experiments on several videos with different sizes,duration,Frames per second(FPS),and quality results demonstrated a significant resampling rate.On normal,the outcome delivered had around a 10%deviation in quality and over half in size when contrasted,and the original video.展开更多
In developing countries like South Africa,users experienced more than 1030 hours of load shedding outages in just the first half of 2023 due to inadequate power supply from the national grid.Residential homes that can...In developing countries like South Africa,users experienced more than 1030 hours of load shedding outages in just the first half of 2023 due to inadequate power supply from the national grid.Residential homes that cannot afford to take actions to mitigate the challenges of load shedding are severely inconvenienced as they have to reschedule their demand involuntarily.This study presents optimal strategies to guide households in determining suitable scheduling and sizing solutions for solar home systems to mitigate the inconvenience experienced by residents due to load shedding.To start with,we predict the load shedding stages that are used as input for the optimal strategies by using the K-Nearest Neighbour(KNN)algorithm.Based on an accurate forecast of the future load shedding patterns,we formulate the residents’inconvenience and the loss of power supply probability during load shedding as the objective function.When solving the multi-objective optimisation problem,four different strategies to fight against load shedding are identified,namely(1)optimal home appliance scheduling(HAS)under load shedding;(2)optimal HAS supported by solar panels;(3)optimal HAS supported by batteries,and(4)optimal HAS supported by the solar home system with both solar panels and batteries.Among these strategies,appliance scheduling with an optimally sized 9.6 kWh battery and a 2.74 kWp panel array of five 550 Wp panels,eliminates the loss of power supply probability and reduces the inconvenience by 92%when tested under the South African load shedding cases in 2023.展开更多
数据缺失在各个研究领域中普遍存在,缺失的数据会对计算的性能与结果产生严重的影响。为提高填补缺失数据的准确度,提出一种基于聚类分析的缺失数据最近邻填补算法。该算法在对数据聚类分析后根据类别分配权重,在MGNN(MahalanobisGray a...数据缺失在各个研究领域中普遍存在,缺失的数据会对计算的性能与结果产生严重的影响。为提高填补缺失数据的准确度,提出一种基于聚类分析的缺失数据最近邻填补算法。该算法在对数据聚类分析后根据类别分配权重,在MGNN(MahalanobisGray and Nearest Neighbor)算法的基础上改进了计算方法和填充值的计算方式。实验结果表明,该方法填补的准确度比传统KNN和MGNN算法要高。展开更多
基金the Deputyship for Research and Innovation,“Ministry of Education”in Saudi Arabia for funding this research(IFKSUOR3-014-3).
文摘In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金the National Natural Science Foundation of China under projects 61772150 and 61862012the Guangxi Key R&D Program under project AB17195025+5 种基金the Guangxi Natural Science Foundation under grants 2018GXNSFDA281054 and 2018GXNSFAA281232the National Cryptography Development Fund of China under project MMJJ20170217the Guangxi Science and Technology Base and Special Talents Program AD18281044the Innovation Project of GUET Graduate Education under project 2017YJCX46the Guangxi Young Teachers’ Basic Ability Improvement Program under Grant 2018KY0194the open program of Guangxi Key Laboratory of Cryptography and Information Security under projects GCIS201621 and GCIS201702.
文摘Existing interference protection systems lack automatic evaluation methods to provide scientific, objective and accurate assessment results. To address this issue, this paper develops a layout scheme by geometrically modeling the actual scene, so that the hand-held full-band spectrum analyzer would be able to collect signal field strength values for indoor complex scenes. An improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression was proposed to predict the signal field strengths for the whole plane before and after being shield. Then the highest accuracy set of data could be picked out by comparison. The experimental results show that the improved prediction algorithm based on the K-nearest neighbor non-parametric kernel regression can scientifically and objectively predict the indoor complex scenes’ signal strength and evaluate the interference protection with high accuracy.
基金supported by the National Science Fund for Distinguished Young Scholars of China(61525304)the National Natural Science Foundation of China(61873328)
文摘In this paper, a memetic algorithm with competition(MAC) is proposed to solve the capacitated green vehicle routing problem(CGVRP). Firstly, the permutation array called traveling salesman problem(TSP) route is used to encode the solution, and an effective decoding method to construct the CGVRP route is presented accordingly. Secondly, the k-nearest neighbor(k NN) based initialization is presented to take use of the location information of the customers. Thirdly, according to the characteristics of the CGVRP, the search operators in the variable neighborhood search(VNS) framework and the simulated annealing(SA) strategy are executed on the TSP route for all solutions. Moreover, the customer adjustment operator and the alternative fuel station(AFS) adjustment operator on the CGVRP route are executed for the elite solutions after competition. In addition, the crossover operator is employed to share information among different solutions. The effect of parameter setting is investigated using the Taguchi method of design-ofexperiment to suggest suitable values. Via numerical tests, it demonstrates the effectiveness of both the competitive search and the decoding method. Moreover, extensive comparative results show that the proposed algorithm is more effective and efficient than the existing methods in solving the CGVRP.
基金This work was supported by the National Natural Science Foundation of China(Grant No.2017YFC0403605 and No.11601419).
文摘Whale optimization algorithm(WOA)is a new population-based meta-heuristic algorithm.WOA uses shrinking encircling mechanism,spiral rise,and random learning strategies to update whale’s positions.WOA has merit in terms of simple calculation and high computational accuracy,but its convergence speed is slow and it is easy to fall into the local optimal solution.In order to overcome the shortcomings,this paper integrates adaptive neighborhood and hybrid mutation strategies into whale optimization algorithms,designs the average distance from itself to other whales as an adaptive neighborhood radius,and chooses to learn from the optimal solution in the neighborhood instead of random learning strategies.The hybrid mutation strategy is used to enhance the ability of algorithm to jump out of the local optimal solution.A new whale optimization algorithm(HMNWOA)is proposed.The proposed algorithm inherits the global search capability of the original algorithm,enhances the exploitation ability,improves the quality of the population,and thus improves the convergence speed of the algorithm.A feature selection algorithm based on binary HMNWOA is proposed.Twelve standard datasets from UCI repository test the validity of the proposed algorithm for feature selection.The experimental results show that HMNWOA is very competitive compared to the other six popular feature selection methods in improving the classification accuracy and reducing the number of features,and ensures that HMNWOA has strong search ability in the search feature space.
文摘The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effective algorithm to estimate the finite mixture model parameters. However, EM algorithm can not guarantee to find the global optimal solution, and often easy to fall into local optimal solution, so it is sensitive to the determination of initial value to iteration. Traditional EM algorithm select the initial value at random, we propose an improved method of selection of initial value. First, we use the k-nearest-neighbor method to delete outliers. Second, use the k-means to initialize the EM algorithm. Compare this method with the original random initial value method, numerical experiments show that the parameter estimation effect of the initialization of the EM algorithm is significantly better than the effect of the original EM algorithm.
文摘Most of the machineries in small or large-scale industry have rotating elementsupported by bearings for rigid support and accurate movement. For proper functioning ofmachinery, condition monitoring of the bearing is very important. In present study soundsignal is used to continuously monitor bearing health as sound signals of rotatingmachineries carry dynamic information of components. There are numerous studies inliterature that are reporting superiority of vibration signal of bearing fault diagnosis.However, there are very few studies done using sound signal. The cost associated withcondition monitoring using sound signal (Microphone) is less than the cost of transducerused to acquire vibration signal (Accelerometer). This paper employs sound signal forcondition monitoring of roller bearing by K-star classifier and k-nearest neighborhoodclassifier. The statistical feature extraction is performed from acquired sound signals. Thentwo-layer feature selection is done using J48 decision tree algorithm and random treealgorithm. These selected features were classified using K-star classifier and k-nearestneighborhood classifier and parametric optimization is performed to achieve the maximumclassification accuracy. The classification results for both K-star classifier and k-nearestneighborhood classifier for condition monitoring of roller bearing using sound signals werecompared.
文摘Deep learning has reached many successes in Video Processing.Video has become a growing important part of our daily digital interactions.The advancement of better resolution content and the large volume offers serious challenges to the goal of receiving,distributing,compressing and revealing highquality video content.In this paper we propose a novel Effective and Efficient video compression by the Deep Learning framework based on the flask,which creatively combines the Deep Learning Techniques on Convolutional Neural Networks(CNN)and Generative Adversarial Networks(GAN).The video compression method involves the layers are divided into different groups for data processing,using CNN to remove the duplicate frames,repeating the single image instead of the duplicate images by recognizing and detecting minute changes using GAN and recorded with Long Short-Term Memory(LSTM).Instead of the complete image,the small changes generated using GAN are substituted,which helps with frame-level compression.Pixel wise comparison is performed using K-nearest Neighbours(KNN)over the frame,clustered with K-means and Singular Value Decomposition(SVD)is applied for every frame in the video for all three colour channels[Red,Green,Blue]to decrease the dimension of the utility matrix[R,G,B]by extracting its latent factors.Video frames are packed with parameters with the aid of a codec and converted to video format and the results are compared with the original video.Repeated experiments on several videos with different sizes,duration,Frames per second(FPS),and quality results demonstrated a significant resampling rate.On normal,the outcome delivered had around a 10%deviation in quality and over half in size when contrasted,and the original video.
基金supported by National Key R&D Program of China(Grant No.2021YFE0199000)National Natural Science Foundation of China(Grant No.62133015)+1 种基金National Research Foundation China/South Africa Research Cooperation Programme with Grant No.148762Royal Academy of Engineering Transforming Systems through Partnership grant scheme with reference No.TSP2021\100016.
文摘In developing countries like South Africa,users experienced more than 1030 hours of load shedding outages in just the first half of 2023 due to inadequate power supply from the national grid.Residential homes that cannot afford to take actions to mitigate the challenges of load shedding are severely inconvenienced as they have to reschedule their demand involuntarily.This study presents optimal strategies to guide households in determining suitable scheduling and sizing solutions for solar home systems to mitigate the inconvenience experienced by residents due to load shedding.To start with,we predict the load shedding stages that are used as input for the optimal strategies by using the K-Nearest Neighbour(KNN)algorithm.Based on an accurate forecast of the future load shedding patterns,we formulate the residents’inconvenience and the loss of power supply probability during load shedding as the objective function.When solving the multi-objective optimisation problem,four different strategies to fight against load shedding are identified,namely(1)optimal home appliance scheduling(HAS)under load shedding;(2)optimal HAS supported by solar panels;(3)optimal HAS supported by batteries,and(4)optimal HAS supported by the solar home system with both solar panels and batteries.Among these strategies,appliance scheduling with an optimally sized 9.6 kWh battery and a 2.74 kWp panel array of five 550 Wp panels,eliminates the loss of power supply probability and reduces the inconvenience by 92%when tested under the South African load shedding cases in 2023.
文摘数据缺失在各个研究领域中普遍存在,缺失的数据会对计算的性能与结果产生严重的影响。为提高填补缺失数据的准确度,提出一种基于聚类分析的缺失数据最近邻填补算法。该算法在对数据聚类分析后根据类别分配权重,在MGNN(MahalanobisGray and Nearest Neighbor)算法的基础上改进了计算方法和填充值的计算方式。实验结果表明,该方法填补的准确度比传统KNN和MGNN算法要高。