In classification problems,datasets often contain a large amount of features,but not all of them are relevant for accurate classification.In fact,irrelevant features may even hinder classification accuracy.Feature sel...In classification problems,datasets often contain a large amount of features,but not all of them are relevant for accurate classification.In fact,irrelevant features may even hinder classification accuracy.Feature selection aims to alleviate this issue by minimizing the number of features in the subset while simultaneously minimizing the classification error rate.Single-objective optimization approaches employ an evaluation function designed as an aggregate function with a parameter,but the results obtained depend on the value of the parameter.To eliminate this parameter’s influence,the problem can be reformulated as a multi-objective optimization problem.The Whale Optimization Algorithm(WOA)is widely used in optimization problems because of its simplicity and easy implementation.In this paper,we propose a multi-strategy assisted multi-objective WOA(MSMOWOA)to address feature selection.To enhance the algorithm’s search ability,we integrate multiple strategies such as Levy flight,Grey Wolf Optimizer,and adaptive mutation into it.Additionally,we utilize an external repository to store non-dominant solution sets and grid technology is used to maintain diversity.Results on fourteen University of California Irvine(UCI)datasets demonstrate that our proposed method effectively removes redundant features and improves classification performance.The source code can be accessed from the website:https://github.com/zc0315/MSMOWOA.展开更多
In vehicle edge computing(VEC),asynchronous federated learning(AFL)is used,where the edge receives a local model and updates the global model,effectively reducing the global aggregation latency.Due to different amount...In vehicle edge computing(VEC),asynchronous federated learning(AFL)is used,where the edge receives a local model and updates the global model,effectively reducing the global aggregation latency.Due to different amounts of local data,computing capabilities and locations of the vehicles,renewing the global model with same weight is inappropriate.The above factors will affect the local calculation time and upload time of the local model,and the vehicle may also be affected by Byzantine attacks,leading to the deterioration of the vehicle data.However,based on deep reinforcement learning(DRL),we can consider these factors comprehensively to eliminate vehicles with poor performance as much as possible and exclude vehicles that have suffered Byzantine attacks before AFL.At the same time,when aggregating AFL,we can focus on those vehicles with better performance to improve the accuracy and safety of the system.In this paper,we proposed a vehicle selection scheme based on DRL in VEC.In this scheme,vehicle’s mobility,channel conditions with temporal variations,computational resources with temporal variations,different data amount,transmission channel status of vehicles as well as Byzantine attacks were taken into account.Simulation results show that the proposed scheme effectively improves the safety and accuracy of the global model.展开更多
Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can a...Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can affect its quantification performance.In this work,we propose a hybrid variable selection method to improve the performance of LIBS quantification.Important variables are first identified using Pearson's correlation coefficient,mutual information,least absolute shrinkage and selection operator(LASSO)and random forest,and then filtered and combined with empirical variables related to fingerprint elements of coal ash content.Subsequently,these variables are fed into a partial least squares regression(PLSR).Additionally,in some models,certain variables unrelated to ash content are removed manually to study the impact of variable deselection on model performance.The proposed hybrid strategy was tested on three LIBS datasets for quantitative analysis of coal ash content and compared with the corresponding data-driven baseline method.It is significantly better than the variable selection only method based on empirical knowledge and in most cases outperforms the baseline method.The results showed that on all three datasets the hybrid strategy for variable selection combining empirical knowledge and data-driven algorithms achieved the lowest root mean square error of prediction(RMSEP)values of 1.605,3.478 and 1.647,respectively,which were significantly lower than those obtained from multiple linear regression using only 12 empirical variables,which are 1.959,3.718 and 2.181,respectively.The LASSO-PLSR model with empirical support and 20 selected variables exhibited a significantly improved performance after variable deselection,with RMSEP values dropping from 1.635,3.962 and 1.647 to 1.483,3.086 and 1.567,respectively.Such results demonstrate that using empirical knowledge as a support for datadriven variable selection can be a viable approach to improve the accuracy and reliability of LIBS quantification.展开更多
In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selec...In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.展开更多
Pattern matching method is one of the classic classifications of existing online portfolio selection strategies. This article aims to study the key aspects of this method—measurement of similarity and selection of si...Pattern matching method is one of the classic classifications of existing online portfolio selection strategies. This article aims to study the key aspects of this method—measurement of similarity and selection of similarity sets, and proposes a Portfolio Selection Method based on Pattern Matching with Dual Information of Direction and Distance (PMDI). By studying different combination methods of indicators such as Euclidean distance, Chebyshev distance, and correlation coefficient, important information such as direction and distance in stock historical price information is extracted, thereby filtering out the similarity set required for pattern matching based investment portfolio selection algorithms. A large number of experiments conducted on two datasets of real stock markets have shown that PMDI outperforms other algorithms in balancing income and risk. Therefore, it is suitable for the financial environment in the real world.展开更多
Radiomics is a non-invasive method for extracting quantitative and higher-dimensional features from medical images for diagnosis.It has received great attention due to its huge application prospects in recent years.We...Radiomics is a non-invasive method for extracting quantitative and higher-dimensional features from medical images for diagnosis.It has received great attention due to its huge application prospects in recent years.We can know that the number of features selected by the existing radiomics feature selectionmethods is basically about ten.In this paper,a heuristic feature selection method based on frequency iteration and multiple supervised training mode is proposed.Based on the combination between features,it decomposes all features layer by layer to select the optimal features for each layer,then fuses the optimal features to form a local optimal group layer by layer and iterates to the global optimal combination finally.Compared with the currentmethod with the best prediction performance in the three data sets,thismethod proposed in this paper can reduce the number of features fromabout ten to about three without losing classification accuracy and even significantly improving classification accuracy.The proposed method has better interpretability and generalization ability,which gives it great potential in the feature selection of radiomics.展开更多
Manganese superoxide dismutase(MnSOD)is an antioxidant that exists in mitochondria and can effectively remove superoxide anions in mitochondria.In a dark,high-pressure,and low-temperature deep-sea environment,MnSOD is...Manganese superoxide dismutase(MnSOD)is an antioxidant that exists in mitochondria and can effectively remove superoxide anions in mitochondria.In a dark,high-pressure,and low-temperature deep-sea environment,MnSOD is essential for the survival of sea cucumbers.Six MnSODs were identified from the transcriptomes of deep and shallow-sea sea cucumbers.To explore their environmental adaptation mechanism,we conducted environmental selection pressure analysis through the branching site model of PAML software.We obtained night positive selection sites,and two of them were significant(97F→H,134K→V):97F→H located in a highly conservative characteristic sequence,and its polarity c hange might have a great impact on the function of MnSOD;134K→V had a change in piezophilic a bility,which might help MnSOD adapt to the environment of high hydrostatic pressure in the deepsea.To further study the effect of these two positive selection sites on MnSOD,we predicted the point mutations of F97H and K134V on shallow-sea sea cucumber by using MAESTROweb and PyMOL.Results show that 97F→H,134K→V might improve MnSOD’s efficiency of scavenging superoxide a nion and its ability to resist high hydrostatic pressure by moderately reducing its stability.The above results indicated that MnSODs of deep-sea sea cucumber adapted to deep-sea environments through their amino acid changes in polarity,piezophilic behavior,and local stability.This study revealed the correlation between MnSOD and extreme environment,and will help improve our understanding of the organism’s adaptation mechanisms in deep sea.展开更多
Background:The heterogeneity of prognosis and treatment benefits among patients with gliomas is due to tumor microenvironment characteristics.However,biomarkers that reflect microenvironmental characteristics and predic...Background:The heterogeneity of prognosis and treatment benefits among patients with gliomas is due to tumor microenvironment characteristics.However,biomarkers that reflect microenvironmental characteristics and predict the prognosis of gliomas are limited.Therefore,we aimed to develop a model that can effectively predict prognosis,differentiate microenvironment signatures,and optimize drug selection for patients with glioma.Materials and Methods:The CIBERSORT algorithm,bulk sequencing analysis,and single-cell RNA(scRNA)analysis were employed to identify significant cross-talk genes between M2 macrophages and cancer cells in glioma tissues.A predictive model was constructed based on cross-talk gene expression,and its effect on prognosis,recurrence prediction,and microenvironment characteristics was validated in multiple cohorts.The effect of the predictive model on drug selection was evaluated using the OncoPredict algorithm and relevant cellular biology experiments.Results:A high abundance of M2 macrophages in glioma tissues indicates poor prognosis,and cross-talk between macrophages and cancer cells plays a crucial role in shaping the tumor microenvironment.Eight genes involved in the cross-talk between macrophages and cancer cells were identified.Among them,periostin(POSTN),chitinase 3 like 1(CHI3L1),serum amyloid A1(SAA1),and matrix metallopeptidase 9(MMP9)were selected to construct a predictive model.The developed model demonstrated significant efficacy in distinguishing patient prognosis,recurrent cases,and characteristics of high inflammation,hypoxia,and immunosuppression.Furthermore,this model can serve as a valuable tool for guiding the use of trametinib.Conclusions:In summary,this study provides a comprehensive understanding of the interplay between M2 macrophages and cancer cells in glioma;utilizes a cross-talk gene signature to develop a predictive model that can predict the differentiation of patient prognosis,recurrence instances,and microenvironment characteristics;and aids in optimizing the application of trametinib in glioma patients.展开更多
With the rapid development and application of energy harvesting technology,it has become a prominent research area due to its significant benefits in terms of green environmental protection,convenience,and high safety...With the rapid development and application of energy harvesting technology,it has become a prominent research area due to its significant benefits in terms of green environmental protection,convenience,and high safety and efficiency.However,the uneven energy collection and consumption among IoT devices at varying distances may lead to resource imbalance within energy harvesting networks,thereby resulting in low energy transmission efficiency.To enhance the energy transmission efficiency of IoT devices in energy harvesting,this paper focuses on the utilization of collaborative communication,along with pricing-based incentive mechanisms and auction strategies.We propose a dynamic relay selection scheme,including a ladder pricing mechanism based on energy level and a Kuhn-Munkre Algorithm based on an auction theory employing a negotiation mechanism,to encourage more IoT devices to participate in the collaboration process.Simulation results demonstrate that the proposed algorithm outperforms traditional algorithms in terms of improving the energy efficiency of the system.展开更多
Genomic selection(GS)has been widely used in livestock,which greatly accelerated the genetic progress of complex traits.The population size was one of the significant factors affecting the prediction accuracy,while it...Genomic selection(GS)has been widely used in livestock,which greatly accelerated the genetic progress of complex traits.The population size was one of the significant factors affecting the prediction accuracy,while it was limited by the purebred population.Compared to directly combining two uncorrelated purebred populations to extend the reference population size,it might be more meaningful to incorporate the correlated crossbreds into reference population for genomic prediction.In this study,we simulated purebred offspring(PAS and PBS)and crossbred offspring(CAB)base on real genotype data of two base purebred populations(PA and PB),to evaluate the performance of genomic selection on purebred while incorporating crossbred information.The results showed that selecting key crossbred individuals via maximizing the expected genetic relationship(REL)was better than the other methods(individuals closet or farthest to the purebred population,CP/FP)in term of the prediction accuracy.Furthermore,the prediction accuracy of reference populations combining PA and CAB was significantly better only based on PA,which was similar to combine PA and PAS.Moreover,the rank correlation between the multiple of the increased relationship(MIR)and reliability improvement was 0.60-0.70.But for individuals with low correlation(Cor(Pi,PA or B),the reliability improvement was significantly lower than other individuals.Our findings suggested that incorporating crossbred into purebred population could improve the performance of genetic prediction compared with using the purebred population only.The genetic relationship between purebred and crossbred population is a key factor determining the increased reliability while incorporating crossbred population in the genomic prediction on pure bred individuals.展开更多
The variable selection of high dimensional nonparametric nonlinear systems aims to select the contributing variables or to eliminate the redundant variables.For a high dimensional nonparametric nonlinear system,howeve...The variable selection of high dimensional nonparametric nonlinear systems aims to select the contributing variables or to eliminate the redundant variables.For a high dimensional nonparametric nonlinear system,however,identifying whether a variable contributes or not is not easy.Therefore,based on the Fourier spectrum of densityweighted derivative,one novel variable selection approach is developed,which does not suffer from the dimensionality curse and improves the identification accuracy.Furthermore,a necessary and sufficient condition for testing a variable whether it contributes or not is provided.The proposed approach does not require strong assumptions on the distribution,such as elliptical distribution.The simulation study verifies the effectiveness of the novel variable selection algorithm.展开更多
Feature Selection(FS)is a key pre-processing step in pattern recognition and data mining tasks,which can effectively avoid the impact of irrelevant and redundant features on the performance of classification models.In...Feature Selection(FS)is a key pre-processing step in pattern recognition and data mining tasks,which can effectively avoid the impact of irrelevant and redundant features on the performance of classification models.In recent years,meta-heuristic algorithms have been widely used in FS problems,so a Hybrid Binary Chaotic Salp Swarm Dung Beetle Optimization(HBCSSDBO)algorithm is proposed in this paper to improve the effect of FS.In this hybrid algorithm,the original continuous optimization algorithm is converted into binary form by the S-type transfer function and applied to the FS problem.By combining the K nearest neighbor(KNN)classifier,the comparative experiments for FS are carried out between the proposed method and four advanced meta-heuristic algorithms on 16 UCI(University of California,Irvine)datasets.Seven evaluation metrics such as average adaptation,average prediction accuracy,and average running time are chosen to judge and compare the algorithms.The selected dataset is also discussed by categorizing it into three dimensions:high,medium,and low dimensions.Experimental results show that the HBCSSDBO feature selection method has the ability to obtain a good subset of features while maintaining high classification accuracy,shows better optimization performance.In addition,the results of statistical tests confirm the significant validity of the method.展开更多
Federated learning is an important distributed model training technique in Internet of Things(IoT),in which participant selection is a key component that plays a role in improving training efficiency and model accurac...Federated learning is an important distributed model training technique in Internet of Things(IoT),in which participant selection is a key component that plays a role in improving training efficiency and model accuracy.This module enables a central server to select a subset of participants to performmodel training based on data and device information.By doing so,selected participants are rewarded and actively perform model training,while participants that are detrimental to training efficiency and model accuracy are excluded.However,in practice,participants may suspect that the central server may have miscalculated and thus not made the selection honestly.This lack of trustworthiness problem,which can demotivate participants,has received little attention.Another problem that has received little attention is the leakage of participants’private information during the selection process.We will therefore propose a federated learning framework with auditable participant selection.It supports smart contracts in selecting a set of suitable participants based on their training loss without compromising the privacy.Considering the possibility of malicious campaigning and impersonation of participants,the framework employs commitment schemes and zero-knowledge proofs to counteract these malicious behaviors.Finally,we analyze the security of the framework and conduct a series of experiments to demonstrate that the framework can effectively improve the efficiency of federated learning.展开更多
The grain protein content(GPC)is the key parameter for wheat grain nutritional quality.This study conducted a resampling GWAS analysis using 406 wheat accessions across eight environments,and identified four previousl...The grain protein content(GPC)is the key parameter for wheat grain nutritional quality.This study conducted a resampling GWAS analysis using 406 wheat accessions across eight environments,and identified four previously reported GPC QTLs.An analysis of 87 landraces and 259 modern cultivars revealed the loss of superior GPC haplotypes,especially in Chinese cultivars.These haplotypes were preferentially adopted in different agroecological zones and had broad effects on wheat yield and agronomic traits.Most GPC QTLs did not significantly reduce yield,suggesting that high GPC can be achieved without a yield penalty.The results of this study provide a reference for future GPC breeding in wheat using the four identified QTLs.展开更多
Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is ext...Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER.展开更多
Millimeter-wave transmission combined with Orbital Angular Momentum(OAM)has the advantage of reducing the loss of beam power and increasing the system capacity.However,to fulfill this advantage,the antennas at the tra...Millimeter-wave transmission combined with Orbital Angular Momentum(OAM)has the advantage of reducing the loss of beam power and increasing the system capacity.However,to fulfill this advantage,the antennas at the transmitter and receiver must be parallel and coaxial;otherwise,the accuracy of mode detection at the receiver can be seriously influenced.In this paper,we design an OAM millimeter-wave communication system for overcoming the above limitation.Specifically,the first contribution is that the power distribution between different OAM modes and the capacity of the system with different mode sets are analytically derived for performance analysis.The second contribution lies in that a novel mode selection scheme is proposed to reduce the total interference between different modes.Numerical results show that system performance is less affected by the offset when the mode set with smaller modes or larger intervals is selected.展开更多
Amid the landscape of Cloud Computing(CC),the Cloud Datacenter(DC)stands as a conglomerate of physical servers,whose performance can be hindered by bottlenecks within the realm of proliferating CC services.A linchpin ...Amid the landscape of Cloud Computing(CC),the Cloud Datacenter(DC)stands as a conglomerate of physical servers,whose performance can be hindered by bottlenecks within the realm of proliferating CC services.A linchpin in CC’s performance,the Cloud Service Broker(CSB),orchestrates DC selection.Failure to adroitly route user requests with suitable DCs transforms the CSB into a bottleneck,endangering service quality.To tackle this,deploying an efficient CSB policy becomes imperative,optimizing DC selection to meet stringent Qualityof-Service(QoS)demands.Amidst numerous CSB policies,their implementation grapples with challenges like costs and availability.This article undertakes a holistic review of diverse CSB policies,concurrently surveying the predicaments confronted by current policies.The foremost objective is to pinpoint research gaps and remedies to invigorate future policy development.Additionally,it extensively clarifies various DC selection methodologies employed in CC,enriching practitioners and researchers alike.Employing synthetic analysis,the article systematically assesses and compares myriad DC selection techniques.These analytical insights equip decision-makers with a pragmatic framework to discern the apt technique for their needs.In summation,this discourse resoundingly underscores the paramount importance of adept CSB policies in DC selection,highlighting the imperative role of efficient CSB policies in optimizing CC performance.By emphasizing the significance of these policies and their modeling implications,the article contributes to both the general modeling discourse and its practical applications in the CC domain.展开更多
In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classif...In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classifiers on large signal datasets with redundant samples requires significant memory and high costs.This paper proposes a support databased core-set selection method(SD)for signal recognition,aiming to screen a representative subset that approximates the large signal dataset.Specifically,this subset can be identified by employing the labeled information during the early stages of model training,as some training samples are labeled as supporting data frequently.This support data is crucial for model training and can be found using a border sample selector.Simulation results demonstrate that the SD method minimizes the impact on model recognition performance while reducing the dataset size,and outperforms five other state-of-the-art core-set selection methods when the fraction of training sample kept is less than or equal to 0.3 on the RML2016.04C dataset or 0.5 on the RML22 dataset.The SD method is particularly helpful for signal recognition tasks with limited memory and computing resources.展开更多
Federated learning enables data owners in the Internet of Things(IoT)to collaborate in training models without sharing private data,creating new business opportunities for building a data market.However,in practical o...Federated learning enables data owners in the Internet of Things(IoT)to collaborate in training models without sharing private data,creating new business opportunities for building a data market.However,in practical operation,there are still some problems with federated learning applications.Blockchain has the characteristics of decentralization,distribution,and security.The blockchain-enabled federated learning further improve the security and performance of model training,while also expanding the application scope of federated learning.Blockchain has natural financial attributes that help establish a federated learning data market.However,the data of federated learning tasks may be distributed across a large number of resource-constrained IoT devices,which have different computing,communication,and storage resources,and the data quality of each device may also vary.Therefore,how to effectively select the clients with the data required for federated learning task is a research hotspot.In this paper,a two-stage client selection scheme for blockchain-enabled federated learning is proposed,which first selects clients that satisfy federated learning task through attribute-based encryption,protecting the attribute privacy of clients.Then blockchain nodes select some clients for local model aggregation by proximal policy optimization algorithm.Experiments show that the model performance of our two-stage client selection scheme is higher than that of other client selection algorithms when some clients are offline and the data quality is poor.展开更多
This article addresses the issue of computing the constant required to implement a specific nonparametric subset selection procedure based on ranks of data arising in a statistical randomized block experimental design...This article addresses the issue of computing the constant required to implement a specific nonparametric subset selection procedure based on ranks of data arising in a statistical randomized block experimental design. A model of three populations and two blocks is used to compute the probability distribution of the relevant statistic, the maximum of the population rank sums minus the rank sum of the “best” population. Calculations are done for populations following a normal distribution, and for populations following a bi-uniform distribution. The least favorable configuration in these cases is shown to arise when all three populations follow identical distributions. The bi-uniform distribution leads to an asymptotic counterexample to the conjecture that the least favorable configuration, i.e., that configuration minimizing the probability of a correct selection, occurs when all populations are identically distributed. These results are consistent with other large-scale simulation studies. All relevant computational R-codes are provided in appendices.展开更多
基金supported in part by the Natural Science Youth Foundation of Hebei Province under Grant F2019403207in part by the PhD Research Startup Foundation of Hebei GEO University under Grant BQ2019055+3 种基金in part by the Open Research Project of the Hubei Key Laboratory of Intelligent Geo-Information Processing under Grant KLIGIP-2021A06in part by the Fundamental Research Funds for the Universities in Hebei Province under Grant QN202220in part by the Science and Technology Research Project for Universities of Hebei under Grant ZD2020344in part by the Guangxi Natural Science Fund General Project under Grant 2021GXNSFAA075029.
文摘In classification problems,datasets often contain a large amount of features,but not all of them are relevant for accurate classification.In fact,irrelevant features may even hinder classification accuracy.Feature selection aims to alleviate this issue by minimizing the number of features in the subset while simultaneously minimizing the classification error rate.Single-objective optimization approaches employ an evaluation function designed as an aggregate function with a parameter,but the results obtained depend on the value of the parameter.To eliminate this parameter’s influence,the problem can be reformulated as a multi-objective optimization problem.The Whale Optimization Algorithm(WOA)is widely used in optimization problems because of its simplicity and easy implementation.In this paper,we propose a multi-strategy assisted multi-objective WOA(MSMOWOA)to address feature selection.To enhance the algorithm’s search ability,we integrate multiple strategies such as Levy flight,Grey Wolf Optimizer,and adaptive mutation into it.Additionally,we utilize an external repository to store non-dominant solution sets and grid technology is used to maintain diversity.Results on fourteen University of California Irvine(UCI)datasets demonstrate that our proposed method effectively removes redundant features and improves classification performance.The source code can be accessed from the website:https://github.com/zc0315/MSMOWOA.
基金supported in part by the National Natural Science Foundation of China(No.61701197)in part by the National Key Research and Development Program of China(No.2021YFA1000500(4))in part by the 111 Project(No.B23008).
文摘In vehicle edge computing(VEC),asynchronous federated learning(AFL)is used,where the edge receives a local model and updates the global model,effectively reducing the global aggregation latency.Due to different amounts of local data,computing capabilities and locations of the vehicles,renewing the global model with same weight is inappropriate.The above factors will affect the local calculation time and upload time of the local model,and the vehicle may also be affected by Byzantine attacks,leading to the deterioration of the vehicle data.However,based on deep reinforcement learning(DRL),we can consider these factors comprehensively to eliminate vehicles with poor performance as much as possible and exclude vehicles that have suffered Byzantine attacks before AFL.At the same time,when aggregating AFL,we can focus on those vehicles with better performance to improve the accuracy and safety of the system.In this paper,we proposed a vehicle selection scheme based on DRL in VEC.In this scheme,vehicle’s mobility,channel conditions with temporal variations,computational resources with temporal variations,different data amount,transmission channel status of vehicles as well as Byzantine attacks were taken into account.Simulation results show that the proposed scheme effectively improves the safety and accuracy of the global model.
基金financial supports from National Natural Science Foundation of China(No.62205172)Huaneng Group Science and Technology Research Project(No.HNKJ22-H105)Tsinghua University Initiative Scientific Research Program and the International Joint Mission on Climate Change and Carbon Neutrality。
文摘Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can affect its quantification performance.In this work,we propose a hybrid variable selection method to improve the performance of LIBS quantification.Important variables are first identified using Pearson's correlation coefficient,mutual information,least absolute shrinkage and selection operator(LASSO)and random forest,and then filtered and combined with empirical variables related to fingerprint elements of coal ash content.Subsequently,these variables are fed into a partial least squares regression(PLSR).Additionally,in some models,certain variables unrelated to ash content are removed manually to study the impact of variable deselection on model performance.The proposed hybrid strategy was tested on three LIBS datasets for quantitative analysis of coal ash content and compared with the corresponding data-driven baseline method.It is significantly better than the variable selection only method based on empirical knowledge and in most cases outperforms the baseline method.The results showed that on all three datasets the hybrid strategy for variable selection combining empirical knowledge and data-driven algorithms achieved the lowest root mean square error of prediction(RMSEP)values of 1.605,3.478 and 1.647,respectively,which were significantly lower than those obtained from multiple linear regression using only 12 empirical variables,which are 1.959,3.718 and 2.181,respectively.The LASSO-PLSR model with empirical support and 20 selected variables exhibited a significantly improved performance after variable deselection,with RMSEP values dropping from 1.635,3.962 and 1.647 to 1.483,3.086 and 1.567,respectively.Such results demonstrate that using empirical knowledge as a support for datadriven variable selection can be a viable approach to improve the accuracy and reliability of LIBS quantification.
基金the Deputyship for Research and Innovation,“Ministry of Education”in Saudi Arabia for funding this research(IFKSUOR3-014-3).
文摘In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.
文摘Pattern matching method is one of the classic classifications of existing online portfolio selection strategies. This article aims to study the key aspects of this method—measurement of similarity and selection of similarity sets, and proposes a Portfolio Selection Method based on Pattern Matching with Dual Information of Direction and Distance (PMDI). By studying different combination methods of indicators such as Euclidean distance, Chebyshev distance, and correlation coefficient, important information such as direction and distance in stock historical price information is extracted, thereby filtering out the similarity set required for pattern matching based investment portfolio selection algorithms. A large number of experiments conducted on two datasets of real stock markets have shown that PMDI outperforms other algorithms in balancing income and risk. Therefore, it is suitable for the financial environment in the real world.
基金Major Project for New Generation of AI Grant No.2018AAA0100400)the Scientific Research Fund of Hunan Provincial Education Department,China(Grant Nos.21A0350,21C0439,22A0408,22A0414,2022JJ30231,22B0559)the National Natural Science Foundation of Hunan Province,China(Grant No.2022JJ50051).
文摘Radiomics is a non-invasive method for extracting quantitative and higher-dimensional features from medical images for diagnosis.It has received great attention due to its huge application prospects in recent years.We can know that the number of features selected by the existing radiomics feature selectionmethods is basically about ten.In this paper,a heuristic feature selection method based on frequency iteration and multiple supervised training mode is proposed.Based on the combination between features,it decomposes all features layer by layer to select the optimal features for each layer,then fuses the optimal features to form a local optimal group layer by layer and iterates to the global optimal combination finally.Compared with the currentmethod with the best prediction performance in the three data sets,thismethod proposed in this paper can reduce the number of features fromabout ten to about three without losing classification accuracy and even significantly improving classification accuracy.The proposed method has better interpretability and generalization ability,which gives it great potential in the feature selection of radiomics.
基金Supported by the Guangdong Province Basic and Applied Basic Research Fund Project(No.2020A1515110826)the National Natural Science Foundation of China(No.42006115)the Major Scientific and Technological Projects of Hainan Province(No.ZDKJ2021036)。
文摘Manganese superoxide dismutase(MnSOD)is an antioxidant that exists in mitochondria and can effectively remove superoxide anions in mitochondria.In a dark,high-pressure,and low-temperature deep-sea environment,MnSOD is essential for the survival of sea cucumbers.Six MnSODs were identified from the transcriptomes of deep and shallow-sea sea cucumbers.To explore their environmental adaptation mechanism,we conducted environmental selection pressure analysis through the branching site model of PAML software.We obtained night positive selection sites,and two of them were significant(97F→H,134K→V):97F→H located in a highly conservative characteristic sequence,and its polarity c hange might have a great impact on the function of MnSOD;134K→V had a change in piezophilic a bility,which might help MnSOD adapt to the environment of high hydrostatic pressure in the deepsea.To further study the effect of these two positive selection sites on MnSOD,we predicted the point mutations of F97H and K134V on shallow-sea sea cucumber by using MAESTROweb and PyMOL.Results show that 97F→H,134K→V might improve MnSOD’s efficiency of scavenging superoxide a nion and its ability to resist high hydrostatic pressure by moderately reducing its stability.The above results indicated that MnSODs of deep-sea sea cucumber adapted to deep-sea environments through their amino acid changes in polarity,piezophilic behavior,and local stability.This study revealed the correlation between MnSOD and extreme environment,and will help improve our understanding of the organism’s adaptation mechanisms in deep sea.
基金funded by the Scientific Research Project of the Higher Education Department of Guizhou Province[Qianjiaoji 2022(187)]Department of Education of Guizhou Province[Guizhou Teaching and Technology(2023)015]+1 种基金Guizhou Medical University National Natural Science Foundation Cultivation Project(22NSFCP45)China Postdoctoral Science Foundation Project(General Program No.2022M720929).
文摘Background:The heterogeneity of prognosis and treatment benefits among patients with gliomas is due to tumor microenvironment characteristics.However,biomarkers that reflect microenvironmental characteristics and predict the prognosis of gliomas are limited.Therefore,we aimed to develop a model that can effectively predict prognosis,differentiate microenvironment signatures,and optimize drug selection for patients with glioma.Materials and Methods:The CIBERSORT algorithm,bulk sequencing analysis,and single-cell RNA(scRNA)analysis were employed to identify significant cross-talk genes between M2 macrophages and cancer cells in glioma tissues.A predictive model was constructed based on cross-talk gene expression,and its effect on prognosis,recurrence prediction,and microenvironment characteristics was validated in multiple cohorts.The effect of the predictive model on drug selection was evaluated using the OncoPredict algorithm and relevant cellular biology experiments.Results:A high abundance of M2 macrophages in glioma tissues indicates poor prognosis,and cross-talk between macrophages and cancer cells plays a crucial role in shaping the tumor microenvironment.Eight genes involved in the cross-talk between macrophages and cancer cells were identified.Among them,periostin(POSTN),chitinase 3 like 1(CHI3L1),serum amyloid A1(SAA1),and matrix metallopeptidase 9(MMP9)were selected to construct a predictive model.The developed model demonstrated significant efficacy in distinguishing patient prognosis,recurrent cases,and characteristics of high inflammation,hypoxia,and immunosuppression.Furthermore,this model can serve as a valuable tool for guiding the use of trametinib.Conclusions:In summary,this study provides a comprehensive understanding of the interplay between M2 macrophages and cancer cells in glioma;utilizes a cross-talk gene signature to develop a predictive model that can predict the differentiation of patient prognosis,recurrence instances,and microenvironment characteristics;and aids in optimizing the application of trametinib in glioma patients.
基金funded by the Researchers Supporting Project Number RSPD2024R681,King Saud University,Riyadh,Saudi Arabia.
文摘With the rapid development and application of energy harvesting technology,it has become a prominent research area due to its significant benefits in terms of green environmental protection,convenience,and high safety and efficiency.However,the uneven energy collection and consumption among IoT devices at varying distances may lead to resource imbalance within energy harvesting networks,thereby resulting in low energy transmission efficiency.To enhance the energy transmission efficiency of IoT devices in energy harvesting,this paper focuses on the utilization of collaborative communication,along with pricing-based incentive mechanisms and auction strategies.We propose a dynamic relay selection scheme,including a ladder pricing mechanism based on energy level and a Kuhn-Munkre Algorithm based on an auction theory employing a negotiation mechanism,to encourage more IoT devices to participate in the collaboration process.Simulation results demonstrate that the proposed algorithm outperforms traditional algorithms in terms of improving the energy efficiency of the system.
基金supported by the earmarked fund for China Agriculture Research System(CARS-35)the National Natural Science Foundation of China(32022078)supported by the National Supercomputer Centre in Guangzhou。
文摘Genomic selection(GS)has been widely used in livestock,which greatly accelerated the genetic progress of complex traits.The population size was one of the significant factors affecting the prediction accuracy,while it was limited by the purebred population.Compared to directly combining two uncorrelated purebred populations to extend the reference population size,it might be more meaningful to incorporate the correlated crossbreds into reference population for genomic prediction.In this study,we simulated purebred offspring(PAS and PBS)and crossbred offspring(CAB)base on real genotype data of two base purebred populations(PA and PB),to evaluate the performance of genomic selection on purebred while incorporating crossbred information.The results showed that selecting key crossbred individuals via maximizing the expected genetic relationship(REL)was better than the other methods(individuals closet or farthest to the purebred population,CP/FP)in term of the prediction accuracy.Furthermore,the prediction accuracy of reference populations combining PA and CAB was significantly better only based on PA,which was similar to combine PA and PAS.Moreover,the rank correlation between the multiple of the increased relationship(MIR)and reliability improvement was 0.60-0.70.But for individuals with low correlation(Cor(Pi,PA or B),the reliability improvement was significantly lower than other individuals.Our findings suggested that incorporating crossbred into purebred population could improve the performance of genetic prediction compared with using the purebred population only.The genetic relationship between purebred and crossbred population is a key factor determining the increased reliability while incorporating crossbred population in the genomic prediction on pure bred individuals.
基金Project supported by the National Key Research and Development Program of China(No.2021YFB3400700)the National Natural Science Foundation of China(Nos.12422201,12072188,12121002,and 12372017)。
文摘The variable selection of high dimensional nonparametric nonlinear systems aims to select the contributing variables or to eliminate the redundant variables.For a high dimensional nonparametric nonlinear system,however,identifying whether a variable contributes or not is not easy.Therefore,based on the Fourier spectrum of densityweighted derivative,one novel variable selection approach is developed,which does not suffer from the dimensionality curse and improves the identification accuracy.Furthermore,a necessary and sufficient condition for testing a variable whether it contributes or not is provided.The proposed approach does not require strong assumptions on the distribution,such as elliptical distribution.The simulation study verifies the effectiveness of the novel variable selection algorithm.
基金This research was funded by the Short-Term Electrical Load Forecasting Based on Feature Selection and optimized LSTM with DBO which is the Fundamental Scientific Research Project of Liaoning Provincial Department of Education(JYTMS20230189)the Application of Hybrid Grey Wolf Algorithm in Job Shop Scheduling Problem of the Research Support Plan for Introducing High-Level Talents to Shenyang Ligong University(No.1010147001131).
文摘Feature Selection(FS)is a key pre-processing step in pattern recognition and data mining tasks,which can effectively avoid the impact of irrelevant and redundant features on the performance of classification models.In recent years,meta-heuristic algorithms have been widely used in FS problems,so a Hybrid Binary Chaotic Salp Swarm Dung Beetle Optimization(HBCSSDBO)algorithm is proposed in this paper to improve the effect of FS.In this hybrid algorithm,the original continuous optimization algorithm is converted into binary form by the S-type transfer function and applied to the FS problem.By combining the K nearest neighbor(KNN)classifier,the comparative experiments for FS are carried out between the proposed method and four advanced meta-heuristic algorithms on 16 UCI(University of California,Irvine)datasets.Seven evaluation metrics such as average adaptation,average prediction accuracy,and average running time are chosen to judge and compare the algorithms.The selected dataset is also discussed by categorizing it into three dimensions:high,medium,and low dimensions.Experimental results show that the HBCSSDBO feature selection method has the ability to obtain a good subset of features while maintaining high classification accuracy,shows better optimization performance.In addition,the results of statistical tests confirm the significant validity of the method.
基金supported by the Key-Area Research and Development Program of Guangdong Province under Grant No.2020B0101090004the National Natural Science Foundation of China under Grant No.62072215,the Guangzhou Basic Research Plan City-School Joint Funding Project under Grant No.2024A03J0405+1 种基金the Guangzhou Basic and Applied Basic Research Foundation under Grant No.2024A04J3458the State Archives Administration Science and Technology Program Plan of China under Grant 2023-X-028.
文摘Federated learning is an important distributed model training technique in Internet of Things(IoT),in which participant selection is a key component that plays a role in improving training efficiency and model accuracy.This module enables a central server to select a subset of participants to performmodel training based on data and device information.By doing so,selected participants are rewarded and actively perform model training,while participants that are detrimental to training efficiency and model accuracy are excluded.However,in practice,participants may suspect that the central server may have miscalculated and thus not made the selection honestly.This lack of trustworthiness problem,which can demotivate participants,has received little attention.Another problem that has received little attention is the leakage of participants’private information during the selection process.We will therefore propose a federated learning framework with auditable participant selection.It supports smart contracts in selecting a set of suitable participants based on their training loss without compromising the privacy.Considering the possibility of malicious campaigning and impersonation of participants,the framework employs commitment schemes and zero-knowledge proofs to counteract these malicious behaviors.Finally,we analyze the security of the framework and conduct a series of experiments to demonstrate that the framework can effectively improve the efficiency of federated learning.
基金supported by the“Integration of Two Chains”Key Research and Development Projects of Shaanxi Province“Wheat Seed Industry Innovation Project”,Chinathe Key R&D of Yangling Seed Industry Innovation Center,China(Ylzy-xm-01)。
文摘The grain protein content(GPC)is the key parameter for wheat grain nutritional quality.This study conducted a resampling GWAS analysis using 406 wheat accessions across eight environments,and identified four previously reported GPC QTLs.An analysis of 87 landraces and 259 modern cultivars revealed the loss of superior GPC haplotypes,especially in Chinese cultivars.These haplotypes were preferentially adopted in different agroecological zones and had broad effects on wheat yield and agronomic traits.Most GPC QTLs did not significantly reduce yield,suggesting that high GPC can be achieved without a yield penalty.The results of this study provide a reference for future GPC breeding in wheat using the four identified QTLs.
文摘Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER.
基金supported in part by The National Natural Science Foundation of China(62071255,62171232,61771257)The Major Projects of the Natural Science Foundation of the Jiangsu Higher Education Institutions(20KJA510009)+3 种基金The Open Research Fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology(Nanjing University of Posts and Telecommunications),Ministry of Education(JZNY201914)The open research fund of National and Local Joint Engineering Laboratory of RF Integration and Micro-Assembly Technology,Nanjing University of Posts and Telecommunications(KFJJ20170305)The Research Fund of Nanjing University of Posts and Telecommunications(NY218012)Henan province science and technology research projects High and new technology(No.182102210106).
文摘Millimeter-wave transmission combined with Orbital Angular Momentum(OAM)has the advantage of reducing the loss of beam power and increasing the system capacity.However,to fulfill this advantage,the antennas at the transmitter and receiver must be parallel and coaxial;otherwise,the accuracy of mode detection at the receiver can be seriously influenced.In this paper,we design an OAM millimeter-wave communication system for overcoming the above limitation.Specifically,the first contribution is that the power distribution between different OAM modes and the capacity of the system with different mode sets are analytically derived for performance analysis.The second contribution lies in that a novel mode selection scheme is proposed to reduce the total interference between different modes.Numerical results show that system performance is less affected by the offset when the mode set with smaller modes or larger intervals is selected.
文摘Amid the landscape of Cloud Computing(CC),the Cloud Datacenter(DC)stands as a conglomerate of physical servers,whose performance can be hindered by bottlenecks within the realm of proliferating CC services.A linchpin in CC’s performance,the Cloud Service Broker(CSB),orchestrates DC selection.Failure to adroitly route user requests with suitable DCs transforms the CSB into a bottleneck,endangering service quality.To tackle this,deploying an efficient CSB policy becomes imperative,optimizing DC selection to meet stringent Qualityof-Service(QoS)demands.Amidst numerous CSB policies,their implementation grapples with challenges like costs and availability.This article undertakes a holistic review of diverse CSB policies,concurrently surveying the predicaments confronted by current policies.The foremost objective is to pinpoint research gaps and remedies to invigorate future policy development.Additionally,it extensively clarifies various DC selection methodologies employed in CC,enriching practitioners and researchers alike.Employing synthetic analysis,the article systematically assesses and compares myriad DC selection techniques.These analytical insights equip decision-makers with a pragmatic framework to discern the apt technique for their needs.In summation,this discourse resoundingly underscores the paramount importance of adept CSB policies in DC selection,highlighting the imperative role of efficient CSB policies in optimizing CC performance.By emphasizing the significance of these policies and their modeling implications,the article contributes to both the general modeling discourse and its practical applications in the CC domain.
基金supported by National Natural Science Foundation of China(62371098)Natural Science Foundation of Sichuan Province(2023NSFSC1422)+1 种基金National Key Research and Development Program of China(2021YFB2900404)Central Universities of South west Minzu University(ZYN2022032).
文摘In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classifiers on large signal datasets with redundant samples requires significant memory and high costs.This paper proposes a support databased core-set selection method(SD)for signal recognition,aiming to screen a representative subset that approximates the large signal dataset.Specifically,this subset can be identified by employing the labeled information during the early stages of model training,as some training samples are labeled as supporting data frequently.This support data is crucial for model training and can be found using a border sample selector.Simulation results demonstrate that the SD method minimizes the impact on model recognition performance while reducing the dataset size,and outperforms five other state-of-the-art core-set selection methods when the fraction of training sample kept is less than or equal to 0.3 on the RML2016.04C dataset or 0.5 on the RML22 dataset.The SD method is particularly helpful for signal recognition tasks with limited memory and computing resources.
文摘Federated learning enables data owners in the Internet of Things(IoT)to collaborate in training models without sharing private data,creating new business opportunities for building a data market.However,in practical operation,there are still some problems with federated learning applications.Blockchain has the characteristics of decentralization,distribution,and security.The blockchain-enabled federated learning further improve the security and performance of model training,while also expanding the application scope of federated learning.Blockchain has natural financial attributes that help establish a federated learning data market.However,the data of federated learning tasks may be distributed across a large number of resource-constrained IoT devices,which have different computing,communication,and storage resources,and the data quality of each device may also vary.Therefore,how to effectively select the clients with the data required for federated learning task is a research hotspot.In this paper,a two-stage client selection scheme for blockchain-enabled federated learning is proposed,which first selects clients that satisfy federated learning task through attribute-based encryption,protecting the attribute privacy of clients.Then blockchain nodes select some clients for local model aggregation by proximal policy optimization algorithm.Experiments show that the model performance of our two-stage client selection scheme is higher than that of other client selection algorithms when some clients are offline and the data quality is poor.
文摘This article addresses the issue of computing the constant required to implement a specific nonparametric subset selection procedure based on ranks of data arising in a statistical randomized block experimental design. A model of three populations and two blocks is used to compute the probability distribution of the relevant statistic, the maximum of the population rank sums minus the rank sum of the “best” population. Calculations are done for populations following a normal distribution, and for populations following a bi-uniform distribution. The least favorable configuration in these cases is shown to arise when all three populations follow identical distributions. The bi-uniform distribution leads to an asymptotic counterexample to the conjecture that the least favorable configuration, i.e., that configuration minimizing the probability of a correct selection, occurs when all populations are identically distributed. These results are consistent with other large-scale simulation studies. All relevant computational R-codes are provided in appendices.