The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class cla...The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset.展开更多
The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wav...The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wavelet transform coefficients and wavelet transform energy distribution constitute feature vectors. These vectors are then trained and tested using SVM multi-class algorithms. Experimental results demonstrate that the SVM multi-class algorithms, which use the Gaussian radial basis function, exponential radial basis function, and hyperbolic tangent function as basis functions, are suitable methods for power quality disturbance classification.展开更多
The inverse problems for motions of dynamic systems of which are described by system of the ordinary differential equations are examined. The classification of such type of inverse problems is given. It was shown that...The inverse problems for motions of dynamic systems of which are described by system of the ordinary differential equations are examined. The classification of such type of inverse problems is given. It was shown that inverse problems can be divided into two types: synthesis inverse problems and inverse problems of measurement (recognition). Each type of inverse problems requires separate approach to statements and solution methods. The regularization method for obtaining of stable solution of inverse problems was suggested. In some cases, instead of recognition of inverse problems solution, the estimation of solution can be used. Within the framework of this approach, two practical inverse problems of measurement are considered.展开更多
Big data is a term that refers to a set of data that,due to its largeness or complexity,cannot be stored or processed with one of the usual tools or applications for data management,and it has become a prominent word...Big data is a term that refers to a set of data that,due to its largeness or complexity,cannot be stored or processed with one of the usual tools or applications for data management,and it has become a prominent word in recent years for the massive development of technology.Almost immediately thereafter,the term“big data mining”emerged,i.e.,mining from big data even as an emerging and interconnected field of research.Classification is an important stage in data mining since it helps people make better decisions in a variety of situations,including scientific endeavors,biomedical research,and industrial applications.The probabilistic neural network(PNN)is a commonly used and successful method for handling classification and pattern recognition issues.In this study,the authors proposed to combine the probabilistic neural network(PPN),which is one of the data mining techniques,with the vibrating particles system(VPS),which is one of the metaheuristic algorithms named“VPS-PNN”,to solve classi-fication problems more effectively.The data set is eleven common benchmark medical datasets from the machine-learning library,the suggested method was tested.The suggested VPS-PNN mechanism outperforms the PNN,biogeography-based optimization,enhanced-water cycle algorithm(E-WCA)and the firefly algorithm(FA)in terms of convergence speed and classification accuracy.展开更多
The establishment of a unified land use classification system is the basis for realizing the unified management of land and sea,urban and rural areas,and aboveground and underground space.In November 2020,the Ministry...The establishment of a unified land use classification system is the basis for realizing the unified management of land and sea,urban and rural areas,and aboveground and underground space.In November 2020,the Ministry of Natural Resources of the People's Republic of China issued the Classification Guide for Land and Space Survey,Planning and Use Control of Land and Sea(for Trial Implementation),which aims to establish a national unified land and sea use classification system,lay an important foundation for scientific planning and unified management of natural resources,rational use and protection of natural resources,and speed up the construction of a new pattern of land and space development and protection.However,there are still some obvious shortcomings in the Classification Guide.This paper analyzes some problems existing in this classification standard from three aspects of logicality,rigorousness and comprehensiveness,and puts forward some suggestions for further improvement.This has important practical significance to better guiding the practice of land use and land resources management,and then to achieving the goal of unified management of natural resources.展开更多
A holistic analysis of problem and incident tickets in a real production cloud service environment is presented in this paper.By extracting different bags of words,we use principal component analysis(PCA)to examine th...A holistic analysis of problem and incident tickets in a real production cloud service environment is presented in this paper.By extracting different bags of words,we use principal component analysis(PCA)to examine the clustering characteristics of these tickets.Then Kmeans and latent Dirichlet allocation(LDA)are applied to show the potential clusters within this Cloud environment.The second part of our study uses a pre-trained bidirectional encoder representation from transformers(BERT)model to classify the tickets,with the goal of predicting the optimal dispatching department for a given ticket.Experimental results show that due to the unique characteristics of ticket description,pre-processing with domain knowledge turns out to be critical in both clustering and classification.Our classification model yields 86%accuracy when predicting the target dispatching department.展开更多
With the development of deep learning and Convolutional Neural Networks(CNNs),the accuracy of automatic food recognition based on visual data have significantly improved.Some research studies have shown that the deepe...With the development of deep learning and Convolutional Neural Networks(CNNs),the accuracy of automatic food recognition based on visual data have significantly improved.Some research studies have shown that the deeper the model is,the higher the accuracy is.However,very deep neural networks would be affected by the overfitting problem and also consume huge computing resources.In this paper,a new classification scheme is proposed for automatic food-ingredient recognition based on deep learning.We construct an up-to-date combinational convolutional neural network(CBNet)with a subnet merging technique.Firstly,two different neural networks are utilized for learning interested features.Then,a well-designed feature fusion component aggregates the features from subnetworks,further extracting richer and more precise features for image classification.In order to learn more complementary features,the corresponding fusion strategies are also proposed,including auxiliary classifiers and hyperparameters setting.Finally,CBNet based on the well-known VGGNet,ResNet and DenseNet is evaluated on a dataset including 41 major categories of food ingredients and 100 images for each category.Theoretical analysis and experimental results demonstrate that CBNet achieves promising accuracy for multi-class classification and improves the performance of convolutional neural networks.展开更多
Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise...Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise the multi-class classification accuracies. The topology structure of DAG is constructed by rearranging the nodes' sequence in the graph. DAG is equivalent to guided operating SVM on a list, and the classification performance depends on the nodes' sequence in the graph. Jeffries-Matusita distance (JMD) is introduced to estimate the separability of each class, and the implementation list is initialized with all classes organized according to certain sequence in the list. To testify the effectiveness of the proposed method, numerical analysis is conducted on UCI data and hyperspectral data. Meanwhile, comparative studies using standard OAO and DAG classification methods are also conducted and the results illustrate better performance and higher accuracy of the orooosed JMD-DAG method.展开更多
A modified multisurface "proximal support vector machine classifier via generalized eigenvalues (GEPSVM for short)" was proposed. By defining a new principle, we designed a new classification approach via GEPSVM, ...A modified multisurface "proximal support vector machine classifier via generalized eigenvalues (GEPSVM for short)" was proposed. By defining a new principle, we designed a new classification approach via GEPSVM, namely, maximum or minimum plane distance GEPSVM (MPDGEPSVM). Unlike GEPSVM, our approach obtains two planes by solving two simple eigenvalue problems, such that it can avoid occurrence of singular problems. Our approach, compared with GEPSVM, has better classification performalce. Moreover, MPDGEPSVM is over one order of magnitude faster than GEPSVM, and almost two orders of magnitude faster than SVM. Computational results on public datasets from UCI database illustrated the efficiency of MPDGEPSVM.展开更多
In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample...In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample points.The algorithm defines an error as a criterion by computing a sample's reconstruction weight using LLE.Furthermore,the existence and characteristics of low dimensional manifold in range-profile time-frequency information are explored using manifold learning algorithm,aiming at the problem of target recognition about high range resolution MilliMeter-Wave(MMW) radar.The new algorithm is applied to radar target recognition.The experiment results show the algorithm is efficient.Compared with other classification algorithms,our method improves the recognition precision and the result is not sensitive to input parameters.展开更多
Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challe...Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challenging research problem.Various machine learning techniques are designed to operate on balanced datasets;therefore,the state of the art,different undersampling,over-sampling and hybrid strategies have been proposed to deal with the problem of imbalanced datasets,but highly skewed datasets still pose the problem of generalization and noise generation during resampling.To overcome these problems,this paper proposes amajority clusteringmodel for classification of imbalanced datasets known as MCBC-SMOTE(Majority Clustering for balanced Classification-SMOTE).The model provides a method to convert the problem of binary classification into a multi-class problem.In the proposed algorithm,the number of clusters for themajority class is calculated using the elbow method and the minority class is over-sampled as an average of clustered majority classes to generate a symmetrical class distribution.The proposed technique is cost-effective,reduces the problem of noise generation and successfully disables the imbalances present in between and within classes.The results of the evaluations on diverse real datasets proved to provide better classification results as compared to state of the art existing methodologies based on several performance metrics.展开更多
This paper offers a symbiosis based hybrid modified DNA-ABC optimization algorithm which combines modified DNA concepts and artificial bee colony (ABC) algorithm to aid hierarchical fuzzy classification. According to ...This paper offers a symbiosis based hybrid modified DNA-ABC optimization algorithm which combines modified DNA concepts and artificial bee colony (ABC) algorithm to aid hierarchical fuzzy classification. According to literature, the ABC algorithm is traditionally applied to constrained and unconstrained problems, but is combined with modified DNA concepts and implemented for fuzzy classification in this present research. Moreover, from the best of our knowledge, previous research on the ABC algorithm has not combined it with DNA computing for hierarchical fuzzy classification to explore the merits of cooperative coevolution. Therefore, this paper is the first to apply the mechanism of symbiosis to create a hybrid modified DNA-ABC algorithm for hierarchical fuzzy classification applications. In this study, the partition number and the shape of the membership function are extracted by the symbiosis based hybrid modified DNA-ABC optimization algorithm, which provides both sufficient global exploration and also adequate local exploitation for hierarchical fuzzy classification. The proposed optimization algorithm is applied on five benchmark University of Irvine (UCI) data sets, and the results prove the efficiency of the algorithm.展开更多
Purpose:A text generation based multidisciplinary problem identification method is proposed,which does not rely on a large amount of data annotation.Design/methodology/approach:The proposed method first identifies the...Purpose:A text generation based multidisciplinary problem identification method is proposed,which does not rely on a large amount of data annotation.Design/methodology/approach:The proposed method first identifies the research objective types and disciplinary labels of papers using a text classification technique;second,it generates abstractive titles for each paper based on abstract and research objective types using a generative pre-trained language model;third,it extracts problem phrases from generated titles according to regular expression rules;fourth,it creates problem relation networks and identifies the same problems by exploiting a weighted community detection algorithm;finally,it identifies multidisciplinary problems based on the disciplinary labels of papers.Findings:Experiments in the“Carbon Peaking and Carbon Neutrality”field show that the proposed method can effectively identify multidisciplinary research problems.The disciplinary distribution of the identified problems is consistent with our understanding of multidisciplinary collaboration in the field.Research limitations:It is necessary to use the proposed method in other multidisciplinary fields to validate its effectiveness.Practical implications:Multidisciplinary problem identification helps to gather multidisciplinary forces to solve complex real-world problems for the governments,fund valuable multidisciplinary problems for research management authorities,and borrow ideas from other disciplines for researchers.Originality/value:This approach proposes a novel multidisciplinary problem identification method based on text generation,which identifies multidisciplinary problems based on generative abstractive titles of papers without data annotation required by standard sequence labeling techniques.展开更多
High-dimensional datasets present significant challenges for classification tasks.Dimensionality reduction,a crucial aspect of data preprocessing,has gained substantial attention due to its ability to improve classifi...High-dimensional datasets present significant challenges for classification tasks.Dimensionality reduction,a crucial aspect of data preprocessing,has gained substantial attention due to its ability to improve classification per-formance.However,identifying the optimal features within high-dimensional datasets remains a computationally demanding task,necessitating the use of efficient algorithms.This paper introduces the Arithmetic Optimization Algorithm(AOA),a novel approach for finding the optimal feature subset.AOA is specifically modified to address feature selection problems based on a transfer function.Additionally,two enhancements are incorporated into the AOA algorithm to overcome limitations such as limited precision,slow convergence,and susceptibility to local optima.The first enhancement proposes a new method for selecting solutions to be improved during the search process.This method effectively improves the original algorithm’s accuracy and convergence speed.The second enhancement introduces a local search with neighborhood strategies(AOA_NBH)during the AOA exploitation phase.AOA_NBH explores the vast search space,aiding the algorithm in escaping local optima.Our results demonstrate that incorporating neighborhood methods enhances the output and achieves significant improvement over state-of-the-art methods.展开更多
Hierarchical Support Vector Machine (H-SVM) is faster in training and classification than other usual multi-class SVMs such as "1-V-R"and "1-V-1". In this paper, a new multi-class fault diagnosis algorithm based...Hierarchical Support Vector Machine (H-SVM) is faster in training and classification than other usual multi-class SVMs such as "1-V-R"and "1-V-1". In this paper, a new multi-class fault diagnosis algorithm based on H-SVM is proposed and applied to aero-engine. Before SVM training, the training data are first clustered according to their class-center Euclid distances in some feature spaces. The samples which have close distances are divided into the same sub-classes for training, and this makes the H-SVM have reasonable hierarchical construction and good generalization performance. Instead of the common C-SVM, the v-SVM is selected as the binary classifier, in which the parameter v varies only from 0 to 1 and can be determined more easily. The simulation results show that the designed H-SVMs can fast diagnose the multi-class single faults and combination faults for the gas path components of an aero-engine. The fault classifiers have good diagnosis accuracy and can keep robust even when the measurement inputs are disturbed by noises.展开更多
Multi-source multi-class classification methods based on multi-class Support Vector Machines and data fusion strategies are proposed in this paper. The centralized and distributed fusion schemes are applied to combine...Multi-source multi-class classification methods based on multi-class Support Vector Machines and data fusion strategies are proposed in this paper. The centralized and distributed fusion schemes are applied to combine information from several data sources. In the centralized scheme, all information from several data sources is centralized to construct an input space. Then a multi-class Support Vector Machine classifier is trained. In the distributed schemes, the individual data sources are proc-essed separately and modelled by using the multi-class Support Vector Machine. Then new data fusion strategies are proposed to combine the information from the individual multi-class Support Vector Machine models. Our proposed fusion strategies take into account that an Support Vector Machine (SVM) classifier achieves classification by finding the optimal classification hyperplane with maximal margin. The proposed methods are applied for fault diagnosis of a diesel engine. The experimental results showed that almost all the proposed approaches can largely improve the diagnostic accuracy. The robustness of diagnosis is also improved because of the implementation of data fusion strategies. The proposed methods can also be applied in other fields.展开更多
In this paper,a multi-point boundary value problems for a three order nonlinear deferential equation is considered.With the help of coincidence theorem due to Mawhin,a existence theorem is obtained.
A technique is developed for finding a closed form expression for the cumulative distribution function of the maximum value of the objective function in a stochastic linear programming problem, where either the object...A technique is developed for finding a closed form expression for the cumulative distribution function of the maximum value of the objective function in a stochastic linear programming problem, where either the objective function coefficients or the right hand side coefficients are continuous random vectors with known probability distributions. This is the “wait and see” problem of stochastic linear programming. Explicit results for the distribution problem are extremely difficult to obtain;indeed, previous results are known only if the right hand side coefficients have an exponential distribution [1]. To date, no explicit results have been obtained for stochastic c, and no new results of any form have appeared since the 1970’s. In this paper, we obtain the first results for stochastic c, and new explicit results if b an c are stochastic vectors with an exponential, gamma, uniform, or triangle distribution. A transformation is utilized that greatly reduces computational time.展开更多
基金supported by the National Natural Science Foundation of China(61703131 61703129+1 种基金 61701148 61703128)
文摘The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset.
文摘The accurate identification and classification of various power quality disturbances are keys to ensuring high-quality electrical energy. In this study, the statistical characteristics of the disturbance signal of wavelet transform coefficients and wavelet transform energy distribution constitute feature vectors. These vectors are then trained and tested using SVM multi-class algorithms. Experimental results demonstrate that the SVM multi-class algorithms, which use the Gaussian radial basis function, exponential radial basis function, and hyperbolic tangent function as basis functions, are suitable methods for power quality disturbance classification.
文摘The inverse problems for motions of dynamic systems of which are described by system of the ordinary differential equations are examined. The classification of such type of inverse problems is given. It was shown that inverse problems can be divided into two types: synthesis inverse problems and inverse problems of measurement (recognition). Each type of inverse problems requires separate approach to statements and solution methods. The regularization method for obtaining of stable solution of inverse problems was suggested. In some cases, instead of recognition of inverse problems solution, the estimation of solution can be used. Within the framework of this approach, two practical inverse problems of measurement are considered.
文摘Big data is a term that refers to a set of data that,due to its largeness or complexity,cannot be stored or processed with one of the usual tools or applications for data management,and it has become a prominent word in recent years for the massive development of technology.Almost immediately thereafter,the term“big data mining”emerged,i.e.,mining from big data even as an emerging and interconnected field of research.Classification is an important stage in data mining since it helps people make better decisions in a variety of situations,including scientific endeavors,biomedical research,and industrial applications.The probabilistic neural network(PNN)is a commonly used and successful method for handling classification and pattern recognition issues.In this study,the authors proposed to combine the probabilistic neural network(PPN),which is one of the data mining techniques,with the vibrating particles system(VPS),which is one of the metaheuristic algorithms named“VPS-PNN”,to solve classi-fication problems more effectively.The data set is eleven common benchmark medical datasets from the machine-learning library,the suggested method was tested.The suggested VPS-PNN mechanism outperforms the PNN,biogeography-based optimization,enhanced-water cycle algorithm(E-WCA)and the firefly algorithm(FA)in terms of convergence speed and classification accuracy.
文摘The establishment of a unified land use classification system is the basis for realizing the unified management of land and sea,urban and rural areas,and aboveground and underground space.In November 2020,the Ministry of Natural Resources of the People's Republic of China issued the Classification Guide for Land and Space Survey,Planning and Use Control of Land and Sea(for Trial Implementation),which aims to establish a national unified land and sea use classification system,lay an important foundation for scientific planning and unified management of natural resources,rational use and protection of natural resources,and speed up the construction of a new pattern of land and space development and protection.However,there are still some obvious shortcomings in the Classification Guide.This paper analyzes some problems existing in this classification standard from three aspects of logicality,rigorousness and comprehensiveness,and puts forward some suggestions for further improvement.This has important practical significance to better guiding the practice of land use and land resources management,and then to achieving the goal of unified management of natural resources.
文摘A holistic analysis of problem and incident tickets in a real production cloud service environment is presented in this paper.By extracting different bags of words,we use principal component analysis(PCA)to examine the clustering characteristics of these tickets.Then Kmeans and latent Dirichlet allocation(LDA)are applied to show the potential clusters within this Cloud environment.The second part of our study uses a pre-trained bidirectional encoder representation from transformers(BERT)model to classify the tickets,with the goal of predicting the optimal dispatching department for a given ticket.Experimental results show that due to the unique characteristics of ticket description,pre-processing with domain knowledge turns out to be critical in both clustering and classification.Our classification model yields 86%accuracy when predicting the target dispatching department.
基金This paper is partially supported by National Natural Foundation of China(Grant No.61772561)the Key Research&Development Plan of Hunan Province(Grant No.2018NK2012)+2 种基金Postgraduate Research and Innovative Project of Central South University of Forestry and Technology(Grant No.20183012)Graduate Education and Teaching Reform Project of Central South University of Forestry and Technology(Grant No.2018JG005)Teaching Reform Project of Central South University of Forestry and Technology(Grant No.20180682).
文摘With the development of deep learning and Convolutional Neural Networks(CNNs),the accuracy of automatic food recognition based on visual data have significantly improved.Some research studies have shown that the deeper the model is,the higher the accuracy is.However,very deep neural networks would be affected by the overfitting problem and also consume huge computing resources.In this paper,a new classification scheme is proposed for automatic food-ingredient recognition based on deep learning.We construct an up-to-date combinational convolutional neural network(CBNet)with a subnet merging technique.Firstly,two different neural networks are utilized for learning interested features.Then,a well-designed feature fusion component aggregates the features from subnetworks,further extracting richer and more precise features for image classification.In order to learn more complementary features,the corresponding fusion strategies are also proposed,including auxiliary classifiers and hyperparameters setting.Finally,CBNet based on the well-known VGGNet,ResNet and DenseNet is evaluated on a dataset including 41 major categories of food ingredients and 100 images for each category.Theoretical analysis and experimental results demonstrate that CBNet achieves promising accuracy for multi-class classification and improves the performance of convolutional neural networks.
基金Sponsored by the National Natural Science Foundation of China(Grant No.61201310)the Fundamental Research Funds for the Central Universities(Grant No.HIT.NSRIF.201160)the China Postdoctoral Science Foundation(Grant No.20110491067)
文摘Based on the framework of support vector machines (SVM) using one-against-one (OAO) strategy, a new multi-class kernel method based on directed aeyclie graph (DAG) and probabilistic distance is proposed to raise the multi-class classification accuracies. The topology structure of DAG is constructed by rearranging the nodes' sequence in the graph. DAG is equivalent to guided operating SVM on a list, and the classification performance depends on the nodes' sequence in the graph. Jeffries-Matusita distance (JMD) is introduced to estimate the separability of each class, and the implementation list is initialized with all classes organized according to certain sequence in the list. To testify the effectiveness of the proposed method, numerical analysis is conducted on UCI data and hyperspectral data. Meanwhile, comparative studies using standard OAO and DAG classification methods are also conducted and the results illustrate better performance and higher accuracy of the orooosed JMD-DAG method.
基金The National Defence Basic Research Pro-gram in China(No.S0500A001)the National High Technol-ogy Research and Development Program of China(863 Pro-gram) (No.2002AA411030)the Scientific and Techno-logical Innovation Foundation of Jiangsu Province in China
文摘A modified multisurface "proximal support vector machine classifier via generalized eigenvalues (GEPSVM for short)" was proposed. By defining a new principle, we designed a new classification approach via GEPSVM, namely, maximum or minimum plane distance GEPSVM (MPDGEPSVM). Unlike GEPSVM, our approach obtains two planes by solving two simple eigenvalue problems, such that it can avoid occurrence of singular problems. Our approach, compared with GEPSVM, has better classification performalce. Moreover, MPDGEPSVM is over one order of magnitude faster than GEPSVM, and almost two orders of magnitude faster than SVM. Computational results on public datasets from UCI database illustrated the efficiency of MPDGEPSVM.
基金Supported by the National Defense Pre-Research Foundation of China (Grant No.9140A05070107BQ0204)
文摘In this paper,a new multiclass classification algorithm is proposed based on the idea of Locally Linear Embedding(LLE),to avoid the defect of traditional manifold learning algorithms,which can not deal with new sample points.The algorithm defines an error as a criterion by computing a sample's reconstruction weight using LLE.Furthermore,the existence and characteristics of low dimensional manifold in range-profile time-frequency information are explored using manifold learning algorithm,aiming at the problem of target recognition about high range resolution MilliMeter-Wave(MMW) radar.The new algorithm is applied to radar target recognition.The experiment results show the algorithm is efficient.Compared with other classification algorithms,our method improves the recognition precision and the result is not sensitive to input parameters.
基金This research was supported by Taif University Researchers Supporting Project number(TURSP-2020/254),Taif University,Taif,Saudi Arabia.
文摘Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challenging research problem.Various machine learning techniques are designed to operate on balanced datasets;therefore,the state of the art,different undersampling,over-sampling and hybrid strategies have been proposed to deal with the problem of imbalanced datasets,but highly skewed datasets still pose the problem of generalization and noise generation during resampling.To overcome these problems,this paper proposes amajority clusteringmodel for classification of imbalanced datasets known as MCBC-SMOTE(Majority Clustering for balanced Classification-SMOTE).The model provides a method to convert the problem of binary classification into a multi-class problem.In the proposed algorithm,the number of clusters for themajority class is calculated using the elbow method and the minority class is over-sampled as an average of clustered majority classes to generate a symmetrical class distribution.The proposed technique is cost-effective,reduces the problem of noise generation and successfully disables the imbalances present in between and within classes.The results of the evaluations on diverse real datasets proved to provide better classification results as compared to state of the art existing methodologies based on several performance metrics.
文摘This paper offers a symbiosis based hybrid modified DNA-ABC optimization algorithm which combines modified DNA concepts and artificial bee colony (ABC) algorithm to aid hierarchical fuzzy classification. According to literature, the ABC algorithm is traditionally applied to constrained and unconstrained problems, but is combined with modified DNA concepts and implemented for fuzzy classification in this present research. Moreover, from the best of our knowledge, previous research on the ABC algorithm has not combined it with DNA computing for hierarchical fuzzy classification to explore the merits of cooperative coevolution. Therefore, this paper is the first to apply the mechanism of symbiosis to create a hybrid modified DNA-ABC algorithm for hierarchical fuzzy classification applications. In this study, the partition number and the shape of the membership function are extracted by the symbiosis based hybrid modified DNA-ABC optimization algorithm, which provides both sufficient global exploration and also adequate local exploitation for hierarchical fuzzy classification. The proposed optimization algorithm is applied on five benchmark University of Irvine (UCI) data sets, and the results prove the efficiency of the algorithm.
基金supported by the General Projects of ISTIC Innovation Foundation“Problem innovation solution mining based on text generation model”(MS2024-03).
文摘Purpose:A text generation based multidisciplinary problem identification method is proposed,which does not rely on a large amount of data annotation.Design/methodology/approach:The proposed method first identifies the research objective types and disciplinary labels of papers using a text classification technique;second,it generates abstractive titles for each paper based on abstract and research objective types using a generative pre-trained language model;third,it extracts problem phrases from generated titles according to regular expression rules;fourth,it creates problem relation networks and identifies the same problems by exploiting a weighted community detection algorithm;finally,it identifies multidisciplinary problems based on the disciplinary labels of papers.Findings:Experiments in the“Carbon Peaking and Carbon Neutrality”field show that the proposed method can effectively identify multidisciplinary research problems.The disciplinary distribution of the identified problems is consistent with our understanding of multidisciplinary collaboration in the field.Research limitations:It is necessary to use the proposed method in other multidisciplinary fields to validate its effectiveness.Practical implications:Multidisciplinary problem identification helps to gather multidisciplinary forces to solve complex real-world problems for the governments,fund valuable multidisciplinary problems for research management authorities,and borrow ideas from other disciplines for researchers.Originality/value:This approach proposes a novel multidisciplinary problem identification method based on text generation,which identifies multidisciplinary problems based on generative abstractive titles of papers without data annotation required by standard sequence labeling techniques.
文摘High-dimensional datasets present significant challenges for classification tasks.Dimensionality reduction,a crucial aspect of data preprocessing,has gained substantial attention due to its ability to improve classification per-formance.However,identifying the optimal features within high-dimensional datasets remains a computationally demanding task,necessitating the use of efficient algorithms.This paper introduces the Arithmetic Optimization Algorithm(AOA),a novel approach for finding the optimal feature subset.AOA is specifically modified to address feature selection problems based on a transfer function.Additionally,two enhancements are incorporated into the AOA algorithm to overcome limitations such as limited precision,slow convergence,and susceptibility to local optima.The first enhancement proposes a new method for selecting solutions to be improved during the search process.This method effectively improves the original algorithm’s accuracy and convergence speed.The second enhancement introduces a local search with neighborhood strategies(AOA_NBH)during the AOA exploitation phase.AOA_NBH explores the vast search space,aiding the algorithm in escaping local optima.Our results demonstrate that incorporating neighborhood methods enhances the output and achieves significant improvement over state-of-the-art methods.
基金University Science Foundation of Jiangsu Province (04KJD510018)
文摘Hierarchical Support Vector Machine (H-SVM) is faster in training and classification than other usual multi-class SVMs such as "1-V-R"and "1-V-1". In this paper, a new multi-class fault diagnosis algorithm based on H-SVM is proposed and applied to aero-engine. Before SVM training, the training data are first clustered according to their class-center Euclid distances in some feature spaces. The samples which have close distances are divided into the same sub-classes for training, and this makes the H-SVM have reasonable hierarchical construction and good generalization performance. Instead of the common C-SVM, the v-SVM is selected as the binary classifier, in which the parameter v varies only from 0 to 1 and can be determined more easily. The simulation results show that the designed H-SVMs can fast diagnose the multi-class single faults and combination faults for the gas path components of an aero-engine. The fault classifiers have good diagnosis accuracy and can keep robust even when the measurement inputs are disturbed by noises.
文摘Multi-source multi-class classification methods based on multi-class Support Vector Machines and data fusion strategies are proposed in this paper. The centralized and distributed fusion schemes are applied to combine information from several data sources. In the centralized scheme, all information from several data sources is centralized to construct an input space. Then a multi-class Support Vector Machine classifier is trained. In the distributed schemes, the individual data sources are proc-essed separately and modelled by using the multi-class Support Vector Machine. Then new data fusion strategies are proposed to combine the information from the individual multi-class Support Vector Machine models. Our proposed fusion strategies take into account that an Support Vector Machine (SVM) classifier achieves classification by finding the optimal classification hyperplane with maximal margin. The proposed methods are applied for fault diagnosis of a diesel engine. The experimental results showed that almost all the proposed approaches can largely improve the diagnostic accuracy. The robustness of diagnosis is also improved because of the implementation of data fusion strategies. The proposed methods can also be applied in other fields.
基金Supported by Nature Science Foundation of Education Department of Henan Province(2010A110023)
文摘In this paper,a multi-point boundary value problems for a three order nonlinear deferential equation is considered.With the help of coincidence theorem due to Mawhin,a existence theorem is obtained.
文摘A technique is developed for finding a closed form expression for the cumulative distribution function of the maximum value of the objective function in a stochastic linear programming problem, where either the objective function coefficients or the right hand side coefficients are continuous random vectors with known probability distributions. This is the “wait and see” problem of stochastic linear programming. Explicit results for the distribution problem are extremely difficult to obtain;indeed, previous results are known only if the right hand side coefficients have an exponential distribution [1]. To date, no explicit results have been obtained for stochastic c, and no new results of any form have appeared since the 1970’s. In this paper, we obtain the first results for stochastic c, and new explicit results if b an c are stochastic vectors with an exponential, gamma, uniform, or triangle distribution. A transformation is utilized that greatly reduces computational time.