For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic...For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.展开更多
Developments in biomedical science, signal processing technologies have led Electroencephalography (EEG) signals to be widely used in the diagnosis of brain disease and in the field of Brain-Computer Interface (BCI). ...Developments in biomedical science, signal processing technologies have led Electroencephalography (EEG) signals to be widely used in the diagnosis of brain disease and in the field of Brain-Computer Interface (BCI). The collected EEG signals are processed using Machine Learning-Random Forest and Naive Bayes- and Deep Learning-Recurrent Neural Network (RNN), Neural Network (NN) and Long Short Term Memory (LSTM)-Algorithms to obtain the recent mood of a person. The Algorithms mentioned above have been imposed on the data set in order to find out what the person is feeling at a particular moment. The following thesis is conducted to find out one of the following moods (happy, surprised, disgust, fear, anger and sadness) of a person at an instant, with an aim to obtain the result with least amount of time delay as the mood differs. It is pretty obvious that the accuracy of the output varies depending upon the algorithm used, time taken to process the data, so that it is easy for us to compare the reliability and dependency of a particular algorithm to another, prior to its practical implementation. The imbalance data sets that were used had an imbalanced class and thus, over fitting occurred. This problem was handled by generating Artificial Data sets with the use of SMOTE Oversampling Technique.展开更多
Effective fault diagnosis has a crucial impact on the safety and cost of complex manufacturing systems.However,the complex structure of the collected multisource data and scarcity of fault samples make it difficult to...Effective fault diagnosis has a crucial impact on the safety and cost of complex manufacturing systems.However,the complex structure of the collected multisource data and scarcity of fault samples make it difficult to accurately identify multiple fault conditions.To address this challenge,this paper proposes a novel deep-learning model for multisource data augmentation and small sample fault diagnosis.The raw multisource data are first converted into two-dimensional images using the Gramian Angular Field,and a generator is built to transform random noise into images through transposed convolution operations.Then,two discriminators are constructed to evaluate the authenticity of input images and the fault diagnosis ability.The Vision Transformer network is built to diagnose faults and obtain the classification error for the discriminator.Furthermore,a global optimization strategy is designed to upgrade parameters in the model.The discriminators and generator compete with each other until Nash equilibrium is achieved.A real-world multistep forging machine is adopted to compare and validate the performance of different methods.The experimental results indicate that the proposed method has multisource data augmentation and minority sample fault diagnosis capabilities.Compared with other state-of-the-art models,the proposed approach has better fault diagnosis accuracy in various scenarios.展开更多
A sample enrichment method focusing on the minor targeted components was established to help them to be successfully separated by pH-zone refining CCC.Seven minor indole alkaloids in Uncaria rhynchophylla(Miq.)Miq.ex ...A sample enrichment method focusing on the minor targeted components was established to help them to be successfully separated by pH-zone refining CCC.Seven minor indole alkaloids in Uncaria rhynchophylla(Miq.)Miq.ex Havil(UR)were chosen to show the advantage of this method.The sample enrichment and separation were展开更多
基金supported by the National Key Research and Development Program of China(2018YFB1003700)the Scientific and Technological Support Project(Society)of Jiangsu Province(BE2016776)+2 种基金the“333” project of Jiangsu Province(BRA2017228 BRA2017401)the Talent Project in Six Fields of Jiangsu Province(2015-JNHB-012)
文摘For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.
文摘Developments in biomedical science, signal processing technologies have led Electroencephalography (EEG) signals to be widely used in the diagnosis of brain disease and in the field of Brain-Computer Interface (BCI). The collected EEG signals are processed using Machine Learning-Random Forest and Naive Bayes- and Deep Learning-Recurrent Neural Network (RNN), Neural Network (NN) and Long Short Term Memory (LSTM)-Algorithms to obtain the recent mood of a person. The Algorithms mentioned above have been imposed on the data set in order to find out what the person is feeling at a particular moment. The following thesis is conducted to find out one of the following moods (happy, surprised, disgust, fear, anger and sadness) of a person at an instant, with an aim to obtain the result with least amount of time delay as the mood differs. It is pretty obvious that the accuracy of the output varies depending upon the algorithm used, time taken to process the data, so that it is easy for us to compare the reliability and dependency of a particular algorithm to another, prior to its practical implementation. The imbalance data sets that were used had an imbalanced class and thus, over fitting occurred. This problem was handled by generating Artificial Data sets with the use of SMOTE Oversampling Technique.
基金supported by“the Fundamental Research Funds for the Central Universities,”Grant/Award Number 30923011008.
文摘Effective fault diagnosis has a crucial impact on the safety and cost of complex manufacturing systems.However,the complex structure of the collected multisource data and scarcity of fault samples make it difficult to accurately identify multiple fault conditions.To address this challenge,this paper proposes a novel deep-learning model for multisource data augmentation and small sample fault diagnosis.The raw multisource data are first converted into two-dimensional images using the Gramian Angular Field,and a generator is built to transform random noise into images through transposed convolution operations.Then,two discriminators are constructed to evaluate the authenticity of input images and the fault diagnosis ability.The Vision Transformer network is built to diagnose faults and obtain the classification error for the discriminator.Furthermore,a global optimization strategy is designed to upgrade parameters in the model.The discriminators and generator compete with each other until Nash equilibrium is achieved.A real-world multistep forging machine is adopted to compare and validate the performance of different methods.The experimental results indicate that the proposed method has multisource data augmentation and minority sample fault diagnosis capabilities.Compared with other state-of-the-art models,the proposed approach has better fault diagnosis accuracy in various scenarios.
基金supported by the National Science and Technology Major Project for Major Drug Development(No.2013ZX09508104)the Traditional Chinese Medicine Industry Research Special Project(No.201307002)the National Science&Technology Major Project Key New Drug Creation and Manufacturing program(No.2011ZX09307002-03)of the People's Republic of China
文摘A sample enrichment method focusing on the minor targeted components was established to help them to be successfully separated by pH-zone refining CCC.Seven minor indole alkaloids in Uncaria rhynchophylla(Miq.)Miq.ex Havil(UR)were chosen to show the advantage of this method.The sample enrichment and separation were