Imbalanced datasets are common in practical applications,and oversampling methods using fuzzy rules have been shown to enhance the classification performance of imbalanced data by taking into account the relationship ...Imbalanced datasets are common in practical applications,and oversampling methods using fuzzy rules have been shown to enhance the classification performance of imbalanced data by taking into account the relationship between data attributes.However,the creation of fuzzy rules typically depends on expert knowledge,which may not fully leverage the label information in training data and may be subjective.To address this issue,a novel fuzzy rule oversampling approach is developed based on the learning vector quantization(LVQ)algorithm.In this method,the label information of the training data is utilized to determine the antecedent part of If-Then fuzzy rules by dynamically dividing attribute intervals using LVQ.Subsequently,fuzzy rules are generated and adjusted to calculate rule weights.The number of new samples to be synthesized for each rule is then computed,and samples from the minority class are synthesized based on the newly generated fuzzy rules.This results in the establishment of a fuzzy rule oversampling method based on LVQ.To evaluate the effectiveness of this method,comparative experiments are conducted on 12 publicly available imbalance datasets with five other sampling techniques in combination with the support function machine.The experimental results demonstrate that the proposed method can significantly enhance the classification algorithm across seven performance indicators,including a boost of 2.15%to 12.34%in Accuracy,6.11%to 27.06%in G-mean,and 4.69%to 18.78%in AUC.These show that the proposed method is capable of more efficiently improving the classification performance of imbalanced data.展开更多
A new intrusion detection method based on learning vector quantization (LVQ) with low overhead and high efficiency is presented. The computer vision system employs LVQ neural networks as classifier to recognize intr...A new intrusion detection method based on learning vector quantization (LVQ) with low overhead and high efficiency is presented. The computer vision system employs LVQ neural networks as classifier to recognize intrusion. The recognition process includes three stages: (1) feature selection and data normalization processing;(2) learning the training data selected from the feature data set; (3) identifying the intrusion and generating the result report of machine condition classification. Experimental results show that the proposed method is promising in terms of detection accuracy, computational expense and implementation for intrusion detection.展开更多
In this paper, a combined method of unsupervised clustering and learning vector quantity (LVQ) is presented to forecast the occurrence of solar flare. Three magnetic parameters including the maximum horizontal gradien...In this paper, a combined method of unsupervised clustering and learning vector quantity (LVQ) is presented to forecast the occurrence of solar flare. Three magnetic parameters including the maximum horizontal gradient, the length of the neutral line, and the number of singular points are extracted from SOHO/MDI longitudinal magnetograms as measures. Based on these pa- rameters, the sliding-window method is used to form the sequential data by adding three days evolutionary information. Con- sidering the imbalanced problem in dataset, the K-means clustering, as an unsupervised clustering algorithm, is used to convert imbalanced data to balanced ones. Finally, the learning vector quantity is employed to predict the flares level within 48 hours. Experimental results indicate that the performance of the proposed flare forecasting model with sequential data is improved.展开更多
Word Sense Disambiguation has been a trending topic of research in Natural Language Processing and Machine Learning.Mining core features and performing the text classification still exist as a challenging task.Here the...Word Sense Disambiguation has been a trending topic of research in Natural Language Processing and Machine Learning.Mining core features and performing the text classification still exist as a challenging task.Here the features of the context such as neighboring words like adjective provide the evidence for classification using machine learning approach.This paper presented the text document classification that has wide applications in information retrieval,which uses movie review datasets.Here the document indexing based on controlled vocabulary,adjective,word sense disambiguation,generating hierarchical cate-gorization of web pages,spam detection,topic labeling,web search,document summarization,etc.Here the kernel support vector machine learning algorithm helps to classify the text and feature extract is performed by cuckoo search opti-mization.Positive review and negative review of movie dataset is presented to get the better classification accuracy.Experimental results focused with context mining,feature analysis and classification.By comparing with the previous work,proposed work designed to achieve the efficient results.Overall design is per-formed with MATLAB 2020a tool.展开更多
Nowadays,Internet of Things(IoT)is widely deployed and brings great opportunities to change people's daily life.To realize more effective human-computer interaction in the IoT applications,the Question Answering(Q...Nowadays,Internet of Things(IoT)is widely deployed and brings great opportunities to change people's daily life.To realize more effective human-computer interaction in the IoT applications,the Question Answering(QA)systems implanted in the IoT services are supposed to improve the ability to understand natural language.Therefore,the distributed representation of words,which contains more semantic or syntactic information,has been playing a more and more important role in the QA systems.However,learning high-quality distributed word vectors requires lots of storage and computing resources,hence it cannot be deployed on the resource-constrained IoT devices.It is a good choice to outsource the data and computation to the cloud servers.Nevertheless,it could cause privacy risks to directly upload private data to the untrusted cloud.Therefore,realizing the word vector learning process over untrusted cloud servers without privacy leakage is an urgent and challenging task.In this paper,we present a novel efficient word vector learning scheme over encrypted data.We first design a series of arithmetic computation protocols.Then we use two non-colluding cloud servers to implement high-quality word vectors learning over encrypted data.The proposed scheme allows us to perform training word vectors on the remote cloud servers while protecting privacy.Security analysis and experiments over real data sets demonstrate that our scheme is more secure and efficient than existing privacy-preserving word vector learning schemes.展开更多
The security of cryptographic systems is a major concern for cryptosystem designers, even though cryptography algorithms have been improved. Side-channel attacks, by taking advantage of physical vulnerabilities of cry...The security of cryptographic systems is a major concern for cryptosystem designers, even though cryptography algorithms have been improved. Side-channel attacks, by taking advantage of physical vulnerabilities of cryptosystems, aim to gain secret information. Several approaches have been proposed to analyze side-channel information, among which machine learning is known as a promising method. Machine learning in terms of neural networks learns the signature (power consumption and electromagnetic emission) of an instruction, and then recognizes it automatically. In this paper, a novel experimental investigation was conducted on field-programmable gate array (FPGA) implementation of elliptic curve cryptography (ECC), to explore the efficiency of side-channel information characterization based on a learning vector quantization (LVQ) neural network. The main characteristics of LVQ as a multi-class classifier are that it has the ability to learn complex non-linear input-output relationships, use sequential training procedures, and adapt to the data. Experimental results show the performance of multi-class classification based on LVQ as a powerful and promising approach of side-channel data characterization.展开更多
As traditional Chinese medicines,Fritillaria from different origins are very similar and it is difficult to distinguish them.In this study,the laser-induced breakdown spectroscopy combined with learning vector quantiz...As traditional Chinese medicines,Fritillaria from different origins are very similar and it is difficult to distinguish them.In this study,the laser-induced breakdown spectroscopy combined with learning vector quantization(LIBS-LVQ)was proposed to distinguish the powdered samples of Fritillaria cirrhosa and non-Fritillaria cirrhosa.We also studied the performance of linear discriminant analysis,and support vector machine on the same data set.Among these three classifiers,LVQ had the highest correct classification rate of 99.17%.The experimental results demonstrated that the LIBS-LVQ model could be used to differentiate the powdered samples of Fritillaria cirrhosa and non-Fritillaria cirrhosa.展开更多
Van der Pauw's function is often used in the measurement of a semiconductor's resistivity. However, it is difficult to obtain its value from voltage measurements because it has an implicit form. If it can be express...Van der Pauw's function is often used in the measurement of a semiconductor's resistivity. However, it is difficult to obtain its value from voltage measurements because it has an implicit form. If it can be expressed as a polynomial, a semiconductor's resistivity can be obtained from such measurements. Normally, five orders of the abscissa can provide sufficient precision during the expression of any non-linear function. Therefore, the key is to determine the coefficients of the polynomial. By taking five coefficients as weights to construct a neuronetwork, neurocomputing has been used to solve this problem. Finally, the polynomial expression for van der Pauw's function is obtained.展开更多
According to the pulverized coal combustion flame image texture features of the rotary-kiln oxide pellets sintering process,a combustion working condition recognition method based on the generalized learning vector(GL...According to the pulverized coal combustion flame image texture features of the rotary-kiln oxide pellets sintering process,a combustion working condition recognition method based on the generalized learning vector(GLVQ) neural network is proposed.Firstly,the numerical flame image is analyzed to extract texture features,such as energy,entropy and inertia,based on grey-level co-occurrence matrix(GLCM) to provide qualitative information on the changes in the visual appearance of the flame.Then the kernel principal component analysis(KPCA) method is adopted to deduct the input vector with high dimensionality so as to reduce the GLVQ target dimension and network scale greatly.Finally,the GLVQ neural network is trained by using the normalized texture feature data.The test results show that the proposed KPCA-GLVQ classifer has an excellent performance on training speed and correct recognition rate,and it meets the requirement for real-time combustion working condition recognition for the rotary kiln process.展开更多
Dynamic regulation and packaging of genetic information is achieved by the organization of DNA into chromatin. Nucleosomal core histones, which form the basic repeating unit of chromatin, are subject to various post-t...Dynamic regulation and packaging of genetic information is achieved by the organization of DNA into chromatin. Nucleosomal core histones, which form the basic repeating unit of chromatin, are subject to various post-translational modifications such as acetylation, methylation, phosphorylation, and ubiquitinylation. These modifications have effects on chromatin structure and, along with DNA methylation, regulate gene transcription.The goal of this study was to determine if patterns in modifications were related to different categories of genomic features, and, if so, if the patterns had predictive value. In this study, we used publically available data(ChIP-chip)for different types of histone modifications(methylation and acetylation) and for DNA methylation for Arabidopsis thaliana and then applied a machine learning based approach(a support vector machine) to demonstrate that patterns of these modifications are very different among different kinds of genomic feature categories(protein, RNA,pseudogene, and transposon elements). These patterns can be used to distinguish the types of genomic features.DNA methylation and H3K4me3 methylation emerged as features with most discriminative power. From our analysis on Arabidopsis, we were able to predict 33 novel genomic features, whose existence was also supported by analysis of RNA-seq experiments. In summary, we present a novel approach which can be used to discriminate/detect different categories of genomic features based upon their patterns of chromatin modification and DNA methylation.展开更多
基金funded by the National Science Foundation of China(62006068)Hebei Natural Science Foundation(A2021402008),Natural Science Foundation of Scientific Research Project of Higher Education in Hebei Province(ZD2020185,QN2020188)333 Talent Supported Project of Hebei Province(C20221026).
文摘Imbalanced datasets are common in practical applications,and oversampling methods using fuzzy rules have been shown to enhance the classification performance of imbalanced data by taking into account the relationship between data attributes.However,the creation of fuzzy rules typically depends on expert knowledge,which may not fully leverage the label information in training data and may be subjective.To address this issue,a novel fuzzy rule oversampling approach is developed based on the learning vector quantization(LVQ)algorithm.In this method,the label information of the training data is utilized to determine the antecedent part of If-Then fuzzy rules by dynamically dividing attribute intervals using LVQ.Subsequently,fuzzy rules are generated and adjusted to calculate rule weights.The number of new samples to be synthesized for each rule is then computed,and samples from the minority class are synthesized based on the newly generated fuzzy rules.This results in the establishment of a fuzzy rule oversampling method based on LVQ.To evaluate the effectiveness of this method,comparative experiments are conducted on 12 publicly available imbalance datasets with five other sampling techniques in combination with the support function machine.The experimental results demonstrate that the proposed method can significantly enhance the classification algorithm across seven performance indicators,including a boost of 2.15%to 12.34%in Accuracy,6.11%to 27.06%in G-mean,and 4.69%to 18.78%in AUC.These show that the proposed method is capable of more efficiently improving the classification performance of imbalanced data.
基金Supported by the National Natural Science Foundation of China (60573047), Natural Science Foundation of the Science and Technology Committee of Chongqing (8503) and the Applying Basic Research of the Education Committee of Chongqing (KJ060804)
文摘A new intrusion detection method based on learning vector quantization (LVQ) with low overhead and high efficiency is presented. The computer vision system employs LVQ neural networks as classifier to recognize intrusion. The recognition process includes three stages: (1) feature selection and data normalization processing;(2) learning the training data selected from the feature data set; (3) identifying the intrusion and generating the result report of machine condition classification. Experimental results show that the proposed method is promising in terms of detection accuracy, computational expense and implementation for intrusion detection.
基金supported by the National Natural Science Foundation of China (Grant No. 10973020)the Funding Project for Academic Human Resources Development in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality (Grant No. PHR200906210)+1 种基金the Funding Project for Base Construction of Scientific Research of Beijing Municipal Commission of Education (Grant No. WYJD200902)Beijing Philosophy and Social Science Planning Project (Grant No. 09BaJG258)
文摘In this paper, a combined method of unsupervised clustering and learning vector quantity (LVQ) is presented to forecast the occurrence of solar flare. Three magnetic parameters including the maximum horizontal gradient, the length of the neutral line, and the number of singular points are extracted from SOHO/MDI longitudinal magnetograms as measures. Based on these pa- rameters, the sliding-window method is used to form the sequential data by adding three days evolutionary information. Con- sidering the imbalanced problem in dataset, the K-means clustering, as an unsupervised clustering algorithm, is used to convert imbalanced data to balanced ones. Finally, the learning vector quantity is employed to predict the flares level within 48 hours. Experimental results indicate that the performance of the proposed flare forecasting model with sequential data is improved.
文摘Word Sense Disambiguation has been a trending topic of research in Natural Language Processing and Machine Learning.Mining core features and performing the text classification still exist as a challenging task.Here the features of the context such as neighboring words like adjective provide the evidence for classification using machine learning approach.This paper presented the text document classification that has wide applications in information retrieval,which uses movie review datasets.Here the document indexing based on controlled vocabulary,adjective,word sense disambiguation,generating hierarchical cate-gorization of web pages,spam detection,topic labeling,web search,document summarization,etc.Here the kernel support vector machine learning algorithm helps to classify the text and feature extract is performed by cuckoo search opti-mization.Positive review and negative review of movie dataset is presented to get the better classification accuracy.Experimental results focused with context mining,feature analysis and classification.By comparing with the previous work,proposed work designed to achieve the efficient results.Overall design is per-formed with MATLAB 2020a tool.
基金supported by the National Natural Science Foundation of China under Grant No.61672195,61872372the Open Foundation of State Key Laboratory of Cryptology No.MMKFKT201617the National University of Defense Technology Grant No.ZK19-38.
文摘Nowadays,Internet of Things(IoT)is widely deployed and brings great opportunities to change people's daily life.To realize more effective human-computer interaction in the IoT applications,the Question Answering(QA)systems implanted in the IoT services are supposed to improve the ability to understand natural language.Therefore,the distributed representation of words,which contains more semantic or syntactic information,has been playing a more and more important role in the QA systems.However,learning high-quality distributed word vectors requires lots of storage and computing resources,hence it cannot be deployed on the resource-constrained IoT devices.It is a good choice to outsource the data and computation to the cloud servers.Nevertheless,it could cause privacy risks to directly upload private data to the untrusted cloud.Therefore,realizing the word vector learning process over untrusted cloud servers without privacy leakage is an urgent and challenging task.In this paper,we present a novel efficient word vector learning scheme over encrypted data.We first design a series of arithmetic computation protocols.Then we use two non-colluding cloud servers to implement high-quality word vectors learning over encrypted data.The proposed scheme allows us to perform training word vectors on the remote cloud servers while protecting privacy.Security analysis and experiments over real data sets demonstrate that our scheme is more secure and efficient than existing privacy-preserving word vector learning schemes.
文摘The security of cryptographic systems is a major concern for cryptosystem designers, even though cryptography algorithms have been improved. Side-channel attacks, by taking advantage of physical vulnerabilities of cryptosystems, aim to gain secret information. Several approaches have been proposed to analyze side-channel information, among which machine learning is known as a promising method. Machine learning in terms of neural networks learns the signature (power consumption and electromagnetic emission) of an instruction, and then recognizes it automatically. In this paper, a novel experimental investigation was conducted on field-programmable gate array (FPGA) implementation of elliptic curve cryptography (ECC), to explore the efficiency of side-channel information characterization based on a learning vector quantization (LVQ) neural network. The main characteristics of LVQ as a multi-class classifier are that it has the ability to learn complex non-linear input-output relationships, use sequential training procedures, and adapt to the data. Experimental results show the performance of multi-class classification based on LVQ as a powerful and promising approach of side-channel data characterization.
基金supported by National Natural Science Foundation of China(No.62075011)Graduate Technological Innovation Project of Beijing Institute of Technology(No.2019CX20026)。
文摘As traditional Chinese medicines,Fritillaria from different origins are very similar and it is difficult to distinguish them.In this study,the laser-induced breakdown spectroscopy combined with learning vector quantization(LIBS-LVQ)was proposed to distinguish the powdered samples of Fritillaria cirrhosa and non-Fritillaria cirrhosa.We also studied the performance of linear discriminant analysis,and support vector machine on the same data set.Among these three classifiers,LVQ had the highest correct classification rate of 99.17%.The experimental results demonstrated that the LIBS-LVQ model could be used to differentiate the powdered samples of Fritillaria cirrhosa and non-Fritillaria cirrhosa.
文摘Van der Pauw's function is often used in the measurement of a semiconductor's resistivity. However, it is difficult to obtain its value from voltage measurements because it has an implicit form. If it can be expressed as a polynomial, a semiconductor's resistivity can be obtained from such measurements. Normally, five orders of the abscissa can provide sufficient precision during the expression of any non-linear function. Therefore, the key is to determine the coefficients of the polynomial. By taking five coefficients as weights to construct a neuronetwork, neurocomputing has been used to solve this problem. Finally, the polynomial expression for van der Pauw's function is obtained.
基金supported by China Postdoctoral Science Foundation(No.20110491510)Program for Liaoning Excellent Talents in University(No.LJQ2011027)+1 种基金Anshan Science and Technology Project(No.2011MS11)Special Research Foundation of University of Science and Technology of Liaoning(No.2011zx10)
文摘According to the pulverized coal combustion flame image texture features of the rotary-kiln oxide pellets sintering process,a combustion working condition recognition method based on the generalized learning vector(GLVQ) neural network is proposed.Firstly,the numerical flame image is analyzed to extract texture features,such as energy,entropy and inertia,based on grey-level co-occurrence matrix(GLCM) to provide qualitative information on the changes in the visual appearance of the flame.Then the kernel principal component analysis(KPCA) method is adopted to deduct the input vector with high dimensionality so as to reduce the GLVQ target dimension and network scale greatly.Finally,the GLVQ neural network is trained by using the normalized texture feature data.The test results show that the proposed KPCA-GLVQ classifer has an excellent performance on training speed and correct recognition rate,and it meets the requirement for real-time combustion working condition recognition for the rotary kiln process.
基金supported by the National Science Foundation of USA(No.IIS 0916250)The University of Georgia Franklin College of Arts&Sciences research fund
文摘Dynamic regulation and packaging of genetic information is achieved by the organization of DNA into chromatin. Nucleosomal core histones, which form the basic repeating unit of chromatin, are subject to various post-translational modifications such as acetylation, methylation, phosphorylation, and ubiquitinylation. These modifications have effects on chromatin structure and, along with DNA methylation, regulate gene transcription.The goal of this study was to determine if patterns in modifications were related to different categories of genomic features, and, if so, if the patterns had predictive value. In this study, we used publically available data(ChIP-chip)for different types of histone modifications(methylation and acetylation) and for DNA methylation for Arabidopsis thaliana and then applied a machine learning based approach(a support vector machine) to demonstrate that patterns of these modifications are very different among different kinds of genomic feature categories(protein, RNA,pseudogene, and transposon elements). These patterns can be used to distinguish the types of genomic features.DNA methylation and H3K4me3 methylation emerged as features with most discriminative power. From our analysis on Arabidopsis, we were able to predict 33 novel genomic features, whose existence was also supported by analysis of RNA-seq experiments. In summary, we present a novel approach which can be used to discriminate/detect different categories of genomic features based upon their patterns of chromatin modification and DNA methylation.