Group distance coding is suitable for secret communication covered by printed documents. However there is no effective method against it. The study found that the hiding method will make group distances of text lines ...Group distance coding is suitable for secret communication covered by printed documents. However there is no effective method against it. The study found that the hiding method will make group distances of text lines coverage on specified values, and make variances of group distances among N-Window text lines become small. Inspired by the discovery, the research brings out a Support Vector Machine (SVM) based steganalysis algorithm. To avoid the disturbance of large difference among words length from same line, the research only reserves samples whose occurrence-frequencies are ± 10dB of the maximum frequency. The results show that the correct rate of the SVM classifier is higher than 90%.展开更多
Aiming at the non-stationary feattwes of the roller bearing fault vibration signal, a roller bearing fault diagnosis methtxt based on improved Local Mean Decomposition (LMD) and Support Vector Machine (SVM) is pro...Aiming at the non-stationary feattwes of the roller bearing fault vibration signal, a roller bearing fault diagnosis methtxt based on improved Local Mean Decomposition (LMD) and Support Vector Machine (SVM) is proposed. In this paper, firstly, the wavelet analysis is introduced to the signal decomposition and reconstruction; secondly, the LMD method is used to decompose the recomtnion signal obtained by the wavelet analysis into a ntmaber of Product Ftmctions (PFs) that include main fault characteristics, thus, the initial feattwe vector matrixes could be formed automatically; Thirdly, by applying the Singular Valueition (SVD) techniques to the initial feature vector matrixes, the singular values of the matrixes can be obtained, which can be used as the fault feature vectors of the roller bearing and serve as the input vectors of the SVM classifier; Finally, the recognition results can be obtained from the SVM output. The results of analysis show that the propsed method can be applied to roller beating fault diagnosis effectively.展开更多
The increased production and price of rare earth elements(REEs) are indicative of their importance and of growing global attention. More accurate and practical exploration procedures are needed for REEs, and for other...The increased production and price of rare earth elements(REEs) are indicative of their importance and of growing global attention. More accurate and practical exploration procedures are needed for REEs, and for other geochemical resources. One such procedure is a multivariate approach. In this study, five classifiers, including multilayer perceptron(MLP), Bayesian, k-Nearest Neighbors(KNN), Parzen, and support vector machine(SVM),were applied in supervised pattern classification of bulk geochemical samples based on REEs, P, and Fe in the Kiruna type magnetite-apatite deposit of Se-Chahun,Central Iran. This deposit is composed of four rock types:(1) High anomaly(phosphorus iron ore),(2) Low anomaly(metasomatized tuff),(3) Low anomaly(iron ore), and(4)Background(iron ore and others). The proposed methods help to predict the proper classes for new samples from the study area without the need for costly and time-consuming additional studies. In addition, this paper provides a performance comparison of the five models. Results show that all five classifiers have appropriate and acceptable performance. Therefore, pattern classification can be used for evaluation of REE distribution. However, MLP and KNN classifiers show the same results and have the highest CCRs in comparison to Bayesian, Parzen, and SVM classifiers. MLP is more generalizable than KNN and seems to be an applicable approach for classification and predictionof the classes. We hope the predictability of the proposed methods will encourage geochemists to expand the use of numerical models in future work.展开更多
By utilizing hyperbolic tangent function,we constructed a novel hyperbolic tangent loss function to reduce the influences of outliers on support vector machine(SVM)classification problem.The new loss function not only...By utilizing hyperbolic tangent function,we constructed a novel hyperbolic tangent loss function to reduce the influences of outliers on support vector machine(SVM)classification problem.The new loss function not only limits the maximal loss value of outliers but also is smooth.Hyperbolic tangent SVM(HTSVM)is then proposed based on the new loss function.The experimental results show that HTSVM reduces the effects of outliers and gives better generalization performance than the classical SVM on both artificial data and UCI data sets.Therefore,the proposed hyperbolic tangent loss function and HTSVM are both effective.展开更多
The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative ...The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative in nature. These qualitative words need statistical computations to convert them into useful quantitative data. This data should be processed properly since it expresses opinions. Each of these opinion bearing words differs based on the significant meaning it conveys. To process the linguistic meaning of words into data and to enhance opinion mining analysis, we propose a novel weighting scheme, referred to as inferred word weighting(IWW). IWW is computed based on the significance of the word in the document(SWD) and the significance of the word in the expression(SWE) to enhance their performance. The proposed weighting methods give an analytic view and provide appropriate weights to the words compared to existing methods. In addition to the new weighting methods, another type of checking is done on the performance of text classification by including stop-words. Generally, stop-words are removed in text processing. When this new concept of including stop-words is applied to the proposed and existing weighting methods, two facts are observed:(1) Classification performance is enhanced;(2) The outcome difference between inclusion and exclusion of stop-words is smaller in the proposed methods, and larger in existing methods. The inferences provided by these observations are discussed. Experimental results of the benchmark data sets show the potential enhancement in terms of classification accuracy.展开更多
基金the National Natural Science Foundation of China under Grant No.61170269,No.61170272,No.61202082,No.61003285,and the Fundamental Research Funds for the Central Universities under Grant No.BUPT2013RC0308,No.BUPT2013RC0311
文摘Group distance coding is suitable for secret communication covered by printed documents. However there is no effective method against it. The study found that the hiding method will make group distances of text lines coverage on specified values, and make variances of group distances among N-Window text lines become small. Inspired by the discovery, the research brings out a Support Vector Machine (SVM) based steganalysis algorithm. To avoid the disturbance of large difference among words length from same line, the research only reserves samples whose occurrence-frequencies are ± 10dB of the maximum frequency. The results show that the correct rate of the SVM classifier is higher than 90%.
基金supported by Chinese National Science Foundation Grant(No.50775068)China Postdoctoral Science Foundation funded project(No.20080430154)High-Tech Research and Development Program of China(No.2009AA04Z414)
文摘Aiming at the non-stationary feattwes of the roller bearing fault vibration signal, a roller bearing fault diagnosis methtxt based on improved Local Mean Decomposition (LMD) and Support Vector Machine (SVM) is proposed. In this paper, firstly, the wavelet analysis is introduced to the signal decomposition and reconstruction; secondly, the LMD method is used to decompose the recomtnion signal obtained by the wavelet analysis into a ntmaber of Product Ftmctions (PFs) that include main fault characteristics, thus, the initial feattwe vector matrixes could be formed automatically; Thirdly, by applying the Singular Valueition (SVD) techniques to the initial feature vector matrixes, the singular values of the matrixes can be obtained, which can be used as the fault feature vectors of the roller bearing and serve as the input vectors of the SVM classifier; Finally, the recognition results can be obtained from the SVM output. The results of analysis show that the propsed method can be applied to roller beating fault diagnosis effectively.
文摘The increased production and price of rare earth elements(REEs) are indicative of their importance and of growing global attention. More accurate and practical exploration procedures are needed for REEs, and for other geochemical resources. One such procedure is a multivariate approach. In this study, five classifiers, including multilayer perceptron(MLP), Bayesian, k-Nearest Neighbors(KNN), Parzen, and support vector machine(SVM),were applied in supervised pattern classification of bulk geochemical samples based on REEs, P, and Fe in the Kiruna type magnetite-apatite deposit of Se-Chahun,Central Iran. This deposit is composed of four rock types:(1) High anomaly(phosphorus iron ore),(2) Low anomaly(metasomatized tuff),(3) Low anomaly(iron ore), and(4)Background(iron ore and others). The proposed methods help to predict the proper classes for new samples from the study area without the need for costly and time-consuming additional studies. In addition, this paper provides a performance comparison of the five models. Results show that all five classifiers have appropriate and acceptable performance. Therefore, pattern classification can be used for evaluation of REE distribution. However, MLP and KNN classifiers show the same results and have the highest CCRs in comparison to Bayesian, Parzen, and SVM classifiers. MLP is more generalizable than KNN and seems to be an applicable approach for classification and predictionof the classes. We hope the predictability of the proposed methods will encourage geochemists to expand the use of numerical models in future work.
基金National Natural Science Foundation of China(No.60705004)
文摘By utilizing hyperbolic tangent function,we constructed a novel hyperbolic tangent loss function to reduce the influences of outliers on support vector machine(SVM)classification problem.The new loss function not only limits the maximal loss value of outliers but also is smooth.Hyperbolic tangent SVM(HTSVM)is then proposed based on the new loss function.The experimental results show that HTSVM reduces the effects of outliers and gives better generalization performance than the classical SVM on both artificial data and UCI data sets.Therefore,the proposed hyperbolic tangent loss function and HTSVM are both effective.
文摘The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative in nature. These qualitative words need statistical computations to convert them into useful quantitative data. This data should be processed properly since it expresses opinions. Each of these opinion bearing words differs based on the significant meaning it conveys. To process the linguistic meaning of words into data and to enhance opinion mining analysis, we propose a novel weighting scheme, referred to as inferred word weighting(IWW). IWW is computed based on the significance of the word in the document(SWD) and the significance of the word in the expression(SWE) to enhance their performance. The proposed weighting methods give an analytic view and provide appropriate weights to the words compared to existing methods. In addition to the new weighting methods, another type of checking is done on the performance of text classification by including stop-words. Generally, stop-words are removed in text processing. When this new concept of including stop-words is applied to the proposed and existing weighting methods, two facts are observed:(1) Classification performance is enhanced;(2) The outcome difference between inclusion and exclusion of stop-words is smaller in the proposed methods, and larger in existing methods. The inferences provided by these observations are discussed. Experimental results of the benchmark data sets show the potential enhancement in terms of classification accuracy.