Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP...Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP) was proposed to solve the problem.DP cut the source data set into data blocks,and extracted the eigenvector for each data block to form the local feature set.The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector.Ultimately according to the global eigenvector,the data set was assigned by criterion of minimum distance.The experimental results show that it is more robust than the conventional clusterings.Characteristics of not sensitive to data dimensions,distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.展开更多
This paper presents a novel approach to feature subset selection using genetic algorithms. This approach has the ability to accommodate multiple criteria such as the accuracy and cost of classification into the proces...This paper presents a novel approach to feature subset selection using genetic algorithms. This approach has the ability to accommodate multiple criteria such as the accuracy and cost of classification into the process of feature selection and finds the effective feature subset for texture classification. On the basis of the effective feature subset selected, a method is described to extract the objects which are higher than their surroundings, such as trees or forest, in the color aerial images. The methodology presented in this paper is illustrated by its application to the problem of trees extraction from aerial images.展开更多
In this work,a system for recognition of newspaper printed in Gurumukhi script is presented.Four feature extraction techniques,namely,zoning features,diagonal features,parabola curve fitting based features,and power c...In this work,a system for recognition of newspaper printed in Gurumukhi script is presented.Four feature extraction techniques,namely,zoning features,diagonal features,parabola curve fitting based features,and power curve fitting based features are considered for extracting the statistical properties of the characters printed in the newspaper.Different combinations of these features are also applied to improve the recognition accuracy.For recognition,four classification techniques,namely,k-NN,linear-SVM,decision tree,and random forest are used.A database for the experiments is collected from three major Gurumukhi script newspapers which are Ajit,Jagbani and Punjabi Tribune.Using 5-fold cross validation and random forest classifier,a recognition accuracy of 96.19%with a combination of zoning features,diagonal features and parabola curve fitting based features has been reported.A recognition accuracy of 95.21%with a partitioning strategy of data set(70%data as training data and remaining 30%data as testing data)has been achieved.展开更多
A new method of Windows Minimum/Maximum Module Learning Subspace Algorithm(WMMLSA) for image feature extraction is presented.The WMMLSM is insensitive to the order of the training samples and can regulate effectively ...A new method of Windows Minimum/Maximum Module Learning Subspace Algorithm(WMMLSA) for image feature extraction is presented.The WMMLSM is insensitive to the order of the training samples and can regulate effectively the radical vectors of an image feature subspace through selecting the study samples for subspace iterative learning algorithm,so it can improve the robustness and generalization capacity of a pattern subspace and enhance the recognition rate of a classifier.At the same time,a pattern subspace is built by the PCA method.The classifier based on WMMLSM is successfully applied to recognize the pressed characters on the gray-scale images.The results indicate that the correct recognition rate on WMMLSM is higher than that on Average Learning Subspace Method,and that the training speed and the classification speed are both improved.The new method is more applicable and efficient.展开更多
To solve the problems of the AMR-WB+(Extended Adaptive Multi-Rate-WideBand) semi-open-loop coding mode selection algorithm,features for ACELP(Algebraic Code Excited Linear Prediction) and TCX(Transform Coded eXcitatio...To solve the problems of the AMR-WB+(Extended Adaptive Multi-Rate-WideBand) semi-open-loop coding mode selection algorithm,features for ACELP(Algebraic Code Excited Linear Prediction) and TCX(Transform Coded eXcitation) classification are investigated.11 classifying features in the AMR-WB+ codec are selected and 2 novel classifying features,i.e.,EFM(Energy Flatness Measurement) and stdEFM(standard deviation of EFM),are proposed.Consequently,a novel semi-open-loop mode selection algorithm based on EFM and selected AMR-WB+ features is proposed.The results of classifying test and listening test show that the performance of the novel algorithm is much better than that of the AMR-WB+ semi-open-loop coding mode selection algorithm.展开更多
In the study of brain-computer interfaces,a method of feature extraction and classification used fortwo kinds of imaginations is proposed.It considers Euclidean distance between mean traces recorded fromthe channels w...In the study of brain-computer interfaces,a method of feature extraction and classification used fortwo kinds of imaginations is proposed.It considers Euclidean distance between mean traces recorded fromthe channels with two kinds of imaginations as a feature,and determines imagination classes using thresh-old value.It analyzed the background of experiment and theoretical foundation referring to the data sets ofBCI 2003,and compared the classification precision with the best result of the competition.The resultshows that the method has a high precision and is advantageous for being applied to practical systems.展开更多
基金Projects(60903082,60975042)supported by the National Natural Science Foundation of ChinaProject(20070217043)supported by the Research Fund for the Doctoral Program of Higher Education of China
文摘Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP) was proposed to solve the problem.DP cut the source data set into data blocks,and extracted the eigenvector for each data block to form the local feature set.The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector.Ultimately according to the global eigenvector,the data set was assigned by criterion of minimum distance.The experimental results show that it is more robust than the conventional clusterings.Characteristics of not sensitive to data dimensions,distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.
文摘This paper presents a novel approach to feature subset selection using genetic algorithms. This approach has the ability to accommodate multiple criteria such as the accuracy and cost of classification into the process of feature selection and finds the effective feature subset for texture classification. On the basis of the effective feature subset selected, a method is described to extract the objects which are higher than their surroundings, such as trees or forest, in the color aerial images. The methodology presented in this paper is illustrated by its application to the problem of trees extraction from aerial images.
文摘In this work,a system for recognition of newspaper printed in Gurumukhi script is presented.Four feature extraction techniques,namely,zoning features,diagonal features,parabola curve fitting based features,and power curve fitting based features are considered for extracting the statistical properties of the characters printed in the newspaper.Different combinations of these features are also applied to improve the recognition accuracy.For recognition,four classification techniques,namely,k-NN,linear-SVM,decision tree,and random forest are used.A database for the experiments is collected from three major Gurumukhi script newspapers which are Ajit,Jagbani and Punjabi Tribune.Using 5-fold cross validation and random forest classifier,a recognition accuracy of 96.19%with a combination of zoning features,diagonal features and parabola curve fitting based features has been reported.A recognition accuracy of 95.21%with a partitioning strategy of data set(70%data as training data and remaining 30%data as testing data)has been achieved.
文摘A new method of Windows Minimum/Maximum Module Learning Subspace Algorithm(WMMLSA) for image feature extraction is presented.The WMMLSM is insensitive to the order of the training samples and can regulate effectively the radical vectors of an image feature subspace through selecting the study samples for subspace iterative learning algorithm,so it can improve the robustness and generalization capacity of a pattern subspace and enhance the recognition rate of a classifier.At the same time,a pattern subspace is built by the PCA method.The classifier based on WMMLSM is successfully applied to recognize the pressed characters on the gray-scale images.The results indicate that the correct recognition rate on WMMLSM is higher than that on Average Learning Subspace Method,and that the training speed and the classification speed are both improved.The new method is more applicable and efficient.
文摘To solve the problems of the AMR-WB+(Extended Adaptive Multi-Rate-WideBand) semi-open-loop coding mode selection algorithm,features for ACELP(Algebraic Code Excited Linear Prediction) and TCX(Transform Coded eXcitation) classification are investigated.11 classifying features in the AMR-WB+ codec are selected and 2 novel classifying features,i.e.,EFM(Energy Flatness Measurement) and stdEFM(standard deviation of EFM),are proposed.Consequently,a novel semi-open-loop mode selection algorithm based on EFM and selected AMR-WB+ features is proposed.The results of classifying test and listening test show that the performance of the novel algorithm is much better than that of the AMR-WB+ semi-open-loop coding mode selection algorithm.
基金supported by the Shanghai Education Commission Foundation for Excellent Young High Education Teacher(No.sdj08001)
文摘In the study of brain-computer interfaces,a method of feature extraction and classification used fortwo kinds of imaginations is proposed.It considers Euclidean distance between mean traces recorded fromthe channels with two kinds of imaginations as a feature,and determines imagination classes using thresh-old value.It analyzed the background of experiment and theoretical foundation referring to the data sets ofBCI 2003,and compared the classification precision with the best result of the competition.The resultshows that the method has a high precision and is advantageous for being applied to practical systems.