Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dim...Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC. A principal component analysis (PCA) is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently, and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC. Furthermore, a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines. Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically, but also improves the identify rates effectively.展开更多
Projection Pursuit (PP) Principal Component Analysis (PCA) method is herein introduced and applied to the field of meteorology for the first time. Some problems relevant to meteorological application are dis- cussed i...Projection Pursuit (PP) Principal Component Analysis (PCA) method is herein introduced and applied to the field of meteorology for the first time. Some problems relevant to meteorological application are dis- cussed in detail and comparisons with EOF method are made with the emphasis on robustness.展开更多
Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with ...Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-展开更多
Intercropping of mulberry(Morus alba L.)and alfalfa(Medicago sativa L.)is a new forestry-grass compound model in China,which can provide high forage yields with high protein.Nitrogen application is one of the importan...Intercropping of mulberry(Morus alba L.)and alfalfa(Medicago sativa L.)is a new forestry-grass compound model in China,which can provide high forage yields with high protein.Nitrogen application is one of the important factors determining the production and quality of this system.To elucidate the advantages of intercropping and nitrogen application,we analyzed the changes of physicochemical properties,enzyme activities,and microbial communities in the rhizosphere soil.We used principal components analysis(PCA)and redundancy discriminators analysis to clarify the relationships among treatments and between treatments and environmental factors,respectively.The results showed that nitrogen application significantly increased pH value,available nitrogen content,soil water content(SWC),and urea(URE)activity in rhizosphere soil of monoculture mulberry.In contrast,intercropping and intercropping+N significantly decreased pH and SWC in mulberry treatments.Nitrogen,intercropping and intercropping+N sharply reduced soil organic matter content and SWC in alfalfa treatments.Nitrogen,intercropping,and intercropping+N increased the values of McIntosh diversity(U),Simpson diversity(D),and Shannon-Weaver diversity(H’)in mulberry treatments.However,PC A scatter plots showed clustering of monoculture mulberry with nitrogen(MNE)and intercropping mulberry without nitrogen(M0).Intercropping reduced both H’and D but nitrogen application showed no effect on diversity of microbial communities in alfalfa.There were obvious differences in using the six types of carbon sources between mulberry and alfalfa treatments.Nitrogen and intercropping increased the numbers of sole carbon substrate in mulberry treatments where the relative use rate exceeded 4%.While the numbers declined in alfalfa with nitrogen and intercropping.RDA indicated that URE was positive when intercropping mulberry was treated with nitrogen,but was negative in monoculture alfalfa treated with nitrogen.Soil pH and SWC were positive with mulberry treatments but were negative with alfalfa treatments.Intercropping with alfalfa benefited mulberry in the absence of nitrogen application.Intercropping with alfalfa and nitrogen application could improve the microbial community function and diversity in rhizosphere soil of mulberry.The microbial community in rhizosphere soil of mulberry and alfalfa is strategically complementary in terms of using carbon sources.展开更多
A face recognition scheme is proposed, wherein a face image is preprocessed by pixel averaging and energy normalizing to reduce data dimension and brightness variation effect, followed by the Fourier transform to esti...A face recognition scheme is proposed, wherein a face image is preprocessed by pixel averaging and energy normalizing to reduce data dimension and brightness variation effect, followed by the Fourier transform to estimate the spectrum of the preprocessed image. The principal component analysis is conducted on the spectra of a face image to obtain eigen features. Combining eigen features with a Parzen classifier, experiments are taken on the ORL face database.展开更多
Today, mammography is the best method for early detection of breast cancer. Radiologists failed to detect evident cancerous signs in approximately 20% of false negative mammograms. False negatives have been identified...Today, mammography is the best method for early detection of breast cancer. Radiologists failed to detect evident cancerous signs in approximately 20% of false negative mammograms. False negatives have been identified as the inability of the radiologist to detect the abnormalities due to several reasons such as poor image quality, image noise, or eye fatigue. This paper presents a framework for a computer aided detection system that integrates Principal Component Analysis (PCA), Fisher Linear Discriminant (FLD), and Nearest Neighbor Classifier (KNN) algorithms for the detection of abnormalities in mammograms. Using normal and abnormal mammograms from the MIAS database, the integrated algorithm achieved 93.06% classification accuracy. Also in this paper, we present an analysis of the integrated algorithm’s parameters and suggest selection criteria.展开更多
Introduction:Salmonella is a key intestinal pathogen of foodborne disease,and the plasmids in Salmonella are related to many biological characteristics,including virulence and drug resistance.A large number of plasmid...Introduction:Salmonella is a key intestinal pathogen of foodborne disease,and the plasmids in Salmonella are related to many biological characteristics,including virulence and drug resistance.A large number of plasmid contigs have been sequenced in bacterial draft genomes,however,these are often difficult to distinguish from chromosomal contigs.Methods:In this study,three different customized Kraken databases were used to build three different Kraken classifiers.Complete genome benchmark datasets and simulated draft genome benchmark datasets were constructed.Five-fold cross-validation was used to evaluate the performance of the three different Kraken classifiers by two benchmark datasets.Results:The predictive performance of the classifier based on all National Center for Biotechnology Information plasmids and Salmonella complete genomes was optimal.This optimal Kraken classifier was performed with Salmonella isolated in China.The plasmid carrying rate of Salmonella in China is 91.01%,and it was found that the Kraken classifier could find more plasmid contigs and antibiotic resistance genes(ARGs)than results derived from a plasmid replicon-based method(PlasmidFinder).Moreover,it was found that in the strains carrying ARGs,plasmids carried more ARGs[three,95%confidence interval(CI):1–14]than chromosomes(one,95%CI:1–7).Discussion:We found building a high-quality customized database as a Kraken classifier to be ideal for the prediction of Salmonella plasmid sequences from bacterial draft genomes.In the future,the Kraken classifier established in this study will play a significant role in ARG monitoring.展开更多
Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained ...Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained Convolutional Neural Network (CNN) architectures to extract powerful features from images for object recognition purposes. We have built on the existing concept of extending the learning from pre-trained CNNs to new databases through activations by proposing to consider multiple deep layers. We have exploited the progressive learning that happens at the various intermediate layers of the CNNs to construct Deep Multi-Layer (DM-L) based Feature Extraction vectors to achieve excellent object recognition performance. Two popular pre-trained CNN architecture models i.e. the VGG_16 and VGG_19 have been used in this work to extract the feature sets from 3 deep fully connected multiple layers namely “fc6”, “fc7” and “fc8” from inside the models for object recognition purposes. Using the Principal Component Analysis (PCA) technique, the Dimensionality of the DM-L feature vectors has been reduced to form powerful feature vectors that have been fed to an external Classifier Ensemble for classification instead of the Softmax based classification layers of the two original pre-trained CNN models. The proposed DM-L technique has been applied to the Benchmark Caltech-101 object recognition database. Conventional wisdom may suggest that feature extractions based on the deepest layer i.e. “fc8” compared to “fc6” will result in the best recognition performance but our results have proved it otherwise for the two considered models. Our experiments have revealed that for the two models under consideration, the “fc6” based feature vectors have achieved the best recognition performance. State-of-the-Art recognition performances of 91.17% and 91.35% have been achieved by utilizing the “fc6” based feature vectors for the VGG_16 and VGG_19 models respectively. The recognition performance has been achieved by considering 30 sample images per class whereas the proposed system is capable of achieving improved performance by considering all sample images per class. Our research shows that for feature extraction based on CNNs, multiple layers should be considered and then the best layer can be selected that maximizes the recognition performance.展开更多
Abundant fossil records show that the Fagaceae has remained a dominant component in the Northern Hemisphere since the Cenozoic. However, due to the large number of living species, it is not easy to identify leaves to ...Abundant fossil records show that the Fagaceae has remained a dominant component in the Northern Hemisphere since the Cenozoic. However, due to the large number of living species, it is not easy to identify leaves to a particular species. Consequently, the identification of fossil leaves belonging to the Fagaceae is problematic.展开更多
Over the past years,with the increasing enrollment of high school,vocational schools are facing great challenge for their existence and development,concerning the low proficiency of the students and great gap among th...Over the past years,with the increasing enrollment of high school,vocational schools are facing great challenge for their existence and development,concerning the low proficiency of the students and great gap among them.The traditional English teaching mode which employs the same teaching contents,same teaching methods and teaching aims cannot satisfy students with different English levels.Therefore,in order to change the present situation,this paper proposes a new English teaching mode:classified English teaching.In the new mode,different students will be taught by different materials,different methods and with different aims.It can stimulate students'enthusiasm in English learning,and make every student develop appropriately.展开更多
The stress vector-based constitutive model for cohesionless soil, proposed by SHI Hong-yan et al., was applied to analyze the deformation behaviors of materials subjected to various stress paths. The result of analysi...The stress vector-based constitutive model for cohesionless soil, proposed by SHI Hong-yan et al., was applied to analyze the deformation behaviors of materials subjected to various stress paths. The result of analysis shows that the constitutive model can capture well the main deformation behavior of cohesionless soil, such as stress-strain nonlinearity, hardening property, dilatancy, stress path dependency, non-coaxiality between the principal stress and the principal strain increment directions, and the coupling of mean effective and deviatoric stress with deformation. In addition, the model can also take into account the rotation of principal stress axes and the influence of intermediate principal stress on deformation and strength of soil simultaneously. The excellent agreement between the predicted and measured behavior indicates the comprehensive applicability of the model.展开更多
基金the National Natural Science of China (50675167)a Foundation for the Author of National Excellent Doctoral Dissertation of China(200535)
文摘Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions, with especially better generalization ability. However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC. A principal component analysis (PCA) is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently, and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC. Furthermore, a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines. Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically, but also improves the identify rates effectively.
文摘Projection Pursuit (PP) Principal Component Analysis (PCA) method is herein introduced and applied to the field of meteorology for the first time. Some problems relevant to meteorological application are dis- cussed in detail and comparisons with EOF method are made with the emphasis on robustness.
基金This project was supported by Shanghai Shu Guang Project.
文摘Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-
基金the Heilongjiang Province Science Foundation for Youths(Grant No.QC2016018)the National Natural Science Foundation of China(Grant No.31600508)+2 种基金the Fundamental Research Funds for the Central University(2572017CA21)the Application Technology Research and Development Projects of Heilongjiang Province(Grant No.WB13B104)the Science and Technology Project of Heilongjiang Farms&Land Reclamation Administration(Grant No.HNK135-01-056)。
文摘Intercropping of mulberry(Morus alba L.)and alfalfa(Medicago sativa L.)is a new forestry-grass compound model in China,which can provide high forage yields with high protein.Nitrogen application is one of the important factors determining the production and quality of this system.To elucidate the advantages of intercropping and nitrogen application,we analyzed the changes of physicochemical properties,enzyme activities,and microbial communities in the rhizosphere soil.We used principal components analysis(PCA)and redundancy discriminators analysis to clarify the relationships among treatments and between treatments and environmental factors,respectively.The results showed that nitrogen application significantly increased pH value,available nitrogen content,soil water content(SWC),and urea(URE)activity in rhizosphere soil of monoculture mulberry.In contrast,intercropping and intercropping+N significantly decreased pH and SWC in mulberry treatments.Nitrogen,intercropping and intercropping+N sharply reduced soil organic matter content and SWC in alfalfa treatments.Nitrogen,intercropping,and intercropping+N increased the values of McIntosh diversity(U),Simpson diversity(D),and Shannon-Weaver diversity(H’)in mulberry treatments.However,PC A scatter plots showed clustering of monoculture mulberry with nitrogen(MNE)and intercropping mulberry without nitrogen(M0).Intercropping reduced both H’and D but nitrogen application showed no effect on diversity of microbial communities in alfalfa.There were obvious differences in using the six types of carbon sources between mulberry and alfalfa treatments.Nitrogen and intercropping increased the numbers of sole carbon substrate in mulberry treatments where the relative use rate exceeded 4%.While the numbers declined in alfalfa with nitrogen and intercropping.RDA indicated that URE was positive when intercropping mulberry was treated with nitrogen,but was negative in monoculture alfalfa treated with nitrogen.Soil pH and SWC were positive with mulberry treatments but were negative with alfalfa treatments.Intercropping with alfalfa benefited mulberry in the absence of nitrogen application.Intercropping with alfalfa and nitrogen application could improve the microbial community function and diversity in rhizosphere soil of mulberry.The microbial community in rhizosphere soil of mulberry and alfalfa is strategically complementary in terms of using carbon sources.
文摘A face recognition scheme is proposed, wherein a face image is preprocessed by pixel averaging and energy normalizing to reduce data dimension and brightness variation effect, followed by the Fourier transform to estimate the spectrum of the preprocessed image. The principal component analysis is conducted on the spectra of a face image to obtain eigen features. Combining eigen features with a Parzen classifier, experiments are taken on the ORL face database.
文摘Today, mammography is the best method for early detection of breast cancer. Radiologists failed to detect evident cancerous signs in approximately 20% of false negative mammograms. False negatives have been identified as the inability of the radiologist to detect the abnormalities due to several reasons such as poor image quality, image noise, or eye fatigue. This paper presents a framework for a computer aided detection system that integrates Principal Component Analysis (PCA), Fisher Linear Discriminant (FLD), and Nearest Neighbor Classifier (KNN) algorithms for the detection of abnormalities in mammograms. Using normal and abnormal mammograms from the MIAS database, the integrated algorithm achieved 93.06% classification accuracy. Also in this paper, we present an analysis of the integrated algorithm’s parameters and suggest selection criteria.
基金Supported by the National Key Research and Development Program of China(2020YFE 0205700,2022YFC2303900)the major projects of the National Natural Science Foundation of China(22193064)the Science Foundation(2022SKLID303)of the State Key Laboratory of Infectious Disease Prevention and Control,China.
文摘Introduction:Salmonella is a key intestinal pathogen of foodborne disease,and the plasmids in Salmonella are related to many biological characteristics,including virulence and drug resistance.A large number of plasmid contigs have been sequenced in bacterial draft genomes,however,these are often difficult to distinguish from chromosomal contigs.Methods:In this study,three different customized Kraken databases were used to build three different Kraken classifiers.Complete genome benchmark datasets and simulated draft genome benchmark datasets were constructed.Five-fold cross-validation was used to evaluate the performance of the three different Kraken classifiers by two benchmark datasets.Results:The predictive performance of the classifier based on all National Center for Biotechnology Information plasmids and Salmonella complete genomes was optimal.This optimal Kraken classifier was performed with Salmonella isolated in China.The plasmid carrying rate of Salmonella in China is 91.01%,and it was found that the Kraken classifier could find more plasmid contigs and antibiotic resistance genes(ARGs)than results derived from a plasmid replicon-based method(PlasmidFinder).Moreover,it was found that in the strains carrying ARGs,plasmids carried more ARGs[three,95%confidence interval(CI):1–14]than chromosomes(one,95%CI:1–7).Discussion:We found building a high-quality customized database as a Kraken classifier to be ideal for the prediction of Salmonella plasmid sequences from bacterial draft genomes.In the future,the Kraken classifier established in this study will play a significant role in ARG monitoring.
文摘Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained Convolutional Neural Network (CNN) architectures to extract powerful features from images for object recognition purposes. We have built on the existing concept of extending the learning from pre-trained CNNs to new databases through activations by proposing to consider multiple deep layers. We have exploited the progressive learning that happens at the various intermediate layers of the CNNs to construct Deep Multi-Layer (DM-L) based Feature Extraction vectors to achieve excellent object recognition performance. Two popular pre-trained CNN architecture models i.e. the VGG_16 and VGG_19 have been used in this work to extract the feature sets from 3 deep fully connected multiple layers namely “fc6”, “fc7” and “fc8” from inside the models for object recognition purposes. Using the Principal Component Analysis (PCA) technique, the Dimensionality of the DM-L feature vectors has been reduced to form powerful feature vectors that have been fed to an external Classifier Ensemble for classification instead of the Softmax based classification layers of the two original pre-trained CNN models. The proposed DM-L technique has been applied to the Benchmark Caltech-101 object recognition database. Conventional wisdom may suggest that feature extractions based on the deepest layer i.e. “fc8” compared to “fc6” will result in the best recognition performance but our results have proved it otherwise for the two considered models. Our experiments have revealed that for the two models under consideration, the “fc6” based feature vectors have achieved the best recognition performance. State-of-the-Art recognition performances of 91.17% and 91.35% have been achieved by utilizing the “fc6” based feature vectors for the VGG_16 and VGG_19 models respectively. The recognition performance has been achieved by considering 30 sample images per class whereas the proposed system is capable of achieving improved performance by considering all sample images per class. Our research shows that for feature extraction based on CNNs, multiple layers should be considered and then the best layer can be selected that maximizes the recognition performance.
文摘Abundant fossil records show that the Fagaceae has remained a dominant component in the Northern Hemisphere since the Cenozoic. However, due to the large number of living species, it is not easy to identify leaves to a particular species. Consequently, the identification of fossil leaves belonging to the Fagaceae is problematic.
文摘Over the past years,with the increasing enrollment of high school,vocational schools are facing great challenge for their existence and development,concerning the low proficiency of the students and great gap among them.The traditional English teaching mode which employs the same teaching contents,same teaching methods and teaching aims cannot satisfy students with different English levels.Therefore,in order to change the present situation,this paper proposes a new English teaching mode:classified English teaching.In the new mode,different students will be taught by different materials,different methods and with different aims.It can stimulate students'enthusiasm in English learning,and make every student develop appropriately.
文摘The stress vector-based constitutive model for cohesionless soil, proposed by SHI Hong-yan et al., was applied to analyze the deformation behaviors of materials subjected to various stress paths. The result of analysis shows that the constitutive model can capture well the main deformation behavior of cohesionless soil, such as stress-strain nonlinearity, hardening property, dilatancy, stress path dependency, non-coaxiality between the principal stress and the principal strain increment directions, and the coupling of mean effective and deviatoric stress with deformation. In addition, the model can also take into account the rotation of principal stress axes and the influence of intermediate principal stress on deformation and strength of soil simultaneously. The excellent agreement between the predicted and measured behavior indicates the comprehensive applicability of the model.