Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognitio...Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognition method is proposed in this paper. Because of the better noise suppression ability of the GAR model and the powerful pattern classification capacity of the MLP neural network classifier, the new method can significantly improve the recognition performance in lower SNR with better robustness. To assess the performance of the new method, computer simulations are also performed.展开更多
In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance...In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance degradation in noisy conditions or distorted channels. It is necessary to search for more robust feature extraction methods to gain better performance in adverse conditions. This paper investigates the performance of conventional and new hybrid speech feature extraction algorithms of Mel Frequency Cepstrum Coefficient (MFCC), Linear Prediction Coding Coefficient (LPCC), perceptual linear production (PLP), and RASTA-PLP in noisy conditions through using multivariate Hidden Markov Model (HMM) classifier. The behavior of the proposal system is evaluated using TIDIGIT human voice dataset corpora, recorded from 208 different adult speakers in both training and testing process. The theoretical basis for speech processing and classifier procedures were presented, and the recognition results were obtained based on word recognition rate.展开更多
Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Car...Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Cardiology,medical imaging technology(2D ultrasonic,MRI)has been proved to be helpful to detect congenital defects of the fetal heart and assists sonographers in prenatal diagnosis.It is a highly complex task to recognize 2D fetal heart ultrasonic standard plane(FHUSP)manually.Compared withmanual identification,automatic identification through artificial intelligence can save a lot of time,ensure the efficiency of diagnosis,and improve the accuracy of diagnosis.In this study,a feature extraction method based on texture features(Local Binary Pattern LBP and Histogram of Oriented Gradient HOG)and combined with Bag of Words(BOW)model is carried out,and then feature fusion is performed.Finally,it adopts Support VectorMachine(SVM)to realize automatic recognition and classification of FHUSP.The data includes 788 standard plane data sets and 448 normal and abnormal plane data sets.Compared with some other methods and the single method model,the classification accuracy of our model has been obviously improved,with the highest accuracy reaching 87.35%.Similarly,we also verify the performance of the model in normal and abnormal planes,and the average accuracy in classifying abnormal and normal planes is 84.92%.The experimental results show that thismethod can effectively classify and predict different FHUSP and can provide certain assistance for sonographers to diagnose fetal congenital heart disease.展开更多
A novel method to extract conic blending feature in reverse engineering is presented. Different from the methods to recover constant and variable radius blends from unorganized points, it contains not only novel segme...A novel method to extract conic blending feature in reverse engineering is presented. Different from the methods to recover constant and variable radius blends from unorganized points, it contains not only novel segmentation and feature recognition techniques, but also bias corrected technique to capture more reliable distribution of feature parameters along the spine curve. The segmentation depending on point classification separates the points in the conic blend region from the input point cloud. The available feature parameters of the cross-sectional curves are extracted with the processes of slicing point clouds with planes, conic curve fitting, and parameters estimation and compensation, The extracted parameters and its distribution laws are refined according to statistic theory such as regression analysis and hypothesis test. The proposed method can accurately capture the original design intentions and conveniently guide the reverse modeling process. Application examples are presented to verify the high precision and stability of the proposed method.展开更多
Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The ma...Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.展开更多
The geometrical and topological information of 3D computer aided design (CAD) models should be represented as a neut- ral format file to exchange the data between different CAD systems. Exchange of 3D CAD model data...The geometrical and topological information of 3D computer aided design (CAD) models should be represented as a neut- ral format file to exchange the data between different CAD systems. Exchange of 3D CAD model data implies that the companies must exchange complete information about their products, all the way from design, manufacturing to inspection and shipping. This informa- tion should be available to each relevant partner over the entire life cycle of the product. This led to the development of an international standard organization (ISO) neutral format file named as standard for the exchange of product model data (STEP). It has been ob- served from the literature, the feature recognition systems developed were identified as planar, cylindrical, conical and to some extent spherical and toroidal surfaces. The advanced surface features such as B-spline and its subtypes are not identified. Therefore, in this work, a STEP-based feature recognition system is developed to recognize t--spline surface features and its sub-types from the 3D CAD model represented in AP203 neutral file format. The developed feature recognition system is implemented in Java programming language and the product model data represented in STEP AP203 format is interpreted through Java standard data access interface (JSDAI). The developed system could recognize B-spline surface features such as B-Spline surface with knots, quasi uniform surface, uniform surface, rational surface and Bezier surface. The application of extracted B-spline surface features information is discussed with reference to the toolpath generation for STEP-NC (STEP AP238).展开更多
A unified feature definition is proposed.Feature is form-concentrated,and can be used to model product func- tionalities,assembly relations,and part geometries.The feature model is given and a feature classification i...A unified feature definition is proposed.Feature is form-concentrated,and can be used to model product func- tionalities,assembly relations,and part geometries.The feature model is given and a feature classification is introduced including functional,assembly,structural,and manufacturing features.A prototype modeling system is developed in Pro/ENGINEER that can define the assembly and user-defined form features.展开更多
This paper describes a novel target recognition scheme using High Range Resolution (HRR) radar signatures. AutoRegressive (AR) method is used to extract features from HRR radar echoes based on scattering center model ...This paper describes a novel target recognition scheme using High Range Resolution (HRR) radar signatures. AutoRegressive (AR) method is used to extract features from HRR radar echoes based on scattering center model of target. The optimal linear transformation based on Euclidian distribution distance criterion is performed on AR model parameter vectors to reduce dimension of feature vectors further and improve the class discrimination capability of feature vectors. The optimization algorithm is designed utilizing the quadratic property of criterion function and Gaussian kernel based Parzen window density function estimator. The concept of Stochastic Information Gradient (SIG) is incorporated into the gradient of cost function to decrease the computational complexity of the algorithm. Simulation results using three real airplanes,data show the effectiveness of the proposed method.展开更多
Aiming at the axiom of design for manufacture (DFM), this paper describes a recognition method for abstracting compound features from a part model and discloses the basic mechanism of compounding, also builds the cor...Aiming at the axiom of design for manufacture (DFM), this paper describes a recognition method for abstracting compound features from a part model and discloses the basic mechanism of compounding, also builds the corresponding 2D-simulation model. The inner association between feature neighboring and feature compounding is deeply discussed and, based on the essential transforming rule of two neighboring features, the corresponding feature adjacency matrix (FAM) of multi - feature entities are generated. For the manufacturing feature converted from the pure design feature; an innovative concept-homogenous compounding is presented to clarify the architecture of machining domain. Then, the FAM recurrence elimination algorithm is developed to determine all the compound features, and according to machining sequence, outputs a group of machining domains.展开更多
A new method for iris recognition using a multi-matching system based on a simplified deformable model of the human iris was proposed. The method defined iris feature points and formed the feature space based on a wa...A new method for iris recognition using a multi-matching system based on a simplified deformable model of the human iris was proposed. The method defined iris feature points and formed the feature space based on a wavelet transform. In the matching stage it worked in a crude manner. Driven by a simplified deformable iris model, the crude matching was refined. By means of such multi-matching system, the task of iris recognition was accomplished. This process can preserve the elastic deformation between an input iris image and a template and improve precision for iris recognition. The experimental results indicate the va- lidity of this method.展开更多
It is well known that automatic speech recognition(ASR) is a resource consuming task. It takes sufficient amount of data to train a state-of-the-art deep neural network acoustic model. As for some low-resource languag...It is well known that automatic speech recognition(ASR) is a resource consuming task. It takes sufficient amount of data to train a state-of-the-art deep neural network acoustic model. As for some low-resource languages where scripted speech is difficult to obtain, data sparsity is the main problem that limits the performance of speech recognition system. In this paper, several knowledge transfer methods are investigated to overcome the data sparsity problem with the help of high-resource languages.The first one is a pre-training and fine-tuning(PT/FT) method, in which the parameters of hidden layers are initialized with a welltrained neural network. Secondly, the progressive neural networks(Prognets) are investigated. With the help of lateral connections in the network architecture, Prognets are immune to forgetting effect and superior in knowledge transferring. Finally,bottleneck features(BNF) are extracted using cross-lingual deep neural networks and serves as an enhanced feature to improve the performance of ASR system. Experiments are conducted in a low-resource Vietnamese dataset. The results show that all three methods yield significant gains over the baseline system, and the Prognets acoustic model performs the best. Further improvements can be obtained by combining the Prognets model and bottleneck features.展开更多
Data mining in the educational field can be used to optimize the teaching and learning performance among the students.The recently developed machine learning(ML)and deep learning(DL)approaches can be utilized to mine ...Data mining in the educational field can be used to optimize the teaching and learning performance among the students.The recently developed machine learning(ML)and deep learning(DL)approaches can be utilized to mine the data effectively.This study proposes an Improved Sailfish Optimizer-based Feature SelectionwithOptimal Stacked Sparse Autoencoder(ISOFS-OSSAE)for data mining and pattern recognition in the educational sector.The proposed ISOFS-OSSAE model aims to mine the educational data and derive decisions based on the feature selection and classification process.Moreover,the ISOFS-OSSAEmodel involves the design of the ISOFS technique to choose an optimal subset of features.Moreover,the swallow swarm optimization(SSO)with the SSAE model is derived to perform the classification process.To showcase the enhanced outcomes of the ISOFSOSSAE model,a wide range of experiments were taken place on a benchmark dataset from the University of California Irvine(UCI)Machine Learning Repository.The simulation results pointed out the improved classification performance of the ISOFS-OSSAE model over the recent state of art approaches interms of different performance measures.展开更多
Along with the development of motion capture technique, more and more 3D motion databases become available. In this paper, a novel approach is presented for motion recognition and retrieval based on ensemble HMM (hidd...Along with the development of motion capture technique, more and more 3D motion databases become available. In this paper, a novel approach is presented for motion recognition and retrieval based on ensemble HMM (hidden Markov model) learning. Due to the high dimensionality of motion’s features, Isomap nonlinear dimension reduction is used for training data of ensemble HMM learning. For handling new motion data, Isomap is generalized based on the estimation of underlying eigen- functions. Then each action class is learned with one HMM. Since ensemble learning can effectively enhance supervised learning, ensembles of weak HMM learners are built. Experiment results showed that the approaches are effective for motion data recog- nition and retrieval.展开更多
Traditional named entity recognition methods need professional domain knowl-edge and a large amount of human participation to extract features,as well as the Chinese named entity recognition method based on a neural n...Traditional named entity recognition methods need professional domain knowl-edge and a large amount of human participation to extract features,as well as the Chinese named entity recognition method based on a neural network model,which brings the prob-lem that vector representation is too singular in the process of character vector representa-tion.To solve the above problem,we propose a Chinese named entity recognition method based on the BERT-BiLSTM-ATT-CRF model.Firstly,we use the bidirectional encoder representations from transformers(BERT)pre-training language model to obtain the se-mantic vector of the word according to the context information of the word;Secondly,the word vectors trained by BERT are input into the bidirectional long-term and short-term memory network embedded with attention mechanism(BiLSTM-ATT)to capture the most important semantic information in the sentence;Finally,the conditional random field(CRF)is used to learn the dependence between adjacent tags to obtain the global optimal sentence level tag sequence.The experimental results show that the proposed model achieves state-of-the-art performance on both Microsoft Research Asia(MSRA)corpus and people’s daily corpus,with F1 values of 94.77% and 95.97% respectively.展开更多
In order to improve the Mandarin vowel pronunciation quality assessment, a nox/el formant feature was proposed and applied to formant classification for Chinese Mandarin vowel pronunciation quality evaluation. Formant...In order to improve the Mandarin vowel pronunciation quality assessment, a nox/el formant feature was proposed and applied to formant classification for Chinese Mandarin vowel pronunciation quality evaluation. Formant candidates of each frame were plotted on the time-frequency plane to form a bitmap, and its Gabor feature was extracted to represent the formant trajectory. The feature was then classified by using GMM model and the classification posterior probability was mapped to pronunciation quality grade. The experiments of comparing the Gabor transformation based formant trajectory feature with several other kinds of traditionally used features show that with this method, a human-machine scoring correlation coefficient (CC) of 0.842 can be achieved, which is better than the result of 0.832 by traditional speech recognition techniques. At the same time, considering that the long-term information of formant classification and the short-term information of speech recognition technique are complementary to each other, it is investigated to combine their results with linear or nonlinear methods to further improve the evaluation performance. As a result, experiments on PSK show that the best CC of 0.913, which is very close to the correlation of inter-human rating of 0.94, is gotten by using neural network.展开更多
Wake-Up-Word Speech Recognition task (WUW-SR) is a computationally very demand, particularly the stage of feature extraction which is decoded with corresponding Hidden Markov Models (HMMs) in the back-end stage of the...Wake-Up-Word Speech Recognition task (WUW-SR) is a computationally very demand, particularly the stage of feature extraction which is decoded with corresponding Hidden Markov Models (HMMs) in the back-end stage of the WUW-SR. The state of the art WUW-SR system is based on three different sets of features: Mel-Frequency Cepstral Coefficients (MFCC), Linear Predictive Coding Coefficients (LPC), and Enhanced Mel-Frequency Cepstral Coefficients (ENH_MFCC). In (front-end of Wake-Up-Word Speech Recognition System Design on FPGA) [1], we presented an experimental FPGA design and implementation of a novel architecture of a real-time spectrogram extraction processor that generates MFCC, LPC, and ENH_MFCC spectrograms simultaneously. In this paper, the details of converting the three sets of spectrograms 1) Mel-Frequency Cepstral Coefficients (MFCC), 2) Linear Predictive Coding Coefficients (LPC), and 3) Enhanced Mel-Frequency Cepstral Coefficients (ENH_MFCC) to their equivalent features are presented. In the WUW- SR system, the recognizer’s frontend is located at the terminal which is typically connected over a data network to remote back-end recognition (e.g., server). The WUW-SR is shown in Figure 1. The three sets of speech features are extracted at the front-end. These extracted features are then compressed and transmitted to the server via a dedicated channel, where subsequently they are decoded.展开更多
In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression featu...In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression features is proposed with its objective to describe features in an effective and efficient way in order to improve the recognition performance. The method combines the facial action coding system(FACS) and 'uniform' local binary patterns(LBP) to represent facial expression features from coarse to fine. The facial feature regions are extracted by active shape models(ASM) based on FACS to obtain the gray-level texture. Then, LBP is used to represent expression features for enhancing the discriminant. A facial expression recognition system is developed based on this feature extraction method by using K nearest neighborhood(K-NN) classifier to recognize facial expressions. Finally, experiments are carried out to evaluate this feature extraction method. The significance of removing the unrelated facial regions and enhancing the discrimination ability of expression features in the recognition process is indicated by the results, in addition to its convenience.展开更多
This work describes an improved feature extractor algorithm to extract the peripheral features of point x(ti,fj) using a nonlinear algorithm to compute the nonlinear time spectrum (NL-TS) pattern. The algo- rithm ob...This work describes an improved feature extractor algorithm to extract the peripheral features of point x(ti,fj) using a nonlinear algorithm to compute the nonlinear time spectrum (NL-TS) pattern. The algo- rithm observes n×n neighborhoods of the point in all directions, and then incorporates the peripheral fea- tures using the Mel frequency cepstrum components (MFCCs)-based feature extractor of the Tsinghua elec- tronic engineering speech processing (THEESP) for Mandarin automatic speech recognition (MASR) sys- tem as replacements of the dynamic features with different feature combinations. In this algorithm, the or- thogonal bases are extracted directly from the speech data using discrite cosime transformation (DCT) with 3×3 blocks on an NL-TS pattern as the peripheral features. The new primal bases are then selected and simplified in the form of the ?dp- operator in the time direction and the ?dp- operator in the frequency di- t f rection. The algorithm has 23.29% improvements of the relative error rate in comparison with the standard MFCC feature-set and the dynamic features in tests using THEESP with the duration distribution-based hid- den Markov model (DDBHMM) based on MASR system.展开更多
In order to solve the problem of indoor place recognition for indoor service robot, a novel algorithm, clustering of features and images (CFI), is proposed in this work. Different from traditional indoor place recog...In order to solve the problem of indoor place recognition for indoor service robot, a novel algorithm, clustering of features and images (CFI), is proposed in this work. Different from traditional indoor place recognition methods which are based on kernels or bag of features, with large margin classifier, CFI proposed in this work is based on feature matching, image similarity and clustering of features and images. It establishes independent local feature clusters by feature cloud registration to represent each room, and defines image distance to describe the similarity between images or feature clusters, which determines the label of query images. Besides, it improves recognition speed by image scaling, with state inertia and hidden Markov model constraining the transition of the state to kill unreasonable wrong recognitions and achieves remarkable precision and speed. A series of experiments are conducted to test the algorithm based on standard databases, and it achieves recognition rate up to 97% and speed is over 30 fps, which is much superior to traditional methods. Its impressive precision and speed demonstrate the great discriminative power in the face of complicated environment.展开更多
文摘Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognition method is proposed in this paper. Because of the better noise suppression ability of the GAR model and the powerful pattern classification capacity of the MLP neural network classifier, the new method can significantly improve the recognition performance in lower SNR with better robustness. To assess the performance of the new method, computer simulations are also performed.
文摘In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance degradation in noisy conditions or distorted channels. It is necessary to search for more robust feature extraction methods to gain better performance in adverse conditions. This paper investigates the performance of conventional and new hybrid speech feature extraction algorithms of Mel Frequency Cepstrum Coefficient (MFCC), Linear Prediction Coding Coefficient (LPCC), perceptual linear production (PLP), and RASTA-PLP in noisy conditions through using multivariate Hidden Markov Model (HMM) classifier. The behavior of the proposal system is evaluated using TIDIGIT human voice dataset corpora, recorded from 208 different adult speakers in both training and testing process. The theoretical basis for speech processing and classifier procedures were presented, and the recognition results were obtained based on word recognition rate.
基金supported by Fujian Provincial Science and Technology Major Project(No.2020HZ02014)by the grants from National Natural Science Foundation of Fujian(2021J01133,2021J011404)by the Quanzhou Scientific and Technological Planning Projects(Nos.2018C113R,2019C028R,2019C029R,2019C076R and 2019C099R).
文摘Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Cardiology,medical imaging technology(2D ultrasonic,MRI)has been proved to be helpful to detect congenital defects of the fetal heart and assists sonographers in prenatal diagnosis.It is a highly complex task to recognize 2D fetal heart ultrasonic standard plane(FHUSP)manually.Compared withmanual identification,automatic identification through artificial intelligence can save a lot of time,ensure the efficiency of diagnosis,and improve the accuracy of diagnosis.In this study,a feature extraction method based on texture features(Local Binary Pattern LBP and Histogram of Oriented Gradient HOG)and combined with Bag of Words(BOW)model is carried out,and then feature fusion is performed.Finally,it adopts Support VectorMachine(SVM)to realize automatic recognition and classification of FHUSP.The data includes 788 standard plane data sets and 448 normal and abnormal plane data sets.Compared with some other methods and the single method model,the classification accuracy of our model has been obviously improved,with the highest accuracy reaching 87.35%.Similarly,we also verify the performance of the model in normal and abnormal planes,and the average accuracy in classifying abnormal and normal planes is 84.92%.The experimental results show that thismethod can effectively classify and predict different FHUSP and can provide certain assistance for sonographers to diagnose fetal congenital heart disease.
基金This project is supported by General Electric Company and National Advanced Technology Project of China(No.863-511-942-018).
文摘A novel method to extract conic blending feature in reverse engineering is presented. Different from the methods to recover constant and variable radius blends from unorganized points, it contains not only novel segmentation and feature recognition techniques, but also bias corrected technique to capture more reliable distribution of feature parameters along the spine curve. The segmentation depending on point classification separates the points in the conic blend region from the input point cloud. The available feature parameters of the cross-sectional curves are extracted with the processes of slicing point clouds with planes, conic curve fitting, and parameters estimation and compensation, The extracted parameters and its distribution laws are refined according to statistic theory such as regression analysis and hypothesis test. The proposed method can accurately capture the original design intentions and conveniently guide the reverse modeling process. Application examples are presented to verify the high precision and stability of the proposed method.
基金This research was funded by the National Natural Science Foundation of China(21878124,31771680 and 61773182).
文摘Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.
文摘The geometrical and topological information of 3D computer aided design (CAD) models should be represented as a neut- ral format file to exchange the data between different CAD systems. Exchange of 3D CAD model data implies that the companies must exchange complete information about their products, all the way from design, manufacturing to inspection and shipping. This informa- tion should be available to each relevant partner over the entire life cycle of the product. This led to the development of an international standard organization (ISO) neutral format file named as standard for the exchange of product model data (STEP). It has been ob- served from the literature, the feature recognition systems developed were identified as planar, cylindrical, conical and to some extent spherical and toroidal surfaces. The advanced surface features such as B-spline and its subtypes are not identified. Therefore, in this work, a STEP-based feature recognition system is developed to recognize t--spline surface features and its sub-types from the 3D CAD model represented in AP203 neutral file format. The developed feature recognition system is implemented in Java programming language and the product model data represented in STEP AP203 format is interpreted through Java standard data access interface (JSDAI). The developed system could recognize B-spline surface features such as B-Spline surface with knots, quasi uniform surface, uniform surface, rational surface and Bezier surface. The application of extracted B-spline surface features information is discussed with reference to the toolpath generation for STEP-NC (STEP AP238).
文摘A unified feature definition is proposed.Feature is form-concentrated,and can be used to model product func- tionalities,assembly relations,and part geometries.The feature model is given and a feature classification is introduced including functional,assembly,structural,and manufacturing features.A prototype modeling system is developed in Pro/ENGINEER that can define the assembly and user-defined form features.
基金Supported by the Basic Research Foundation of Tsinghua National Laboratory for Information Science and Technology (TNList)the Major Program of the National Natural Science Foundation of Foundation of China (No. 60496311)
文摘This paper describes a novel target recognition scheme using High Range Resolution (HRR) radar signatures. AutoRegressive (AR) method is used to extract features from HRR radar echoes based on scattering center model of target. The optimal linear transformation based on Euclidian distribution distance criterion is performed on AR model parameter vectors to reduce dimension of feature vectors further and improve the class discrimination capability of feature vectors. The optimization algorithm is designed utilizing the quadratic property of criterion function and Gaussian kernel based Parzen window density function estimator. The concept of Stochastic Information Gradient (SIG) is incorporated into the gradient of cost function to decrease the computational complexity of the algorithm. Simulation results using three real airplanes,data show the effectiveness of the proposed method.
文摘Aiming at the axiom of design for manufacture (DFM), this paper describes a recognition method for abstracting compound features from a part model and discloses the basic mechanism of compounding, also builds the corresponding 2D-simulation model. The inner association between feature neighboring and feature compounding is deeply discussed and, based on the essential transforming rule of two neighboring features, the corresponding feature adjacency matrix (FAM) of multi - feature entities are generated. For the manufacturing feature converted from the pure design feature; an innovative concept-homogenous compounding is presented to clarify the architecture of machining domain. Then, the FAM recurrence elimination algorithm is developed to determine all the compound features, and according to machining sequence, outputs a group of machining domains.
文摘A new method for iris recognition using a multi-matching system based on a simplified deformable model of the human iris was proposed. The method defined iris feature points and formed the feature space based on a wavelet transform. In the matching stage it worked in a crude manner. Driven by a simplified deformable iris model, the crude matching was refined. By means of such multi-matching system, the task of iris recognition was accomplished. This process can preserve the elastic deformation between an input iris image and a template and improve precision for iris recognition. The experimental results indicate the va- lidity of this method.
基金partially supported by the National Natural Science Foundation of China(11590770-4,U1536117)the National Key Research and Development Program of China(2016YFB0801203,2016YFB0801200)+1 种基金the Key Science and Technology Project of the Xinjiang Uygur Autonomous Region(2016A03007-1)the Pre-research Project for Equipment of General Information System(JZX2017-0994/Y306)
文摘It is well known that automatic speech recognition(ASR) is a resource consuming task. It takes sufficient amount of data to train a state-of-the-art deep neural network acoustic model. As for some low-resource languages where scripted speech is difficult to obtain, data sparsity is the main problem that limits the performance of speech recognition system. In this paper, several knowledge transfer methods are investigated to overcome the data sparsity problem with the help of high-resource languages.The first one is a pre-training and fine-tuning(PT/FT) method, in which the parameters of hidden layers are initialized with a welltrained neural network. Secondly, the progressive neural networks(Prognets) are investigated. With the help of lateral connections in the network architecture, Prognets are immune to forgetting effect and superior in knowledge transferring. Finally,bottleneck features(BNF) are extracted using cross-lingual deep neural networks and serves as an enhanced feature to improve the performance of ASR system. Experiments are conducted in a low-resource Vietnamese dataset. The results show that all three methods yield significant gains over the baseline system, and the Prognets acoustic model performs the best. Further improvements can be obtained by combining the Prognets model and bottleneck features.
文摘Data mining in the educational field can be used to optimize the teaching and learning performance among the students.The recently developed machine learning(ML)and deep learning(DL)approaches can be utilized to mine the data effectively.This study proposes an Improved Sailfish Optimizer-based Feature SelectionwithOptimal Stacked Sparse Autoencoder(ISOFS-OSSAE)for data mining and pattern recognition in the educational sector.The proposed ISOFS-OSSAE model aims to mine the educational data and derive decisions based on the feature selection and classification process.Moreover,the ISOFS-OSSAEmodel involves the design of the ISOFS technique to choose an optimal subset of features.Moreover,the swallow swarm optimization(SSO)with the SSAE model is derived to perform the classification process.To showcase the enhanced outcomes of the ISOFSOSSAE model,a wide range of experiments were taken place on a benchmark dataset from the University of California Irvine(UCI)Machine Learning Repository.The simulation results pointed out the improved classification performance of the ISOFS-OSSAE model over the recent state of art approaches interms of different performance measures.
基金Project supported by the National Natural Science Foundation of China (Nos. 60533090 and 60525108), the National Basic Research Program (973) of China (No. 2002CB312101), and the Science and Technology Project of Zhejiang Province (Nos. 2005C13032 and 2005C11001-05), China
文摘Along with the development of motion capture technique, more and more 3D motion databases become available. In this paper, a novel approach is presented for motion recognition and retrieval based on ensemble HMM (hidden Markov model) learning. Due to the high dimensionality of motion’s features, Isomap nonlinear dimension reduction is used for training data of ensemble HMM learning. For handling new motion data, Isomap is generalized based on the estimation of underlying eigen- functions. Then each action class is learned with one HMM. Since ensemble learning can effectively enhance supervised learning, ensembles of weak HMM learners are built. Experiment results showed that the approaches are effective for motion data recog- nition and retrieval.
文摘Traditional named entity recognition methods need professional domain knowl-edge and a large amount of human participation to extract features,as well as the Chinese named entity recognition method based on a neural network model,which brings the prob-lem that vector representation is too singular in the process of character vector representa-tion.To solve the above problem,we propose a Chinese named entity recognition method based on the BERT-BiLSTM-ATT-CRF model.Firstly,we use the bidirectional encoder representations from transformers(BERT)pre-training language model to obtain the se-mantic vector of the word according to the context information of the word;Secondly,the word vectors trained by BERT are input into the bidirectional long-term and short-term memory network embedded with attention mechanism(BiLSTM-ATT)to capture the most important semantic information in the sentence;Finally,the conditional random field(CRF)is used to learn the dependence between adjacent tags to obtain the global optimal sentence level tag sequence.The experimental results show that the proposed model achieves state-of-the-art performance on both Microsoft Research Asia(MSRA)corpus and people’s daily corpus,with F1 values of 94.77% and 95.97% respectively.
基金Project(61062011)supported by the National Natural Science Foundation of ChinaProject(2010GXNSFA013128)supported by the Natural Science Foundation of Guangxi Province,China
文摘In order to improve the Mandarin vowel pronunciation quality assessment, a nox/el formant feature was proposed and applied to formant classification for Chinese Mandarin vowel pronunciation quality evaluation. Formant candidates of each frame were plotted on the time-frequency plane to form a bitmap, and its Gabor feature was extracted to represent the formant trajectory. The feature was then classified by using GMM model and the classification posterior probability was mapped to pronunciation quality grade. The experiments of comparing the Gabor transformation based formant trajectory feature with several other kinds of traditionally used features show that with this method, a human-machine scoring correlation coefficient (CC) of 0.842 can be achieved, which is better than the result of 0.832 by traditional speech recognition techniques. At the same time, considering that the long-term information of formant classification and the short-term information of speech recognition technique are complementary to each other, it is investigated to combine their results with linear or nonlinear methods to further improve the evaluation performance. As a result, experiments on PSK show that the best CC of 0.913, which is very close to the correlation of inter-human rating of 0.94, is gotten by using neural network.
文摘Wake-Up-Word Speech Recognition task (WUW-SR) is a computationally very demand, particularly the stage of feature extraction which is decoded with corresponding Hidden Markov Models (HMMs) in the back-end stage of the WUW-SR. The state of the art WUW-SR system is based on three different sets of features: Mel-Frequency Cepstral Coefficients (MFCC), Linear Predictive Coding Coefficients (LPC), and Enhanced Mel-Frequency Cepstral Coefficients (ENH_MFCC). In (front-end of Wake-Up-Word Speech Recognition System Design on FPGA) [1], we presented an experimental FPGA design and implementation of a novel architecture of a real-time spectrogram extraction processor that generates MFCC, LPC, and ENH_MFCC spectrograms simultaneously. In this paper, the details of converting the three sets of spectrograms 1) Mel-Frequency Cepstral Coefficients (MFCC), 2) Linear Predictive Coding Coefficients (LPC), and 3) Enhanced Mel-Frequency Cepstral Coefficients (ENH_MFCC) to their equivalent features are presented. In the WUW- SR system, the recognizer’s frontend is located at the terminal which is typically connected over a data network to remote back-end recognition (e.g., server). The WUW-SR is shown in Figure 1. The three sets of speech features are extracted at the front-end. These extracted features are then compressed and transmitted to the server via a dedicated channel, where subsequently they are decoded.
基金supported by National Natural Science Foundation of China(No.61273339)
文摘In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression features is proposed with its objective to describe features in an effective and efficient way in order to improve the recognition performance. The method combines the facial action coding system(FACS) and 'uniform' local binary patterns(LBP) to represent facial expression features from coarse to fine. The facial feature regions are extracted by active shape models(ASM) based on FACS to obtain the gray-level texture. Then, LBP is used to represent expression features for enhancing the discriminant. A facial expression recognition system is developed based on this feature extraction method by using K nearest neighborhood(K-NN) classifier to recognize facial expressions. Finally, experiments are carried out to evaluate this feature extraction method. The significance of removing the unrelated facial regions and enhancing the discrimination ability of expression features in the recognition process is indicated by the results, in addition to its convenience.
基金Supported by the National High-Tech Research and Development (863) Program of China (No. 200/AA/14)
文摘This work describes an improved feature extractor algorithm to extract the peripheral features of point x(ti,fj) using a nonlinear algorithm to compute the nonlinear time spectrum (NL-TS) pattern. The algo- rithm observes n×n neighborhoods of the point in all directions, and then incorporates the peripheral fea- tures using the Mel frequency cepstrum components (MFCCs)-based feature extractor of the Tsinghua elec- tronic engineering speech processing (THEESP) for Mandarin automatic speech recognition (MASR) sys- tem as replacements of the dynamic features with different feature combinations. In this algorithm, the or- thogonal bases are extracted directly from the speech data using discrite cosime transformation (DCT) with 3×3 blocks on an NL-TS pattern as the peripheral features. The new primal bases are then selected and simplified in the form of the ?dp- operator in the time direction and the ?dp- operator in the frequency di- t f rection. The algorithm has 23.29% improvements of the relative error rate in comparison with the standard MFCC feature-set and the dynamic features in tests using THEESP with the duration distribution-based hid- den Markov model (DDBHMM) based on MASR system.
基金supported by National Natural Science Foundation of China(Nos.61305103 and 61473103)Natural Science Foundation Heilongjiang province(No.QC2014C072)+1 种基金Postdoctoral Science Foundation of Heilongjiang(No.LBH-Z14108)SelfPlanned Task of State Key Laboratory of Robotics and System(HIT)(No.SKLRS201609B)
文摘In order to solve the problem of indoor place recognition for indoor service robot, a novel algorithm, clustering of features and images (CFI), is proposed in this work. Different from traditional indoor place recognition methods which are based on kernels or bag of features, with large margin classifier, CFI proposed in this work is based on feature matching, image similarity and clustering of features and images. It establishes independent local feature clusters by feature cloud registration to represent each room, and defines image distance to describe the similarity between images or feature clusters, which determines the label of query images. Besides, it improves recognition speed by image scaling, with state inertia and hidden Markov model constraining the transition of the state to kill unreasonable wrong recognitions and achieves remarkable precision and speed. A series of experiments are conducted to test the algorithm based on standard databases, and it achieves recognition rate up to 97% and speed is over 30 fps, which is much superior to traditional methods. Its impressive precision and speed demonstrate the great discriminative power in the face of complicated environment.