In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language proc...In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language processing. The speaker independently continuous speech recognition experiments and the part-of-speech tagging experiments show that Markov family model has higher performance than hidden Markov model. The precision is enhanced from 94.642% to 96.214% in the part-of-speech tagging experiments, and the work rate is reduced by 11.9% in the speech recognition experiments with respect to HMM baseline system.展开更多
In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance...In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance degradation in noisy conditions or distorted channels. It is necessary to search for more robust feature extraction methods to gain better performance in adverse conditions. This paper investigates the performance of conventional and new hybrid speech feature extraction algorithms of Mel Frequency Cepstrum Coefficient (MFCC), Linear Prediction Coding Coefficient (LPCC), perceptual linear production (PLP), and RASTA-PLP in noisy conditions through using multivariate Hidden Markov Model (HMM) classifier. The behavior of the proposal system is evaluated using TIDIGIT human voice dataset corpora, recorded from 208 different adult speakers in both training and testing process. The theoretical basis for speech processing and classifier procedures were presented, and the recognition results were obtained based on word recognition rate.展开更多
This paper presents a method of tone recognition for Mandarin speech by using combination of wavelet transform and hidden Markov modeling techniques. A pitch detector based on singularity detection and multi-resolutio...This paper presents a method of tone recognition for Mandarin speech by using combination of wavelet transform and hidden Markov modeling techniques. A pitch detector based on singularity detection and multi-resolution analysis of wavelet transform is employed for estimation of pitch periods, and hidden Markov modeling with partition Gaussian mixtures probability density function is used for the tone recognition. The algorithm can provide recognition accuracy of 97.22% and 94.47% for speaker-dependent and speaker-independent tone recognition, respectively.展开更多
Because performance parameters of gear have degradation,a method is proposed to recognize and analyze its faults using the hidden Markov model( HMM). In this method,firstly,the delayed correlation-envelope method is u...Because performance parameters of gear have degradation,a method is proposed to recognize and analyze its faults using the hidden Markov model( HMM). In this method,firstly,the delayed correlation-envelope method is used to extract features from vibration signals. Then,HMMs are trained respectively using data under normal condition,gear root crack condition and gear root breaking condition. Further,the trained HMMs are used in pattern recognition and model assessment. Finally,the results from standard HMM and the proposed method are compared, which shows that the proposed methodology is feasible and effective.展开更多
Heart murmur recognition and classification play an important role in the auscultative diagnosis. The method based on hidden markov model (HMM) was presented to recognize the heart murmur. The murmur was isolated on b...Heart murmur recognition and classification play an important role in the auscultative diagnosis. The method based on hidden markov model (HMM) was presented to recognize the heart murmur. The murmur was isolated on basis of the principle of wavelet analysis considering the time-frequency characteristics of the heart murmur. This method uses Mel frequency cepstral coefficient (MFCC) to extract representative features and develops hidden Markov model (HMM) for signal classification. The result shows that this method?is able to recognize the murmur efficiently and superior to BP?neural network (94.2% vs 82.8%). And the findings suggest that the method may have the potential to be used to assist doctors for a more objective diagnosis.展开更多
Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become availa...Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.展开更多
This research presents a novel way of labelling human activities from the skeleton output computed from RGB-D data from vision-based motion capture systems. The activities are labelled by means of a Compound Hidden Ma...This research presents a novel way of labelling human activities from the skeleton output computed from RGB-D data from vision-based motion capture systems. The activities are labelled by means of a Compound Hidden Markov Model. The linkage of several Linear Hidden Markov Models to common states, makes a Compound Hidden Markov Model. Each separate Linear Hidden Markov Model has motion information of a human activity. The sequence of most likely states, from a sequence of observations, indicates which activities are performed by a person in an interval of time. The purpose of this research is to provide a service robot with the capability of human activity awareness, which can be used for action planning with implicit and indirect Human-Robot Interaction. The proposed Compound Hidden Markov Model, made of Linear Hidden Markov Models per activity, labels activities from unknown subjects with an average accuracy of 59.37%, which is higher than the average labelling accuracy for activities of unknown subjects of an Ergodic Hidden Markov Model (6.25%), and a Compound Hidden Markov Model with activities modelled by a single state (18.75%).展开更多
We propose a model structure with a double-layer hidden Markov model (HMM) to recognise driving intention and predict driving behaviour. The upper-layer multi-dimensional discrete HMM (MDHMM) in the double-layer HMM r...We propose a model structure with a double-layer hidden Markov model (HMM) to recognise driving intention and predict driving behaviour. The upper-layer multi-dimensional discrete HMM (MDHMM) in the double-layer HMM represents driving intention in a combined working case, constructed according to the driving behaviours in certain single working cases in the lower-layer multi-dimensional Gaussian HMM (MGHMM). The driving behaviours are recognised by manoeuvring the signals of the driver and vehicle state information, and the recognised results are sent to the upper-layer HMM to recognise driving intentions. Also, driving behaviours in the near future are predicted using the likelihood-maximum method. A real-time driving simulator test on the combined working cases showed that the double-layer HMM can recognise driving intention and predict driving behaviour accurately and efficiently. As a result, the model provides the basis for pre-warning and intervention of danger and improving comfort performance.展开更多
Unconstrained offiine handwriting recognition is a challenging task in the areas of document analysis and pattern recognition. In recent years, to sufficiently exploit the supervisory information hidden in document im...Unconstrained offiine handwriting recognition is a challenging task in the areas of document analysis and pattern recognition. In recent years, to sufficiently exploit the supervisory information hidden in document images, much effort has been made to integrate multi-layer perceptrons (MLPs) in either a hybrid or a tandem fashion into hidden Markov models (HMMs). However, due to the weak learnability of MLPs, the learnt features are not necessarily optimal for subsequent recognition tasks. In this paper, we propose a deep architecture-based tandem approach for unconstrained offiine handwriting recognition. In the proposed model, deep belief networks arc adopted to learn the compact representations of sequential data, while HMMs are applied for (sub-)word recognition. We evaluate the proposed model on two publicly available datasets, i.e., RIMES and IFN/ENIT, which are based on Latin and Arabic languages respectively, and one dataset collected by ourselves called Devanagari (all Indian script). Extensive experiments show the advantage of the proposed model, especially over the MLP-HMMs taudem approaches.展开更多
We present a novel model for recognizing long-term complex activities involving multiple persons. The proposed model, named ‘decomposed hidden Markov model’ (DHMM), combines spatial decomposition and hierarchical ab...We present a novel model for recognizing long-term complex activities involving multiple persons. The proposed model, named ‘decomposed hidden Markov model’ (DHMM), combines spatial decomposition and hierarchical abstraction to capture multi-modal, long-term dependent and multi-scale characteristics of activities. Decomposition in space and time offers conceptual advantages of compaction and clarity, and greatly reduces the size of state space as well as the number of parameters. DHMMs are efficient even when the number of persons is variable. We also introduce an efficient approximation algorithm for inference and parameter estimation. Experiments on multi-person activities and multi-modal individual activities demonstrate that DHMMs are more efficient and reliable than familiar models, such as coupled HMMs, hierarchical HMMs, and multi-observation HMMs.展开更多
In this paper the authors look into the problem of Hidden Markov Models (HMM): the evaluation, the decoding and the learning problem. The authors have explored an approach to increase the effectiveness of HMM in th...In this paper the authors look into the problem of Hidden Markov Models (HMM): the evaluation, the decoding and the learning problem. The authors have explored an approach to increase the effectiveness of HMM in the speech recognition field. Although hidden Markov modeling has significantly improved the performance of current speech-recognition systems, the general problem of completely fluent speaker-independent speech recognition is still far from being solved. For example, there is no system which is capable of reliably recognizing unconstrained conversational speech. Also, there does not exist a good way to infer the language structure from a limited corpus of spoken sentences statistically. Therefore, the authors want to provide an overview of the theory of HMM, discuss the role of statistical methods, and point out a range of theoretical and practical issues that deserve attention and are necessary to understand so as to further advance research in the field of speech recognition.展开更多
The authors propose a two-stage method for recognizing driving situations on the basis of driving signals for application to a safe human interface of an in-vehicle information system. In first stage, an unknown drivi...The authors propose a two-stage method for recognizing driving situations on the basis of driving signals for application to a safe human interface of an in-vehicle information system. In first stage, an unknown driving situation is determined as stopping behavior or non-stopping behavior. In second stage, a Hidden Markov Model (HMM)-based pattern recognition method is used to model and recognize six non-stopping driving situations. The authors attempt to find the optimal HMM configuration to improve the performance of driving situation recognition. Center for Integrated Acoustic Information Research (CLAIR) in-vehicle corpus is used to evaluate the HMM-based recognition method. Driving situation categories are recognized using five driving signals. The proposed method achieves a relative error reduction rate of 30.9% compared to a conventional one-stage based HMMs.展开更多
As a kind of statistical method, the technique of Hidden Markov Model (HMM) is widely used for speech recognition. In order to train the HMM to be more effective with much less amount of data, the Subspace Distribut...As a kind of statistical method, the technique of Hidden Markov Model (HMM) is widely used for speech recognition. In order to train the HMM to be more effective with much less amount of data, the Subspace Distribution Clustering Hidden Markov Model (SDCHMM), derived from the Continuous Density Hidden Markov Model (CDHMM), is introduced. With parameter tying, a new method to train SDCHMMs is described. Compared with the conventional training method, an SDCHMM recognizer trained by means of the new method achieves higher accuracy and speed. Experiment results show that the SDCHMM recognizer outperforms the CDHMM recognizer on speech recognition of Chinese digits.展开更多
基金Project(60763001)supported by the National Natural Science Foundation of ChinaProjects(2009GZS0027,2010GZS0072)supported by the Natural Science Foundation of Jiangxi Province,China
文摘In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language processing. The speaker independently continuous speech recognition experiments and the part-of-speech tagging experiments show that Markov family model has higher performance than hidden Markov model. The precision is enhanced from 94.642% to 96.214% in the part-of-speech tagging experiments, and the work rate is reduced by 11.9% in the speech recognition experiments with respect to HMM baseline system.
文摘In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance degradation in noisy conditions or distorted channels. It is necessary to search for more robust feature extraction methods to gain better performance in adverse conditions. This paper investigates the performance of conventional and new hybrid speech feature extraction algorithms of Mel Frequency Cepstrum Coefficient (MFCC), Linear Prediction Coding Coefficient (LPCC), perceptual linear production (PLP), and RASTA-PLP in noisy conditions through using multivariate Hidden Markov Model (HMM) classifier. The behavior of the proposal system is evaluated using TIDIGIT human voice dataset corpora, recorded from 208 different adult speakers in both training and testing process. The theoretical basis for speech processing and classifier procedures were presented, and the recognition results were obtained based on word recognition rate.
基金Supported by the National Natural Science Foundatiuon of China
文摘This paper presents a method of tone recognition for Mandarin speech by using combination of wavelet transform and hidden Markov modeling techniques. A pitch detector based on singularity detection and multi-resolution analysis of wavelet transform is employed for estimation of pitch periods, and hidden Markov modeling with partition Gaussian mixtures probability density function is used for the tone recognition. The algorithm can provide recognition accuracy of 97.22% and 94.47% for speaker-dependent and speaker-independent tone recognition, respectively.
文摘Because performance parameters of gear have degradation,a method is proposed to recognize and analyze its faults using the hidden Markov model( HMM). In this method,firstly,the delayed correlation-envelope method is used to extract features from vibration signals. Then,HMMs are trained respectively using data under normal condition,gear root crack condition and gear root breaking condition. Further,the trained HMMs are used in pattern recognition and model assessment. Finally,the results from standard HMM and the proposed method are compared, which shows that the proposed methodology is feasible and effective.
文摘Heart murmur recognition and classification play an important role in the auscultative diagnosis. The method based on hidden markov model (HMM) was presented to recognize the heart murmur. The murmur was isolated on basis of the principle of wavelet analysis considering the time-frequency characteristics of the heart murmur. This method uses Mel frequency cepstral coefficient (MFCC) to extract representative features and develops hidden Markov model (HMM) for signal classification. The result shows that this method?is able to recognize the murmur efficiently and superior to BP?neural network (94.2% vs 82.8%). And the findings suggest that the method may have the potential to be used to assist doctors for a more objective diagnosis.
基金Supported by Grant-in-Aid for Young Scientists(A)(Grant No.26700021)Japan Society for the Promotion of Science and Strategic Information and Communications R&D Promotion Programme(Grant No.142103011)Ministry of Internal Affairs and Communications
文摘Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.
文摘This research presents a novel way of labelling human activities from the skeleton output computed from RGB-D data from vision-based motion capture systems. The activities are labelled by means of a Compound Hidden Markov Model. The linkage of several Linear Hidden Markov Models to common states, makes a Compound Hidden Markov Model. Each separate Linear Hidden Markov Model has motion information of a human activity. The sequence of most likely states, from a sequence of observations, indicates which activities are performed by a person in an interval of time. The purpose of this research is to provide a service robot with the capability of human activity awareness, which can be used for action planning with implicit and indirect Human-Robot Interaction. The proposed Compound Hidden Markov Model, made of Linear Hidden Markov Models per activity, labels activities from unknown subjects with an average accuracy of 59.37%, which is higher than the average labelling accuracy for activities of unknown subjects of an Ergodic Hidden Markov Model (6.25%), and a Compound Hidden Markov Model with activities modelled by a single state (18.75%).
基金Project (Nos. 50775096 and 51075176) supported by the National Natural Science Foundation of China
文摘We propose a model structure with a double-layer hidden Markov model (HMM) to recognise driving intention and predict driving behaviour. The upper-layer multi-dimensional discrete HMM (MDHMM) in the double-layer HMM represents driving intention in a combined working case, constructed according to the driving behaviours in certain single working cases in the lower-layer multi-dimensional Gaussian HMM (MGHMM). The driving behaviours are recognised by manoeuvring the signals of the driver and vehicle state information, and the recognised results are sent to the upper-layer HMM to recognise driving intentions. Also, driving behaviours in the near future are predicted using the likelihood-maximum method. A real-time driving simulator test on the combined working cases showed that the double-layer HMM can recognise driving intention and predict driving behaviour accurately and efficiently. As a result, the model provides the basis for pre-warning and intervention of danger and improving comfort performance.
基金the National Natural Science Foundation of China (No. 61403353)
文摘Unconstrained offiine handwriting recognition is a challenging task in the areas of document analysis and pattern recognition. In recent years, to sufficiently exploit the supervisory information hidden in document images, much effort has been made to integrate multi-layer perceptrons (MLPs) in either a hybrid or a tandem fashion into hidden Markov models (HMMs). However, due to the weak learnability of MLPs, the learnt features are not necessarily optimal for subsequent recognition tasks. In this paper, we propose a deep architecture-based tandem approach for unconstrained offiine handwriting recognition. In the proposed model, deep belief networks arc adopted to learn the compact representations of sequential data, while HMMs are applied for (sub-)word recognition. We evaluate the proposed model on two publicly available datasets, i.e., RIMES and IFN/ENIT, which are based on Latin and Arabic languages respectively, and one dataset collected by ourselves called Devanagari (all Indian script). Extensive experiments show the advantage of the proposed model, especially over the MLP-HMMs taudem approaches.
基金Project (No. 60772050) supported by the National Natural Science Foundation of China
文摘We present a novel model for recognizing long-term complex activities involving multiple persons. The proposed model, named ‘decomposed hidden Markov model’ (DHMM), combines spatial decomposition and hierarchical abstraction to capture multi-modal, long-term dependent and multi-scale characteristics of activities. Decomposition in space and time offers conceptual advantages of compaction and clarity, and greatly reduces the size of state space as well as the number of parameters. DHMMs are efficient even when the number of persons is variable. We also introduce an efficient approximation algorithm for inference and parameter estimation. Experiments on multi-person activities and multi-modal individual activities demonstrate that DHMMs are more efficient and reliable than familiar models, such as coupled HMMs, hierarchical HMMs, and multi-observation HMMs.
文摘In this paper the authors look into the problem of Hidden Markov Models (HMM): the evaluation, the decoding and the learning problem. The authors have explored an approach to increase the effectiveness of HMM in the speech recognition field. Although hidden Markov modeling has significantly improved the performance of current speech-recognition systems, the general problem of completely fluent speaker-independent speech recognition is still far from being solved. For example, there is no system which is capable of reliably recognizing unconstrained conversational speech. Also, there does not exist a good way to infer the language structure from a limited corpus of spoken sentences statistically. Therefore, the authors want to provide an overview of the theory of HMM, discuss the role of statistical methods, and point out a range of theoretical and practical issues that deserve attention and are necessary to understand so as to further advance research in the field of speech recognition.
文摘The authors propose a two-stage method for recognizing driving situations on the basis of driving signals for application to a safe human interface of an in-vehicle information system. In first stage, an unknown driving situation is determined as stopping behavior or non-stopping behavior. In second stage, a Hidden Markov Model (HMM)-based pattern recognition method is used to model and recognize six non-stopping driving situations. The authors attempt to find the optimal HMM configuration to improve the performance of driving situation recognition. Center for Integrated Acoustic Information Research (CLAIR) in-vehicle corpus is used to evaluate the HMM-based recognition method. Driving situation categories are recognized using five driving signals. The proposed method achieves a relative error reduction rate of 30.9% compared to a conventional one-stage based HMMs.
基金Supported by the National Natural Science Foundation of China (No.60172048)
文摘As a kind of statistical method, the technique of Hidden Markov Model (HMM) is widely used for speech recognition. In order to train the HMM to be more effective with much less amount of data, the Subspace Distribution Clustering Hidden Markov Model (SDCHMM), derived from the Continuous Density Hidden Markov Model (CDHMM), is introduced. With parameter tying, a new method to train SDCHMMs is described. Compared with the conventional training method, an SDCHMM recognizer trained by means of the new method achieves higher accuracy and speed. Experiment results show that the SDCHMM recognizer outperforms the CDHMM recognizer on speech recognition of Chinese digits.