A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses ...A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses the bottom-up to connect the recognized phrase nodes to construct the syn- tactic tree. On the basis of Beijing forest studio Chinese tagged corpus, two experiments are de- signed to select the training parameters and verify the validity of the method. The result shows that the method costs 78. 98 ms and 4. 63 ms to train and test a Chinese sentence of 17. 9 words. The method is a new way to parse the phrase structure grammar for Chinese, and has good generalization ability and fast speed.展开更多
为进一步提高温室黄瓜霜霉病诊断的准确率,构建了一个基于图像处理的温室黄瓜霜霉病诊断系统。针对温室黄瓜栽培现场采集的病害图像,采用基于条件随机场(Conditional random fields,CRF)的图像分割方法进行病斑图像分割,并采用决策树模...为进一步提高温室黄瓜霜霉病诊断的准确率,构建了一个基于图像处理的温室黄瓜霜霉病诊断系统。针对温室黄瓜栽培现场采集的病害图像,采用基于条件随机场(Conditional random fields,CRF)的图像分割方法进行病斑图像分割,并采用决策树模型扩展一元势函数,提高病斑图像分割的准确性;将分割后的病斑图像转换到HSV颜色空间并提取其颜色、纹理和形状等25个特征,利用粗糙集方法进行特征选择与优化;构建了基于径向基核函数的SVM分类器,准确地识别与诊断温室黄瓜霜霉病。系统试验验证结果表明,该系统采用的病斑分割方法,能够克服复杂背景和光照条件的影响,准确地提取病斑图像;采用粗糙集方法能够有效地选择分类特征,将25个初始特征减少到12个,提高了运行效率;黄瓜霜霉病识别准确率达到90%,能够满足设施蔬菜叶部病害诊断的需求。展开更多
Latent Semantic Analysis involves natural language processing techniques for analyzing relationships between a set of documents and the terms they contain, by producing a set of concepts (related to the documents and ...Latent Semantic Analysis involves natural language processing techniques for analyzing relationships between a set of documents and the terms they contain, by producing a set of concepts (related to the documents and terms) called semantic topics. These semantic topics assist search engine users by providing leads to the more relevant document. We develope a novel algorithm called Latent Semantic Manifold (LSM) that can identify the semantic topics in the high-dimensional web data. The LSM algorithm is established upon the concepts of topology and probability. Asearch tool is also developed using the LSM algorithm. This search tool is deployed for two years at two sites in Taiwan: 1) Taipei Medical University Library, Taipei, and 2) Biomedical Engineering Laboratory, Institute of Biomedical Engineering, National Taiwan University, Taipei. We evaluate the effectiveness and efficiency of the LSM algorithm by comparing with other contemporary algorithms. The results show that the LSM algorithm outperforms compared with others. This algorithm can be used to enhance the functionality of currently available search engines.展开更多
Automatic prosodic break detection and annotation are important for both speech understanding and natural speech synthesis. In this paper, we discuss automatic prosodic break detection and feature analysis. The contri...Automatic prosodic break detection and annotation are important for both speech understanding and natural speech synthesis. In this paper, we discuss automatic prosodic break detection and feature analysis. The contributions of the paper are two aspects. One is that we use classifier combination method to detect Mandarin and English prosodic break using acoustic, lexical and syntactic evidence. Our proposed method achieves better performance on both the Mandarin prosodic annotation corpus Annotated Speech Corpus of Chinese Discourse and the English prosodic annotation corpus -- Boston University Radio News Corpus when compared with the baseline system and other researches' experimental results. The other is the feature analysis for prosodic break detection. The functions of different features, such as duration, pitch, energy, and intensity, are analyzed and compared in Mandarin and English prosodic break detection. Based on the feature analysis, we also verify some linguistic conclusions.展开更多
基金Supported by the Science and Technology Innovation Plan of Beijing Institute of Technology(2013)
文摘A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses the bottom-up to connect the recognized phrase nodes to construct the syn- tactic tree. On the basis of Beijing forest studio Chinese tagged corpus, two experiments are de- signed to select the training parameters and verify the validity of the method. The result shows that the method costs 78. 98 ms and 4. 63 ms to train and test a Chinese sentence of 17. 9 words. The method is a new way to parse the phrase structure grammar for Chinese, and has good generalization ability and fast speed.
文摘为进一步提高温室黄瓜霜霉病诊断的准确率,构建了一个基于图像处理的温室黄瓜霜霉病诊断系统。针对温室黄瓜栽培现场采集的病害图像,采用基于条件随机场(Conditional random fields,CRF)的图像分割方法进行病斑图像分割,并采用决策树模型扩展一元势函数,提高病斑图像分割的准确性;将分割后的病斑图像转换到HSV颜色空间并提取其颜色、纹理和形状等25个特征,利用粗糙集方法进行特征选择与优化;构建了基于径向基核函数的SVM分类器,准确地识别与诊断温室黄瓜霜霉病。系统试验验证结果表明,该系统采用的病斑分割方法,能够克服复杂背景和光照条件的影响,准确地提取病斑图像;采用粗糙集方法能够有效地选择分类特征,将25个初始特征减少到12个,提高了运行效率;黄瓜霜霉病识别准确率达到90%,能够满足设施蔬菜叶部病害诊断的需求。
文摘Latent Semantic Analysis involves natural language processing techniques for analyzing relationships between a set of documents and the terms they contain, by producing a set of concepts (related to the documents and terms) called semantic topics. These semantic topics assist search engine users by providing leads to the more relevant document. We develope a novel algorithm called Latent Semantic Manifold (LSM) that can identify the semantic topics in the high-dimensional web data. The LSM algorithm is established upon the concepts of topology and probability. Asearch tool is also developed using the LSM algorithm. This search tool is deployed for two years at two sites in Taiwan: 1) Taipei Medical University Library, Taipei, and 2) Biomedical Engineering Laboratory, Institute of Biomedical Engineering, National Taiwan University, Taipei. We evaluate the effectiveness and efficiency of the LSM algorithm by comparing with other contemporary algorithms. The results show that the LSM algorithm outperforms compared with others. This algorithm can be used to enhance the functionality of currently available search engines.
基金Supported by the National Natural Science Foundation of China under Grant Nos. 90820303,90820011the Natural Science Foundation of Shandong Province of China under Grant No. ZR2011FQ024
文摘Automatic prosodic break detection and annotation are important for both speech understanding and natural speech synthesis. In this paper, we discuss automatic prosodic break detection and feature analysis. The contributions of the paper are two aspects. One is that we use classifier combination method to detect Mandarin and English prosodic break using acoustic, lexical and syntactic evidence. Our proposed method achieves better performance on both the Mandarin prosodic annotation corpus Annotated Speech Corpus of Chinese Discourse and the English prosodic annotation corpus -- Boston University Radio News Corpus when compared with the baseline system and other researches' experimental results. The other is the feature analysis for prosodic break detection. The functions of different features, such as duration, pitch, energy, and intensity, are analyzed and compared in Mandarin and English prosodic break detection. Based on the feature analysis, we also verify some linguistic conclusions.