The objective of Universal Digital Library (UDL) is to capture all books in digital format. A text to speech (TTS) interface for UDL portal would enable access to the digital content in voice mode, and also provide ac...The objective of Universal Digital Library (UDL) is to capture all books in digital format. A text to speech (TTS) interface for UDL portal would enable access to the digital content in voice mode, and also provide access to the digital content for illiterate and vision-impaired people. Our work focuses on design and implementation of text to speech interface for UDL portal primarily for Indian languages. This paper is aimed at identifying the issues involved in integrating text to speech system into UDL portal and describes the development process of Hindi, Telugu and Tamil voices under Festvox framework using unit selection techniques. We demonstrate the quality of the Tamil and Telugu voices and lay out the plan for integrating the TTS into the UDL portal.展开更多
A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-toSpeech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system wit...A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-toSpeech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system with ISLPC speech synthesis and discusses the design and implementation of ISLPC TTS-based Web voice browser. The browser integrates Web technology, Chinese information processing, artificial intelligence and the key technology of Chinese ISLPC speech synthesis. It's a visual and audible web browser that can improve information precision for network users. The evaluation results show that ISLPC-based TTS model has a better performance than other browsers in voice quality and capability of identifying Chinese characters.展开更多
Speech or Natural language contents are major tools of communication. This research paper presents a natural language processing based automated system for understanding speech language text. A new rule based model ha...Speech or Natural language contents are major tools of communication. This research paper presents a natural language processing based automated system for understanding speech language text. A new rule based model has been presented for analyzing the natural languages and extracting the relative meanings from the given text. User writes the natural language text in simple English in a few paragraphs and the designed system has a sound ability of analyzing the given script by the user. After composite analysis and extraction of associated information, the designed system gives particular meanings to an assortment of speech language text on the basis of its context. The designed system uses standard speech language rules that are clearly defined for all speech languages as English, Urdu, Chinese, Arabic, French, etc. The designed system provides a quick and reliable way to comprehend speech language context and generate respective meanings.展开更多
According to Reiss’s Text Type theory,a key part of the functionalist approach in translation studies,the source text can be assigned to a text type and to a genre.In making this assignment,the translator can decide ...According to Reiss’s Text Type theory,a key part of the functionalist approach in translation studies,the source text can be assigned to a text type and to a genre.In making this assignment,the translator can decide on the hierarchy of postulates which has to be observed during target-text production(Mona,2005).This essay intends to conduct a linguistic and stylistic analysis of the Chinese translation of Obama’s speech to explore the general approach of the translator(if there is one),by comparing the respective results of the two analyses from the perspective of Katharina Reiss’s Text Type theory.In doing so,critical judgments will accordingly be made as to whether such an approach is justifiable or not.展开更多
Tourism texts (shortened as TT) have become a principal means of promoting tourism in China. Cultural elements are abundant in those texts. Therefore, the proper understanding and rendering of such messages is the k...Tourism texts (shortened as TT) have become a principal means of promoting tourism in China. Cultural elements are abundant in those texts. Therefore, the proper understanding and rendering of such messages is the key to the quality of TT translation. Based on Skopos theory the paper focuses on the translating strategies of TT cultural elements.展开更多
Speech plays an extremely important role in social activities.Many individuals suffer from a“speech barrier,”which limits their communication with others.In this study,an improved speech recognitionmethod is propose...Speech plays an extremely important role in social activities.Many individuals suffer from a“speech barrier,”which limits their communication with others.In this study,an improved speech recognitionmethod is proposed that addresses the needs of speech-impaired and deaf individuals.A basic improved connectionist temporal classification convolutional neural network(CTC-CNN)architecture acoustic model was constructed by combining a speech database with a deep neural network.Acoustic sensors were used to convert the collected voice signals into text or corresponding voice signals to improve communication.The method can be extended to modern artificial intelligence techniques,with multiple applications such as meeting minutes,medical reports,and verbatim records for cars,sales,etc.For experiments,a modified CTC-CNN was used to train an acoustic model,which showed better performance than the earlier common algorithms.Thus a CTC-CNN baseline acoustic model was constructed and optimized,which reduced the error rate to about 18%and improved the accuracy rate.展开更多
In the speech recognition system,the acoustic model is an important underlying model,and its accuracy directly affects the performance of the entire system.This paper introduces the construction and training process o...In the speech recognition system,the acoustic model is an important underlying model,and its accuracy directly affects the performance of the entire system.This paper introduces the construction and training process of the acoustic model in detail and studies the Connectionist temporal classification(CTC)algorithm,which plays an important role in the end-to-end framework,established a convolutional neural network(CNN)combined with an acoustic model of Connectionist temporal classification to improve the accuracy of speech recognition.This study uses a sound sensor,ReSpeakerMic Array v2.0.1,to convert the collected speech signals into text or corresponding speech signals to improve communication and reduce noise and hardware interference.The baseline acousticmodel in this study faces challenges such as long training time,high error rate,and a certain degree of overfitting.The model is trained through continuous design and improvement of the relevant parameters of the acousticmodel,and finally the performance is selected according to the evaluation index.Excellentmodel,which reduces the error rate to about 18%,thus improving the accuracy rate.Finally,comparative verificationwas carried out from the selection of acoustic feature parameters,the selection of modeling units,and the speaker’s speech rate,which further verified the excellent performance of the CTCCNN_5+BN+Residual model structure.In terms of experiments,to train and verify the CTC-CNN baseline acoustic model,this study uses THCHS-30 and ST-CMDS speech data sets as training data sets,and after 54 epochs of training,the word error rate of the acoustic model training set is 31%,the word error rate of the test set is stable at about 43%.This experiment also considers the surrounding environmental noise.Under the noise level of 80∼90 dB,the accuracy rate is 88.18%,which is the worst performance among all levels.In contrast,at 40–60 dB,the accuracy was as high as 97.33%due to less noise pollution.展开更多
Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translat...Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translated texts on Neijing. Due to the diversity of editions and different understanding, the translating styles and contents are widely different. This study will focus on the different translating styles on culture-specific lexicon、figure of speech and four-Chinese-character structures in Neijin.展开更多
The Arabic language comes under the category of Semitic languages with an entirely different sentence structure in terms of Natural Language Processing. In such languages, two different words may have identical spelli...The Arabic language comes under the category of Semitic languages with an entirely different sentence structure in terms of Natural Language Processing. In such languages, two different words may have identical spelling whereas their pronunciations and meanings are totally different. To remove this ambiguity, special marks are put above or below? the spelling characters to determine the correct pronunciation. These marks are called diacritics and the language that uses them is called a diacritized language. This paper presents a system for Arabic language diacritization using Hid- den Markov Models (HMMs). The system employs the renowned HMM Tool Kit? (HTK). Each single diacritic is represented as a separate model. The concatenation of output models is coupled with the input? character sequence to form the fully diacritized text. The performance of the proposed system is assessed using a data corpus that includes more than 24000 sentences.展开更多
汉语多音字的情况为中文文语转换TTS(Text To Speech)系统的建立带来了很大的困难。针对中文文语转换系统中的多音字问题,通过构建多音字词库和非多音字词库,将多音字以词汇的形式区分,并且通过对多音字词库和非多音字词库的结构形式的...汉语多音字的情况为中文文语转换TTS(Text To Speech)系统的建立带来了很大的困难。针对中文文语转换系统中的多音字问题,通过构建多音字词库和非多音字词库,将多音字以词汇的形式区分,并且通过对多音字词库和非多音字词库的结构形式的构造,减少了词库的冗余信息,提高了词汇语音的查找速率。实验证明该方案可以解决中文TTS中的多音字问题。展开更多
文摘The objective of Universal Digital Library (UDL) is to capture all books in digital format. A text to speech (TTS) interface for UDL portal would enable access to the digital content in voice mode, and also provide access to the digital content for illiterate and vision-impaired people. Our work focuses on design and implementation of text to speech interface for UDL portal primarily for Indian languages. This paper is aimed at identifying the issues involved in integrating text to speech system into UDL portal and describes the development process of Hindi, Telugu and Tamil voices under Festvox framework using unit selection techniques. We demonstrate the quality of the Tamil and Telugu voices and lay out the plan for integrating the TTS into the UDL portal.
基金Supported by the National High-Technology Re-search and Development Program(2005AA122210) the National Out-standing Youth Foundation (60325104)
文摘A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-toSpeech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system with ISLPC speech synthesis and discusses the design and implementation of ISLPC TTS-based Web voice browser. The browser integrates Web technology, Chinese information processing, artificial intelligence and the key technology of Chinese ISLPC speech synthesis. It's a visual and audible web browser that can improve information precision for network users. The evaluation results show that ISLPC-based TTS model has a better performance than other browsers in voice quality and capability of identifying Chinese characters.
文摘Speech or Natural language contents are major tools of communication. This research paper presents a natural language processing based automated system for understanding speech language text. A new rule based model has been presented for analyzing the natural languages and extracting the relative meanings from the given text. User writes the natural language text in simple English in a few paragraphs and the designed system has a sound ability of analyzing the given script by the user. After composite analysis and extraction of associated information, the designed system gives particular meanings to an assortment of speech language text on the basis of its context. The designed system uses standard speech language rules that are clearly defined for all speech languages as English, Urdu, Chinese, Arabic, French, etc. The designed system provides a quick and reliable way to comprehend speech language context and generate respective meanings.
文摘According to Reiss’s Text Type theory,a key part of the functionalist approach in translation studies,the source text can be assigned to a text type and to a genre.In making this assignment,the translator can decide on the hierarchy of postulates which has to be observed during target-text production(Mona,2005).This essay intends to conduct a linguistic and stylistic analysis of the Chinese translation of Obama’s speech to explore the general approach of the translator(if there is one),by comparing the respective results of the two analyses from the perspective of Katharina Reiss’s Text Type theory.In doing so,critical judgments will accordingly be made as to whether such an approach is justifiable or not.
文摘Tourism texts (shortened as TT) have become a principal means of promoting tourism in China. Cultural elements are abundant in those texts. Therefore, the proper understanding and rendering of such messages is the key to the quality of TT translation. Based on Skopos theory the paper focuses on the translating strategies of TT cultural elements.
基金This research was supported by the Department of Electrical Engineering at National Chin-Yi University of Technology.The authors would like to thank the National Chin-Yi University of Technology,TakmingUniversity of Science and Technology,Taiwan,for supporting this research.
文摘Speech plays an extremely important role in social activities.Many individuals suffer from a“speech barrier,”which limits their communication with others.In this study,an improved speech recognitionmethod is proposed that addresses the needs of speech-impaired and deaf individuals.A basic improved connectionist temporal classification convolutional neural network(CTC-CNN)architecture acoustic model was constructed by combining a speech database with a deep neural network.Acoustic sensors were used to convert the collected voice signals into text or corresponding voice signals to improve communication.The method can be extended to modern artificial intelligence techniques,with multiple applications such as meeting minutes,medical reports,and verbatim records for cars,sales,etc.For experiments,a modified CTC-CNN was used to train an acoustic model,which showed better performance than the earlier common algorithms.Thus a CTC-CNN baseline acoustic model was constructed and optimized,which reduced the error rate to about 18%and improved the accuracy rate.
基金Supported by the Department of Electrical Engineering at National Chin-Yi University of TechnologyNational Chin-Yi University of Technology,TakmingUniversity of Science and Technology,Taiwan,for supporting this research。
文摘In the speech recognition system,the acoustic model is an important underlying model,and its accuracy directly affects the performance of the entire system.This paper introduces the construction and training process of the acoustic model in detail and studies the Connectionist temporal classification(CTC)algorithm,which plays an important role in the end-to-end framework,established a convolutional neural network(CNN)combined with an acoustic model of Connectionist temporal classification to improve the accuracy of speech recognition.This study uses a sound sensor,ReSpeakerMic Array v2.0.1,to convert the collected speech signals into text or corresponding speech signals to improve communication and reduce noise and hardware interference.The baseline acousticmodel in this study faces challenges such as long training time,high error rate,and a certain degree of overfitting.The model is trained through continuous design and improvement of the relevant parameters of the acousticmodel,and finally the performance is selected according to the evaluation index.Excellentmodel,which reduces the error rate to about 18%,thus improving the accuracy rate.Finally,comparative verificationwas carried out from the selection of acoustic feature parameters,the selection of modeling units,and the speaker’s speech rate,which further verified the excellent performance of the CTCCNN_5+BN+Residual model structure.In terms of experiments,to train and verify the CTC-CNN baseline acoustic model,this study uses THCHS-30 and ST-CMDS speech data sets as training data sets,and after 54 epochs of training,the word error rate of the acoustic model training set is 31%,the word error rate of the test set is stable at about 43%.This experiment also considers the surrounding environmental noise.Under the noise level of 80∼90 dB,the accuracy rate is 88.18%,which is the worst performance among all levels.In contrast,at 40–60 dB,the accuracy was as high as 97.33%due to less noise pollution.
文摘Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translated texts on Neijing. Due to the diversity of editions and different understanding, the translating styles and contents are widely different. This study will focus on the different translating styles on culture-specific lexicon、figure of speech and four-Chinese-character structures in Neijin.
文摘The Arabic language comes under the category of Semitic languages with an entirely different sentence structure in terms of Natural Language Processing. In such languages, two different words may have identical spelling whereas their pronunciations and meanings are totally different. To remove this ambiguity, special marks are put above or below? the spelling characters to determine the correct pronunciation. These marks are called diacritics and the language that uses them is called a diacritized language. This paper presents a system for Arabic language diacritization using Hid- den Markov Models (HMMs). The system employs the renowned HMM Tool Kit? (HTK). Each single diacritic is represented as a separate model. The concatenation of output models is coupled with the input? character sequence to form the fully diacritized text. The performance of the proposed system is assessed using a data corpus that includes more than 24000 sentences.
文摘汉语多音字的情况为中文文语转换TTS(Text To Speech)系统的建立带来了很大的困难。针对中文文语转换系统中的多音字问题,通过构建多音字词库和非多音字词库,将多音字以词汇的形式区分,并且通过对多音字词库和非多音字词库的结构形式的构造,减少了词库的冗余信息,提高了词汇语音的查找速率。实验证明该方案可以解决中文TTS中的多音字问题。