期刊文献+
共找到534篇文章
< 1 2 27 >
每页显示 20 50 100
A text to speech interface for Universal Digital Library 被引量:3
1
作者 PRAHALLAD Kishore BLACK Alan 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2005年第11期1229-1234,共6页
The objective of Universal Digital Library (UDL) is to capture all books in digital format. A text to speech (TTS) interface for UDL portal would enable access to the digital content in voice mode, and also provide ac... The objective of Universal Digital Library (UDL) is to capture all books in digital format. A text to speech (TTS) interface for UDL portal would enable access to the digital content in voice mode, and also provide access to the digital content for illiterate and vision-impaired people. Our work focuses on design and implementation of text to speech interface for UDL portal primarily for Indian languages. This paper is aimed at identifying the issues involved in integrating text to speech system into UDL portal and describes the development process of Hindi, Telugu and Tamil voices under Festvox framework using unit selection techniques. We demonstrate the quality of the Tamil and Telugu voices and lay out the plan for integrating the TTS into the UDL portal. 展开更多
关键词 text to speech tts Indian language Universal Digital Library (UDL)
下载PDF
TTS在车载乘客信息系统中的应用
2
作者 汤俊芹 《电声技术》 2024年第1期25-28,共4页
随着从文本到语音(Text To Speech,TTS)技术的发展,其语音效果已经可以达到真人播报效果。基于此,提出将TTS技术应用到车载乘客信息系统中,改变传统预录语音文件报站的方式,极大地提高语音播报的灵活性和可维护性。
关键词 从文本到语音(tts) 乘客信息系统 语音质量
下载PDF
Web Voice Browser Based on an ISLPC Text-to-Speech Algorithm
3
作者 LIAO Rikun JI Yuefeng LI Hui 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1157-1160,共4页
A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-toSpeech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system wit... A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-toSpeech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system with ISLPC speech synthesis and discusses the design and implementation of ISLPC TTS-based Web voice browser. The browser integrates Web technology, Chinese information processing, artificial intelligence and the key technology of Chinese ISLPC speech synthesis. It's a visual and audible web browser that can improve information precision for network users. The evaluation results show that ISLPC-based TTS model has a better performance than other browsers in voice quality and capability of identifying Chinese characters. 展开更多
关键词 improved synchronous linear predictive coding (ISLPC) text-to-speech tts Web voice browser voice quality
下载PDF
基于TTS技术的智能化英语自动翻译系统
4
作者 王渭刚 《信息技术》 2023年第3期117-121,127,共6页
提出基于TTS技术的智能化英语自动翻译系统设计研究。选型并配置文音转换器与语音处理器,以此为基础,引入TTS技术(文本分析、韵律控制与语音合成),结合英语翻译需求,设计系统软件模块,包括连续语音自动切分与标注模块、语音韵律控制模... 提出基于TTS技术的智能化英语自动翻译系统设计研究。选型并配置文音转换器与语音处理器,以此为基础,引入TTS技术(文本分析、韵律控制与语音合成),结合英语翻译需求,设计系统软件模块,包括连续语音自动切分与标注模块、语音韵律控制模块、语音合成模块及语音库裁减模块。通过上述硬件单元与软件模块的设计,实现了智能化英语自动翻译系统的运行。实验数据显示:相较于对比系统,应用设计系统获得的语音韵律控制参数偏差较小,语音自然度因子数值更大,充分表明设计系统英语翻译语音更为精准。 展开更多
关键词 文本分析 英语翻译 语音自动切分标注 语音库裁减 语音韵律控制
下载PDF
A Rule Based System for Speech Language Context Understanding
5
作者 Imran Sarwar Bajwa Muhammad Abbas Choudhary 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期39-42,共4页
Speech or Natural language contents are major tools of communication. This research paper presents a natural language processing based automated system for understanding speech language text. A new rule based model ha... Speech or Natural language contents are major tools of communication. This research paper presents a natural language processing based automated system for understanding speech language text. A new rule based model has been presented for analyzing the natural languages and extracting the relative meanings from the given text. User writes the natural language text in simple English in a few paragraphs and the designed system has a sound ability of analyzing the given script by the user. After composite analysis and extraction of associated information, the designed system gives particular meanings to an assortment of speech language text on the basis of its context. The designed system uses standard speech language rules that are clearly defined for all speech languages as English, Urdu, Chinese, Arabic, French, etc. The designed system provides a quick and reliable way to comprehend speech language context and generate respective meanings. 展开更多
关键词 automatic text understanding speech language processing information extraction language engineering.
下载PDF
A Linguistic and Stylistic Analysis of the Chinese Translation of Obama's Speech at the First Meeting of the Strategic and Economic Dialogue between the United States and China——Through the Lens of Reiss' s Text Type Theory
6
作者 付端凌 《英语广场(学术研究)》 2014年第4期38-42,共5页
According to Reiss’s Text Type theory,a key part of the functionalist approach in translation studies,the source text can be assigned to a text type and to a genre.In making this assignment,the translator can decide ... According to Reiss’s Text Type theory,a key part of the functionalist approach in translation studies,the source text can be assigned to a text type and to a genre.In making this assignment,the translator can decide on the hierarchy of postulates which has to be observed during target-text production(Mona,2005).This essay intends to conduct a linguistic and stylistic analysis of the Chinese translation of Obama’s speech to explore the general approach of the translator(if there is one),by comparing the respective results of the two analyses from the perspective of Katharina Reiss’s Text Type theory.In doing so,critical judgments will accordingly be made as to whether such an approach is justifiable or not. 展开更多
关键词 text Type theory Obama’s speeches speech analysis
下载PDF
Skopos theory and translating strategies of cultural elements in tourism texts
7
作者 MA Jing REN Su-zhen 《Sino-US English Teaching》 2008年第9期34-37,共4页
Tourism texts (shortened as TT) have become a principal means of promoting tourism in China. Cultural elements are abundant in those texts. Therefore, the proper understanding and rendering of such messages is the k... Tourism texts (shortened as TT) have become a principal means of promoting tourism in China. Cultural elements are abundant in those texts. Therefore, the proper understanding and rendering of such messages is the key to the quality of TT translation. Based on Skopos theory the paper focuses on the translating strategies of TT cultural elements. 展开更多
关键词 tourism texts (TT) cultural elements Skopos theory translating strategies
下载PDF
Improving Speech Enhancement Framework via Deep Learning
8
作者 Sung-Jung Hsiao Wen-Tsai Sung 《Computers, Materials & Continua》 SCIE EI 2023年第5期3817-3832,共16页
Speech plays an extremely important role in social activities.Many individuals suffer from a“speech barrier,”which limits their communication with others.In this study,an improved speech recognitionmethod is propose... Speech plays an extremely important role in social activities.Many individuals suffer from a“speech barrier,”which limits their communication with others.In this study,an improved speech recognitionmethod is proposed that addresses the needs of speech-impaired and deaf individuals.A basic improved connectionist temporal classification convolutional neural network(CTC-CNN)architecture acoustic model was constructed by combining a speech database with a deep neural network.Acoustic sensors were used to convert the collected voice signals into text or corresponding voice signals to improve communication.The method can be extended to modern artificial intelligence techniques,with multiple applications such as meeting minutes,medical reports,and verbatim records for cars,sales,etc.For experiments,a modified CTC-CNN was used to train an acoustic model,which showed better performance than the earlier common algorithms.Thus a CTC-CNN baseline acoustic model was constructed and optimized,which reduced the error rate to about 18%and improved the accuracy rate. 展开更多
关键词 Artificial intelligence speech recognition speech to text CTC-CNN
下载PDF
Speech Recognition via CTC-CNN Model
9
作者 Wen-Tsai Sung Hao-WeiKang Sung-Jung Hsiao 《Computers, Materials & Continua》 SCIE EI 2023年第9期3833-3858,共26页
In the speech recognition system,the acoustic model is an important underlying model,and its accuracy directly affects the performance of the entire system.This paper introduces the construction and training process o... In the speech recognition system,the acoustic model is an important underlying model,and its accuracy directly affects the performance of the entire system.This paper introduces the construction and training process of the acoustic model in detail and studies the Connectionist temporal classification(CTC)algorithm,which plays an important role in the end-to-end framework,established a convolutional neural network(CNN)combined with an acoustic model of Connectionist temporal classification to improve the accuracy of speech recognition.This study uses a sound sensor,ReSpeakerMic Array v2.0.1,to convert the collected speech signals into text or corresponding speech signals to improve communication and reduce noise and hardware interference.The baseline acousticmodel in this study faces challenges such as long training time,high error rate,and a certain degree of overfitting.The model is trained through continuous design and improvement of the relevant parameters of the acousticmodel,and finally the performance is selected according to the evaluation index.Excellentmodel,which reduces the error rate to about 18%,thus improving the accuracy rate.Finally,comparative verificationwas carried out from the selection of acoustic feature parameters,the selection of modeling units,and the speaker’s speech rate,which further verified the excellent performance of the CTCCNN_5+BN+Residual model structure.In terms of experiments,to train and verify the CTC-CNN baseline acoustic model,this study uses THCHS-30 and ST-CMDS speech data sets as training data sets,and after 54 epochs of training,the word error rate of the acoustic model training set is 31%,the word error rate of the test set is stable at about 43%.This experiment also considers the surrounding environmental noise.Under the noise level of 80∼90 dB,the accuracy rate is 88.18%,which is the worst performance among all levels.In contrast,at 40–60 dB,the accuracy was as high as 97.33%due to less noise pollution. 展开更多
关键词 Artificial intelligence speech recognition speech to text convolutional neural network automatic speech recognition
下载PDF
The Comparative Study of Two Translated Texts on Huangdirent Translating Styles o'n s Internal Classics——Simple Analysis on the Diffe Culture-Specific Lexicon, Figure of Speech and Four-Chinese-Character Structures
10
作者 杨峥 《海外英语》 2017年第17期5-6,共2页
Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translat... Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translated texts on Neijing. Due to the diversity of editions and different understanding, the translating styles and contents are widely different. This study will focus on the different translating styles on culture-specific lexicon、figure of speech and four-Chinese-character structures in Neijin. 展开更多
关键词 Huangdi’s Internal Classics the comparative Study of translated texts culture-specific lexicon figure of speech four-Chinese-character structures
下载PDF
A HMM-Based System To Diacritize Arabic Text
11
作者 M. S. Khorsheed 《Journal of Software Engineering and Applications》 2012年第12期124-127,共4页
The Arabic language comes under the category of Semitic languages with an entirely different sentence structure in terms of Natural Language Processing. In such languages, two different words may have identical spelli... The Arabic language comes under the category of Semitic languages with an entirely different sentence structure in terms of Natural Language Processing. In such languages, two different words may have identical spelling whereas their pronunciations and meanings are totally different. To remove this ambiguity, special marks are put above or below? the spelling characters to determine the correct pronunciation. These marks are called diacritics and the language that uses them is called a diacritized language. This paper presents a system for Arabic language diacritization using Hid- den Markov Models (HMMs). The system employs the renowned HMM Tool Kit? (HTK). Each single diacritic is represented as a separate model. The concatenation of output models is coupled with the input? character sequence to form the fully diacritized text. The performance of the proposed system is assessed using a data corpus that includes more than 24000 sentences. 展开更多
关键词 ARABIC Hidden MARKOV MODELS text-to-speech Diacritization
下载PDF
基于层次化Conformer的语音合成
12
作者 吴克伟 韩超 +2 位作者 孙永宣 彭梦昊 谢昭 《计算机科学》 CSCD 北大核心 2024年第2期161-171,共11页
语音合成需要将输入语句的文本转换为包含音素、单词和语句的语音信号。现有语音合成方法将语句看作一个整体,难以准确地合成出不同长度的语音信号。通过分析语音信号中蕴含的层次化关系,分别设计基于Conformer的层次化文本编码器和基于... 语音合成需要将输入语句的文本转换为包含音素、单词和语句的语音信号。现有语音合成方法将语句看作一个整体,难以准确地合成出不同长度的语音信号。通过分析语音信号中蕴含的层次化关系,分别设计基于Conformer的层次化文本编码器和基于Conformer的层次化语音编码器,并提出了一种基于层次化文本-语音Conformer的语音合成模型。首先,该模型根据输入文本信号的长度,构建层次化文本编码器,包括音素级、单词级、语句级文本编码器3个层次,不同层次的文本编码器描述不同长度的文本信息;并使用Conformer的注意力机制来学习该长度信号中不同时间特征之间的关系。利用层次化的文本编码器,能够找出语句中不同长度需要强调的信息,有效实现不同长度的文本特征提取,缓解合成的语音信号持续时间长度不确定的问题。其次,层次化语音编码器包括音素级、单词级、语句级语音编码器3个层次。每个层次的语音编码器将文本特征作为Conformer的查询向量,将语音特征作为Conformer的关键字向量和值向量,来提取文本特征和语音特征的匹配关系。利用层次化的语音编码器和文本语音匹配关系,可以缓解不同长度语音信号合成不准确的问题。所提模型的层次化文本-语音编码器可以灵活地嵌入现有的多种解码器中,通过文本和语音之间的互补,提供更为可靠的语音合成结果。在LJSpeech和LibriTTS两个数据集上进行实验验证,实验结果表明,所提方法的梅尔倒谱失真小于现有语音合成方法。 展开更多
关键词 语音合成 文本编码器 语音编码器 层次化模型 CONFORMER
下载PDF
TTS在智能公交报站系统中的应用 被引量:10
13
作者 黄华 仲元昌 《自动化仪表》 CAS 北大核心 2012年第8期24-26,30,共4页
在传统公交车报站系统中,采用"录音-存储-回放"的方法要求存储器的容量较大。为解决这一问题,结合GPS技术和TTS技术,设计了一种新型公交报站系统。该系统以TTS方式输出语音,使存储器存储的不是语音信号波形参数信息,而是播报... 在传统公交车报站系统中,采用"录音-存储-回放"的方法要求存储器的容量较大。为解决这一问题,结合GPS技术和TTS技术,设计了一种新型公交报站系统。该系统以TTS方式输出语音,使存储器存储的不是语音信号波形参数信息,而是播报语音汉字文本信息。试验结果表明,采用这种方式,存储空间只用到传统方式的17.1%,节省了存储空间。 展开更多
关键词 文语转换(tts) 智能报站 语音合成 GPS 存储器
下载PDF
大规模语音语料库及其在TTS中应用的几个问题 被引量:12
14
作者 章森 刘磊 刁麓弘 《计算机学报》 EI CSCD 北大核心 2010年第4期687-696,共10页
首先介绍了大规模语音语料库以及基于大规模语音语料库的文语转换技术的研究现状,接着介绍了一个大规模连续汉语语音语料库的实例Slib的结构和内容;在此基础上,讨论了面向大规模语音语料库的索引技术,提出了语料库检索中的集合运算和最... 首先介绍了大规模语音语料库以及基于大规模语音语料库的文语转换技术的研究现状,接着介绍了一个大规模连续汉语语音语料库的实例Slib的结构和内容;在此基础上,讨论了面向大规模语音语料库的索引技术,提出了语料库检索中的集合运算和最小包容问题,证明了最小包容问题是NP完全的,给出了求解该问题的贪婪算法以及算法的近似比;最后,讨论了基于集合运算的大规模语音语料库的检索技术在文语转换系统中的应用,特别是在基本语言单位实例的选取问题上实现了一种基于最小包容的优化方法,对提高文语转换系统的自然度有实用价值. 展开更多
关键词 语音语料库 集合运算 文语转换 最小包容 信息检索
下载PDF
汉语文语转换系统(TTS) 被引量:8
15
作者 谌卫军 李建民 +1 位作者 林福宗 张钹 《计算机工程与应用》 CSCD 北大核心 2000年第9期1-3,共3页
文章讨论了一个典型的汉语文语转换系统的实现。首先介绍了系统的整体框架及其各个功能模块,然后分析了系统的特点及其存在的问题,最后从两方面讨论了改进系统的具体思路:提出了一种简单而有效的基音周期提取算法,验证了上下文环境... 文章讨论了一个典型的汉语文语转换系统的实现。首先介绍了系统的整体框架及其各个功能模块,然后分析了系统的特点及其存在的问题,最后从两方面讨论了改进系统的具体思路:提出了一种简单而有效的基音周期提取算法,验证了上下文环境在提高合成语音自然度中的作用。 展开更多
关键词 汉语文语转换系统 语音自然度 标音处理 单音字
下载PDF
主题统觉测验用于自杀风险识别——基于语音及文本特征的机器学习研究
16
作者 杨劲寅 吴雯 +1 位作者 李世佳 张亚 《心理科学》 CSCD 北大核心 2024年第2期485-493,共9页
自杀风险识别是自杀预防的重要环节,但传统的自陈量表筛查存在虚报/漏报率高的局限。通过两步连续实验对主题统觉测验(TAT)进行的改编实现了基于TAT的小程序自助施测方案,并获取音频及文本数据用于机器学习建模,构建了针对自杀意念的自... 自杀风险识别是自杀预防的重要环节,但传统的自陈量表筛查存在虚报/漏报率高的局限。通过两步连续实验对主题统觉测验(TAT)进行的改编实现了基于TAT的小程序自助施测方案,并获取音频及文本数据用于机器学习建模,构建了针对自杀意念的自杀风险识别模型。结果发现,在测验耗时更短的情况下,该模型取得了与前人研究相比综合指数更优的模型效果;词频分析及关键词共线网络分析发现高自杀风险组被试在叙述文本中提及了更多与自杀、自伤相关的词汇以及主题,且使用了更多的排除词。经改编后的TAT小程序施测方案流程标准化且施测便捷,后续可收集更多高质量的样本以构建泛化性能更优的模型,应用于自杀风险识别的辅助评估中。 展开更多
关键词 自杀风险识别 主题统觉测验 机器学习 语音识别 文本分析
下载PDF
中文TTS系统中多音字的一种解决方案 被引量:3
17
作者 张力 薛惠锋 +1 位作者 吴晓军 李慜 《计算机应用与软件》 CSCD 北大核心 2008年第2期143-145,共3页
汉语多音字的情况为中文文语转换TTS(Text To Speech)系统的建立带来了很大的困难。针对中文文语转换系统中的多音字问题,通过构建多音字词库和非多音字词库,将多音字以词汇的形式区分,并且通过对多音字词库和非多音字词库的结构形式的... 汉语多音字的情况为中文文语转换TTS(Text To Speech)系统的建立带来了很大的困难。针对中文文语转换系统中的多音字问题,通过构建多音字词库和非多音字词库,将多音字以词汇的形式区分,并且通过对多音字词库和非多音字词库的结构形式的构造,减少了词库的冗余信息,提高了词汇语音的查找速率。实验证明该方案可以解决中文TTS中的多音字问题。 展开更多
关键词 文语转换 多音字 语音合成
下载PDF
融合音素的缅甸语语音识别文本纠错
18
作者 陈璐 董凌 +3 位作者 王文君 王剑 余正涛 高盛祥 《计算机工程与科学》 CSCD 北大核心 2024年第6期1121-1127,共7页
缅甸语语音识别文本中包含大量的同音和空格错误,使用通用的文本语义信息纠正错误字符,对缅甸语空格和同音错误定位和纠正不准确。考虑到缅甸语是一种声调语言,并且音素中包含了声调信息,因此提出融合音素的缅甸语语音识别文本纠错方法... 缅甸语语音识别文本中包含大量的同音和空格错误,使用通用的文本语义信息纠正错误字符,对缅甸语空格和同音错误定位和纠正不准确。考虑到缅甸语是一种声调语言,并且音素中包含了声调信息,因此提出融合音素的缅甸语语音识别文本纠错方法。通过参数共享策略对转录文本及其音素进行联合建模,利用音素信息辅助检测并纠正缅甸语同音和空格错误。实验结果表明,本文所提方法相比基线方法ConvSeq2Seq,在缅甸语语音识别纠错任务中的F1值提升了85.97%,达到了79.15%。 展开更多
关键词 缅甸语 语音识别文本纠错 音素 共享参数 BERT
下载PDF
基于元学习自适应的小样本语音合成
19
作者 吴郅昊 迟子秋 +1 位作者 肖婷 王喆 《计算机应用》 CSCD 北大核心 2024年第5期1629-1635,共7页
在小样本条件下的语音合成(TTS)要求在仅有少量样本的情况下合成与原说话人相似的语音,然而现有的小样本语音合成面临如下问题:如何快速适配新说话人,并且在保证语音质量的情况下提高生成语音与说话人的相似性。现有模型在适配新说话人... 在小样本条件下的语音合成(TTS)要求在仅有少量样本的情况下合成与原说话人相似的语音,然而现有的小样本语音合成面临如下问题:如何快速适配新说话人,并且在保证语音质量的情况下提高生成语音与说话人的相似性。现有模型在适配新说话人的过程中,很少考虑到在不同适配阶段模型特征的变化规律,导致生成语音不能在保证语音质量的情况下快速提升语音相似性。为了解决上述问题,提出一种使用元学习指导模型适配新说话人的方法,模型中通过元特征模块对适配过程进行指导,在适配新说话人过程中提升语音相似度的同时保证生成语音质量;并通过步数编码器区分不同的适配阶段,以提升模型适配新说话人的速度。在Libri-TTS与VCTK数据集上通过主观与客观评价指标,在不同的适配步数下对现有快速适配新说话人的方法进行了比较,实验结果表明所提方法动态时间规整的梅尔倒谱失真(DTW-MCD)分别为7.4502与6.5243,在合成语音的相似度上优于其他元学习方法,并且能够更快适配新的说话人。 展开更多
关键词 小样本生成 语音合成 元学习 说话人适配 特征提取
下载PDF
基于TTS的实验自主学习系统的实现 被引量:1
20
作者 祝诗平 黄华 +1 位作者 唐超 林跃跃 《西南师范大学学报(自然科学版)》 CAS CSCD 北大核心 2012年第1期88-91,共4页
在大多数高等教育院校中,电子技术实验学时少、内容多、内容复杂,表现出学生难学、老师难教的现状.利用TTS及单片机技术研制实验自主学习系统,即将教师在课堂上要讲的相关内容(如实验方法、实验步骤等)以文本方式存储,然后系统将文本转... 在大多数高等教育院校中,电子技术实验学时少、内容多、内容复杂,表现出学生难学、老师难教的现状.利用TTS及单片机技术研制实验自主学习系统,即将教师在课堂上要讲的相关内容(如实验方法、实验步骤等)以文本方式存储,然后系统将文本转换成语音方式输出,播放的内容同时在LCD上同步显示.学生利用本系统在课余时间进行实验可以解决没有教师指导的难题,该系统现在已经应用到模拟电子技术实验中.经过实验表明,该系统取得了较好的应用效果、具有较强实用性和推广性. 展开更多
关键词 实验教学 文语转换 自主学习
下载PDF
上一页 1 2 27 下一页 到第
使用帮助 返回顶部