期刊文献+

基于深度学习的蒙古语AI合成主播 被引量:1

Mongolian AI Composite Anchor Based on Deep Learning
下载PDF
导出
摘要 目前,汉文AI合成主播已被运用于新闻内容生产和传播,为传统新闻媒体行业开辟了新的发展路径,但蒙古语AI合成主播技术的研究仍处于起步阶段。为了研发蒙古语AI合成主播系统,本文采用深度学习技术提出了基于嘴型分类的蒙古语AI合成主播模型。首先采用ObamaNet模型构建了蒙古语AI合成主播基线系统,因为基线系统的时间开销大,提出了基于嘴型分类的蒙古语AI合成主播模型,此方法使用9种嘴型标签代表所有的嘴型状态,将得到的语音特征同步到不同的嘴型,根据得到的嘴型选择候选帧,得到AI主播合成视频。文章构建了蒙古语AI合成主播视频语料库,并以此为基础进行了实验比较。结果表明,文中提出的模型可以生成自然度较好的蒙古语AI合成主播视频。 At present,Chinese AI composite anchor has been used in the production and dissemination of news content,opens up a new development path for the traditional news media industry.However,the research on Mongolian AI composite anchor was still in its infancy.In order to develop the Mongolian AI composite anchor system,we use deep learning technology to propose a Mongolian AI composite anchor model based on mouth shape classification.First,the Mongolian AI composite anchor baseline system is constructed using the ObamaNet model.Because of the baseline system's large time cost,we proposed a Mongolian AI composite anchor model based on mouth shape classification.This method uses 9 mouth shapes to represent all mouth states,synchronizes the speech features to different mouth shapes,and selects candidate frames according to the mouth shapes to obtain AI anchor composite video.We also construct a Mongolian AI composite anchor video corpus,and make experimental comparison based on it.The experimental results show that the proposed model can generate Mongolian AI composite video with good naturalness.
作者 宝音都古楞 飞龙 王炜华 张晖 董林坤 BAOYINDUGULENG;FEI LONG;WANG Weihua;ZHANG Hui;DONG Linkun(College of Computer Science,Inner Mongolia University,Huhhot 010021,China;National&Local Joint Engineering Research Center of Intelligent Information Processing Technology for Mongolian,Huhhot O10021,China;Inner Mongolia Key Laboratory of Mongolian Information Processing Technology,Huhhot 010021,China)
出处 《中央民族大学学报(自然科学版)》 2023年第2期31-40,共10页 Journal of Minzu University of China(Natural Sciences Edition)
基金 内蒙古自治区科技计划项目(2021GG0158)。
关键词 AI合成主播 蒙古语 多模态学习 嘴型同步 人脸重构 AI composite anchor mongolian multimodal learning lip sync face reconstruction
  • 相关文献

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部