期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
MusicFace: Music-driven expressive singing face synthesis
1
作者 Pengfei Liu Wenjin Deng +5 位作者 Hengda Li Jintai Wang Yinglin Zheng Yiwei Ding Xiaohu Guo Ming Zeng 《Computational Visual Media》 SCIE EI CSCD 2024年第1期119-136,共18页
It remains an interesting and challenging problem to synthesize a vivid and realistic singing face driven by music. In this paper, we present a method for this task with natural motions for the lips, facial expression... It remains an interesting and challenging problem to synthesize a vivid and realistic singing face driven by music. In this paper, we present a method for this task with natural motions for the lips, facial expression, head pose, and eyes. Due to the coupling of mixed information for the human voice and backing music in common music audio signals, we design a decouple-and-fuse strategy to tackle the challenge. We first decompose the input music audio into a human voice stream and a backing music stream. Due to the implicit and complicated correlation between the two-stream input signals and the dynamics of the facial expressions, head motions, and eye states, we model their relationship with an attention scheme, where the effects of the two streams are fused seamlessly. Furthermore, to improve the expressivenes of the generated results, we decompose head movement generation in terms of speed and direction, and decompose eye state generation into short-term blinking and long-term eye closing, modeling them separately. We have also built a novel dataset, SingingFace, to support training and evaluation of models for this task, including future work on this topic. Extensive experiments and a user study show that our proposed method is capable of synthesizing vivid singing faces, qualitatively and quantitatively better than the prior state-of-the-art. 展开更多
关键词 face synthesis SINGING MUSIC generative adversarial network
原文传递
Attention-Enhanced Voice Portrait Model Using Generative Adversarial Network
2
作者 Jingyi Mao Yuchen Zhou +3 位作者 YifanWang Junyu Li Ziqing Liu Fanliang Bu 《Computers, Materials & Continua》 SCIE EI 2024年第4期837-855,共19页
Voice portrait technology has explored and established the relationship between speakers’ voices and their facialfeatures, aiming to generate corresponding facial characteristics by providing the voice of an unknown ... Voice portrait technology has explored and established the relationship between speakers’ voices and their facialfeatures, aiming to generate corresponding facial characteristics by providing the voice of an unknown speaker.Due to its powerful advantages in image generation, Generative Adversarial Networks (GANs) have now beenwidely applied across various fields. The existing Voice2Face methods for voice portraits are primarily based onGANs trained on voice-face paired datasets. However, voice portrait models solely constructed on GANs facelimitations in image generation quality and struggle to maintain facial similarity. Additionally, the training processis relatively unstable, thereby affecting the overall generative performance of the model. To overcome the abovechallenges,wepropose a novel deepGenerativeAdversarialNetworkmodel for audio-visual synthesis, namedAVPGAN(Attention-enhanced Voice Portrait Model using Generative Adversarial Network). This model is based ona convolutional attention mechanism and is capable of generating corresponding facial images from the voice ofan unknown speaker. Firstly, to address the issue of training instability, we integrate convolutional neural networkswith deep GANs. In the network architecture, we apply spectral normalization to constrain the variation of thediscriminator, preventing issues such as mode collapse. Secondly, to enhance the model’s ability to extract relevantfeatures between the two modalities, we propose a voice portrait model based on convolutional attention. Thismodel learns the mapping relationship between voice and facial features in a common space from both channeland spatial dimensions independently. Thirdly, to enhance the quality of generated faces, we have incorporated adegradation removal module and utilized pretrained facial GANs as facial priors to repair and enhance the clarityof the generated facial images. Experimental results demonstrate that our AVP-GAN achieved a cosine similarity of0.511, outperforming the performance of our comparison model, and effectively achieved the generation of highqualityfacial images corresponding to a speaker’s voice. 展开更多
关键词 Cross-modal generation GANs voice portrait technology face synthesis
下载PDF
单元式超前支架在综放工作面巷道支护中的应用
3
作者 解洪梨 孔政 孔飞 《现代工业经济和信息化》 2024年第2期130-132,共3页
为解决单体液压支柱配合铰接顶梁支护强度小、劳动强度大及迈步前移式超前支架对顶板反复支撑破坏顶板的问题,唐阳煤矿首次在431综放工作面使用了单元式超前支架对巷道进行超前支护。详细介绍了单元式超前支架的安装、防倒防滑措施、超... 为解决单体液压支柱配合铰接顶梁支护强度小、劳动强度大及迈步前移式超前支架对顶板反复支撑破坏顶板的问题,唐阳煤矿首次在431综放工作面使用了单元式超前支架对巷道进行超前支护。详细介绍了单元式超前支架的安装、防倒防滑措施、超前支护质量标准以及使用过程中的安全技术措施,使用超前支架后,不仅减轻了工人的劳动强度,而且实现了矿井安全、高效生产。 展开更多
关键词 综放工作面 超前支架 巷道支护
下载PDF
Target Detection Algorithm in Crime Recognition Using Artificial Intelligence
4
作者 Abdulsamad A.AL-Marghilani 《Computers, Materials & Continua》 SCIE EI 2022年第4期809-824,共16页
Presently,suspect prediction of crime scenes can be considered as a classification task,which predicts the suspects based on the time,space,and type of crime.Performing digital forensic investigation in a big data env... Presently,suspect prediction of crime scenes can be considered as a classification task,which predicts the suspects based on the time,space,and type of crime.Performing digital forensic investigation in a big data environment poses several challenges to the investigational officer.Besides,the facial sketches are widely employed by the law enforcement agencies for assisting the suspect identification of suspects involved in crime scenes.The sketches utilized in the forensic investigations are either drawn by forensic artists or generated through the computer program(composite sketches)based on the verbal explanation given by the eyewitness or victim.Since this suspect identification process is slow and difficult,it is required to design a technique for a quick and automated facial sketch generation.Machine Learning(ML)and deep learning(DL)models find it useful to automatically support the decision of forensics experts.The challenge is the incorporation of the domain expert knowledge with DL models for developing efficient techniques to make better decisions.In this view,this study develops a new artificial intelligence(AI)based DL model with face sketch synthesis(FSS)for suspect identification(DLFSS-SI)in a big data environment.The proposed method performs preprocessing at the primary stage to improvise the image quality.In addition,the proposed model uses a DL based MobileNet(MN)model for feature extractor,and the hyper parameters of the MobileNet are tuned by quasi oppositional firefly optimization(QOFFO)algorithm.The proposed model automatically draws the sketches of the input facial images.Moreover,a qualitative similarity assessment takes place with the sketch drawn by a professional artist by the eyewitness.If there is a higher resemblance between the two sketches,the suspect will be determined.To validate the effective performance of the DLFSS-SI method,a detailed qualitative and quantitative examination takes place.The experimental outcome stated that the DLFSSSI model has outperformed the compared methods in terms of mean square error(MSE),peak signal to noise ratio(PSNR),average actuary,and average computation time. 展开更多
关键词 Artificial intelligence big data deep learning suspect identification face sketch synthesis
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部