Voice portrait technology has explored and established the relationship between speakers’ voices and their facialfeatures, aiming to generate corresponding facial characteristics by providing the voice of an unknown ...Voice portrait technology has explored and established the relationship between speakers’ voices and their facialfeatures, aiming to generate corresponding facial characteristics by providing the voice of an unknown speaker.Due to its powerful advantages in image generation, Generative Adversarial Networks (GANs) have now beenwidely applied across various fields. The existing Voice2Face methods for voice portraits are primarily based onGANs trained on voice-face paired datasets. However, voice portrait models solely constructed on GANs facelimitations in image generation quality and struggle to maintain facial similarity. Additionally, the training processis relatively unstable, thereby affecting the overall generative performance of the model. To overcome the abovechallenges,wepropose a novel deepGenerativeAdversarialNetworkmodel for audio-visual synthesis, namedAVPGAN(Attention-enhanced Voice Portrait Model using Generative Adversarial Network). This model is based ona convolutional attention mechanism and is capable of generating corresponding facial images from the voice ofan unknown speaker. Firstly, to address the issue of training instability, we integrate convolutional neural networkswith deep GANs. In the network architecture, we apply spectral normalization to constrain the variation of thediscriminator, preventing issues such as mode collapse. Secondly, to enhance the model’s ability to extract relevantfeatures between the two modalities, we propose a voice portrait model based on convolutional attention. Thismodel learns the mapping relationship between voice and facial features in a common space from both channeland spatial dimensions independently. Thirdly, to enhance the quality of generated faces, we have incorporated adegradation removal module and utilized pretrained facial GANs as facial priors to repair and enhance the clarityof the generated facial images. Experimental results demonstrate that our AVP-GAN achieved a cosine similarity of0.511, outperforming the performance of our comparison model, and effectively achieved the generation of highqualityfacial images corresponding to a speaker’s voice.展开更多
Background:Chronic diseases are becoming a critical challenge to the aging Chinese population.Biobanks with extensive genomic and environmental data offer opportunities to elucidate the complex gene-environment intera...Background:Chronic diseases are becoming a critical challenge to the aging Chinese population.Biobanks with extensive genomic and environmental data offer opportunities to elucidate the complex gene-environment interactions underlying their aetiology.Genome-wide genotyping array remains an efficient approach for large-scale genomic data collection.However,most commercial arrays have reduced performance for biobanking in the Chinese population.Materials and methods:Deep whole-genome sequencing data from 2641 Chinese individuals were used as a reference to develop the CAS array,a custom-designed genotyping array for precision medicine.Evaluation of the array was performed by comparing data from 384 individuals assayed both by the array and whole-genome sequencing.Validation of its mitochondrial copy number estimating capacity was conducted by examining its association with established covariates among 10162 Chinese elderly.Results:The CAS Array adopts the proven Axiom technology and is restricted to 652429 single-nucleotide polymorphism(SNP)markers.Its call rate of 99.79% and concordance rate of 99.89% are both higher than for commercial arrays.Its imputation-based genome coverage reached 98.3% for common SNPs and 63.0% for low-frequency SNPs,both comparable to commercial arrays with larger SNP capacity.After validating its mitochondrial copy number estimates,we developed a publicly available software tool to facilitate the array utility.Conclusion:Based on recent advances in genomic science,we designed and implemented a high-throughput and low-cost genotyping array.It is more cost-effective than commercial arrays for large-scale Chinese biobanking.展开更多
基金the Double First-Class Innovation Research Projectfor People’s Public Security University of China (No. 2023SYL08).
文摘Voice portrait technology has explored and established the relationship between speakers’ voices and their facialfeatures, aiming to generate corresponding facial characteristics by providing the voice of an unknown speaker.Due to its powerful advantages in image generation, Generative Adversarial Networks (GANs) have now beenwidely applied across various fields. The existing Voice2Face methods for voice portraits are primarily based onGANs trained on voice-face paired datasets. However, voice portrait models solely constructed on GANs facelimitations in image generation quality and struggle to maintain facial similarity. Additionally, the training processis relatively unstable, thereby affecting the overall generative performance of the model. To overcome the abovechallenges,wepropose a novel deepGenerativeAdversarialNetworkmodel for audio-visual synthesis, namedAVPGAN(Attention-enhanced Voice Portrait Model using Generative Adversarial Network). This model is based ona convolutional attention mechanism and is capable of generating corresponding facial images from the voice ofan unknown speaker. Firstly, to address the issue of training instability, we integrate convolutional neural networkswith deep GANs. In the network architecture, we apply spectral normalization to constrain the variation of thediscriminator, preventing issues such as mode collapse. Secondly, to enhance the model’s ability to extract relevantfeatures between the two modalities, we propose a voice portrait model based on convolutional attention. Thismodel learns the mapping relationship between voice and facial features in a common space from both channeland spatial dimensions independently. Thirdly, to enhance the quality of generated faces, we have incorporated adegradation removal module and utilized pretrained facial GANs as facial priors to repair and enhance the clarityof the generated facial images. Experimental results demonstrate that our AVP-GAN achieved a cosine similarity of0.511, outperforming the performance of our comparison model, and effectively achieved the generation of highqualityfacial images corresponding to a speaker’s voice.
基金supported by the National Key R&D Program of China(Grant No.2018YFC2001003)the Strategic Priority Research Program of the Chinese Academy of Sciences(category B,Grant No.XDB38020100).
文摘Background:Chronic diseases are becoming a critical challenge to the aging Chinese population.Biobanks with extensive genomic and environmental data offer opportunities to elucidate the complex gene-environment interactions underlying their aetiology.Genome-wide genotyping array remains an efficient approach for large-scale genomic data collection.However,most commercial arrays have reduced performance for biobanking in the Chinese population.Materials and methods:Deep whole-genome sequencing data from 2641 Chinese individuals were used as a reference to develop the CAS array,a custom-designed genotyping array for precision medicine.Evaluation of the array was performed by comparing data from 384 individuals assayed both by the array and whole-genome sequencing.Validation of its mitochondrial copy number estimating capacity was conducted by examining its association with established covariates among 10162 Chinese elderly.Results:The CAS Array adopts the proven Axiom technology and is restricted to 652429 single-nucleotide polymorphism(SNP)markers.Its call rate of 99.79% and concordance rate of 99.89% are both higher than for commercial arrays.Its imputation-based genome coverage reached 98.3% for common SNPs and 63.0% for low-frequency SNPs,both comparable to commercial arrays with larger SNP capacity.After validating its mitochondrial copy number estimates,we developed a publicly available software tool to facilitate the array utility.Conclusion:Based on recent advances in genomic science,we designed and implemented a high-throughput and low-cost genotyping array.It is more cost-effective than commercial arrays for large-scale Chinese biobanking.