Latent discriminative representation learning for speaker recognition

导出

摘要 Extracting discriminative speaker-specific representations from speech signals and transforming them into fixed length vectors are key steps in speaker identification and verification systems.In this study,we propose a latent discriminative representation learning method for speaker recognition.We mean that the learned representations in this study are not only discriminative but also relevant.Specifically,we introduce an additional speaker embedded lookup table to explore the relevance between different utterances from the same speaker.Moreover,a reconstruction constraint intended to learn a linear mapping matrix is introduced to make representation discriminative.Experimental results demonstrate that the proposed method outperforms state-of-the-art methods based on the Apollo dataset used in the Fearless Steps Challenge in INTERSPEECH2019 and the TIMIT dataset.

作者 Duolin HUANG Qirong MAO Zhongchen MA Zhishen ZHENG Sidheswar ROUTRYAR Elias-Nii-Noi OCQUAYE

机构地区 School of Computer Science and.Communication Engineering Jiangsu Key Laboratory of Security-Technology for Industrial Cyberspace

出处《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2021年第5期697-708,共12页 信息与电子工程前沿（英文版）

基金 Project supported by the National Natural Science Foundation of China(Nos.U1836220 and 61672267) the Qing Lan Talent Program of Jiangsu Province,China the Jiangsu Province Key Research and Development Plan(Industry Foresight and Key Core Technology)(No.BE2020036)。

关键词 Speaker recognition Latent discriminative representation learning Speaker embedding lookup table Linear mapping matrix

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1Myung-jae KIM,Il-ho YANG,Min-seok KIM,Ha-jin YU.Histogram equalization using a reduced feature set of background speakers' utterances for speaker recognition[J].Frontiers of Information Technology & Electronic Engineering,2017,18(5):738-750. 被引量：1

1RADONJIC Aleksandar,VUJICIC Vladimir.Integer Codes Correcting Single Errors and Random Asymmetric Errors within a Byte[J].Journal of Systems Science & Complexity,2020,33(6):2103-2113.
2Wei Cao,Chunyan Liang,Shuxin Cao.Speaker Verification Based on Log-Likelihood Score Normalization[J].Journal of Computer and Communications,2020,8(11):80-87.
3Yang LIU,Zongwu XIE,Hong LIU.Three-line structured light vision system for non-cooperative satellites in proximity operations[J].Chinese Journal of Aeronautics,2020,33(5):1494-1504. 被引量：4
4刘艳.科技巨头跑步入场造车是门好生意吗?[J].中国科技财富,2021(4):45-48.
5Chunai WU.Development of Verification Device for Starting Lever Pressing Automation of DEM6 Portable Three-cup Anemometer[J].Meteorological and Environmental Research,2021,12(1):31-33.
6陈广敏.PBL教学法在普外科护理带教中的应用效果[J].中国卫生产业,2021,18(8):98-101. 被引量：1
7任静,尚景文,何庆柏,钟文丽.后疫情时代我国物流智能配送发展现状及策略[J].商业文化,2021(11):94-95. 被引量：4
8Margarita Acevedo-Pe&#241,a,Rosa María Ostiguín-Meléndez,José Luis Cadena-Anguiano,Marcela Patricia Ibarra-Gonzalez,Jesús Rigoberto Hernández-Hernández,Rafael Villalobos-Molina,Diana Cecilia Tapia-Pancardo.Problem-Based Learning <i>in Situ</i>Applied to Students in the Assessment of Nursing Process[J].Open Journal of Nursing,2021,11(5):378-389.
9Kun-Hsuan Wu,Ching-Te Chiu.Action Recognition Using Multi-Scale Temporal Shift Module and Temporal Feature Difference Extraction Based on 2D CNN[J].Journal of Software Engineering and Applications,2021,14(5):172-188.
10Lianyin Jia,Jilin Tang,Mengjuan Li,Jinguo You,Jiaman Ding,Yinong Chen.TWE‐WSD: An effective topical word embedding based word sense disambiguation[J].CAAI Transactions on Intelligence Technology,2021,6(1):72-79. 被引量：1

Frontiers of Information Technology & Electronic Engineering

2021年第5期

浏览历史

内容加载中请稍等...

Latent discriminative representation learning for speaker recognition

参考文献1

相关作者

相关机构

相关主题

浏览历史