Forensic speaker recognition is experiencing a remarkable paradigm shift in terms of the evaluation framework and presentation of voice evidence.This paper proposes a new method of forensic automatic speaker recogniti...Forensic speaker recognition is experiencing a remarkable paradigm shift in terms of the evaluation framework and presentation of voice evidence.This paper proposes a new method of forensic automatic speaker recognition using the likelihood ratio framework to quantify the strength of voice evidence.The proposed method uses a reference database to calculate the within-and between-speaker variability.Some acoustic-phonetic features are extracted automatically using the software VbiceSauce.The effectiveness of the approach was tested using two Mandarin databases:A mobile telephone database and a landline database.The experimenfs results indicate that these acoustic-phonetic features do have some discriminating potential and are worth trying in discrimination.The automatic acoustic-phonetic features have acceptable discriminative performance and can provide more reliable results in evidence analysis when fused with other kind of voice features.展开更多
So far, phonetic features have been the main type of forensic speaker recognition features studied and used in practice. One problem with phonetic forensic speaker recognition features is that they are affected dramat...So far, phonetic features have been the main type of forensic speaker recognition features studied and used in practice. One problem with phonetic forensic speaker recognition features is that they are affected dramatically by the real-world conditions, which results in within-speaker variations and consequently reduces the reliability of forensic speaker cognition results. In this context, supported by Sapir’s description of the structure of speech behavior and discourse information theory, natural conversations are adopted as experiment materials to explore nonphonetic featuresthat are supposed to be less affected by real‑world conditions. The results of experimentsshow that first there exist nonphonetic featuresbesides phonetic features, and what’s more, the nonphonetic features are less affected by real-world conditions as expected.展开更多
文摘Forensic speaker recognition is experiencing a remarkable paradigm shift in terms of the evaluation framework and presentation of voice evidence.This paper proposes a new method of forensic automatic speaker recognition using the likelihood ratio framework to quantify the strength of voice evidence.The proposed method uses a reference database to calculate the within-and between-speaker variability.Some acoustic-phonetic features are extracted automatically using the software VbiceSauce.The effectiveness of the approach was tested using two Mandarin databases:A mobile telephone database and a landline database.The experimenfs results indicate that these acoustic-phonetic features do have some discriminating potential and are worth trying in discrimination.The automatic acoustic-phonetic features have acceptable discriminative performance and can provide more reliable results in evidence analysis when fused with other kind of voice features.
基金This paper is one of the outcomes of the“13th Five-Year Plan”Philosophy and Social Science Research Program(GD16CWW02)the Study of Identification of We-Media Language in Big Data Era,which is directed by Guan Xin and has been approved by Guangdong Planning Office of Philosophy and Social Science in 2016.
文摘So far, phonetic features have been the main type of forensic speaker recognition features studied and used in practice. One problem with phonetic forensic speaker recognition features is that they are affected dramatically by the real-world conditions, which results in within-speaker variations and consequently reduces the reliability of forensic speaker cognition results. In this context, supported by Sapir’s description of the structure of speech behavior and discourse information theory, natural conversations are adopted as experiment materials to explore nonphonetic featuresthat are supposed to be less affected by real‑world conditions. The results of experimentsshow that first there exist nonphonetic featuresbesides phonetic features, and what’s more, the nonphonetic features are less affected by real-world conditions as expected.