Most of the existing loudness models are based on the diotic listening hypothesis,though human beings always hear in dichotic listening conditions.In this situation,the arithmetic mean of loudness at both ears is usua...Most of the existing loudness models are based on the diotic listening hypothesis,though human beings always hear in dichotic listening conditions.In this situation,the arithmetic mean of loudness at both ears is usually taken as the approximate value of overall perceived loudness,unaffected by the interaural level difference(ILD).The present work investigated the overall perceived loudness for pure tones in dichotic listening conditions through a subjective experiment.Two experimental procedures and systematic errors were investigated to prove the accuracy of the subjective test.The results showed that fluctuation was insignificant in the low frequency range,while apparent fluctuation of overall loudness could be observed at a high frequency.The overall loudness deviated from the arithmetic mean value as the ILD became larger.A revised model for overall loudness in dichotic listening conditions was proposed.The proposed model and experiment are consistent.展开更多
基于麦克风阵列的声源定位技术受到了越来越多的关注。在视频会议、助听器、免提电话系统中,声源定位被用于检测说话人的位置信息来自动调节摄像头,或者形成波束。在各种声源定位方法中,基于到达时间差(time delay of arrival,TDOA)估...基于麦克风阵列的声源定位技术受到了越来越多的关注。在视频会议、助听器、免提电话系统中,声源定位被用于检测说话人的位置信息来自动调节摄像头,或者形成波束。在各种声源定位方法中,基于到达时间差(time delay of arrival,TDOA)估计的双步定位算法是普遍采用的一种行之有效的方法。Birchfield从能量的角度出发提出了一种基于双耳电平差(interaural level difference,ILD)的双步定位算法,它通过检测多个麦克风对所接收到的信号能量比来确定声源的位置。然而,所有的这些方法如果要确定出声源在二维平面内的位置坐标,都至少需要三个麦克风。针对这一问题,本文提出了一种基于双麦克风的二维平面定位算法,类似于人的双耳定位原理,我们通过同时估计声源到达两个麦克风的能量比和时延信息,来达到定位的目的,而进一步推导出的闭合解可以用于实时地跟踪运动声源。最后的仿真结果证明了这一算法在一般的混响条件下都可以获得好的结果,然而它减小了阵列的尺寸,这对于体积受限的通信设备来说具有极大的吸引力。展开更多
Most existing algorithms for the underdetermined blind source separation(UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previ...Most existing algorithms for the underdetermined blind source separation(UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference(ITD) and the interaural level difference(ILD), as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.展开更多
基金supported by the National Natural Science Foundation of China (Grant No. 10674104)
文摘Most of the existing loudness models are based on the diotic listening hypothesis,though human beings always hear in dichotic listening conditions.In this situation,the arithmetic mean of loudness at both ears is usually taken as the approximate value of overall perceived loudness,unaffected by the interaural level difference(ILD).The present work investigated the overall perceived loudness for pure tones in dichotic listening conditions through a subjective experiment.Two experimental procedures and systematic errors were investigated to prove the accuracy of the subjective test.The results showed that fluctuation was insignificant in the low frequency range,while apparent fluctuation of overall loudness could be observed at a high frequency.The overall loudness deviated from the arithmetic mean value as the ILD became larger.A revised model for overall loudness in dichotic listening conditions was proposed.The proposed model and experiment are consistent.
基金supported by the National Natural Science Foundation of China(Grant Nos.61162014,61210306074)the Natural Science Foundation of Jiangxi Province of China(Grant No.20122BAB201025)the Foundation for Young Scientists of Jiangxi Province(Jinggang Star)(Grant No.20122BCB23002)
文摘Most existing algorithms for the underdetermined blind source separation(UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference(ITD) and the interaural level difference(ILD), as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.