头相关传输函数(Head Related Transfer Function, HRTF)在空间音频渲染中具有关键作用,能够显著提升个体的听觉体验。然而,要实现最佳的听觉效果,HRTF必须与受试者的解剖特征相符。为了达到这一目标,采用了一种基于人体测量特征的方法...头相关传输函数(Head Related Transfer Function, HRTF)在空间音频渲染中具有关键作用,能够显著提升个体的听觉体验。然而,要实现最佳的听觉效果,HRTF必须与受试者的解剖特征相符。为了达到这一目标,采用了一种基于人体测量特征的方法来估算个体化HRTF,特别地,扩展了耳廓的测量参数,并考虑了常被忽略的耳廓腔室对HRTF的影响。通过主动形状模型(Active Shape Models, ASM),自动从特定耳廓的标记点提取耳廓测量参数。接着,使用轻量级梯度提升机(Light Gradient Boosting Machine, LightGBM)模型,根据提取的耳廓测量参数及头部测量参数,预测个体在中垂面上的HRTF幅度。评估结果显示,所提取的耳廓特征能够显著提升HRTF个性化的客观性能指标。展开更多
声源定位因素包括双耳时间差、双耳声级差、谱因素等,文章在介绍声源定位因素的基础上,引出了人体头部相关传输函数HRTF定义及其特性。基于美国MIT Media Lab发布的HRTF数据库,利用计算机合成两声道虚拟三维音频,并对虚拟声源进行了定...声源定位因素包括双耳时间差、双耳声级差、谱因素等,文章在介绍声源定位因素的基础上,引出了人体头部相关传输函数HRTF定义及其特性。基于美国MIT Media Lab发布的HRTF数据库,利用计算机合成两声道虚拟三维音频,并对虚拟声源进行了定位测试实验,12名听觉正常者参与了主观听音实验,选取10个具有方位信息的音频信号作为刺激信号,听音者将听到的方位记录下来,与预设方位进行对比。最后对实验结果进行了总结分析。展开更多
In order to approach to head related transfer functions (HRTFs), this paper employs and compares three kinds of one input neural network models, namely, multi layer perceptron (MLP) networks, radial basis function ...In order to approach to head related transfer functions (HRTFs), this paper employs and compares three kinds of one input neural network models, namely, multi layer perceptron (MLP) networks, radial basis function (RBF) networks and wavelet neural networks (WNN) so as to select the best network model for further HRTFs approximation. Experimental results demonstrate that wavelet neural networks are more efficient and useful.展开更多
本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角...本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角度信息作为输入的展开层以及将前两者的输出作为输入的深度神经网络组成。最后对所提出方法的整体性能进行了客观评价。展开更多
This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction method...This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction methods,we use Light GBM and a specific network structure to prevent over-fitting and enhance the prediction performance.By decomposing and combining the data to be predicted,we set up 90 Light GBM models to separately predict the 90instants of HRTF in log domain.At the same time,the method of 10-fold cross-validation is used to score the accuracy of the model.For models with scores below 80 points,Bayesian optimization is used to adjust model hyperparameters to obtain a better model structure.The results obtained by Light GBM are evaluated with spectral distortion(SD)which can show the fitting error between the prediction and the original data.The mean SD values of both ears on the whole test set are 2.32 d B and 2.28 d B respectively.Compared with the non-linear regression method and the latest method,SD value of Light GBM-based method relatively decreases by 83.8%and 48.5%.展开更多
本文提出了一种基于N-mode SVD的HRTF个人化近似方法。HRTF(Head-Related Transfer Function),即头相关传输函数,描述了声波从声源方位到耳道口的传输特性,反映了头部、躯干和外耳等身体结构对不同方向声音信号的滤波效果。本文给出的...本文提出了一种基于N-mode SVD的HRTF个人化近似方法。HRTF(Head-Related Transfer Function),即头相关传输函数,描述了声波从声源方位到耳道口的传输特性,反映了头部、躯干和外耳等身体结构对不同方向声音信号的滤波效果。本文给出的多重近似方法基于传统独立主元分析的张量扩展[1]。使用该方法只需测量不同个体的部分生理参数即可得到该个体的个人化HRTF。展开更多
文摘声源定位因素包括双耳时间差、双耳声级差、谱因素等,文章在介绍声源定位因素的基础上,引出了人体头部相关传输函数HRTF定义及其特性。基于美国MIT Media Lab发布的HRTF数据库,利用计算机合成两声道虚拟三维音频,并对虚拟声源进行了定位测试实验,12名听觉正常者参与了主观听音实验,选取10个具有方位信息的音频信号作为刺激信号,听音者将听到的方位记录下来,与预设方位进行对比。最后对实验结果进行了总结分析。
文摘In order to approach to head related transfer functions (HRTFs), this paper employs and compares three kinds of one input neural network models, namely, multi layer perceptron (MLP) networks, radial basis function (RBF) networks and wavelet neural networks (WNN) so as to select the best network model for further HRTFs approximation. Experimental results demonstrate that wavelet neural networks are more efficient and useful.
文摘本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角度信息作为输入的展开层以及将前两者的输出作为输入的深度神经网络组成。最后对所提出方法的整体性能进行了客观评价。
基金supported by the cooperation between BIT and Ericssonpartially supported by the National Natural Science Foundation of China under Grants No.62071039。
文摘This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction methods,we use Light GBM and a specific network structure to prevent over-fitting and enhance the prediction performance.By decomposing and combining the data to be predicted,we set up 90 Light GBM models to separately predict the 90instants of HRTF in log domain.At the same time,the method of 10-fold cross-validation is used to score the accuracy of the model.For models with scores below 80 points,Bayesian optimization is used to adjust model hyperparameters to obtain a better model structure.The results obtained by Light GBM are evaluated with spectral distortion(SD)which can show the fitting error between the prediction and the original data.The mean SD values of both ears on the whole test set are 2.32 d B and 2.28 d B respectively.Compared with the non-linear regression method and the latest method,SD value of Light GBM-based method relatively decreases by 83.8%and 48.5%.
文摘本文提出了一种基于N-mode SVD的HRTF个人化近似方法。HRTF(Head-Related Transfer Function),即头相关传输函数,描述了声波从声源方位到耳道口的传输特性,反映了头部、躯干和外耳等身体结构对不同方向声音信号的滤波效果。本文给出的多重近似方法基于传统独立主元分析的张量扩展[1]。使用该方法只需测量不同个体的部分生理参数即可得到该个体的个人化HRTF。