Far-field head-related transfer functions(HRTFs)vary with source direction,frequency and individual.Correspondingly,the dimensionality of a full set of HRTFs is huge and measurement or calculation of individualized HR...Far-field head-related transfer functions(HRTFs)vary with source direction,frequency and individual.Correspondingly,the dimensionality of a full set of HRTFs is huge and measurement or calculation of individualized HRTFs with high directional resolution could be difficult.A method to construct HRTFs with high directional resolution from a small set of directional measurements or calculations is proposed in present paper.Based on tensor decomposition of HRTFs,far-field HRTFs are decomposed into a combination of direction,frequency and a few of individual-related modes.The individual-independent matrices of direction and frequency-related modes are derived by statistical analysis on a known baseline HRTF database.For an arbitrary new individual,the individual mode can be estimated from measured or calculated HRTFs at a small set of directions,then the HRTFs with high directional resolution are reconstructed.Calculation from two base-line HRTF databases indicates that 11 individualrelated modes account for more than 98%individual-related energy variation of HRTFs;and HRTFs with high directional resolution can be reconstructed from about 30 directional measurements or calculation.A psychoacoustic experiment validates the analysis.The present method is applicable to simplify individualized HRTF measurement or calculation.展开更多
This paper reports the recent works and progress on a PC and C++ language-based virtual auditory environment(VAE) system platform.By tracing the temporary location and orientation of listener's head and dynamicall...This paper reports the recent works and progress on a PC and C++ language-based virtual auditory environment(VAE) system platform.By tracing the temporary location and orientation of listener's head and dynamically simulating the acoustic propagation from sound source to two ears,the system is capable of recreating free-field virtual sources at various directions and distances as well as auditory perception in reflective environment via headphone presentation.Schemes for improving VAE performance,including PCA-based(principal components analysis) near-field virtual source synthesis,simulating six degrees of freedom of head movement,are proposed.Especially,the PCA-based scheme greatly reduces the computational cost of multiple virtual sources synthesis.Test demonstrates that the system exhibits improved performances as compared with some existing systems.It is able to simultaneously render up to 280 virtual sources using conventional scheme,and 4500 virtual sources using the PCA-based scheme.A set of psychoacoustic experiments also validate the performance of the system,and at the same time,provide some preliminary results on the research of binaural hearing.The functions of the VAE system is being extended and the system serves as a flexible and powerful platform for future binaural hearing researches and virtual reality applications.展开更多
Based on the measurements from 52 Chinese subjects (26 males and 26 females), a high-spatial-resolution head-related transfer function (HRTF) database with corre- sponding anthropometric parameters is established. By ...Based on the measurements from 52 Chinese subjects (26 males and 26 females), a high-spatial-resolution head-related transfer function (HRTF) database with corre- sponding anthropometric parameters is established. By using the database, cues relating to sound source localization, including interaural time difference (ITD), interaural level difference (ILD), and spectral features introduced by pinna, are analyzed. Moreover, the statistical relationship between ITD and anthropometric parameters is estimated. It is proved that the mean values of maximum ITD for male and female are significantly different, so are those for Chinese and western sub- jects. The difference in ITD is due to the difference in individual anthropometric parameters. It is further proved that the spectral features introduced by pinna strongly depend on individual; while at high frequencies (f≥ 5.5 kHz), HRTFs are left-right asymmetric. This work is instructive and helpful for the research on bin- aural hearing and applications on virtual auditory in future.展开更多
By considering the contribution of dynamic cue to auditory vertical localization,a method for virtual reproduction of surround sound in frontal space using four actual loudspeakers is proposed.The four actual loudspea...By considering the contribution of dynamic cue to auditory vertical localization,a method for virtual reproduction of surround sound in frontal space using four actual loudspeakers is proposed.The four actual loudspeakers are arranged in the left-front and right-front directions in the horizontal plane,as well as left-front-up,right-front-up directions in a higher elevation plane,respectively.Transaural signal processing is used to convert the multichannel sound signals to the signals for the four actual loudspeakers.Virtual reproduction of 9.1 channel sound is taken as an example.The analysis of binaural pressures and corresponding localization cues using head-related transfer functions indicates that the current method is able to recreate correct interaural time difference and its dynamic variation with head turning,and thus able to create appropriate binaural cue for lateral localization and dynamic cue for vertical localization.Results of psychoacoustic experiment indicate that the current method is able to recreate stable horizontal and vertical virtual source within the frontal-hemispherical directions.Therefore,combined with transaural processing,the four loudspeakers arrangement is enough to reproduce the vertical localization information in frontal space and realize the down-mixing and simplification of multichannel spatial surround sound.展开更多
A head-related transfer function (HRTF) model for fast and real-time synthesizing multiple virtual sound sources is proposed. A head-related impulse response (HRIR, time- domain version of HRTF) is first decompose...A head-related transfer function (HRTF) model for fast and real-time synthesizing multiple virtual sound sources is proposed. A head-related impulse response (HRIR, time- domain version of HRTF) is first decomposed by a two-level wavelet packet and then represented by a model composed of subband filters and reconstruction filters. The coefficients of the subband filters are the zero interpolation of the wavelet coefficients of the HRIR. The coefficients of the reconstruction filters can be calculated from the wavelet function. The model is simplified by applying a threshold method to reduce the wavelet coefficients. The calculated results indicate that for a model with 30 wavelet coefficients, the error of reconstructed HRIR is about 1%. And the result of a psychoacoustic test shows that a model with 35 wavelet coefficients is perceptually indistinguishable from the original HRIR. When multiple virtual sound sources are synthesized simultaneously, the computational cost of the proposed model is much less than the traditional HRTF filters.展开更多
A method to correct the measured head-related transfer functions (HRTFs) at low frequency was proposed. By analyzing the HRTFs from the spherical head model at low frequency, it is proved that below the frequency of...A method to correct the measured head-related transfer functions (HRTFs) at low frequency was proposed. By analyzing the HRTFs from the spherical head model at low frequency, it is proved that below the frequency of 400 Hz, magnitude of HRTF is nearly constant and the phase is a linear function of frequency both for the far and near field. Therefore, if the HRTFs above 400 Hz are accurately measured by experiment, it is able to correct the HRTFs at low frequency by the theoretical model. The results of calculation and subjective experiment show that the feasibility of the proposed method.展开更多
A binaural-loudness-model-based method for evaluating the spatial discrimination threshold of magnitudes of head-related transfer function(HRTF) is proposed.As the input of the binaural loudness model,the HRTF magni...A binaural-loudness-model-based method for evaluating the spatial discrimination threshold of magnitudes of head-related transfer function(HRTF) is proposed.As the input of the binaural loudness model,the HRTF magnitude variations caused by spatial position variations were firstly calculated from a high-resolution HRTF dataset.Then,three perceptualrelevant parameters,namely interaural loudness level difference,binaural loudness level spectra,and total binaural loudness level,were derived from the binaural loudness model.Finally,the spatial discrimination thresholds of HRTF magnitude were evaluated according to just-noticedifference of the above-mentioned perceptual-relevant parameters.A series of psychoacoustic experiments was also conducted to obtain the spatial discrimination threshold of HRTF magnitudes.Results indicate that the threshold derived from the proposed binaural-loudness-modelbased method is consistent with that obtained from the traditional psychoacoustic experiment,validating the effectiveness of the proposed method.展开更多
A scheme for analyzing the timbre in spatial sound with binaural auditory model is proposed and the Ambisonics is taken as an example for analysis. Ambisonics is a spatial sound system based on physical sound field re...A scheme for analyzing the timbre in spatial sound with binaural auditory model is proposed and the Ambisonics is taken as an example for analysis. Ambisonics is a spatial sound system based on physical sound field reconstruction. The errors and timbre colorations in the final reconstructed sound field depend on the spatial aliasing errors on both the recording and reproducing stages of Ambisonics. The binaural loudness level spectra in Ambisonics recon- struction is calculated by using Moore's revised loudness model and then compared with the result of real sound source, so as to evaluate the timbre coloration in Ambisonics quantitatively. The results indicate that, in the case of ideal 'independent signals, the high-frequency limit and radius of region without perceived timbre coloration increase with the order of Ambisonics. On the other hand, in the case of recording by microphone array, once the high-frequency limit of microphone array exceeds that of sound field reconstruction, array recording influences little on the binaural loudness level spectra and thus timbre in final reconstruction up to the high- frequency limit of reproduction. Based on the binaural auditory model analysis, a scheme for optimizing design of Ambisonics recording and reproduction is also suggested. The subjective experiment yields consistent results with those of binaural model, thus verifies the effectiveness of the model analysis.展开更多
Near-field head-related transfer functions (HRTFs) are essential to scientific re- searches of binaural hearing and practical applications of virtual auditory display. High ef- ficiency, accuracy and repeatability a...Near-field head-related transfer functions (HRTFs) are essential to scientific re- searches of binaural hearing and practical applications of virtual auditory display. High ef- ficiency, accuracy and repeatability are required in a near-field HRTF measurement. Hence, there is no reference which intents on solving the measuring difficulties of near-field HRTF for human subjects. In present work, an efficient near-field HRTF measurement system based on computer control is designed and implemented, and a fast calibration method for the system is proposed to first solve the measurement of near-field HRTF for human subjects. The efficiency of measurement is enhanced by a comprehensive design on the acoustic, electronic and mechanical parts of the system. And the accuracy and repeatability of the measurement are greatly im- proved by carefully calibrating the positions of sound source, subject and binaural microphones. This system is suitable for near-field HRTF measurement at various source distances within 1.0 m, for both human subject and artificial head. The time costs of HRTF measurement at a single sound source distance and full directions has been reduced to less than 20 minutes. The measurement results indicate that the accuracy of the system satisfies the actual requirements. The system is applicable to scientific research and can be used to establish an individualized near-field HRTF database for human subjects.展开更多
The relationship between the cross-correlation coefficients of feeding signals and auditory spatial impression(ASI) which are created by the left,right,left surround and right surround loudspeakers in 5.1 channel su...The relationship between the cross-correlation coefficients of feeding signals and auditory spatial impression(ASI) which are created by the left,right,left surround and right surround loudspeakers in 5.1 channel surround sound system is investigated by psychoacoustic experiments.The results show that for reproducing by the front left-right or left-right surround loudspeakers pair,the auditory source width(ASW) can be broadened by controlling the crosscorrelation coefficients of feeding signals to some extent.The quantitative relationships between ASW and the cross-correlation coefficients is frequency dependent.For reproducing by a pair of lateral loudspeakers,however,ASW can not be changed by controlling the cross-correlation coefficients of feeding signals.For reproducing by the front and surround loudspeakers pairs simultaneously and for pink noises and octave noises with central frequencies no more than 1kHz,a strong sense of listener envelopment(LEV) can be obtained by controlling the crosscorrelation coefficients of feeding signals properly.For the octave band noises with central frequencies at 2 kHz and 4 kHz,however,LEV can not be obtained by controlling the crosscorrelation coefficients of feeding signals.Further theoretical calculations and measurements show that there is no unique relationship between the inter-aural cross-correlation(IACC) and the ASW in 5.1 channels surround sound reproduction,which may be due to the algorithms of IACC calculation.Further experimental verifications are needed to investigate the applicability of IACC for evaluating ASI.The present results will be helpful to the actual surround sound programming recording and evaluation.展开更多
From the point of spatial sampling, spatial interpolation of HRTFs (head-related transfer functions) and signal mixing for multichannel (surround) sound are analyzed. First, it is proved that they are mathematical...From the point of spatial sampling, spatial interpolation of HRTFs (head-related transfer functions) and signal mixing for multichannel (surround) sound are analyzed. First, it is proved that they are mathematically equivalent. Different methods for HRTFs interpolation are equivalent to different signal mixing methods for multichannel sound. Then, a stricter derivation for the signal mixing of multichannel sound and the law of sine for stereophonic sound is given. It is pointed out that trying to reconstruct lateral HRTFs by adjacent linear interpolation is wrong. And for accurate sound image localization, the conventional equation of adjacent linear interpolation of HRTFs is revised. At last, it is also pointed out that some methods used in the analysis of HRTFs and multichannel sound can be used for reference mutually.展开更多
基金supported by the National Natural Science Foundation of China(12174118)。
文摘Far-field head-related transfer functions(HRTFs)vary with source direction,frequency and individual.Correspondingly,the dimensionality of a full set of HRTFs is huge and measurement or calculation of individualized HRTFs with high directional resolution could be difficult.A method to construct HRTFs with high directional resolution from a small set of directional measurements or calculations is proposed in present paper.Based on tensor decomposition of HRTFs,far-field HRTFs are decomposed into a combination of direction,frequency and a few of individual-related modes.The individual-independent matrices of direction and frequency-related modes are derived by statistical analysis on a known baseline HRTF database.For an arbitrary new individual,the individual mode can be estimated from measured or calculated HRTFs at a small set of directions,then the HRTFs with high directional resolution are reconstructed.Calculation from two base-line HRTF databases indicates that 11 individualrelated modes account for more than 98%individual-related energy variation of HRTFs;and HRTFs with high directional resolution can be reconstructed from about 30 directional measurements or calculation.A psychoacoustic experiment validates the analysis.The present method is applicable to simplify individualized HRTF measurement or calculation.
基金supported by the National Natural Science Foundation of China(11174087,10774049)State Key Laboratory of Subtropical Building Science,South China University of Technology
文摘This paper reports the recent works and progress on a PC and C++ language-based virtual auditory environment(VAE) system platform.By tracing the temporary location and orientation of listener's head and dynamically simulating the acoustic propagation from sound source to two ears,the system is capable of recreating free-field virtual sources at various directions and distances as well as auditory perception in reflective environment via headphone presentation.Schemes for improving VAE performance,including PCA-based(principal components analysis) near-field virtual source synthesis,simulating six degrees of freedom of head movement,are proposed.Especially,the PCA-based scheme greatly reduces the computational cost of multiple virtual sources synthesis.Test demonstrates that the system exhibits improved performances as compared with some existing systems.It is able to simultaneously render up to 280 virtual sources using conventional scheme,and 4500 virtual sources using the PCA-based scheme.A set of psychoacoustic experiments also validate the performance of the system,and at the same time,provide some preliminary results on the research of binaural hearing.The functions of the VAE system is being extended and the system serves as a flexible and powerful platform for future binaural hearing researches and virtual reality applications.
基金Supported by the National Natural Science Foundation of China (Grant No. 10374031)
文摘Based on the measurements from 52 Chinese subjects (26 males and 26 females), a high-spatial-resolution head-related transfer function (HRTF) database with corre- sponding anthropometric parameters is established. By using the database, cues relating to sound source localization, including interaural time difference (ITD), interaural level difference (ILD), and spectral features introduced by pinna, are analyzed. Moreover, the statistical relationship between ITD and anthropometric parameters is estimated. It is proved that the mean values of maximum ITD for male and female are significantly different, so are those for Chinese and western sub- jects. The difference in ITD is due to the difference in individual anthropometric parameters. It is further proved that the spectral features introduced by pinna strongly depend on individual; while at high frequencies (f≥ 5.5 kHz), HRTFs are left-right asymmetric. This work is instructive and helpful for the research on bin- aural hearing and applications on virtual auditory in future.
基金supported by the National Natural Science Foundation of China(11674105)the State Key Lab of Subtropical Building Science,South China University of Technology。
文摘By considering the contribution of dynamic cue to auditory vertical localization,a method for virtual reproduction of surround sound in frontal space using four actual loudspeakers is proposed.The four actual loudspeakers are arranged in the left-front and right-front directions in the horizontal plane,as well as left-front-up,right-front-up directions in a higher elevation plane,respectively.Transaural signal processing is used to convert the multichannel sound signals to the signals for the four actual loudspeakers.Virtual reproduction of 9.1 channel sound is taken as an example.The analysis of binaural pressures and corresponding localization cues using head-related transfer functions indicates that the current method is able to recreate correct interaural time difference and its dynamic variation with head turning,and thus able to create appropriate binaural cue for lateral localization and dynamic cue for vertical localization.Results of psychoacoustic experiment indicate that the current method is able to recreate stable horizontal and vertical virtual source within the frontal-hemispherical directions.Therefore,combined with transaural processing,the four loudspeakers arrangement is enough to reproduce the vertical localization information in frontal space and realize the down-mixing and simplification of multichannel spatial surround sound.
基金supported by the National Nature Science Fund of China(50938003,10774049)State Key Lab of Subtropical Building Science,South China University of Technology
文摘A head-related transfer function (HRTF) model for fast and real-time synthesizing multiple virtual sound sources is proposed. A head-related impulse response (HRIR, time- domain version of HRTF) is first decomposed by a two-level wavelet packet and then represented by a model composed of subband filters and reconstruction filters. The coefficients of the subband filters are the zero interpolation of the wavelet coefficients of the HRIR. The coefficients of the reconstruction filters can be calculated from the wavelet function. The model is simplified by applying a threshold method to reduce the wavelet coefficients. The calculated results indicate that for a model with 30 wavelet coefficients, the error of reconstructed HRIR is about 1%. And the result of a psychoacoustic test shows that a model with 35 wavelet coefficients is perceptually indistinguishable from the original HRIR. When multiple virtual sound sources are synthesized simultaneously, the computational cost of the proposed model is much less than the traditional HRTF filters.
基金supported by the National Natural Science Foundation of China(No.10774049)
文摘A method to correct the measured head-related transfer functions (HRTFs) at low frequency was proposed. By analyzing the HRTFs from the spherical head model at low frequency, it is proved that below the frequency of 400 Hz, magnitude of HRTF is nearly constant and the phase is a linear function of frequency both for the far and near field. Therefore, if the HRTFs above 400 Hz are accurately measured by experiment, it is able to correct the HRTFs at low frequency by the theoretical model. The results of calculation and subjective experiment show that the feasibility of the proposed method.
基金Supported by the National Natural Science Foundation of China(11174087)
文摘A binaural-loudness-model-based method for evaluating the spatial discrimination threshold of magnitudes of head-related transfer function(HRTF) is proposed.As the input of the binaural loudness model,the HRTF magnitude variations caused by spatial position variations were firstly calculated from a high-resolution HRTF dataset.Then,three perceptualrelevant parameters,namely interaural loudness level difference,binaural loudness level spectra,and total binaural loudness level,were derived from the binaural loudness model.Finally,the spatial discrimination thresholds of HRTF magnitude were evaluated according to just-noticedifference of the above-mentioned perceptual-relevant parameters.A series of psychoacoustic experiments was also conducted to obtain the spatial discrimination threshold of HRTF magnitudes.Results indicate that the threshold derived from the proposed binaural-loudness-modelbased method is consistent with that obtained from the traditional psychoacoustic experiment,validating the effectiveness of the proposed method.
基金supported by the National Natural Science Foundation of China(11174087)
文摘A scheme for analyzing the timbre in spatial sound with binaural auditory model is proposed and the Ambisonics is taken as an example for analysis. Ambisonics is a spatial sound system based on physical sound field reconstruction. The errors and timbre colorations in the final reconstructed sound field depend on the spatial aliasing errors on both the recording and reproducing stages of Ambisonics. The binaural loudness level spectra in Ambisonics recon- struction is calculated by using Moore's revised loudness model and then compared with the result of real sound source, so as to evaluate the timbre coloration in Ambisonics quantitatively. The results indicate that, in the case of ideal 'independent signals, the high-frequency limit and radius of region without perceived timbre coloration increase with the order of Ambisonics. On the other hand, in the case of recording by microphone array, once the high-frequency limit of microphone array exceeds that of sound field reconstruction, array recording influences little on the binaural loudness level spectra and thus timbre in final reconstruction up to the high- frequency limit of reproduction. Based on the binaural auditory model analysis, a scheme for optimizing design of Ambisonics recording and reproduction is also suggested. The subjective experiment yields consistent results with those of binaural model, thus verifies the effectiveness of the model analysis.
基金supported by the National Natural Science Foundation of China(11104082,11574090)Fundamental Research Funds for the Central Universities of South China University of Technology(2015ZZ135)
文摘Near-field head-related transfer functions (HRTFs) are essential to scientific re- searches of binaural hearing and practical applications of virtual auditory display. High ef- ficiency, accuracy and repeatability are required in a near-field HRTF measurement. Hence, there is no reference which intents on solving the measuring difficulties of near-field HRTF for human subjects. In present work, an efficient near-field HRTF measurement system based on computer control is designed and implemented, and a fast calibration method for the system is proposed to first solve the measurement of near-field HRTF for human subjects. The efficiency of measurement is enhanced by a comprehensive design on the acoustic, electronic and mechanical parts of the system. And the accuracy and repeatability of the measurement are greatly im- proved by carefully calibrating the positions of sound source, subject and binaural microphones. This system is suitable for near-field HRTF measurement at various source distances within 1.0 m, for both human subject and artificial head. The time costs of HRTF measurement at a single sound source distance and full directions has been reduced to less than 20 minutes. The measurement results indicate that the accuracy of the system satisfies the actual requirements. The system is applicable to scientific research and can be used to establish an individualized near-field HRTF database for human subjects.
基金supported by the National Nature Science Fund of China Grant(10774049)State Key Lab of Subtropical Building Science,South China University of Technology
文摘The relationship between the cross-correlation coefficients of feeding signals and auditory spatial impression(ASI) which are created by the left,right,left surround and right surround loudspeakers in 5.1 channel surround sound system is investigated by psychoacoustic experiments.The results show that for reproducing by the front left-right or left-right surround loudspeakers pair,the auditory source width(ASW) can be broadened by controlling the crosscorrelation coefficients of feeding signals to some extent.The quantitative relationships between ASW and the cross-correlation coefficients is frequency dependent.For reproducing by a pair of lateral loudspeakers,however,ASW can not be changed by controlling the cross-correlation coefficients of feeding signals.For reproducing by the front and surround loudspeakers pairs simultaneously and for pink noises and octave noises with central frequencies no more than 1kHz,a strong sense of listener envelopment(LEV) can be obtained by controlling the crosscorrelation coefficients of feeding signals properly.For the octave band noises with central frequencies at 2 kHz and 4 kHz,however,LEV can not be obtained by controlling the crosscorrelation coefficients of feeding signals.Further theoretical calculations and measurements show that there is no unique relationship between the inter-aural cross-correlation(IACC) and the ASW in 5.1 channels surround sound reproduction,which may be due to the algorithms of IACC calculation.Further experimental verifications are needed to investigate the applicability of IACC for evaluating ASI.The present results will be helpful to the actual surround sound programming recording and evaluation.
基金This work was supported by the National Natural Science Foundation of China (No.10374031).
文摘From the point of spatial sampling, spatial interpolation of HRTFs (head-related transfer functions) and signal mixing for multichannel (surround) sound are analyzed. First, it is proved that they are mathematically equivalent. Different methods for HRTFs interpolation are equivalent to different signal mixing methods for multichannel sound. Then, a stricter derivation for the signal mixing of multichannel sound and the law of sine for stereophonic sound is given. It is pointed out that trying to reconstruct lateral HRTFs by adjacent linear interpolation is wrong. And for accurate sound image localization, the conventional equation of adjacent linear interpolation of HRTFs is revised. At last, it is also pointed out that some methods used in the analysis of HRTFs and multichannel sound can be used for reference mutually.