Objective evaluations of two-stage binaural speech enhancement with Wiener filter for speech enhancement and sound localization

Forfattere

  • Junfeng Li School of Information Science, Japan Advanced Institute of Science and Technology, Japan
  • Shuichi Sakamoto Research Institute of Electrical Communication, Tohoku University, Japan
  • Satoshi Hongo School of Information Science, Miyagi National College of Technology, Japan
  • Masato Akagi School of Information Science, Japan Advanced Institute of Science and Technology, Japan
  • Yôiti Suzuki Research Institute of Electrical Communication, Tohoku University, Japan

Resumé

For high-quality speech communication, we previously proposed a two-stage binaural speech enhancement with Wiener lter (TS-BASE/WF) approach inspired by the equalization-cancellation (EC) theory, to suppress interfering signals and preserve impression of acoustic scene. In the proposed TS-BASE/ WF, the interfering signal is rst estimated by equalizing and cancelling the target signal through two equalizers and a time-variant Wiener filter is then applied to enhance the target signal given the noisy mixture signals. In this paper, we pay main attention to the comprehensive experimental evaluations on its speech-enhancement performance and its ability in preserving binaural bene ts in a variety of acoustic conditions. Experimental results show that the TS-BASE/WF approach is able to suppress non-stationary multiple interfering signals and enhance the target signal which is expected to improve the quality of speech communication, and succeeds in preserving the binaural cues which is expected to give birth to the perceptual impression of the auditory scene, in all tested spatial scenarios.

Referencer

Aichner R., Buchner H., Zourub M., and Kellermann W. (2007). “Multi-channel source separation preserving spatial information,” Proc. ICASSP, pp. I.5-8.

Culling, J. F., and Summer eld, Q. (1995). “Perceptual segregation of concurrent speech sounds: absence of across-frequency grouping by common interaural delay,” J. Acoust. Soc. Am. 98, 785-797.

Dorbecker M., and Ernst S. (1996). “Combination of two-channel spectral subtraction and adaptive Wiener post- ltering for noise reduction and dereverberation,” Proc. EUSIPCO, pp. 995-998.

Durlach, N. I. (1963). “Equalization and cancellation theory of binaural masking level differences,” J. Acoust. Soc. Am. 35, 1206-1218.

Jeffress, L. A. (1948). “A place theory of sound localization,” J. Comparative and Physiological Psychology 41, 35-39.

Klasen T. J., Van den Boqaert, T., Moonen, M., Wouters, J. (2007). “Binaural noise reduction algorithms for hearing aids that preserve interaural time delay cues processing,” IEEE Trans. on Signal Processing, 55, 1579-1585.

Kollmeier B., Peissig J., and Hohmann V. (1993). “Binaural noise-reduction hearing aid scheme with real-time processing in the frequency domain,” Scand. Audiol. Suppl. 38, 28-38.

Li, J., Sakamoto, S., Hongo, S., Akagi M., Suzuki Y. (2009). “Two-stage binaural speech enhancement with Wiener lter based on equalization-cancellation model,” Proc. IEEE Workshop on Application of Signal Processing to Audio and Acoustics (New Paltz, NY, USA), pp. 133-136.

Lotter T., Sauert B., and Vary P. (2005). “A stereo input-output superdirective beamformer for dual channel noise reduction,” Proc., Eurospeech, pp. 2285- 2288.

Nakashima H., Chisaki Y., Usagawa T., and Ebata M. (2003). “Frequency domain binaural model based on interaural phase and level differences,” Acoust. Sci. and Tech. 24, 172-178.

Scalart P., and Filho J. V. (1996) “Speech enhancement based on a priori signal to noise estimation,” Proc. ICASSP, vol. 2, pp. 629-632.

Yderligere filer

Publiceret

2009-12-15

Citation/Eksport

Li, J., Sakamoto, S., Hongo, S., Akagi, M., & Suzuki, Y. (2009). Objective evaluations of two-stage binaural speech enhancement with Wiener filter for speech enhancement and sound localization. Proceedings of the International Symposium on Auditory and Audiological Research, 2, 343–352. Hentet fra https://proceedings.isaar.eu/index.php/isaarproc/article/view/2009-35

Nummer

Sektion

2009/3. Speech processing and perception under adverse conditions