Improving robustness of adaptive beamforming for hearing devices
Fixed beamforming for hearing aids is suboptimal due to mismatches in real-world situations between the assumed and encountered sound fields. Adaptive beamforming potentially provides better performance but may degrade it if the characteristics of the signal required by the design procedure are inaccurately estimated. This paper proposes a straightforward but sufficiently rich model for the sound field that can be used to increase the robustness of adaptive beamformer design. A method for estimating the model parameters is also presented. In reverberant acoustic conditions, the proposed method improves performance by > 1 dB even at −16 dB SNR, the lowest signal to noise ratio (SNR) tested. Furthermore, it is shown to be robust in a variety of acoustic conditions which do not conform to the sound field model, and to inaccurate steering of the array.
Avargel, Y., and Cohen, I. (2007), “On multiplicative transfer function approximation in the short-time Fourier transform domain,” IEEE Signal Process. Lett., 14(5), 337-340.
Bitzer, J., and Simmer, K.U. (2001), “Superdirective microphone arrays,” in Microphone Arrays: Signal Processing Techniques and Applications, M. S. Brandstein and D. B. Ward, Eds. Berlin, Germany: Springer-Verlag, 2001, 19-38.
Braun, S., and Habets, E.A.P (2015), “A multichannel diffuse power estimator for dereverberation in the presence of multiple sources,” EURASIP J. Audio Speech Music Process., vol. 2015, no. 1, p. 34.
Braun, S., Kuklasi´nski, A., Schwartz, O., Thiergart, O., Habets, E.A.P., Gannot, S., Doclo, S., and Jensen, J. (2018), “Evaluation and comparison of late reverberation power spectral density estimators,” IEEE/ACM Trans. Audio Speech Lang. Process., 26(6), 1056-1071.
Brookes, D.M. (1997), “VOICEBOX: A speech processing toolbox for MATLAB,” 1997–2016. [Online]. Available: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
Capon, J. (1969), “High resolution frequency-wavenumber spectrum analysis,” Proc. IEEE, 57, 1408-1418.
Chakrabarty, S., and Habets, E.A.P. (2018), “A Bayesian approach to informed spatial filtering with robustness against DOA estimation errors,” IEEE/ACM Trans. Audio Speech Lang. Process., 26(1), 145-160.
Cox, H., Zeskind, R.M., and Owen, M.M. (1987), “Robust adaptive beamforming,”
IEEE Trans. Acoust. Speech Signal Process., 35(10), 1365-1376.
Ehrenberg, L., Gannot, S., Leshem, A., and Zehavi, E. (2010), “Sensitivity analysis of MVDR and MPDR beamformers,” Proc. IEEE Conv. Electrical and Electronics Engineers, 416-420.
Gannot, S., Burshtein, D., and Weinstein, E. (2001), “Signal enhancement using
beamforming and nonstationarity with applications to speech,” IEEE Trans. Signal Process., 49(8), 1614-1626.
ITU-T (1993), “Objective measurement of active speech level,” Intl. Telecommunications Union (ITU-T), Recommendation P.56, Mar. 1993.
ITU-T (2003), “Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs,” Intl. Telecommunications Union (ITU-T), Recommendation P.862, Nov. 2003.
Jarrett, D.P., Habets, E.A.P., and Naylor, P.A. (2017), Theory and Applications of Spherical Microphone Array Processing, ser. Springer Topics in Signal Processing. Springer International Publishing, 2017.
Klasen, T.J., Bogaert, T.V. den, Moonen, M., and Wouters, J. (2007), “Binaural noise reduction algorithms for hearing aids that preserve interaural time delay cues,” IEEE Trans. Signal Process., 55(4), 1579-1585.
Li, J., Stoica, P., and Wang, Z. (2003), “On robust Capon beamforming and diagonal loading,” IEEE Trans. Signal Process., 51(7), 1702-1715.
Löllmann, H.W., Moore, A.H., Naylor, P.A., Rafaely, B., Horaud, R., Mazel, A., and Kellermann, W. (2017), “Microphone array signal processing for robot audition,” Proc. HSCMA, 51–55.
Markovich, S., Gannot, S., and Cohen, I. (2009), “Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals,” IEEE Trans. Audio, Speech, Lang. Process., 17(6), 1071-1086.
Markovich-Golan, S. and Gannot, S. (2015), “Performance analysis of the covariance subtraction method for relative transfer function estimation and comparison to the covariance whitening method,” Proc. ICASSP, 544–548.
Moore, A.H., Lightburn, L., Xue, W., Naylor, P.A., and Brookes, M. (2018), “Binaural mask-informed speech enhancement for hearing aids with head tracking,” Proc. IWAENC.
Moore, A.H., Xue, W., Naylor, P.A., and Brookes, M. (2019), “Noise covariance matrix estimation for rotating microphone arrays,” IEEE/ACM Trans. Audio Speech Lang. Process., 27(3), 519-530.
Moore, A.H., de Haan, J.M., Pedersen, M.S., Naylor, P.A., Brookes, M., and Jensen, J. (2019), “Personalized signal-independent beamforming for binaural hearing aids,” J. Acoust. Soc. Am., 145, 971–2981.
Rafaely, B. (2015), Fundamentals of Spherical Array Processing, ser. Springer Topics in Signal Processing. Berlin Heidelberg: Springer-Verlag, 2015.
Schwartz, O., Gannot, S., and Habets, E.A.P (2016), “Joint estimation of late reverberant and speech power spectral densities in noisy environments using frobenius norm,” Proc. EUSIPCO, 1123–1127.
Schwarz, A., and Kellermann, W. (2015), “Coherent-to-diffuse power ratio estimation for dereverberation,” IEEE/ACM Trans. Audio Speech Lang. Process., 23(6), 1006–1018.
Taal, C.H., Hendriks, R.C., Heusdens, R., and Jensen, J. (2011), “An algorithm for intelligibility prediction of time-frequency weighted noisy speech,” IEEE Trans. Audio Speech Lang. Process., 19(7), 2125-2136.
Tamai, Y., Kagami, S., Amemiya, Y., Sasaki, Y., Mizoguchi, H., and Takano, T. (2004), “Circular microphone array for robot’s audition,” Proc. IEEE Sensors, 565–570.
Thiergart, O., and Habets, E.A.P. (2013), “An informed LCMV filter based on multiple instantaneous direction-of-arrival estimates,” Proc. ICASSP, 659–663.
van Trees, H.L. (2002), Optimum Array Processing, ser. Detection, Estimation and Modulation Theory. John Wiley & Sons, Inc., 2002.
Yilmaz, O., and Rickard, S. (2004), “Blind separation of speech mixtures via time- frequency masking,” IEEE Trans. Signal Process., 52(7), 1830-1847.
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright* and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
*From the 2017 issue onward. The Danavox Jubilee Foundation owns the copyright of all articles published in the 1969-2015 issues. However, authors are still allowed to share the work with an acknowledgement of the work's authorship and initial publication in this journal.