Role of temporal envelope and fine structure cues in speech perception: A review

Authors

  • Christian Lorenzi LPP CNRS, Université René Descartes Paris 5, Paris, France; GRAEC, CNRS, France; Dept d’Etudes Cognitives, Ecole Normale Supérieure, 29 rue d’Ulm, 75005 Paris, France
  • Brian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing street, Cambridge CB2 3EB, UK

Abstract

Over the last few decades, a variety of evidence has been presented to support the idea that, for normal-hearing listeners, both temporal envelope (E) and temporal fine structure (TFS) cues play a role in speech identification. E cues in a few frequency bands seem to be sufficient for good speech identification in quiet, but TFS cues appear to play an important role when background sounds are present, especially for “glimpsing” speech in the temporal minima of fluctuating background sounds. There is also evidence that cochlear damage associated with mild to moderate hearing loss may severely degrade the ability to use TFS cues while preserving the ability to use E cues in speech stimuli. This is consistent with the relatively preserved ability of hearing-impaired listeners to identify speech in quiet when audibility is controlled for, and the substantial deficits observed for these listeners when speech is masked by uctuating background noise.

References

Bacon, S. P., and Viemeister, N. F. (1985). “Temporal modulation transfer functions in normal-hearing and hearing-impaired subjects,” Audiol., 24, 117-134.

Baer, T., and Moore, B. C. J. (1993). “Effects of spectral smearing on the intelligibility of sentences in the presence of noise,” J. Acoust. Soc. Am. 94, 1229-1241.

Baer, T., and Moore, B. C. J. (1994). “Effects of spectral smearing on the intelligibility of sentences in the presence of interfering speech,” J. Acoust. Soc. Am. 95, 2277-2280.

Baskent, D. (2006). “Speech recognition in normal hearing and sensorineural hearing loss as a function of the number of spectral channels,” J. Acoust. Soc. Am. 120, 2908-2925.

Buss, E., Hall, J. W., 3rd, and Grose, J. H. (2004). “Temporal fine structure cues to speech recognition and pure tone modulation in observers with sensorineural hearing loss,” Ear Hear., 25, 242-250.

Drullman, R. (1995). “Temporal envelope and ne structure cues for speech intelligibility,” J. Acoust. Soc. Am. 97, 585-592.

Drullman, R., Festen, J. M., and Plomp, R. (1994). “Effect of reducing slow temporal modulations on speech reception,” J. Acoust. Soc. Am. 95, 2670-2680.

Duquesnoy, A. J. (1983). “Effect of a single interfering noise or speech source on the binaural sentence intelligibility of aged persons,” J. Acoust. Soc. Am. 74, 739- 743.

Flanagan, J. L. (1980). “Parametric coding of speech spectra,” J. Acoust. Soc. Am. 68, 412-419.

Füllgrabe, C., Berthommier, F., and Lorenzi, C. (2006). “Masking release for conso- nant features in temporally fluctuating background noise,” Hear. Res. 211, 74-84.

Füllgrabe, C., Maillet, D., Moroni, C., Belin, C., and Lorenzi, C. (2004). “Detection of 1st- and 2nd-order temporal envelope cues in a patient with left brain damage,” NeuroCase 10, 189-197.

Ghitza, O. (2001). “On the upper cutoff frequency of the auditory critical-band envelope detectors in the context of speech perception,” J. Acoust. Soc. Am. 110, 1628-1640.

Gilbert, G., and Lorenzi, C. (2006). “The ability of listeners to use recovered envelope cues from speech ne structure,” J. Acoust. Soc. Am. 119, 2438-2444.

Gilbert, G., Bergeras, I., Voillery, D., and Lorenzi, C. (2007). “Effects of periodic interruption on the intelligibility of speech based on temporal ne-structure or envelope cues,” J. Acoust. Soc. Am. 122, 1336-1339.

Giraud, A. L., Lorenzi, C., Ashburner, J., Wable, J., Johnsrude, I., Frackowiak, R., and Kleinschmidt, A. (2000). “Representation of the temporal envelope of sounds in the human brain,” J. Neurophysiol. 84, 1588-1598.

Glasberg, B. R., and Moore, B. C. J. (1990). “Derivation of auditory filter shapes from notched-noise data,” Hear. Res. 47, 103-138.

Griffiths, T. D., Penhune, V., Peretz, I., Dean, J. L., Patterson, R. D., and Green, G. G. (2000). “Frontal processing and auditory perception,” Neuroreport 7, 919-922.

Hart, H. C., Palmer, A. R., and Hall, D. A. (2003). “Amplitude and frequency-modulated stimuli activate common regions of human auditory cortex,” Cereb. Cortex 13, 773-781.

Hescot, F., Lorenzi, C., Debruille, X., and Camus, J. F. (2000). “Amplitude-modulation detection for broadband noise in a single listener with left-hemisphere damage,” Brit. J. Audiol. 34, 341-351.

Hilbert, D. (1912). Grundzüge einer allgemeinen Theorie der linearen Integralgleichungen (Teubner, Leipzig).

Hopkins, K., and Moore, B. C. J. (2007). “Moderate cochlear hearing loss leads to a reduced ability to use temporal fine structure information,” J. Acoust. Soc. Am. 122, 1055-1068.

Hopkins, K., Moore, B. C. J., and Stone, M. A. (2008). “Effects of moderate cochlear hearing loss on the ability to bene t from temporal ne structure information in speech,” J. Acoust. Soc. Am. (submitted).

Houtgast, T., and Steeneken, H. J. M. (1985). “A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria,” J. Acoust. Soc. Am. 77, 1069-77.

Irino, T., and Patterson, R. D. (2001). “A compressive gammachirp auditory filter for both physiological and psychophysical data,” J. Acoust. Soc. Am. 109, 2008-2022.

Joris, P. X., Schreiner, C. E., and Rees, A. (2004). “Neural processing of amplitude modulated sounds,” Physiol. Rev. 84, 541-577.

Lacher-Fougère, S., and Demany, L. (2005). “Consequences of cochlear damage for the detection of interaural phase differences,” J. Acoust. Soc. Am. 118, 2519-2526.

Liégeois-Chauvel, C., Lorenzi, C., Trébuchon, A., Régis, J., and Chauvel, P. (2004). “Temporal envelope processing in the human left and right auditory cortices,” Cereb. Cortex 14, 731-740.

Loizou, P. C., Dorman, M., and Tu, Z. (1999). “On the number of channels needed to understand speech,” J. Acoust. Soc. Am. 106, 2097-2103.

Lorenzi, C., Debruille, L., Garnier, S., Fleuriot, P., and Moore, B. C. J. (2008). “Abnormal auditory temporal processing for frequencies where absolute thresholds are normal,” J. Acoust. Soc. Am. (submitted).

Lorenzi, C., Dumont, A., and Füllgrabe, C. (2000a). “Use of temporal envelope cues by developmental dyslexics,” J. Speech Lang., Hear. Res. 43, 1367-1379.

Lorenzi, C., Gilbert, G., Carn, H., Garnier, S., and Moore, B. C. J. (2006). “Speech perception problems of the hearing impaired reflect inability to use temporal fine structure,” Proc. Natl. Acad. Sci. USA 103, 18866-18869.

Lorenzi, C., Wable, J., Moroni, C., Derobert, C., Frachet, B., and Belin, C. (2000b). “Auditory temporal envelope processing in a patient with left-hemisphere damage,” Neurocase 6, 231-244.

Luo, H., Wang, Y., Poeppel, D., and Simon, J. Z. (2006). “Concurrent encoding of frequency and amplitude modulation in human auditory cortex: MEG evidence,” J. Neurophysiol. 96, 2712-2723.

Moore, B. C. J. (2007). Cochlear Hearing Loss: Physiological, Psychological and Technical Issues (Wiley, Chichester).

Moore, B. C. J. (2008). “The role of temporal ne structure in normal and impaired hearing," in International Symposium on Auditory and Audiological Research,” edited by T. Dau, Holmens Trykkeri, Denmark.

Moore, B. C. J., and Glasberg, B. R. (1998). “Use of a loudness model for hearing aid fitting. I. Linear hearing aids,” Br. J. Audiol. 32, 317-335.

Moore, B. C. J., and Glasberg, B. R. (2001). “Temporal modulation transfer functions obtained using sinusoidal carriers with normally hearing and hearing-impaired listeners,” J. Acoust. Soc. Am. 110, 1067-1073.

Moore, B. C. J., and Sek, A. (1996). “Detection of frequency modulation at low modulation rates: Evidence for a mechanism based on phase locking,” J. Acoust. Soc. Am. 100, 2320-2331.

Moore, B. C., and Skrodzka, E. (2002). “Detection of frequency modulation by hearing-impaired listeners: Effects of carrier frequency, modulation rate, and added amplitude modulation,” J. Acoust. Soc. Am. 111, 327–335.

Nelson, P. B., Jin, S.-H., Carney, A. E., and Nelson, D. A. (2003). “Understanding speech in modulated interference: Cochlear implant users and normal-hearing listeners,” J. Acoust. Soc. Am. 113, 961-968.

Palmer, A. R. (1995). “Neural signal processing,” in Hearing, edited by B. C. J. Moore (Academic Press, San Diego).

Qin, M. K., and Oxenham, A. J. (2003). “Effects of simulated cochlear-implant processing on speech reception in uctuating maskers,” J. Acoust. Soc. Am. 114, 446-454.

Rocheron, I., Lorenzi, C., Fullgrabe, C., and Dumont, A. (2002). “Temporal envelope perception in dyslexic children,” Neuroreport 13, 1683-1687.

Santurette, S., and Dau, T. (2006). “Binaural pitch perception in normal-hearing and hearing-impaired listeners,” Hear. Res. 223, 29-47.

Shannon, R., Zeng, F-G., Kamath, V., Wygonski, J., and Ekelid, M. (1995). “Speech recognition with primarily temporal cues,” Science 270, 303-304.

Sheft, S., Ardoint, M., and Lorenzi, C. (2008). “Speech identification based on temporal fine structure: comparison of two speech coding schemes,” J. Acoust. Soc. Am. (submitted).

Shulze, H., and Langner, G. (1997). “Periodicity coding in the primary auditory cortex of the Mongolian gerbil (Meriones unguiclatus): two different coding strategies for pitch and rhythm?,” J. Comp. Physiol. 181, 651-663.

Smith, Z. M., Delgutte, B., and Oxenham, A. J. (2002). “Chimaeric sounds reveal dichotomies in auditory perception,” Nature 416, 87-90.

Stone, M. A., and Moore, B. C. J. (2003). “Effect of the speed of a single-channel dynamic range compressor on intelligibility in a competing speech task,” J. Acoust. Soc. Am. 114, 1023-1034.

Tuner, C. W., Souza, P. E., and Forget, L.N. (1995). “Use of temporal envelope cues in speech recognition by normal and hearing-impaired listeners,” J. Acoust. Soc. Am. 97, 2568-2576.

van Tasell, D., Soli, S. D., Kirby, V. M., and Widin, G. P. (1987). “Speech waveform envelope cues for consonant recognition,” J. Acoust. Soc. Am. 82, 1152-1161.

Xu, L., and P ngst, B. E. (2003). “Relative importance of temporal envelope and ne structure in lexical-tone perception (L),” J. Acoust. Soc. Am. 114, 3024-3027.

Zeng, F. G., Nie, K., Liu, S., Stickney, G., Del Rio, E., Kong, Y. Y., and Chen, H. (2004). “On the dichotomy in the auditory perception between temporal envelope and fine structure cues,” J. Acoust. Soc. Am. 116, 1351-1354.

Zeng, F. G., Nie, K., Stickney, G. S., Kong, Y. Y., Vongphoe, M., Bhargave, A., Wei, C., and Cao, K. (2005). “Speech recognition with amplitude and frequency modulations,” Proc. Natl. Acad. Sci. USA 102, 2293-2298.

Additional Files

Published

2007-12-15

How to Cite

Lorenzi, C., & Moore, B. C. J. (2007). Role of temporal envelope and fine structure cues in speech perception: A review. Proceedings of the International Symposium on Auditory and Audiological Research, 1, 263–272. Retrieved from https://proceedings.isaar.eu/index.php/isaarproc/article/view/2007-25

Issue

Section

2007/3. Perceptual correlates of hearing loss and auditory processing disorders