Fishing for meaningful units in connected speech
Resumé
In many branches of spoken language analysis including Automatic Speech Recognition (ASR), the set of smallest meaningful units of speech is taken to coincide with the set of phones or phonemes. However, shing for phones is dif cult, error-prone, and computationally expensive. We present an experiment, based on machine learning, with an alternative approach. Instead of stipulating a basic set of target units, the determination of the set is considered to be part of the learning task. Given 18 recordings of Danish talkers performing a simple lab task, our algorithm produced a set of acoustically well- de ned units suf cient for identifying all the major semantic elements (be they parts of words, single words or several words), relevant to the task. As the sound encoding used was very simple – fundamental frequency (F0), Harmonicity- to-Noise-Ratio (HNR), and Intensity samples only – the computational complexity involved was far lower than for phonemic recognition. Our ndings show that it is possible to automatically characterize a linguistic message, without detailed spectral information or presumptions about the target units. Further, shing for simple meaningful cues and enhancing these selectively would potentially be a more effective way of achieving intelligibility transfer, which is the end goal for speech transducing technologies.
Referencer
Boersma, P. (2001). “Praat, a system for doing phonetics by computer”, Glot. International 5:9/10, 341-345.
Bratko, I. (2000). Prolog Programming for Arti cial Intelligence, Third Edition (Addison-Wesley E).
Grønnum, N. (2009). “A Danish phonetically annotated spontaneous speech corpus (DanPASS)”, Speech Communication 51, 594-603.
Henrichsen, P. J. (2004). “Siblings and Cousins; Statistical Methods for Spoken Language Analysis”, Acta Linguistica Hafniensia 36, 7-33.
Yderligere filer
Publiceret
Citation/Eksport
Nummer
Sektion
Licens
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright* and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
*From the 2017 issue onward. The Danavox Jubilee Foundation owns the copyright of all articles published in the 1969-2015 issues. However, authors are still allowed to share the work with an acknowledgement of the work's authorship and initial publication in this journal.