Sine Wave Speech



Sample Sine Wave Speech

F1:

F2:

F3:

Combined:

Original:


Different American English vowels, as spoken by a male speaker


Date:

Debt:

?:

??:

???:


Let’s do an experiment!


The first and last sounds have formants like the typical English /eɪ/ and /ɛ/vowels










Information Conveyed by Vowels


1957!


They had to paint what they wanted on glass

Then feed it into an analog sound synthesizer

The results weren’t too pretty

Stimulus #4:

Stimulus #5:

Stimulus #6:


Different contexts led to different perception!


Ladefoged and Broadbent: Conclusions

“The linguistic information conveyed by a vowel is largely dependent on the relations between the frequencies of its formants and the formants of other vowels occurring in the same auditory context”


So, uh, how’s that work going?


We’ve got two main theories!


Speaker-intrinsic vowel space normalization


Speaker-extrinsic vowel space normalization



We don’t know which is more accurate!


What do we know about normalization?


What else do we know about normalization?


These finches are a major problem.





This suggests that normalization may be a more general cognitive process

“OK, OK, we get it. Nothing’s real. Everybody varies. Speech study is impossible. Let’s change to syntax.”


How do we cope as researchers?


Mathematical Normalization

“Various algorithms have already been proposed for this purpose. The criterion for their degree of success might be that they should maximally reduce the variance within each group of vowels presumed to represent the same target when spoken by different speakers, while maintaining the separation between such groups of vowels presumed to represent different targets.” (Disner 1979)


Lobanov (1971) Normalization


Danger!!


Vowel Normalization is imperfect


Wrapping up



Thank you!

http://savethevowels.org/talks/vowelperception_advanced.html


References

Baru, A. V. (1975). Discrimination of synthesized vowels /a/ and /i/ with varying parameters (f0, intensity, duration, # of formants) in dog. In G. Fant, & M. A. A. Tatham (Eds.), Auditory Analysis and perception of speech. New York: Academic Press.

Ciocca, V., Wong, N. K. Y., Leung, W. H. Y., & Chu, P. C. Y. (2006). Extrinsic context affects perceptual normalization of lexical tone. The Journal of the Acoustical Society of America, Vol. 119, No. 3, 1712-1726.

Charlton, B. D., Ellis, W. A. H., Brumm, J., Nilsson, K., and Fitch, W. T. (2012). Female koalas prefer bellows in which lower formants indicate larger males. Animal Behaviour, 84(6):1565– 1571.

Disner, S.F. (1980). Evaluation of vowel normalization procedures. The Journal of the Acoustical Society of America, Vol 67(1), 253-261.

Joos, M. (1948). Acoustic Phonetics - Supplement to Language. Baltimore: Linguistic Society of America.

Ladefoged, P., & Broadbent, D. E. (1957). Information Conveyed by Vowels. The Journal of the Acoustical Society of America, Volume 29, Number 1, 98-104.

Lobanov, B. (1971). Classification of Russian Vowels Spoken by Different Speakers. The Journal of the Acoustical Society of America, 49(2B):606–608.

Ohms et al. Zebra finches exhibit speaker-independent phonetic perception of human speech. Proceedings of the The Royal Society of Biological Sciences (2009)

Rositzke, H. A. (1939). Vowel-Length in General American Speech. Language, Vol. 15, No. 2, 99-109.

Verbrugge, R. R., Strange, W., Shankweiler, D. P., & Edman, T. R. (1976). What information enables a listener to map a talker’s vowel space? Journal of the Acoustical Society of America, Vol. 60, No. 1, 198-212.

Whalen, D. H., & Sheffert, S. M. (1997). Normalization of Vowels by Breath Sounds. In K. Johnson, & J. W. Mullenix (Eds.), Talker Variability in Speech Processing (pp. 133-143). San Diego, CA: Academic Press Ltd.