Vowel Perception is magic

LING 3100 - Will Styler


So, vowel perception is kind of my thing



Why study vowel perception?


(Yeah, I said it. Take that, Consonants)


What kind of vowels are we talking about?




Review: What is a vowel?



Review: What is a vowel?

A vowel is voicing passing through (and resonating in) an unobstructed vocal tract!

If we change the position of the tongue, we change the resonances



What do vowels sound like?




Vowel formants


Formants alone can be enough for some perception!


F1


F2


F3


All together now!


What’s being said?


Here’s the original


Listen again!



So, vowels are basically formant patterns


Different American English vowels, as spoken by a male speaker


… and vowel formants map to articulation!



The IPA chart is acoustic!





So…


Language is crazy


Why is vowel perception hard?


Perceptual Gradience


Perceptual Gradience


Date vs. Debt


Date


Debt


?


??


???


Let’s do an experiment!


????


The first and last sounds have formants like the typical English /eɪ/ and /ɛ/vowels


… but in the middle, we’re not really sure what’s going on


Language plays a major role in categorization!


Language as a perceptual factor


Spanish


English


Swedish


Speaker Variation!


Speaker Vowel Space Variation



Speaker Vowel Space Variation

Different speakers produce different resonances, even for the “same” vowels




Moment-to-moment Vowel Variation






Every person you’ve ever talked with has had different vowel formant patterns


See, I told you: Magic


How do we accomplish this magic?


Some people try to put the issue aside




… but how do we manage perceptually?

### Dealing with vowel variability!
* We stack the deck in our favor using the phonology of the language
* We use non-formant-related cues such as vowel length
* We attend to context
* We adjust to individual speakers (or vocal tracts) through Speaker Normalization
* Then, if all else fails, we pretend that we understood, and hope for the best

Dirty Phonological Tricks


Vowel Inventories are designed for perceptibility


Spanish


English


Swedish


Vowel Inventories are designed for perceptibility

Vowels are spread through the mouth


Vowel Length helps too!

Data from Rositske 1939


Context helps!


The Role of Context


Speaker Normalization


Speaker Normalization

### History of Normalization
* Differences in absolute vowel qualities were noted very early on
* Two Competing Theories in the 40’s and 50’s:
* Peterson: We identify vowels based on their absolute formant frequencies
* Joos: We identify vowels based on their relative formant structures
* If Joos is right, then prior context aids in normalization
* Ladefoged and Broadbent set out to test that idea in “Information conveyed by vowels” in 1957

Information Conveyed by Vowels


1957!


They had to paint what they wanted on glass

Then feed it into an analog sound synthesizer

The results weren’t too pretty


Stimulus #4

Stimulus #5


Stimulus #6


… but it worked!


Different contexts led to different perception!


Ladefoged and Broadbent: Conclusions

“The linguistic information conveyed by a vowel is largely dependent on the relations between the frequencies of its formants and the formants of other vowels occurring in the same auditory context”


So, uh, how’s that work going?


We’ve got two main theories!


Speaker-intrinsic vowel space normalization


Speaker-extrinsic vowel space normalization



We don’t know which is more accurate!


What do we know about normalization?


What else do we know about normalization?


These finches are a major problem.





Wrapping up



Thank you!

http://savethevowels.org/talks/vowelperception.html


References

Baru, A. V. (1975). Discrimination of synthesized vowels /a/ and /i/ with varying parameters (f0, intensity, duration, # of formants) in dog. In G. Fant, & M. A. A. Tatham (Eds.), Auditory Analysis and perception of speech. New York: Academic Press.

Ciocca, V., Wong, N. K. Y., Leung, W. H. Y., & Chu, P. C. Y. (2006). Extrinsic context affects perceptual normalization of lexical tone. The Journal of the Acoustical Society of America, Vol. 119, No. 3, 1712-1726.

Joos, M. (1948). Acoustic Phonetics - Supplement to Language. Baltimore: Linguistic Society of America.

Ladefoged, P., & Broadbent, D. E. (1957). Information Conveyed by Vowels. The Journal of the Acoustical Society of America, Volume 29, Number 1, 98-104.

Ohms et al. Zebra finches exhibit speaker-independent phonetic perception of human speech. Proceedings of the The Royal Society of Biological Sciences (2009)

Rositzke, H. A. (1939). Vowel-Length in General American Speech. Language, Vol. 15, No. 2, 99-109.

Verbrugge, R. R., Strange, W., Shankweiler, D. P., & Edman, T. R. (1976). What information enables a listener to map a talker's vowel space? Journal of the Acoustical Society of America, Vol. 60, No. 1, 198-212.

Whalen, D. H., & Sheffert, S. M. (1997). Normalization of Vowels by Breath Sounds. In K. Johnson, & J. W. Mullenix (Eds.), Talker Variability in Speech Processing (pp. 133-143). San Diego, CA: Academic Press Ltd.