On the Acoustical and Perceptual Features of Vowel Nasality

Will Styler

Acknowledgements

My Advisor, Rebecca
My Committee
Luciana Marques, Georgia Zellou, and Story Kiser
The rest of the CU Linguistic Community
- The Illocutionary Force

More Acknowledgement

My family
Jessica
Vowels

Hi! I’m Will.

I’ve got a Nasality problem.

Vowel Nasality

Opening the Velopharyngeal Port during vowel production to allow nasal airflow

Coarticulatory Nasality in English

‘Pats’ [pæts]	‘Pants’ [pæ̃nts]

Contrastive Nasality in Lakota

‘seed’ [su]	‘braid’ [sũ]

### Humans are OK with vowel nasality

* … Yet it’s complicated for Linguists…

“…To do my experiment I will need to find the point where nasality starts in a vowel, and I am struggling with that a bit.

> Would you have an idea about possible ways to look for this point in time where nasality actually starts for each vowel, based on sound?”

Nope.

Our current methods just aren’t that accurate

### A1-P0: The Reigning Champion

* “Nasality makes the vowel formants drop in power, and introduces a nasal resonance. Compare the two.”

A1-P0 lets us say things about classes of vowels.

“CVN words should have increasing nasality through the vowel”

(A1-P0 should drop)

Sure does!*

*(when you have 576 measurements per point)

… but we can’t say much about nasality in this vowel right here.

Going from known-oral to known-nasal parts of vowels, nasality should always go up.

A1-P0 shows this increase only 56% of the time.

Listeners clearly can make judgements about nasality in individual vowels*, but linguists can’t.

(c.f. Lahiri and Marslen-Wilson 1991, Beddor and Krakow 1999, Beddor 2013, Kingston and Macmillin 1995, Macmillin et al 1999)
Clearly, then, Nasality is Magic.

(Or we just don’t know what makes nasals nasal for humans.)

That’s where I come in!

Two Goals

1. Figure out what acoustical features change with nasality in English and French
1. Figure out which ones humans are actually using

The Fundamental Problem

There are a lot of possible features for nasality

The Plan

Collect Data and measure possible features
Experiment 1 - What features are statistically linked to nasality?
Experiment 2 - What features are useful for identifying nasal vowels by machine?
- (These two experiments combine to tell us which features look most promising)
Experiment 3 - What features are humans using to perceive nasality?
Experiment 4 - Do computers show a similar perceptual pattern?

Data Collection!

Data Collection

I recorded 12 English and 8 French speakers making words with oral and nasal(ized) vowels
- For English, we recorded CVC/CVN/NVC/NVN words
- For French, we recorded nasal/oral vowel minimal pairs
Find things that could encode nasality, and measure them!
- All measurement was done automatically by Praat Script
- Measurements happened at two timepoints per vowel

Feature Selection

Let’s talk about a few features more specifically

A1-P0

P0 Prominence

Vowel Formant Bandwidth

### Vowel Formant Bandwidth

Vowel Duration

Spectral Tilt (A3-P0)

The Data!

A file showing data 29 different features for two measurements per vowel per word
- 15,449 rows!
Annotated with information about language, speaker, phonological structure, etc
All of this was read into the R Statistics Suite

Experiment 1: Statistical Analysis!

The Idea

“If a feature doesn’t meaningfully change between oral and nasal vowels, humans won’t use it.”
Let’s test which features are different in oral and nasal vowels!

The Plan

1. Run a bunch of Linear Mixed-Effects Analyses for English and French
- This will show the statistical link between the features and nasality
1. See which features showed significant differences in nasal(ized) vowels
- … and how large the ∆Feature is for each
1. See if this differs in English and French

The Analyses

lmer(Amp_F1 ~ nasality + repetition + vowel + Timepoint + (1+ nasality|speaker) + (1|Word), data = eng)

This compares the “nasal” vowels to the “oral” vowels
- In English, this means CVC to CVN/NVC/NVN
Random slopes for speaker allow by-speaker variation in amount of change
Not all features showed a significant statstical link with nasality!

The Findings (English)

The Findings (French)

The Most Promising Features

A1-P0 performed well in both languages
Duration showed major changes in both languages
- (English nasalized vowels appear shorter, French nasal vowels appear longer)
Spectral Tilt was really strong in French, less so in English
Formant Bandwidth was really strong in both languages
There was some Formant Frequency effect too
P0’s Prominence is looking pretty good too.

Experiment 1 Wrap-up

Humans probably don’t use the features that didn’t show significant oral-to-nasal ∆Feature
We now know which features are linked with nasality across the entire dataset
… and which ones show the largest and most meaningful ∆Feature values

… but how do we know if they’re actually helpful for identifying nasality?

These tests show overall trends across several thousand words
What about vowel-by-vowel identification? How can we test that?

Uh… do a perception experiment?

The Problem:

Perceptual testing using humans is inefficient and expensive.

… and they cheat

“I’m rooting for the pats in the Super Bowl”
“One should likely wear pants to a thesis defense”

… and they give awful feedback

“Uh, it just sounded like”pants”, bro.”
“Well, that one sounded more nasally.”
“Can I get my extra credit already?”

The Solution?

Ask a Computer!

Experiment 2: Machine Learning!

The Idea

Humans hear a signal, find acoustical features, and then make judgements.

Machines can be given features, and then make judgements too.
Better accuracy with a feature means the feature is more useful.

Machines have some advantages!

Their decisions are easier to quantify.
They’ll tell you how they made the decision they did.
They live in my apartment!
They have no idea what “pants” are, and don’t watch football.

The Plan

1. Choose an Algorithm
1. Give all 29 features to a Machine Learning Algorithm individually
- The most accurate features should be the most useful
1. Give them everything, and ask which features are most useful.
- Feature weighting and importance are handy!
1. Find the best group of features
- Find the balance between “few features” and “good accuracy”
1. Test those features with expensive humans.

Choosing Algorithms

Machine Classification

“Is this datapoint likely in class A, or class B?”

“Is the car driving normally, or crashing?”
“Is this language English or Chinese?”
“Is this handwritten symbol”1”? “2”? “3”? (…)
“Is this word a noun, or a verb, or an adjective, or…?”

My Algorithms of Choice

RandomForests
- Because they’re transparent
Support Vector Machines
- Because they’re the gold standard

Before we discuss RandomForests, we need to talk about…

## Decision Trees

Let’s pretend to be classifiers!

“I’m looking at a bird. What kind of bird is it?”

One Approach:

Ask questions, then make decisions based on the answer!

By asking enough questions looking at a training set, you’d end up with a Decision Tree.

Classification is just “following the tree”
Ask a question, then ask a different question based on the first one, then ask another….

RandomForests

To make a RandomForest:

1. Make a decision tree using a subset of the features and data
1. Make another decision tree using another random subset of features and data
3-500) Do that 498 more times
1. Synthesize these models into a single, best-performing model
1. Classify using that mega-tree!

Let’s make a RandomForest!

RandomForests are great!

They work well with small and large datasets
They’re transparent!
… but they’re not the most accurate algorithms out there
… and they’re a bit… odd sometimes.
So we should also use a model which is more accurate
- … and less weird

Support Vector Machines!

Back to the waterfowl!

Your Kayaking Relative has taken a hands-on approach to classification

You are now recieving texts with bill length and body-length measurements for birds
The question is “Swan, or Duck?”

Support Vector Machines

Look at all the data in an n dimensional space
- n is the number of features
Try to find a hyperplane with the best separation
This hyperplane is delineated by the “support vectors”
Classification is just seeing where the new data is relative to that line

(My approach was slightly more complex, using “kernels”, but you’ll have to read the paper for more info!)

Support Vector Machines

SVMs are really accurate
- … and anything that beats them is usually really complex
They act exemplar-ish, when used as I used them.
They’re a “gold standard” for machine learning

So, we have two algorithms

RandomForests for transparency
SVMs for accuracy

Let’s do some classification!

Single-feature tests

Single-Feature testing

Are any features good enough on their own to allow nasal perception?
Using both SVMs and RandomForests.
Using 10-fold cross-validation
116 models, one per feature per algorithm per language

So, none of the features are good enough on their own.

What about as part of a larger group?

Evaluating Feature Importance

RandomForest Importance

RandomForests can calculate which features were most useful for classification!
Reclassify, but shuffle the data for one feature per run
- If this hurts the accuracy, that feature’s important!
This is awesome!

Evaluating Feature Importance

Run an all-features-included RandomForest
Compare the Importance Values for each feature

… I wonder if all these important features would perform well as a group…?

Multi-feature Models

Multi-feature modeling

Pick six a priori feature groupings
- There are 20,030,007 other possible groupings of 10 features out of 29
Test them with SVMs and RandomForests
Compare accuracy in light of the number of features
- More features will usually mean better accuracy

Remember, we did this for both English and French

So we can ask some fun questions!

Do English and French differ in terms of which features are important?

Does the same classifier work well on both?

What happens if you train a model on English, then test it on French?
- … and vice versa.
If they’re fundamentally similar, it won’t matter!

Cross-language Classification

French and English do nasality differently

The Findings

P0’s Prominence was not very useful at all.
- We can eliminate it going forward.
Formant Bandwidth was the best feature for English, strong in French
Spectral Tilt was the most useful feature in French, less so in English
A1-P0 performed well in both languages
Duration was really useful in both languages
- … but this could be because it lends itself particularly well to classification
There was some Formant Frequency effect too

So… uh… what about humans?

Experiment 3: Human Perception

The Idea

“English can use vowel nasality to identify ambiguous words. Let’s see which of these features is helpful!”

Create nasal vowels where each nasal feature is reduced
Create oral vowels where each nasal feature is added
Put them in contexts where part of the word is missing - ba(d) or ba(n)
If the listeners are confused more often or take more time to choose, the feature’s important!

The Plan

1. Create modified stimuli
- Modify A1-P0, A3-P0, Formant Structure and Duration
1. Create Control Stimuli
- Make similar modifications, but then reverse them
1. Present both to English listeners
1. Analyze Accuracy and Reaction Time
- If modifying different conditions shows different effects, we will understand the cue!

The Modifications

The Experiment

bad	ban

bomb	bob

The Analysis

Reduction stimuli, where nasal features are reduced in nasal vowels
Addition stimuli, where nasal features are added to oral vowels
Linear Mixed Effects Regressions for accuracy and RT by Condition*Control

Addition Stimuli Findings

Addition Summary

Modifying all conditions or formants resulted in more confusion
Modifying all conditions or formants resulted in slower responses
- Post-hoc tests show that “All” and “Formant” modification did not meaningfully differ for either
Only modifying formant frequency and bandwidth had an effect on perception

Removal Stimuli Findings

Removal Summary

None of the experimental modifications affected confusion
Modifying all conditions or formants resulted in slower responses
- Post-hoc tests show that “All” and “Formant” modification did not meaningfully differ
Only modifying formant frequency and bandwidth had an effect on perception
- … but it wasn’t enough to change classification!

Experiment 3 Summary

Only formant modification had a significant effect on perception
Formant modification caused listeners to respond more slowly
Formant modification made listeners call some oral vowels “nasal”
Formant modification wasn’t enough to make nasal vowels “oral”

(We’ll talk more about that asymmetry at the end!)

So, computers predicted F1’s bandwidth as the most useful feature…

How similar are the SVMs and the humans?

Experiment 4: Humans vs. Machines

The Idea

“Let’s give the computer the same experimental task as the humans, using the same altered stimuli, and see how they compare!”

The Plan

1. Train SVMs on different datasets
1. Test those SVMs on the experimental stimuli (classifying “oral” or “nasal”)
1. Compare the by-condition results to the humans

The SVMs

NoNVN - Trained on CVCs, CVNs, and NVCs
EnAll - Trained on all the English data
EnFrAll - Trained on all data, English and French

Experiment 4 Summary

Humans and machines did show similar patterns
Putting aside duration, the EnAll SVM mirrored the humans very well
Perceptual testing with machine learning isn’t crazy
Humans still win.

Hooray!

So… what does it all mean?

Formants are the cue to nasality perception in English

It’s probably F1’s bandwidth
- It worked best in ML, had the best statistical link, and it makes sense acoustically
- Hawkins and Stevens (1985) also points that direction
… but it’s probably not the only cue for vowel nasality

Reducing Formant Bandwidth doesn’t make nasal vowels “oral”

There was still something “nasal” about the vowels
We only took away universal formant changes
- We modeled the changes that happen in every nasal vowels
Vowel-specific formants changes in nasal vowels were not affected

Maybe nasal vowels are produced differently in the mouth?

This is in line with prior work
- (Carignan et al. (2015), Carignan (2014), Carignan et al. (2011) and Shosted et al. (2012))
This makes phonological sense
- Nasal vowel systems are often really different than the oral vowel systems
If nasal vowels are orally different, then of course we wouldn’t confuse listeners

So wrapping up

Our current measurements of nasality aren’t bad!
Machines can accurately classify nasality
- … and simulate human perception
The acoustics of nasality per se are clearly useful
- Particularly formant bandwidth!
… but other aspects of the vowel articulation are important too!

Most importantly…

There’s more to a “nasal vowel” than nasal airflow

Thank you very much!

References

Carignan, C. (2014). An acoustic and articulatory examination of the oral in nasal: The oral articulations of french nasal vowels are not arbitrary. Journal of Phonetics, 46(0):23–33.
Carignan, C., Shosted, R., Shih, C., and Rong, P. (2011). Compensatory articulation in american english nasalized vowels. Journal of Phonetics, 39(4):668 – 682.
Carignan, C., Shosted, R. K., Fu, M., Liang, Z.-P., and Sutton, B. P. (2015). A real-time mri investigation of the role of lingual and pharyngeal articulation in the production of the nasal vowel system of french. Journal of Phonetics, 50(0):34 – 51.

References Continued

Chen, M. Y. (1997). Acoustic correlates of english and french nasalized vowels. The Journal of the Acoustical Society of America, 102(4):2350–2370.
Hawkins, S. and Stevens, K. N. (1985b). Acoustic and perceptual correlates of the non-nasal–nasal distinction for vowels. The Journal of the Acoustical Society of America, 77(4):1560–1575.
Shosted, R., Carignan, C., and Rong, P. (2012). Managing the distinctiveness of phonemic nasal vowels: Articulatory evidence from hindi. The Journal of the Acoustical Society of America, 131(1):455–465.

On the Acoustical and Perceptual Features of Vowel Nasality

Will Styler

Acknowledgements

More Acknowledgement

Hi! I’m Will.

I’ve got a Nasality problem.

Vowel Nasality

Coarticulatory Nasality in English

Contrastive Nasality in Lakota

> Would you have an idea about possible ways to look for this point in time where nasality actually starts for each vowel, based on sound?”

Nope.

A1-P0 lets us say things about classes of vowels.

Sure does!*

… but we can’t say much about nasality in this vowel right here.

Clearly, then, Nasality is Magic.

(Or we just don’t know what makes nasals nasal for humans.)

That’s where I come in!

Two Goals

The Fundamental Problem

There are a lot of possible features for nasality

The Plan

Data Collection!

Data Collection

Feature Selection

Let’s talk about a few features more specifically

A1-P0

P0 Prominence

Vowel Formant Bandwidth

Vowel Duration

Spectral Tilt (A3-P0)

The Data!

Experiment 1: Statistical Analysis!

The Idea

The Plan

The Analyses

The Findings (English)

The Findings (French)

The Most Promising Features

Experiment 1 Wrap-up

… but how do we know if they’re actually helpful for identifying nasality?

The Problem:

… and they cheat

… and they give awful feedback

The Solution?

Ask a Computer!

Experiment 2: Machine Learning!

The Idea

Machines have some advantages!

The Plan

Choosing Algorithms

Machine Classification

My Algorithms of Choice

## Decision Trees

RandomForests

To make a RandomForest:

RandomForests are great!

Support Vector Machines!

Your Kayaking Relative has taken a hands-on approach to classification

Support Vector Machines

Support Vector Machines

So, we have two algorithms

Let’s do some classification!

Single-feature tests

Single-Feature testing

What about as part of a larger group?

Evaluating Feature Importance

RandomForest Importance

Evaluating Feature Importance

Multi-feature Models

Multi-feature modeling

More features will usually mean better accuracy

So we can ask some fun questions!

Do English and French differ in terms of which features are important?

Does the same classifier work well on both?

Cross-language Classification

French and English do nasality differently

The Findings

Experiment 3: Human Perception

The Idea

The Plan