This presentation is available online at:

http://savethevowels.org/talks/ucsd_talk.html

(Navigate with the arrow keys or on-screen controls)


N-Gram Language Models

Will Styler


The Plan


N-grams


What is an N-gram?


How do we find N-Gram counts?


Let’s try it



#!/usr/bin/env python

import nltk
from nltk import word_tokenize
from nltk.util import ngrams

es = open('enronsent_all.txt','r')
text = es.read()
token = nltk.word_tokenize(text)

unigrams = ngrams(token,1)
bigrams = ngrams(token,2)
trigrams = ngrams(token,3)
fourgrams = ngrams(token,4)
fivegrams = ngrams(token,5)


Unigrams


Bigrams


Trigrams


Four-Grams


Five-Grams


Note that the frequencies of occurrence dropped as N rose


OK, Great.


N-Grams give us more than just counts


N-Grams can give us a language model


These probabilities tell us about Grammar


These probabilities tell us about the world


N-Gram models are really useful


N-Gram uses in the real world


Sociolinguistic n-gramming


… and all of this comes from counting words


N-Gram Modeling Strengths


N-Gram Modeling is relatively simple


N-Gram Modeling is easily scalable


N-Gram Modeling Weaknesses


They only work with strict juxtaposition


Very poor at handling uncommon or unattested N-Grams


N-Gram models are missing information


Conclusion


N-Grams aren’t the solution to every problem


N-Gram Models are a powerful tool for NLP


N-Grams are not the only tool we need to model language


Questions?



Vowel Formants, the Source, and the Filter

### Will Styler

The Source-Filter Dichotomy is a ‘threshold concept’ in Acoustic Phonetics


The Plan


Three Ways to Visualize Sound


Waveforms


Spectral Slice (FFT)


Spectrograms


Vowel Formants


We talk about vowel formants a great deal in acoustic phonetics


We see formants in spectrograms


We label them as F1, F2, and F3


The frequency of Vowel formants is main cue for perceiving vowels in English


But what are they, really?


The Source-Filter Model of Speech Production


Let’s talk about the vocal tract

‘Source’ and ‘Filter’


Source (The Vocal Folds)


Source (The Vocal Folds)


This source signal is not so pretty


This source signal is not so pretty


This signal carries pitch information, but not much else


Filter (The Vocal Tract)


Resonance


We all understand resonance


Resonant Cavities act like filters


The vocal tract filters the source


We take something boring

(The source signal)


… and filter it into something beautiful



Different vowels are just different cavity shapes


Each cavity shape produces different resonances


Changes in tongue position mean changes in formant structure


A (creepy) demonstration


So, we have a source, and a filter


Perceiving vowels using formants


Measuring vowels using formants


But where are they?!


Harmonics are not formants!


Formants are obvious when you’re looking at sounds from a distance



A more grounded example



Formants are the ranges, not the mountains!


Formants are the areas of the spectrum where harmonics resonate


One final, crucial point…


Source and Filter are Independent


The Filter will filter any source signal

Image Credit


Changing Pitch doesn’t change the resonances


Changing resonances doesn’t change pitch


Voice pitch is unrelated to resonance.


In fact, there are lots of sources possible


Electrolaryngeal Speech


Esophageal Speech


Esophageal Speech


The Source and the Filter are Independent


Final conclusions


Take-home points


… And vowel acoustics are really, really cool!


Questions?


This presentation is available online at:

http://savethevowels.org/talks/ucsd_talk.html


Thank you!