# Linguistic Problems with Statistic Solutions Will Styler
--- ### Today's Plan - What is Linguistics, and why? - The state of statistics in linguistics - Coarticulation - Complexity from complex data types - Complexity from complex questions - Why is this a problem for our field? - Why should statisticians and linguists team up more often? --- # What is Linguistics, and why? --- ### Linguistics is the study of Language - What is this thing I'm doing right now with my flapping bits of meat around in my head and you then understanding my thoughts? - How can we describe what languages are doing? - How can we understand the differences and similarities among them? - What does language tell us about cognition and culture? --- ### Linguists study languages to understand Language - Many linguists speak lots of languages, but some don't! - We're interested in the whole enterprise, and study it scientifically --- ### We break Linguistics into subfields - "How does talking and understanding speech work?" - Phonetics - "How do units of sound or gesture change when we combine them?" - Phonology - "How do we build words?" - Morphology - "How do we combine words into sentences?" - Syntax - "What does it all mean?" - Semantics and Pragmatics - "How does this less-well-known language work?" - Lg. Documentation - ... and many more! --- ### Linguistics is an increasingly experimental discipline - Some folks still work in armchairs - ... or in the homes and worlds of language experts - Theory is now often supported by recourse to quantitative data - Especially where the patterns are small, variable, or difficult to ferret out --- ### Almost every type of linguistic research has data to analyze - Text data (e.g. large corpora) - Survey data (e.g. responses, free text) - Experimental data (e.g. eye tracking, reaction time, accuracy) - Neural data (e.g. EEG, fMRI, PET, MEG) - Imaging data (e.g. video, ultrasound) - Spatial data (e.g. GIS info, 3D spatial movement tracking) --- ### I'm a phonetician - My focus is on understanding exactly what's happening in the mouth when we talk - As well as on how we're reconstructing those gestures using the acoustic signal we can hear - ... and we're going to focus on some phonetic questions today --- # Statistics in Linguistics --- ## The State of Linguistic Statistics --- ### Most linguists take some basic statistics classes - "Statistics for Psychology Students" - Increasingly more sophisticated classes are available - "Bayesian Methods for Linguists" - "Generalized Additive Models" --- ### There are dedicated resources for statistics for Linguists > [Baayen, R. H. (2008). Analyzing Linguistic Data: A practical introduction to statistics using R. Cambridge University Press.](https://www.cambridge.org/us/academic/subjects/languages-linguistics/grammar-and-syntax/analyzing-linguistic-data-practical-introduction-statistics-using-r) > [Winter, Bodo (2020). Statistics for Linguists: An Introduction Using R. Routledge.](https://www.routledge.com/Statistics-for-Linguists-An-Introduction-Using-R/Winter/p/book/9781138056091) --- ### We're still pretty basic - There are absolutely complex analyses being run in the field - Some specializations (e.g. neurolinguistics) require advanced models to function - Some linguists have secondary passions in statistics - Some statisticians moonlight in linguistics (to varying degrees of success) - The vast majority of linguistic work in these core fields is still supported by more basic methods - T-Tests and Chi-Square, decreasingly, with ANOVA and lm/glm on the rise --- ### The most recent statistical 'trend' in the field is towards Linear Mixed Effects Regression - Most experiments have some decidedly random random factors - Speaker language background differences - Differences in vocal tract sizes - Individual word differences - Usually implemented using ``lmer`` in R - ... but mixed models are right at the edge of many linguists' understanding - This has led to a saying... --- > "Giving Linear Mixed Models to Linguists is like giving shotguns to toddlers" --- ### ... but linguists are needing more and more statistical complexity - Larger and larger text corpora are allowing (and forcing) *massive* analyses - Interdisciplinary work often inherits the toolchains of related methods - New experimental methods require new technology to process it - More nuanced questions require more nuanced examinations --- ### We're going to look at those last two - Complex data requiring complex analysis - Nuanced questions requiring nuanced analysis - We're going to examine both in the context of linguistic phonetics --- # Coarticulation in Phonetics --- ### Phonetics is the study of speech and speech perception - "How are you moving structures inside your body to produce this word?" - "How are listeners able to understand that you've produced this word" --- ### Studying gestures - Speech can be defined as a sequence of gestures of the tongue, lips, larynx and other speech articulators - Gestures of the tongue and mouth are the smallest units of spoken language - Gestures are likely the object of human speech perception - *Both of the claims above could cause a fistfight at a conference, but I said them* --- ### Gestures aren't cleanly separable - We write letters one after the other - ... but the lines between gestures tend to blur - Speech sounds are **not** beads on a string - We often begin moving our articulators towards the next gesture before we've finished the current one - ... and the last sound can often have an influence on the current one - A nice example: 'car key' --- ### This gestural overlap is called 'coarticulation' - "Car key" is changing the articulation of one sound to better 'match' the next - We will often start to articulate the /l/ in words like 'bulk' before we've finished the vowel - Air starts flowing out the nose in words like 'bend' before we actually make the /n/ sound where it's supposed to - This provides useful data for perception! - **Coarticulation is easier when talking, and useful for understanding speech** - So we want to learn more about the gestures we're making --- ### Phonetics has a big problem - We want to see exactly which gestures are happening inside your head - Your head is opaque --- ### Phonetics has done this acoustically for a long time - First by ear training, then frequency-domain analysis ('spectrographs'), now digital signal processing of audio - We've focused on finding quantifiable measures which covary with the articulatory properties under study - "This measure represents the height of the tongue in the mouth" - Articulatory measures are useful - "What exactly are the speech articulators doing inside the head?" - Imaging of tongue motion and position is ideal! --- ### ... but when we look inside the head, we find... --- # Complexity from Complex Data --- ### Ultrasound Imaging - Pulse high-frequency sound waves into the body - Measure the patterns in which they return to image internal structure - The resulting data are black and white image frames showing areas of high and low reflection --- ### Ultrasound Data Acquisition
--- ### Sample Speech Ultrasound file
--- ### Ultrasound in Speech - Captures the motions of the tongue in (generally) two dimensions - Ideal for tracking the *contour* of the tongue --- ### Ultrasound 'Splining' - The machine outputs a series of images (or grayscale matrices) at a fixed sampling rate - We transform images into lists of ordered points representing the tongue shape and location - This is done using undergrads or [machine learning models](https://arxiv.org/abs/1907.10210) ---
---
--- ### Technical Notes - There are some approaches which use PCA on whole-frame images to isolate meaningful components and skip this process (c.f. [Faytak et al. 2020](https://www.journal-labphon.org/article/id/6281/)) - There are many problems with normalizing position and orientation between speakers and words which are Fun --- ### This splined data gives us details about articulation - What is the average/min/max height of the tongue? - "Is the vowel in 'beet' generally higher than the vowel in 'bit'?" - What's the front-back distribution of the tongue? - "Is the vowel in 'boot' really as far back as in 'boat' for Californians?" - How do tongue contours differ between sounds? - "Do we shape the tongue differently for 'buck' and 'bulk'?" - How do tongue contours change during sounds? - "At what point does the tongue start moving towards the /l/ gesture in 'bulk'?" --- ### Getting front-back-high-low distribution is relatively easy
--- ### Does the tongue shape differ for 'buck' vs. 'bulk'?
--- ### Comparing Contours is difficult (for us) - Usually done using Smoothing Spline ANOVA in Linguistics - Occasionally mixed models with B-Splines - Some work with Generalized Additive Models (GAM)
--- ### At what point does the tongue start moving towards the /l/ gesture in 'bulk'? - This is a place where speakers vary - We can look at the time course of the vowel+l portion of the word --- ### Some people show some change later on
--- ### Some people have have massive change early on
--- ### Some people don't show change at all
--- ### Measuring these changes is very difficult (for us) - Quantifying the degree of change in a 50 point spline which changes contour and position over time - Variably, across speakers - Identifying the *onset* of the contour change in time - Identifying specific types of contour change which are most relevant - Finding 'targeted' vs 'untargeted' change - **There isn't a well-established statistical method for doing this in our field!** --- ### "Wait... hold on..." - "People differ in the amount and timing of change...?" ---
---
--- ### "Why do people differ in their patterns of coarticulation?" --- # Complexity from Complex Questions --- ### Background: Nasal Coarticulation - /n/ is a 'nasal' sound, with airflow from the nose - This is accomplished by lowering the 'velum'
---
bend
/bɛnd/
* **...but there's more to it than the symbols show us!** * In the word "bend", we start nasal airflow before the nasal /n/, *during the vowel* --- ### This is audible and useful to us - Is this 'bob' or 'bomb'?
- **We use coarticulation to tell what the upcoming word will be more quickly!** --- ### We can measure nasal coarticulation by measuring airflow from the mouth and nose - This is called 'pneumotachography'
--- ### Airflow measurement gives us curves - Oral and nasal flow in mL/sec - Sampled (here) at 50 points through the vowel --- ### The word 'bed' has no nasal airflow
--- ### The word 'bend' is more complicated
--- ### The /b/ has no nasal flow
--- ### The /n/ has lots of nasal flow and little oral flow
--- ### The vowel in the middle shows coarticulation
--- ### Looking at airflow we can see coarticulation directly - Both the *amount* of flow and the *timing* of the flow --- ### Some speakers show only a bit of coarticulation
--- ### Some speakers show only a bit of coarticulation
--- ### Some speakers show moderate coarticulation
--- ### Some speakers show massive coarticulation
--- ### Some speakers show massive coarticulation
--- ### Speakers differ greatly in their *production* of coarticulation - Ranging from 'practically none' to 'it's all nasal' - Inference can be done using splined mixed models, GAMs, and more - Functional data analysis isn't common in Linguistics, but it does happen! --- ### If speakers vary in their production of coarticulation - Do they differ in their *perception* of coarticulation as well? --- ### Measuring the Perception of Coarticulation - Often done using eyetracking - "When does the participant look at the correct image on the screen?" - "Does this person use vowel nasality to choose 'send' over 'said' more quickly?" --- ### Visual World Eyetracking
--- ### Eye Tracking Data - For each trial, 1000 binary points over the course of a second, 'Are they looking at the nasal word?' - 0000000000000001111111111... - Occasionally 00000000000000011111111110000000... - Many, many trials are averaged out to create response curves - "Generally speaking, does this person make a choice earlier in this condition than that one?" --- ### Conditions - "Early Nasalization": Coarticulation begins very early in the vowel - "Late Nasalization": Coarticulation begins later in the vowel - *How early is information about the word made available to listeners?* --- ### Listeners can be compared on the basis of their use of nasality - People who use coarticulation strongly in perception will decide earlier for 'early' nasalization tokens - People who don't use coarticulation in perception will show little distinction between the conditions --- ### Listeners who use coarticulation
--- ### Listeners who use coarticulation
--- ### Listeners who largely ignore coarticulation
--- ### Listeners who largely ignore coarticulation
--- ### So, now we can measure perception of coarticulation - ... and production - This allowed us to ask one very large question... --- ### Is a listener's production of coarticulation related to their perception of coarticulation? - Put differently, do people who coarticulate early, listen for it early? - *Do people who talk unusually expect others to talk the same way?* - This was tested in [Beddor et al. 2018](https://muse.jhu.edu/article/712563) --- ### This is a surprisingly useful question - It gets at the heart of the gesture vs acoustics debate in speech perception - It tells us about the role of our own productions in guiding our learning of a language - It has massive implications for how languages change over time --- ### But it's really, really unpleasant to test - Correlating a functional airflow curve (with massive variation in values) with the overall trend across a large set of logistic time series from eye tracking trials - Some truly random factors we want to get rid of (e.g. variation in frequency and 'lookability' across words) - Some speaker factors we want to get rid of (e.g. variation in pre-look processing time, absolute differences in airflow volume), but some we want to study (e.g. variation in time-to-look by condition, variation in flow slope and time onset) - We're interested in speaker variation, but the experiment was so complex that we could only collect 42 participants - **Yikes** --- ### We needed help - Help came in the form of [Kerby Shedden](https://sph.umich.edu/faculty-profiles/shedden-kerby.html), University of Michigan Department of Statistics --- ### We ended up collapsing the airflow data using PCA - This gave us a single quantity representing timing and degree of coarticulation which we could insert into a model of perception - The perception model was run using ``mcmcglmm`` in R, with B-Splines to model temporal variation --- ### Turns out that people who produce early coarticulation generally listen for early coarticulation
(Adapted from Beddor et al 2018) --- ### Work is ongoing to continue investigating these issues - The production/perception link is very interesting, and uniformly hard to analyze - ... and there are a million other domains to test it in --- ### These cases illustrate the sorts of complexity which we've found ourselves wandering into - ... and analogous issues exist in *every* subfield of linguistics --- # Why is this a problem for our field? --- ### Increasingly complex data has pulled us into complex territories - We've moved from single variable correlations into functional data - ... and in many cases, functional data which is itself captured as a time series - New methods are arriving, but our questions are generally different enough that existing statistical toolchains don't cleanly apply - Our data keep getting richer and bigger - The burden of 'proof' is rising as available data to test is rising --- ### Increasingly complex questions require increasingly nuanced analyses - We're now increasingly studying the kinds of variability which conventional models attempt to factor out - Potentially explanatory data is seldom low-dimensional - SOMETHING --- ### Our statistical needs have surpassed our statistical abilities - Grad level Psych Stats has very little to say about comparing 3D meshes of tongue motion by conditions - This poses a massive teaching problem! - Reviewers are generally chosen for knowledge of specific domains (e.g. coarticulation or French nasality), and have vastly variable statistical backgrounds - "Why not just use an ANOVA here?" - Keeping up with the statistical state-of-the-art is a full-time job, and it's very easy to miss things - ... so those of us who try to learn more about complex analyses often remain toddlers with shotguns --- ### That's why I'm here today - (That and Shuheng's gracious invitation) --- # Linguists and Statisticians have massive collaborative potential --- ### Language is uniquely rewarding as an area of research - You are quite literally always using language - Problems are often interpretable in terms of linguistic experience - It offers a diversity of data types, often in the same experiments - Text data, behavioral experiments, sensor output, imaging data, GIS, and more - Linguistic knowledge is helpful for breaking into Natural Language Processing, and other language-focused data science - Everything I've talked about today has straightforward applications in speech recognition and text-to-speech --- ### Linguists are often held back by lack of knowledge of techniques - It's very possible that 'straightforward' techniques in statistics could be revolutionary in our field - Many of us feel limited by our tools more than our questions - Collaborations can be mutually rewarding, and *extremely* beneficial in our world - Cross-specialization is important --- ### Our field is just realizing this need - There is increasing discussion of hiring statisticians in departments and divisions for consulting and collaboration - ... and already, statistical saavy is a common desired trait for new hires - Statisticians who know even basic elements of language will be increasingly valued --- ### Teamwork can make the dream work - Linguistic work is often held back by relatively basic inference approaches - Increased complexity of data, and increased complexity of questions, both leave ample room for collaboration - New methods in statistics likely have testable uses in language - New questions in linguistics may require new methods in statistics - And people collaborating in this world have a very real chance to make a difference in our fields --- ### Let's talk! - Next time you're looking to branch out, remember that we linguists are here - That we've got amazing data - ... and at the very least, you can use your knowledge to help disarm a toddler ---
Thank you!
Questions?