Associate Teaching Professor of Linguistics at UC San Diego
Director of UCSD's Computational Social Science Program
A Phonetician’s Software Toolkit
I’ve been thinking a lot about the tools that allow me to do what I do, and I’m often asked by curious colleagues about what software I recommend for X, Y, or Z.
So, today, I’m going to discuss the software I use to do my work as a linguist and phonetician. All of these are tools which I use regularly, which fill a niche, and which I would be very sad without. I’m not saying that each choice is the best choice for phonetic use, but instead, that each choice is the best for me.
Below are my phonetic programs of choice, organized alphabetically by function. Programs that cost money are listed with their rough prices as of March 2015. All of them run on MacOS Ventura, and this list was updated in 2023.
On Free Software in Academia
I will freely admit (sorry) that I’m biased towards free and open-source software for academic research. Using a widely available and free program to do something makes my work a) less expensive, b) less likely to end up abandoned and obsoleted by some company, and c) much more easily shared with and reproduced by other researchers.
I also refuse to teach my students to use software that they themselves can’t afford, use, or buy. A $50 “student license” for a mega-program is great, but if I’ve given students skills that aren’t useful unless they can buy a $1000+ “private license” every few years once they leave school, I’ve given my students little of use.
So, although there’s a role for non-free software, and I do pay for many great programs for general computing (without begrudging the authors), I tend to favor free, and you’ll see that with only two exceptions (MATLAB and ExperimentBuilder), every application I recommend and use in my academic life is free.
Audio Conversion - XLD and iTunes - Free
XLD supports weird formats (.wv, .flac, .shn), and is really great at working with lossless file formats that Praat doesn’t read. And iTunes supports other formats, particularly things like .mp3 and .m4a, and allows you (by tweaking the “import settings”) to convert to .wav or .aiff straightforwardly.
Between these two programs, Praat, and Miro Video Converter (for video), I can convert nearly anything into nearly anything else.
Audio Recording - Audacity - Free
Praat can record. But it’s somewhat limited in its ability to record long sound files, it’s finicky in recording from multiple inputs, and it makes it shockingly easy to delete what you just recorded. So, when I’m recording data in bigger chunks, I use Audacity.
There are some other really nice bits of software for recording. Apple’s Garage Band is decent for recording as well. Adobe Audition($150) is well respected, as is Logic Pro ($200), but both are overkill for phonetic recording, and do not come anywhere near justifying their pricetags.
Bibliography and Article Organization - Bibdesk - Free
This program is incredible. It allows you to keep a library of all of your references in the open (and common) BibTeX format. It allows you to tag these references with keywords, and group by those keywords. It allows you to attach PDF copies of articles, and then organizes those PDFs by author on your drive. And, most magical, allows you to select a few references, and then with the click of a button, email them to a colleague.
It integrates (via DropBox) with PocketBib for iOS (which is getting dated, but still good), so you have all your papers with you on the go.
If you’re using LaTeX, this is the absolute best solution, as LaTeX plugs right in. But even if you’re not, seriously consider using BibDesk to sort your bibliography, books, and articles.
2023 Update: I’ve switched to Zotero. It is not strictly better, but it is a bit nicer, and more importantly, it’s cross-platform, allowing
Editing Code - NeoVim, Emacs or Textmate 2 - Free
I’ve used every major editor. I’ve spent time with vim, emacs, BBEdit and SublimeText. I’ve landed on NeoVim for my life, but TextMate 2 is a great option if you like a bit more GUI. But the choice of a text editor is deeply personal, almost religious. Walk your own path.
A worthy alternative - VSCodium - Free
This is the Free and Open Distribution of Microsoft’s VSCode. Very powerful software, with a strong developer base.
Experiment Design and Running - Experiment Builder - $1000+
SRResearch’s ‘Experiment Builder’ software is, without a doubt, the best experimental design software I’ve used. I started using it because it integrates seamlessly with the EyeLink Eyetracking systems, but I’ve stuck around because it’s incredible even for non-eyetracking experiments, and has an accessible python underbelly. If only it were less expensive (or free!), it’d be a great gateway to their Eyelink system, but as is, even when not eye-tracking, I’m still using EB.
A Free Alternative - PsychoPy - Free
PsychoPy is a free and open source experimental design suite. It has a user interface for building experiments, and lets you write the experiment as python code behind the scenes if you’d like to get fancier. It has all the features I’ve found that I need, and isn’t that complicated, particularly for easy experiments.
Paid alternatives like ePrime ($1000) exist, and do offer some increased power (and certainly better tech support!), but ultimately, $1000 will buy a lot of tutoring in PsychoPy and Python, and will pay a lot of subjects with the cash left over.
Forced Alignment - CharSiu Aligner
This is a new Neural Aligner by Jian Zhu, a former student of mine who’s now faculty at the University of British Columbia, and it’s truly amazing. It works quite well, even with data where a direct transcript isn’t available, and is, from the perspective of somebody who comes from the early 2000s, Basically Magic. This should be your first choice.
An Older Alternative - P2FA - Free
The Penn Forced Aligner is a great tool for aligning text to recordings of American English speech. I talk a lot about it in this post. For French, I’ve used EasyAlign, which gets reasonable results, and a newer port of P2FA called “SPLAligner” by Peter Milne, which gets really great results.
It’s also worth mentioning the Montreal Forced Aligner, which I haven’t had the chance to use yet, but which is getting rave reviews.
IPA Fonts and Keyboarding - This - Free
I’ve maintained (since 2007) a post on installing IPA fonts on the mac. So, obviously, I recommend what I recommend there. Check it out!
Machine Learning - R - Free
I’m increasingly of the mind that phoneticians are going to want to use machine learning to study speech and speech perception. Although I’m slowly shifting to Python, particularly for machine learning, R has been the tool I’ve used most for ML and statistics.
I’ll talk about R for statistical uses below, but the very same R has some capable libraries for machine learning. In my dissertation, I used Machine Learning (specifically SVMs and RandomForests) to model the perception of acoustical cues in humans, and to test features quickly and cheaply. To do this, I used two libraries, or extensions to R:
- e1071 - For SVM model training, testing, tuning, and creation
- RandomForest
- For creating RandomForests.
- Tree - For vanilla decision trees
Those packages made it easy to do machine learning using the same data I used for all my other analyses, and to output my graphs and tables all at once. 10/10, will use again.
This is, mind you, not an argument that R is better than Python for ML, but just highlighting that R has great ML packages, and if you’re already there, it’s going to be easier by default.
A worthy alternative - Scikit-Learn - Free
If you already speak Python, or want more power and flexibility, Scikit-Learn is a great option. It has lots of algorithms, lots of libraries, and good documentation. The only reason I didn’t use this package is because I already know and love R, and because it was easier to work with my data in just one place.
PDF Reading - Skim - Free
OS X includes Preview, which is great, but Skim is just a bit nicer. It shows you a table of contents for files with that data. It lets you jump to a page by entering the page number. And it plays very nicely with LaTeX, highlighting recent changes. If you’re happy with Preview, stick with it, but if you’re not, use Skim.
Presentation Software - Reveal.js - Free
This is a very nerdy pick. Basically, it allows you to make presentations which are also websites. You can have transitions, a presenter’s display, you can advance the slides with a remote, you can build items in progressively, and you can include images, audio, and video.
The beauty is that all of your presentations are actually html files (with bits of markdown, if you’d like), and that writing them is as easy as making an outline of a paper. You don’t need to worry about adjusting spacing, font size, etc, because that’s all done for you. This, particularly with Markdown, allows you to tap out the next day’s powerpoint in an email to yourself on your phone, if you’d like.
You can also do fancy tricks, like posting your slides online for students, embedding YouTube videos, and styling your presentations using CSS. Students particularly loved being able to go through the slides, complete with sound and video, at home, and even on their smartphones.
The problem with it is that your slides are all websites. Unless you’re content editing bits of HTML, CSS, and tastes of Javascript, this may not be for you. It’s also tougher to do fancy composite images (“I’m going to make a Koala pop up on top of the vowel chart, then slide off to the right!”).
It’s not for everyone, but it’s really powerful. Now that I’ve started using reveal and used it to run a full 27-lecture course, I can’t go back.
Another Great Option - Keynote - $20
For many years, I used Keynote, Apple’s Powerpoint-killer. It’s great, and it’s what I recommend to everyday folks who don’t want to mess around with code.
Speech and Signal Analysis - Praat - Free
Given that I’ve written one of the more popular free textbooks on using Praat, and a large repository of Praat scripts, my affinity for the program should shock roughly nobody.
But the fact is, it’s incredible. For easy speech manipulation, measurement, and visualization, Praat’s the best tool out there. If you’re doing phonetics, you should be using Praat, or at least be familiar enough to teach your students.
An expensive alternative - MATLAB - $500
MATLAB, a proprietary programming language, can be extended to do much of what Praat does, and MATLAB is more powerful for strict signal processing. Unfortunately, it costs $500 (no, that’s not a typo) even for non-student educational use, and even more if you’re outside of academia. This means that students won’t be able to use it after graduation, that colleagues won’t reliably have access to it, and that you will always be just a bit poorer than you otherwise would’ve been.
I’m hoping that, much like R (see below) has replaced expensive and proprietary options like SAS and SPSS for many academics, octave or Python with specific libraries will catch up to signal processing feature parity, and thus, a more powerful tool will come online for widespread use. But until it does, there are software toolchains designed to be used only with MATLAB, so I’m stuck using MATLAB.
Statistics - R - Free
R is spectacular. It’s great for statistics, for data manipulation, for graphing, for generating tables, and even for machine learning.
In addition, because it’s more or less, kind of a programming language, although the learning curve is higher, one can conduct an analysis in such a way that somebody else who has your data and your code can reproduce your analysis exactly in a few keystrokes.
At this point, it has surpassed (in most relevant ways) its non-free competition, and if you’re planning to do statistics (or planning to learn it), you should be using R.
Because R is a programming language, it also makes use of libraries, which add functionality. A few of these merit special mention, and all are downloaded through R:
- e1071 - This is a package for doing many kinds of machine learning tasks in R, and works really well for SVMs.
- ggplot2 - This is the package for graphing in R. It’s got a learning curve, but allows for true beauty.
- lme4 - This is my favorite package for running linear mixed-effects models (and here is a great tutorial for using them).
- praatr - PraatR is an interface to Praat within R, which allows you to use Praat commands within R for analysis. I haven’t used it much, as I think in Praat scripting, but the author and the concept are both brilliant.
- stargazer - Allows easy export of tables in R to HTML, LaTeX, plaintext. Nearly every table in my dissertation was generated directly from the data or analysis using Stargazer.
- vowels - This is strictly for phonetic data. Discussed more below.
Video Conversion - Miro Video Converter - Free
I’m often given a video file, whether from Youtube, field recordings, or otherwise, and asked to do some analysis. When that happens, I use Miro to turn it into a sane format (usually mp4), or to extract the audio (using the “Format” setting).
Vowel Plotting - The ‘vowels’ package for R - Free
Although you have to reformat the data into a very specific column ordering, then import to R, the ‘vowels’ package is great, and produces some really beautiful vowel plots. It’s better than any other approach I’ve found.
Youtube Video Downloading - [**yt-
dlp**](https://github.com/yt-dlp/yt-dlp) - Free
This is a free and easy command line utility for downloading videos from YouTube. If you wanted to download a video of Ken Stevens being irradiated for phonetics, you would just install yt-dlp and type the below at a terminal:
yt-dlp "https://www.youtube.com/watch?v=DcNMCB-Gsn8"
You can then use Miro (see above) to convert to sound, and next thing
you know, you’re good to analyze. This is a fork of my previous choice,
youtube-dl
, but is now strictly better.
Word Processing and Writing - XeLaTeX or Markdown with Pandoc - Free
I describe my complicated writing workflow in the last post, but I love LaTeX, and XeLaTeX makes it even better, allowing full Unicode support (so, effortless IPA, and more!). There’s a reason that I’ve taught LaTeX for Linguists several times. I’m passionate about it.
If I’m writing something more casual, or using my crazy workflow above, I’ll write using markdown. Markdown is a simple way to mark formatting in text, which can then be transformed into other formats using tools like Pandoc (Free) or Marked ($14). It’s a nice way to write plaintext, and let formatting just get out of my way. If I’m not using emacs, iA Writer Pro ($20) is nice for putting markdown text on a page in a pleasant environment, but Textmate 2 (Free) is 80% as good for free.
However, both of these solutions are really geeky. LaTeX has a scary learning curve, and Markdown is kind of finicky, given that you need a second program to print it. Both are unquestionably worth the time to learn, but if you haven’t the time, patience, or geek-tolerance, there’s always…
An expensive alternative - Microsoft Word - $80
If I’m not in Markdown or TeX, or if I’m collaborating with somebody who’s scared of TeX and doesn’t want to use Overleaf (formerly WriteLaTeX) (Free), I’ll use Word. But I won’t be super happy about it.
So, I hope you’ve enjoyed this list of phonetic tools and software, and that somewhere, somebody out there finds something new and wonderful.