Associate Teaching Professor of Linguistics at UC San Diego
Director of UCSD's Computational Social Science Program
How to write a dissertation in LaTeX using Markdown
This was originally posted on my blog, Notes from a Linguistic Mystic in 2013. See all posts
My particular form of procrastination is optimization. You can tell I don’t want to cut two bags of potatoes when I’m sharpening the kitchen knives. You can tell I’m uninterested in laundry when I’m cleaning the dryer barrel. And when I didn’t quite know where to go with my dissertation prospectus, well, I decided that I needed to develop a more graceful way to do so.
For the last few years, I’ve written all my large papers in XeLaTeX (using XeLaTeX for unicode support, making IPA much easier). I love LaTeX, love BibTeX, and love not worrying about formatting. But writing long sections of text in LaTeX kind of sucks, because it’s rather clunky and there are no good editors for LaTeX on mobile devices.
In LaTeX, making text bold requires you to wrap the word or phrase in eight characters worth of tags. Section headings are ugly, and also have accompanying tags. Every %, & or _ must be escaped. LaTeX is powerful for doing complex things, but while writing prose, it just gets in the way.
Why Markdown?
I decided that I’d rather write in Markdown. Markdown is an easy syntax for writing, where you can define section headings as easily as:
# This is a section heading
## Subsection Heading
### Subsubsection heading
Bold, italic, and bold-italic are as easy as:
**bold**, *italic*, ***bold italic***
Most importantly, it’s designed to be quick to use and type using available symbols. So, in short, writing Markdown doesn’t suck, but I wanted to still use the best of LaTeX, for things like dynamic numbering, BibTeX automatic bibliographies, and easy creation of nice tables.
So, I hacked together a solution using Pandoc, the same software I use to generate this site from Markdown.
Turning Markdown into LaTeX
First, I created two documents which had the preamble code for LaTeX in one (everything up until the first section heading), and the footer info in the other (the bibliography).
Then, I created a markdown file for the meat of the paper, which I’ll later convert into LaTeX and stick between the header and footer. I stuck this markdown file in my Dropbox folder and I edit that markdown file to write the paper, whether on a Mac (using TextMate or MacVim), or on an iPad or iPhone (using Editorial). You can make individual chapter files and concatenate them, if you’d prefer, but I stuck to one mega-file.
The beautiful thing about this approach is that I can write Markdown, which is readable and pleasant, 95% of the time, and then switch into LaTeX in the same file to add something fancy, such as a latex citation, a reference to a labeled section or a footnote.
I can also include LaTeX tables, throw in commands to read other tables in, and use vspace where needed. There’s no penalty to going back and forth, and I have the power of LaTeX when needed, and the easy-pretty of markdown when I’m just writing.
This also allows me to use Stargazer, a package for the R Statistics Suite which allows you to directly output data as pretty LaTeX tables. I just have Stargazer output to a .tex file, then input that .tex file. It’s both wonderful and reproducible, because all of my figures, tables, and models are generated directly by R, so no “copy-paste” errors are possible.
How?!
Well, the joy is in the script that creates the data. When I’d like to see a final version, I run a script in the terminal (or hit Cmd+Option+Control+Shift+PageDown, triggering it through HammerSpoon.
Although you’ll want to look at the script itself, which is extensively commented, basically, it does the following:
- It copies all of the text from Markdown files, and all of the analysis scripts, into a single place.
- It turns the Markdown into a LaTeX file using Pandoc.
- It cleans up the output a bit.
- It tacks a custom header and footer onto the output, which contains all my style information.
- It builds the document and bibliography in LaTeX
- It opens the PDF copy in a PDF reader, and copies the latest PDF version to my dissertation folder
- It builds a .tar.gz archive containing the complete text and
analysis scripts, and saves it to a “backups” folder by date.
- This way, if I mess something up, I can always go back to the last version(s), and I’ve got a way to compare changes if I need to.
It combines the best parts of simple plaintext writing with the best parts of LaTeX, and allows me to be as productive on my phone or iPad as I can be at home (with the exception of rendering a new PDF, and using PocketBib for reading and finding citekeys). In short, it allowed me to write 72,000+ words of dissertation, and not hate my life. I’ve since moved my guide to using Praat to a similar workflow, so I can write it using Markdown too!
Most importantly, though, I’ve found a way to make writing a dissertation geekier than it already was. And that, my friends, is my real accomplishment.