Associate Teaching Professor of Linguistics at UC San Diego
Director of UCSD's Computational Social Science Program
Building a Website using Pandoc, Markdown, and Static HTML
This was originally posted on my blog, Notes from a Linguistic Mystic in 2015. See all posts
I’m kind of a nerd about websites. I’m not content to use Dreamweaver, or just write some code. I always want my websites to be a lightweight and optimal as they can be.
When it comes to web publishing, I’ve always been a bit of a minimalist. Over time, I moved my blog (formerly at http://linguisticmystic.com) from a hosted solution, to a Wordpress install, and then eventually, to Jekyll.
I started off creating my personal site using Jekyll. This was rather a waste, considering that Jekyll’s made for blogs, and that site is really just styled text at its core, with nothing temporal, and no need of fancy tags or pagination. But I still wanted to be able to write in Markdown and style it with CSS afterwards.
So, I got the idea to write everything in markdown, style with CSS, use Pandoc to convert it to HTML, then just drop it onto the server.
Oh, simple!
Actually implementing this was one of those things that’s easy to do once working, but takes forever to get exactly right. The core of it is a single shell script (viewable here, slightly de-identified) which converts the markdown to HTML, then uploads it.
The hardest part of setting all this up was figuring out the syntax of the below command, applied to each folder:
find . -name \*.md -type f -exec pandoc -B includes/spcvhead -A includes/spcvfoot -o {}.html {} \;
In English, it finds any file which ends with “.md” (a markdown file), then executes the pandoc command, including spcvhead (containing the header info, overarching style info, etc) (B)efore the file , then spcvfoot (A)fter. Then it outputs the rest as .html files. If you want different headers/footers in different parts of the site, just run the command on each folder with a different set of includes.
This gives you folders full of .md.html files, due to a quirk of how Pandoc operates. It then goes through and changes those back to .html files with the below command:
find . -depth -name '*.md.html' -execdir bash -c 'mv -i "$1" "${1//md.html/html}"' bash {} \;
Then, it uploads the contents of the site folder (html, css, etc.) to the server using rsync, and goes through and removes the newly generated .html files (to keep the local folder tidy).
This allows me to write pages, posts, and essays using mostly markdown, with occasional dashes of HTML/CSS to style particular elements (page titles, lists, images).
It works great, and is the closest a CMS has ever come to simply getting out of my way. Hopefully this description (and the shell script that makes it work) will prove useful to others.