Automatic Syntactic Parsing

Will Styler - LIGN 6

We’ve talked about trees in terms of constituency

These trees come from Phrase Structure Grammars

Linguists use constituency trees for linguistic analysis

Linguistic Syntax is done using constituency

… but there is another way

Syntax helps tell us which words ‘belong’ to other words

Syntax also marks who did what to whom

What if we could just capture which words are ‘connected’ to other words

Dependency Parsing

Dependency Trees

The Angry Squirrel stole the Mixed Nuts from Walmart

Dependency Trees mark the connections between words

Dependency Trees often mark the types of connections

These ‘relations’ have meaning that doesn’t change with order

From Jurafsky and Martin ‘Speech and Language Processing’ 2018

Here’s a link to the full guidelines

Any sentence that can be constituency parsed, can be dependency parsed

Let’s try it: “Will spotted Bigfoot outside the Medical Library”

Dependency Parsing can handle huge complexity

So, Dependency Parsing is a different beast

How do we do dependency parsing automatically?

We use any number of different algorithms

We’re going to focus on why it’s done

Why Dependency Parsing in NLP?

Dependency Parsing is useful

We can still see some elements of constituency

Ambiguity is very easy to resolve with dependency parses

The Angry Squirrel stole [the Mixed Nuts] [from Walmart]

“I saw the man with the telescope”

“I saw the Queen of England’s Hat”

Dependency Parsing works better in some other languages

In Russian, Word order can be used for emphasis

Constituency-based approaches struggle here

Dependency parsing works just fine

Dependency Parsing is a bit simpler

Dependency Parsing can be done quickly

Dependency Parsing hands us semantic information for free!

Dependency Parsing gives us ‘arguments’

These arguments are directly useful in natural language understanding!

How is parsing actually useful for NLP engineering?

Dependency Parsing hands us information

“Hey Siri, send my wife a romantic text”

If $ROOT == "send"
    If $DOBJ == ("text" | "message" | "sms")
        $NUMBER = get.phonenumber($IOBJ)
        If $AMOD.DOBJ == "romantic":
            $MESSAGE = "omg ilu soooo much sweetie uwu 😍"
        If $AMOD.DOBJ == "mean":
            $MESSAGE = "I ate your leftover enchiladas and they were gooooood"
        SEND.SMS $message TO $number

When parsing fails: A Case Study

“Alexa, turn on the office lights and bedroom lights”

Alexa knows how to turn on lights

If $ROOT == "turn on"
    If $DOBJ == "lights"
        ACTIVATE lights location $NMOD.DOBJ

“Turn on the office lights”

“Turn on the office lights and bedroom lights” (actual)

“I didn’t find a device named ‘office lights bedroom lights’”

Alexa didn’t parse the syntax, and failed

So, we need to be able to parse syntax to do NLP

