CSE411

CSE411: Introduction to Compilers

Parsing #4 (Top-Down Parsing)

Jooyong Yi (UNIST)

2023 Spring
CSE411

Parsing

                     |--------|           
a stream of tokens → | Parser | → parse tree
                     |--------|         
  • If the parsing algorithm succeeds to construct a parse tree, the obtained parse tree is used in the later phases of compilation.
  • If the parsing algorithm fails to construct a parse tree, a syntax error is reported to the user.
2023 Spring
CSE411

How to Build a Parse Tree

  1. Top-down
  2. Bottom-up
2023 Spring
CSE411

Top-Down Approach

  • Parse
center center
2023 Spring
CSE411

Top-Down Approach

  • Parse
center center
2023 Spring
CSE411

Top-Down Approach

  • Parse
center center
2023 Spring
CSE411

Top-Down Approach

  • Parse
center center
2023 Spring
CSE411

Top-Down Approach

  • Typically performed using the left-most derivation.
  • A parser using the left-most derivation is called a recursive-descent parser.
2023 Spring
CSE411

Issues of Top-Down Approach (Left-Most Derivation)

  1. ...
  2. ...
  3. ...
2023 Spring
CSE411

Issues of Top-Down Approach

  1. Left Recursion should not appear in the grammar.
2023 Spring
CSE411

Issues of Top-Down Approach

  1. Left Recursion should not appear in the grammar.
  2. The grammar should not be ambiguous.
2023 Spring
CSE411

Issues of Top-Down Approach

Ambiguous Grammar Unambiguous Grammar Non-Left-Recursive Grammar
center center center
2023 Spring
CSE411

Additional Issue of Top-Down Approach

  • Parse
Grammar Step Parse Tree Lexing Position () Remark
center 1 center
2 center
2023 Spring
CSE411

Backtracking

  • Parse
Grammar Step Parse Tree Lexing Position () Remark
center 1 center
2 center
3 center Backtracked
4 center Done
2023 Spring
CSE411

Predictive Top-Down Parsing

How to perform top-down parsing without backtracking?

2023 Spring
CSE411

Predictive Top-Down Parsing with a Lookahead

  • First idea: Use a lookahead token.
  • Assume that we can "look ahead" two more tokens.
Grammar Step Lookahead Tokens Parse Tree Lexing Position () Remark
center 1 center
2 center Done
2023 Spring
CSE411

Predictive Top-Down Parsing with a Lookahead

  • How many tokens should we use?
  • Consider parsing the following grammar

2023 Spring
CSE411

LL(1) Grammar

  • A restrictive grammar that can be parsed
    • using the left-most derivation
    • without using backtracking
    • while using one lookahead token.
2023 Spring
CSE411

LL(1) Grammar

  • The first L: scanning from left to right.
  • The second L: using the left-most derivation.
  • 1: using one lookahead token.
2023 Spring
CSE411

How LL(1) parsing works

  • LL(1) parsing uses a PDA.
Input
□□□□□□□□
  ↓
|---------|    Stack
| Control | ⇔  |__|
|  Unit   |    |__|
|---------|    |__|
2023 Spring
CSE411

How LL(1) parsing works

  • LL(1) parsing uses a PDA.
Input
□□□□□□□□
  ↓
|----------|    Stack
| Decision | ⇔  |__|
|  Table   |    |__|
|---------=|    |__|
2023 Spring
CSE411

Example LL(1) Grammar

2023 Spring
CSE411

Decision Table (LL(1) Parsing Table)

int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
  • The second column: non-terminals
  • The top-most row: lookahead tokens
  • Cell: production
  • $: the special symbol indicating the end of input.
2023 Spring
CSE411

How to Read the Table

int * + ( ) $
E T X T X
X + E
T int Y (E)
Y * T
  • Given a nonterminal and a lookahead token , use the production in cell .
  • If is empty, a syntax error is reported.
2023 Spring
CSE411

LL(1) Parsing Example (Step 1)

  • Parse int * int $
  • Push E$ to the stack (stack top: E, bottom: $).
    • E: the starting nonterminal.
  • The lookahead token is boldfaced.
int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

LL(1) Parsing Example (Step 2)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E)
Y * T
2023 Spring
CSE411

LL(1) Parsing Example (Step 3)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E) int Y X$ int * int $ match int
Y * T
2023 Spring
CSE411

LL(1) Parsing Example (Step 4)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E) int Y X$ int * int $ match int
Y * T Y X$ int * int $ * T
2023 Spring
CSE411

LL(1) Parsing Example (Step 5)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E) int Y X$ int * int $ match int
Y * T Y X$ int * int $ * T
* T X$ int * int $ match *
2023 Spring
CSE411

LL(1) Parsing Example (Step 6)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E) int Y X$ int * int $ match int
Y * T Y X$ int * int $ * T
* T X$ int * int $ match *
T X$ int * int $ int Y
2023 Spring
CSE411

LL(1) Parsing Example (Step 7)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E) int Y X$ int * int $ match int
Y * T Y X$ int * int $ * T
* T X$ int * int $ match *
T X$ int * int $ int Y
int Y X$ int * int $ match int
2023 Spring
CSE411

LL(1) Parsing Example (Step 8)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E) int Y X$ int * int $ match int
Y * T Y X$ int * int $ * T
* T X$ int * int $ match *
T X$ int * int $ int Y
int Y X$ int * int $ match int
Y X$ int * int $
2023 Spring
CSE411

LL(1) Parsing Example (Step 9)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E) int Y X$ int * int $ match int
Y * T Y X$ int * int $ * T
* T X$ int * int $ match *
T X$ int * int $ int Y
int Y X$ int * int $ match int
Y X$ int * int $
X$ int * int $
2023 Spring
CSE411

LL(1) Parsing Example (Step 10)

int * + ( ) $ Stack Input Action
E T X T X E$ int * int $ T X
X + E T X$ int * int $ int Y
T int Y (E) int Y X$ int * int $ match int
Y * T Y X$ int * int $ * T
* T X$ int * int $ match *
T X$ int * int $ int Y
int Y X$ int * int $ match int
Y X$ int * int $
X$ int * int $
$ int * int $ Accept
2023 Spring
CSE411

How to Construct a Parsing Table?

2023 Spring
CSE411

Predictive Parsing

  • Is the above grammar LL(1)?
2023 Spring
CSE411

Predictive Parsing

  • Is the above grammar LL(1)?
  • Consider the input strings:
    1. and
      • The symbol indicates the location of the lookahead (i.e., the lookahead is ).
2023 Spring
CSE411

Left Factoring

2023 Spring
CSE411

Left Factoring

2023 Spring
CSE411

Left Factoring

2023 Spring
CSE411

Left Factoring

Compare the obtained grammar with the previous example.

2023 Spring
CSE411

Left Factoring

  • The conversion is done by factoring out a common string into a fresh non-terminal.
2023 Spring
CSE411

Left Factoring

    • : a set of nonterminals
    • : a set of terminals

2023 Spring
CSE411

Constructing a Parsing Table

  • Given a left-factored CFG, how to construct a parsing table?
2023 Spring
CSE411

Constructing a Parsing Table

  • Let's figure out how to build the following parsing table.
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Constructing a Parsing Table

  • When occurs, what lookahead terminals do we expect?
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Constructing a Parsing Table

  • When occurs, what lookahead terminals do we expect?
    • int and (
  • Hence, Table[E][int] = T X and Table[E][(] = T X
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Set

  • Given a string , is the set of all terminal symbols that begin strings derived from .
    • e.g.,
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Building a Parsing Table Using Sets

  • Assume an LL(1) Grammar ; We want to build a parsing table .
  • For each production in do:
    • for each terminal do:
2023 Spring
CSE411

Building a Parsing Table Using Sets

  • Assume an LL(1) Grammar ; We want to build a parsing table .
  • For each production in do:
    • for each terminal do:
  • e.g., For ,
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Building a Parsing Table Using Sets (1/2)

  • For ,
  • For ,
  • For ,
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Building a Parsing Table Using Sets (2/2)

  • For ,
  • For ,
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Using sets is not enough

  • Consider parsing (int) whose partial derivation steps are shown as follows:
    • What is the lookahead?
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Using sets is not enough

  • Consider parsing (int) whose partial derivation steps are shown as follows:
    • To match the lookahead, i.e., ), we should use .
      • Hence, Table[Y][)] = ε
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Set

  • Consider parsing (int) whose partial derivation steps are shown as follows:
    • To match the lookahead, i.e., ), we should use .
      • Hence, Table[Y][)] = ε
    • is followed by ).
    • )
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Set

  • for nonterminal is the set of terminals that can immediately follow .
  • e.g.,
int * + ( ) $
center E T X T X
X + E
T int Y (E)
Y * T
2023 Spring
CSE411

Building a LL(1) Parsing Table

  • Use both and sets.
  • For each production in do:
    • for each terminal do:
    • If , for each
2023 Spring
CSE411

Building a LL(1) Parsing Table

  • Use both and sets.
  • For each production in do:
    • for each terminal do:
    • If , for each
2023 Spring
CSE411

Building a LL(1) Parsing Table

int * + ( ) $
center E T X T X
X + E ε ε
T int Y (E)
Y * T ε ε ε
2023 Spring
CSE411

Computing

for is defined as follows:

  • If (i.e., is a terminal), then
  • If is a production of the given grammar , then
  • Suppose (i.e., is a nonterminal) and production (where ) exists in the given grammar .
    • If , then
2023 Spring
CSE411

Computing

FIRST FOLLOW
center E
T
X
Y
2023 Spring
CSE411

Computing

FIRST FOLLOW
center E {(, int}
T {(, int}
X {+, }
Y {*, }
2023 Spring
CSE411

Computing

for is defined as follows:

  • If is the start nonterminal, then .
  • If there exists a production , then
  • If there exists a production , then
  • If there exists a production and , then