Apply operator precedence parsing algorithm for the statement $Id

21views

written 7.8 years ago by

• modified 7.8 years ago

Bottom-up parsers for a large class of context-free grammars can be easily developed using operator grammars.
Operator grammars have the property that no production right side is empty or has two adjacent nonterminal.
This property enables the implementation of efficient operator-precedence parsers.
These parser rely on the following three precedence relations:

Relation	Meaning
a <· b	a yields precedence to b
a =·b	a has the same precedence as b
a·>b	a takes precedence over b

These operator precedence relations allow to delimit the handles in the right sentential forms: $\lt \cdot $ marks the left end, $=\cdot$ appears in the interior of the handle, and $\cdot\gt$ marks the right end.
Let assume that between the symbols $ai$ and $ai+1$ there is exactly one precedence relation. Suppose that \$ is the end of the string.
Then for all terminals we can write: $ \lt \cdot b \ and \ b \cdot \gt $. If we remove all nonterminal and place the correct precedence relation: $\lt \cdot, =\cdot, \cdot\gt$ between the remaining terminals, there remain strings that can be analyzed by easily developed parser.
For example, the following operator precedence relations can be introduced for simple expressions:

	id	+	*	\$
id		$ \cdot \gt$	$ \cdot \gt$	$ \cdot \gt$
+	$ \lt \cdot $	$ \cdot \gt$	$ \lt \cdot $	$ \cdot \gt$
*	$ \lt \cdot $	$ \cdot \gt$	$ \cdot \gt$	$ \cdot \gt$
\$	$ \lt \cdot $	$ \lt \cdot $	$ \lt \cdot $	$ \cdot \gt$

Stack	Input	a	b	Action
\$	id + id \$	\$	id	Shift id
\$id	+id \$	id	+	Reduce E -> id
\$ E	+id \$	\$	+	Shift +
\$E+	id\$	+	id	Shift id
\$ E+id	\$	id	\$	Reduce E -> id
\$ E + E	\$	+	\$	Reduce E -> E+E
\$E	\$	\$	\$	Accept

Example: The input string:

$\text{id1 + id2 * id3} $

After inserting precedence relations becomes

$\text{\lt $\cdot$ id1 $\cdot$ \gt + \lt $\cdot$ id2 $\cdot$\gt * \lt $\cdot$ id3 $\cdot$ \gt}$

Having precedence relations allows to identify handles as follows:
- Scan the string from left until seeing •>
- Scan backwards the string from right to left until seeing <•
- Everything between the two relations <• and •> forms the handle.

Note that not the entire sentential form is scanned to find the handle.

Operator Precedence Parsing Algorithm

Initialize: Set ip to point to the first symbol of w \$ Repeat: Let X be the top stack symbol, and a the symbol pointed to by ip

if \$ is on the top of the stack and ip points to \$  then return
            else
      /* Let a be the top terminal on the stack, and b the symbol pointed to by ip*/
                If a< $\cdot$ b  or  a =• b  then
                    push b  onto the stack
                    advance ip  to the next input symbol
                else if  a •>bthen
                    repeat
                        pop the stack
                    until the top stack terminal is related by <•
                        to the terminal most recently popped
                else error()
            end

Making Operator Precedence Relations

The operator precedence parsers usually do not store the precedence table with the relations, rather they are implemented in a special way.
Operator precedence parsers use precedence functions that map terminal symbols to integers, and so the precedence relations between the symbols are implemented by numerical comparison.
Not every table of precedence relations has precedence functions but in practice for most grammars such functions can be designed.

Algorithm for Constructing Precedence Functions

Create functions $f_a$ for each grammar terminal a and for the end of string symbol;
Partition the symbols in groups so that $f_a$ and $g_b$ are in the same group if a =• b (there can be symbols in the same group even if they are not connected by this relation);
Create a directed graph whose nodes are in the groups, next for each symbols a and b do: place an edge from the group of $g_b$ to the group of fa if a <• b, otherwise if a •>b place an edge from the group of $f_a$ to that of $g_b$;
If the constructed graph has a cycle then no precedence functions exist. When there are no cycles collect the length of the longest paths from the groups of fa and $g_b$ respectively.

Example: Consider the above table

	id	+	*
id		$ \cdot \gt$	$ \cdot \gt$
+	$ \lt \cdot $	$ \cdot \gt$	$ \lt \cdot $
*	$ \lt \cdot $	$ \cdot \gt$	$ \cdot \gt$
\$	$ \lt \cdot $	$ \lt \cdot $	$ \lt \cdot $

Using the algorithm leads to the following graph:

From which we extract the following precedence functions:

	id	+	*	\$
f	4	2	4	0
g	5	1	3	0

Error Recovery in operator-precedence parsing:

There are two points in the parsing process at which an operator-precedence parser can discover syntactic error:

If no precedence relation holds between the terminal on top of the stack and the current input.
If a handle has been found, but there is no production with this handle as a right side.

The error checker does the following errors:

Missing operand
Missing operator
No expression between parenthesis
Illegal b on line (line containing b)
Missing d on line (line containing c)
Missing E on line (line containing b) These error diagnostic issued at handling of errors during reduction.

During handling of shift/reduce errors; the diagnostic’s issues are:

Missing operand
Unbalanced right parenthesis
Missing right parenthesis
Missing operator

Advantages:

Simplicity, easy to construct by hand

Disadvantages:

It is hard to handle tokens like the unary operators
Since the relationship between a grammar for the language being parsed and the operator precedence parser itself is tenuous, one cannot always be sure the parser accepts exactly the desired language.
Only a small class of grammars can be parsed using operator-precedence techniques.

ADD COMMENT EDIT

For a given grammar below, construct an operator precedence relation matrix, assuming *, + are binary operators and id as terminal Symbol and E as non-terminal.

E → E + E $ \ \ \ $ E → E * E $ \ \ \ $ E → id

Apply operator precedence parsing algorithm for the statement

Id - id * id