Comdes

Shift-Reduce Parsing

A class of bottom-up parsing algorithms that constructs the parse tree by "shifting" input tokens onto a stack and "reducing" them based on production rules.

Core Concepts

Shift-Reduce parsing is a form of bottom-up parsing. Instead of starting with the initial symbol and deriving the target string (like LL top-down parsing), a shift-reduce parser begins with the input string and attempts to progressively condense it backward to the Start Symbol.

The parser operates using a Stack to hold grammar symbols in progress and an Input Buffer for the unread tokens.

The Four Parser Actions

At any point during the parsing sequence, the parser decides between four possible operations:

  • Shift

    Moves the next unread token from the input buffer to the top of the stack. This is effectively saying, "I haven't found a complete production match yet, let me see more tokens."

  • Reduce

    When the sequence of symbols securely situated at the top of the stack matches the Right Hand Side (RHS) of a grammar rule, the parser removes them and substitutes the corresponding Left Hand Side (LHS) non-terminal.

  • Accept

    If the stack contains only the start symbol and the input sequence is completely exhausted (reaching the $ end-marker), the sentence is syntactically correct and accepted.

  • Error

    If neither a valid Shift nor a valid Reduce applies, the parser discovers a syntax error.

Action Trace Example

Using a simplified grammar for arithmetic expressions: E → E + E | id.

StackInputAction
$id + id $Shift
$ id+ id $Reduce (E → id)
$ E+ id $Shift
$ E +id $Shift
$ E + id$Reduce (E → id)
$ E + E$Reduce (E → E + E)
$ E$Accept
Watch the LR parser state machine execute this exact trace automatically!
Shift-Reduce Conflicts

The inherent challenge in shift-reduce parsing occurs when multiple actions are possible, known as conflicts:

  • Shift/Reduce Conflict: The parser doesn't know whether to shift a new token or reduce the current top-of-stack. This often happens in rules with varying associativity (like "if-else" dangling modifiers).
  • Reduce/Reduce Conflict: The parser encounters a situation where the stack's top sequence matches two distinct rule RHSs, so it doesn't know which LHS to reduce to.

Algorithms like LR(0), SLR(1), CLR(1), and LALR(1) are sophisticated methods constructed implicitly to handle or avoid these conflicts using lookahead state machines.