Skip to main content

Push-Down Automaton

Push-Down Automaton (PDA) is an automaton that can test a context-free language. Just like how NFA can test a regular language. PDA is the device we later will use for LRLR grammar parsing.

Components of PDA

PDA is an e-NFA with an extra stack. To formulate it more precisely, it contains,

  • State set Q={qs,q1,q2,,qn}\mathcal{Q} = \{qs, q_1, q_2, \cdots, q_n\}
  • A vocabulary set V\mathcal{V}
  • A stack vocabulary set S\mathcal{S}
  • A start state {qs}\{ qs \}
  • A stack start character z0Sz_0 \in \mathcal{S}
  • A transition table T(q,v,z)={(qi,Zi),},ZiS\mathcal{T}(q, v, z) = \{(q'_i, Z'_i), \ldots\}, Z'_i \in \mathcal{S}^*

The T\mathcal{T} takes a state, an input vocabulary, and an stack symbol, and gives out a set of possible transitions, including he next state and a stack sentence. PDA will pop the top element, based on the transition table, the input and the current state, transit to a possible state and push every new element into the new stack elements.

We use the empty-stack acceptance. That is, when the stack becomes empty, we accept the input. Some other books may use accept state set like we did in NFA. They have the same level of expressiveness.

The graph to describe PDA is similar to that of e-NFA, just that on each edge, besides the input character, we also note z/Zz/Z' as the top element and the sentence to push after transition.

For example, we know that 0n1n0^n1^n is not a regular language, but we can build a PDA that accepts it.

The xx in the stack acts as count for number zero.

This is equivalent to the CFG,

S0X1X0X101λS \rightarrow 0X1 \\ X \rightarrow 0X1 | 01 | \lambda

Instant Description

For convenience, we use instant description sometimes instead of writing, T\mathcal{T}.

For FA, we use,

(q,aX)(q,X)(q, aX) \vdash (q', X')

To represent,

T(q,aX)=T(q,X)\mathcal{T}(q, aX) = \mathcal{T}(q', X)

For,

qT(q,a)q' \in \mathcal{T}(q, a)

Or for DFA,

q=T(q,a)q' = \mathcal{T}(q, a)

Where aa is a terminal symbol, XX is the sentence to test.

For PDA, we use,

(q,aX,Z)(q,X,Z)(q, aX, Z) \vdash (q', X', Z')

Where ZZ represents the elements in the stack. The left side is the stack top, the right side is the stack bottom.

Again, consider this PDA,

Instead of drawing graphs like we used to, we can use the following instant description. Assume we want it to test 00110011.

(qs,0011,z0)(qs,011,xz0)(qs,11,xxz0)(q1,1,xz0)(q1,1,z0)(qf,λ,z0)(q_s, 0011, z_0) \\ \vdash (q_s, 011, xz_0) \\ \vdash (q_s, 11, xxz_0) \\ \vdash (q_1, 1, xz_0) \\ \vdash (q_1, 1, z_0) \\ \vdash (q_f, \lambda, z_0)

CFL is the Language of PDA

Consider a CNF CFG, that every CFG can be converted into. Every grammar should be either,

ABCDdA \rightarrow BC \\ D \rightarrow d

First we define,

T(qs,λ,z0)={(qf,S)}\mathcal{T}(q_s, \lambda, z_0) = \{(q_f, S)\}

Then, for each rule like,

AB0C0BiCiA \rightarrow B_0C_0 | B_iC_i

Convert this into,

T(q1,λ,A)={(qf,B0C0),(qf,BiCi)}\mathcal{T}(q_1, \lambda, A) = \{(q_f, B_0C_0), (q_f, B_iC_i)\}

And lastly, for,

DdD \rightarrow d

We add,

T(qf,d,D)={(qf,λ)}\mathcal{T}(q_f, d, D) = \{(q_f, \lambda)\}

So that, from the start state, the PDA can perform arbitrary operation based on the rules (using leftmost derivation since stack is last in first out). Then, the PDA pops every symbol from the stack, and convert them to the corresponding, terminals- and during the step, consuming all the string.

Let's consider this rule,

S0X1X0X101λS \rightarrow 0X1 \\ X \rightarrow 0X1 | 01 | \lambda

We can convert it into CNF. Firstly, the above grammar equlas,

S0S1λS \rightarrow 0S1 | \lambda \\

We can convert into CNF with the following steps

SAS1λA0S \rightarrow AS1 | \lambda \\ A \rightarrow 0 SABλA0BS1S \rightarrow AB | \lambda \\ A \rightarrow 0 \\ B \rightarrow S1 SABλBSCC1A0S \rightarrow AB | \lambda \\ B \rightarrow SC \\ C \rightarrow 1 \\ A \rightarrow 0 \\

It's safe to ignore the empty generation, SλS \rightarrow \lambda, since it only adds empty sentence to the language, thus, the CNF of the original CFG is,

SABBSCC1A0S \rightarrow AB \\ B \rightarrow SC \\ C \rightarrow 1 \\ A \rightarrow 0 \\

Now we can construct the T\mathcal{T}.

We first have,

T(qf,λ,S)={(qf,S)}\mathcal{T}(q_f, \lambda, S) = \{(q_f, S)\}

And for the two rules,

SABBSCS \rightarrow AB \\ B \rightarrow SC \\

We have,

T(qf,λ,S)={(qf,AB)}T(qf,λ,B)={(qf,SC)}\mathcal{T}(q_f, \lambda, S) = \{(q_f, AB)\} \\ \mathcal{T}(q_f, \lambda, B) = \{(q_f, SC)\}

For,

C1A0C \rightarrow 1 \\ A \rightarrow 0 \\

We have,

T(qf,1,C)={(qf,λ)}T(qf,0,A)={(qf,λ)}\mathcal{T}(q_f, 1, C) = \{(q_f, \lambda)\} \\ \mathcal{T}(q_f, 0, A) = \{(q_f, \lambda)\}

We can draw this PDA,

For example, to test 00110011, we have the process,

(qf,0011,z0)(qf,0011,S)(qf,0011,AB)(qf,011,B)(qf,011,SC)(qf,011,BCC)(qf,11,CC)(qf,1,C)(qf,λ,λ)(q_f,0011,z_0) \\ \vdash (q_f, 0011, S) \\ \vdash (q_f, 0011, AB) \\ \vdash (q_f, 011, B) \\ \vdash (q_f, 011, SC) \\ \vdash (q_f, 011, BCC) \\ \vdash (q_f, 11, CC) \\ \vdash (q_f, 1, C) \\ \vdash (q_f, \lambda, \lambda)
tip

We don't actually have to go through the whole CNF process. We can convert the CFG into,

AWBbA \rightarrow W^* \\ B \rightarrow b

Where WW^* is a string of non-terminals. This conversion is obviously valid.

Then, we define,

T(qs,λ,A)={(qs,W)}T(qs,b,B)={(qs,λ)}\mathcal{T}(q_s, \lambda, A) = \{(q_s, W^*)\} \\ \mathcal{T}(q_s, b, B) = \{(q_s, \lambda)\}

PDA tests CFL

To convert PDA into CFG, we construct variables look likes this,

A=[qAp]A = [qAp]

This is a non-terminal. This variable refers to an edge in the PDA, that state qq with stack top AA can reach pp with an empty stack.

I'll continue by showing how to convert a PDA to a CFG, using simple notations.

Converting PDA to CFG

For a PDA, we can construct variables in our CFG of the form:

[qiAqj][q_i A q_j]

Where this variable represents "starting in state qiq_i with AA on top of stack, we can reach state qjq_j with empty stack.

The key idea is that we can construct grammar rules based on PDA transitions-

If a transition is in form of,

(q,Z)T(q,a,Z)(q', Z') \in \mathcal{T}(q, a, Z)

That is, when a transition (q,Z1Z2Zk)T(q,a,Z)(q', Z_1 Z_2 \dots Z_k) \in \mathcal{T}(q, a, Z) exists (where Z=Z1Z2ZkZ' = Z_1 Z_2 \dots Z_k is the sequence pushed onto the stack), we generate productions in the CFG to model the effect of this transition on the stack and states. Specifically:

  1. For transitions pushing k1k \geq 1 symbols:
    For every possible sequence of intermediate states q1,q2,,qk1q_1, q_2, \dots, q_{k-1}, add the rule:
    [qZq]a[qZ1q1][q1Z2q2][qk1Zkq][q Z q''] \rightarrow a \, [q' Z_1 q_1] [q_1 Z_2 q_2] \cdots [q_{k-1} Z_k q'']
    Here, aa is the input symbol consumed (or omitted if the transition is on ε\varepsilon). This rule encodes that after popping ZZ and pushing Z1Z2ZkZ_1 Z_2 \dots Z_k, the PDA processes each ZiZ_i sequentially, transitioning through states q1,,qk1q_1, \dots, q_{k-1} until the stack empties at qq''.

  2. For transitions pushing nothing (k=0k=0):
    Add the production:
    [qZq]a[q Z q'] \rightarrow a
    This directly represents popping ZZ and moving to qq' while consuming aa, leaving the stack empty.

  3. Start symbol:
    The CFG’s start symbol SS derives all variables corresponding to the PDA’s initial configuration:
    S[q0Z0q]for all qQ,S \rightarrow [q_0 Z_0 q] \quad \text{for all } q \in Q,
    where q0q_0 is the PDA’s initial state and Z0Z_0 is the initial stack symbol. This ensures all valid computations starting from q0q_0 with Z0Z_0 are captured. The qq here is the possible state that the PDA is in when the stack is empty. If there exists multiple such states, use a symbol to generate all of them at first.

For example, considering our first PDA,

We have,

T(qs,0,x)={(qs,xx)}T(qs,0,z0)={(qs,xz0)}T(qs,1,x)={(q1,λ)}T(q1,1,x)={(q1,λ)}T(q1,z0,z0)={(qf,λ)}\mathcal{T}(q_s, 0, x) = \{(q_s, xx)\} \\ \mathcal{T}(q_s, 0, z_0) = \{(q_s, xz_0)\} \\ \mathcal{T}(q_s, 1, x) = \{(q_1, \lambda)\} \\ \mathcal{T}(q_1, 1, x) = \{(q_1, \lambda)\} \\ \mathcal{T}(q_1, z_0, z_0) = \{(q_f, \lambda)\}

We should construct the grammar with the following step,

The start variable is,

[qsz0qf][q_sz_0q_f]

This represents computations starting in qsq_s with z0z_0 on the stack and ending in qfq_f with an empty stack.

We derive productions from each PDA transition:

(a) Transition (qs,0,x)(qs,xx)(q_s, 0, x) \rightarrow (q_s, xx)

Pushes two xx symbols. For all states q1,q{qs,q1,qf}q_1, q'' \in \{q_s, q_1, q_f\}:

[qsxq]0[qsxq1][q1xq][q_s x q''] \rightarrow 0 \, [q_s x q_1] [q_1 x q'']

This encodes that after reading 0, the first xx is processed from qsq1q_s \rightarrow q_1, and the second xx from q1qq_1 \rightarrow q''.

(b) Transition (qs,0,z0)(qs,xz0)(q_s, 0, z_0) \rightarrow (q_s, x z_0)

Pushes xx followed by z0z_0. For all q1,q{qs,q1,qf}q_1, q'' \in \{q_s, q_1, q_f\}:

[qsz0q]0[qsxq1][q1z0q][q_s z_0 q''] \rightarrow 0 \, [q_s x q_1] [q_1 z_0 q'']

This encodes processing xx from qsq1q_s \rightarrow q_1, then z0z_0 from q1qq_1 \rightarrow q''.

(c) Transition (qs,1,x)(q1,λ)(q_s, 1, x) \rightarrow (q_1, \lambda)

Pops xx. Directly generates 11:

[qsxq1]1[q_s x q_1] \rightarrow 1

(d) Transition (q1,1,x)(q1,λ)(q_1, 1, x) \rightarrow (q_1, \lambda)

Pops xx. Directly generates 11:

[q1xq1]1[q_1 x q_1] \rightarrow 1

(e) Transition (q1,ϵ,z0)(qf,λ)(q_1, \epsilon, z_0) \rightarrow (q_f, \lambda)

Pops z0z_0 without consuming input. Generates λ\lambda:

[q1z0qf]λ[q_1 z_0 q_f] \rightarrow \lambda

For the string 00110 0 1 1:

  1. Start with S[qsz0qf]S \rightarrow [q_s z_0 q_f].
  2. Use production (b):
    [qsz0qf]0[qsxq1][q1z0qf].[q_s z_0 q_f] \rightarrow 0 \, [q_s x q_1] [q_1 z_0 q_f].
  3. Expand [qsxq1][q_s x q_1] using production (a):
    [qsxq1]0[qsxq1][q1xq1].[q_s x q_1] \rightarrow 0 \, [q_s x q_1] [q_1 x q_1].
  4. Apply [qsxq1]1[q_s x q_1] \rightarrow 1 (production c) and [q1xq1]1[q_1 x q_1] \rightarrow 1 (production d):
    011011
  5. Combine with [q1z0qf]λ[q_1 z_0 q_f] \rightarrow \lambda:
    00110011

The complete CFG is:

S[qsz0qf][qsxq]0[qsxq1][q1xq]q1,q{qs,q1,qf}[qsz0q]0[qsxq1][q1z0q]q1,q{qs,q1,qf}[qsxq1]1[q1xq1]1[q1z0qf]λ\begin{align*} S &\rightarrow [q_s z_0 q_f] \\ [q_s x q''] &\rightarrow 0 \, [q_s x q_1] [q_1 x q''] \quad \forall q_1, q'' \in \{q_s, q_1, q_f\} \\ [q_s z_0 q''] &\rightarrow 0 \, [q_s x q_1] [q_1 z_0 q''] \quad \forall q_1, q'' \in \{q_s, q_1, q_f\} \\ [q_s x q_1] &\rightarrow 1 \\ [q_1 x q_1] &\rightarrow 1 \\ [q_1 z_0 q_f] &\rightarrow \lambda \\ \end{align*}