Skip to main content

Deterministic Finite Automaton

Deterministic Finite automaton (DFA) is a machine that can recognize a regular language based on its grammar.

Components of DFA

Components

DFA has the following components,

  • State set Q={qs,q1,q2,,qn}\mathcal{Q} = \{qs, q_1, q_2, \cdots, q_n\}
  • A vocabulary set V\mathcal{V}
  • A transition table T(q,v)=q\mathcal{T}(q, v) = q'
  • A start state qsqs
  • An accept state set F={qf,qf1,qf2,,qfn}\mathcal{F} = \{q_f, q_{f1}, q_{f2}, \cdots, q_{fn}\}

An automaton can take a symbol and transit to another state based on the transition function. That is, if w=aww=aw', then we define,

T(q,w)=T(T(q,a),w)\mathcal{T}(q, w) = \mathcal{T}(\mathcal{T}(q, a), w')

We also define,

T(q,λ)=q\mathcal{T}(q, \lambda) = q

If a sentence ww satisfies,

T(qs,w)F\mathcal{T}(qs, w) \in \mathcal{F}

Then we say that ww is accepted or recognized by the automaton.

We often illustrate with a graph, where each node represents a state, and an edge represents a transition, the label of the edge is the symbol that triggers the transition.

For example,

We add a * to denote that the state is an accept state.

If we want this automaton to test 001001, we would have the following process (the highted node is the current state), here is the example,

Example by Graph

Step 1: Start at q0

The automatonton begins in the start state q0, and the first symbol of the string is 0.

Step 2: Still in q0, next symbol is 0

The automatonton is still in q0, and it reads another 0, so it remains in q0.

Step 3: Now in q0, next symbol is 1

The automatonton now reads 1, so it transitions from q0 to q1 (accept state).

Final Step: End of input string, and state q1 is accepting

The DFA has finished reading all input symbols (001), and it has ended in q1, which is an accept state. Therefore, the string is accepted.

Example by Table

We can also just look after the table. Based on the same automaton, we have,

01
qsqsq1
q1*q2q1
q2q2q2

Each element in the table is T(row,col)\mathcal{T}(row, col).

If we test 001001,

  • Start at qs
  • Read 0, go to qs
  • Read 0, go to qs
  • Read 1, go to q1
  • String exhausted, and q1 is an accept state, so the string passes.

Minimizing a DFA

Sometimes, we may design a DFA that has more states than necessary, for example,

This automaton has the exact same function as the automaton above. For such simple cases, we can tell by eye. However, for more complex cases, we need a tool to minimize the automaton.

We define a pair of states is an equivalent pair if and only if they satisfy all the following conditions,

  • They are both accept states or both not accept states.
  • When accepting any symbol, they will transit to a pair of equivalent states.

If two states are equivalent, they can be merged into one.

We always use the most optimistic choice with circular decency. That is, if whether AA and BB are equivalent depends on wether CC and DD are equivalent, while CC and DD are dependent on AA and BB. Then we will always assume both of them are equivalent. This is also true for more circular dependencies.

We usually use a table called the equivalence table for help. We will demonstrate how to do it,

Let's consider,

Let's draw the table,

qsBCDE*
qs
B
C
D
E*

We ignore the half above the diagonal and the diagonal. Because the upper half is the exactly same as the lower half, and a state is always in pair with itself.

If we write in the element, then it means that the row and column are equivalent. If we write in the element, then it means that the row and column are not equivalent. If we write FGFG in the element, then it means that if the FGFG are equivalent, then the row and column are equivalent.

Let's construct the equivalence table. Because EE is the only end state, no other state is equivalent to EE. Therefore, we have,

qsBCDE*
qs
B
C
D
E*
tip

We usually firstly fill as many obvious element as possible so that it makes our life a bit easier.

Then, we choose the ones that leads to the determined pairs (no matter whether they are equivalent or not), so that we can have most of the table filled with or .

Lastly, we iterate through all the remaining pairs.

Now we know EE is in no pair with any other state. Therefore, it'd be easier for us to consider the pairs in which an element points to EE.

For example, DD points to EE after bb, but for all other states, they don't go to EE after bb, and thus,

qsBCDE*
qs
B
C
D
E*

There are no obvious ones, so let's check the only three pairs remaining.

qsqs and BB: after aa, they both go to BB. After bb, qsqs goes to CC, and BB goes to DD. But BB and DD are not equivalent, so qsqs and BB are not equivalent.

CC and qsqs: after bb, they both go to CC. After aa, they both go to BB. Thus, CC and qsqs are equivalent.

CC and BB: after bb, CC goes to CC but BB goes to DD. DD and CC are not equivalent, so CC and BB are not equivalent.

In all,

qsBCDE*
qs
B
C
D
E*

We can simply redirect all lines pointing to CC to qsqs, then remove the CC node. In the transition table, we just replace all CC with qsqs.

This is a simple case.

We may have this type of table,

qsFGH
qs
F*
G*FH
H*GHFG

There exists circular dependency. We use the most optimistic choice- as we said above, to get the table,

qsFGH
qs
F*
G*
H*

This algorithm is an application of Hopcroft–Karp algorithm.

The Grammar of DFA

For a DFA, we can create a regular grammar that is equivalent to the DFA.

Suppose we have state set,

Q={qs,q1,q2,,qn}\mathcal{Q} = \{qs, q_1, q_2, \cdots, q_n\}

Then, we can create a regular grammar,

qiaT(qi,a)q_i \rightarrow a \mathcal{T}(q_i, a)

If T(qi,a)F\mathcal{T}(q_i, a) \in \mathcal{F}, then,

qiaq_i \rightarrow a

So that using the regular grammar, we can generate the language of a DFA. Thus DFA generates regular language.

Please note that, by this conversion,

AaBA \rightarrow aB

Can only exists one such rule. That is to say, AaBaCA \rightarrow aB | aC is not allowed. So the grammar of DFA is actually a more restricted version of regular grammar.

NFA is actually directly corresponding to the regular grammar.