Theory of DFA
DFA (Deterministic Finite Automaton) sometimes called DFSM (Deterministic Finite State Machine) is a a finite state machine which given sequence of symbols accepts them or not.
Why do we need DFA?
Lets say you are given a task to write a program which accepts strings with following conditions:
- It must contain only
:
,)
,(
, and_
characters. -
:
has to be followed by)
or(
. - Has to have at least one smiley
:)
or sad:(
face. - If string has more than one face, then all of them have to be separated by at least one
_
character. - Each string can start and end with zero or more
_
characters.
Of course, this task can be easily accomplished using regular expressions. But for the sake of argument, let's say we can't use them. In fact, behind the scenes all regular expressions are transformed into automaton anyway.
One way to accomplish this task is to create a method with for
loop and deeply nested
if
statements in it. Loop
will iterate through each character in the string and based on previous characters will check if it is acceptable or not.
If there are lots of cases to consider, then this approach is error prone.
Another way of thinking about it, is to imagine that a program (DFA) has finite set of states. For example, a state when it begins reading characters, a state after last character, and intermediate states which it visits between start and finish. A transition between states are based on two things:
- Current state.
- Read in character.
There can be a situation when different characters lead to the same state. For example, after reading
:)
there are only one
acceptable character _
, all other characters
:
, )
, (
will lead to not
acceptable state. Also, some characters will lead to the same (current) state. For example, if the first character
was _
, then no matter how many of _
program will read in, all states _
, __
,
_______
will not impact transition to the next state,
thus current state will stay the same. If the DFA after reading last character will end up in state which
is acceptable, then a string is considered accepted.
Formal definition of DFA
While we could create DFA using the reasoning presented earlier. There are some questions which begs an answer: What exactly are start and accept states? Can it have zero accept states? Can it have more than one start state? and so on. All these questions can be answered by looking at the formal definition of DFA.
A deterministic finite automaton (DFA) $$M$$ is a 5-tuple $$(Q, Σ, δ, q_0, F )$$, where
- $$Q$$ is a finite set called the states,
- $$Σ$$ is a finite set called the alphabet,
- $$δ : Q × Σ → Q$$ is the transition function,
- $$q_0 ∈ Q$$ is the start state, and
- $$F ⊆ Q$$ is the set of accept states.
Notice that DFA is defined as 5-tuple. Meaning, it has five elements and order of the elements matter, thus $$(Q, Σ, δ, q_0, F) \neq (Σ, Q, δ, q_0, F)$$. First element $$Q$$ represents all the states in hte DFA. In general, when we are talking about finite state machine, term finite state is analogous to $$Q$$.
Second element is an alphabet $$Σ$$ (read as sigma). It represents all the symbols which are allowed to go into machine. For example, if $$Σ = \{0, 1, 2\}$$, then only characters $$1$$, $$2$$, and $$3$$ are allowed. But what if we have more symbols? In that case we have to modify DFA to support them. Remember, we are dealing with deterministic finite automaton, meaning that it's next state is determined by input symbol, thus unrecognised symbols are not allowed. Also, in alphabet symbols do not need to be one letter characters. For example, $$Σ = \{health, stamina, shield\}$$ is perfectly valid alphabet for game AI.
Third element is a function $$δ$$ (read as delta) where $$δ : Q × Σ → Q$$. It represents a transition from one state to another. $$Q × Σ$$ is a domain of the function. From the set theory, we know that if we have two sets $$Q$$ and $$Σ$$, then their cartesian product $$Q × Σ$$ is a set of all the pairs $$(a, b)$$ where $$a ∈ Q$$ and $$b ∈ Σ$$. Codomain of $$δ$$ is $$Q$$. In plain english function $$δ : Q × Σ → Q$$ says: give me any sate and any symbol and I will return a state. Notice, it does not say that a returned state has to be different, thus it can be the same state as given.
Fourth element $$q_0$$ represents a start state which has to be one of the states in $$Q$$. Also, by definition of the start state, we can answer a question: can it have more than one state? No, it can't. Every DFA has to have only one start state.
Lastly, fifth element $$F$$ represents set of accept states which is a subset of $$Q$$. Notice, the difference between $$∈$$ and $$⊆$$. $$∈$$ means member of, thus start state $$q_0$$ is a member of $$Q$$. While $$⊆$$ means subset of, thus $$F$$ is a subset of $$Q$$. We know that every set has empty set $$∅$$ as it's subset. Therefore, $$F$$ can be $$∅$$, meaning that DFAs with no accept states does exist.
State (transition) diagram
While formal definition is a nice way of defining DFA, it does not help us to visualise it. On the other hand, state diagrams can be used to define and to visually represent finite automaton. The only downside of them is that state diagrams tend to be very messy if DFA's have many states. Nevertheless, for simple ones is a perfect choice.
A state diagram for DFA $$M = (Q, Σ, δ, q_0, F)$$ is graph defined as follows:
- For each state $$q$$ in $$Q$$ there is a node.
- For each state $$q$$ in $$Q$$ and each input symbol $$a$$ in $$Σ$$, let $$δ(q,a) = p$$. Then the state diagram has an arc (arrow) from node $$q$$ to node $$p$$, labeled $$a$$. If there are several input symbols that cause transitions from $$q$$ to $$p$$, then the state diagram can have one arc, labeled by the list of these symbols.
- There is an arrow into the start state $$q_0$$ which does not originate at any node.
- Nodes corresponding to accepting states (those in $$F$$) are marked by double circle. States not in $$F$$ have a single circle.
For example, Image 1.0.0 represents DFA for Task 1 .
We see nodes representing states and arcs - transitions from one state to another. Transitions are based on
current state and input symbol. For example, if automaton's current state is $$q_0$$ and the input symbol is :
,
then it goes to $$q_1$$. If the input symbol is _
, then it transitions to its current state ($$q_0$$). Notice that one
arrow pointing to $$q_0$$ does not originate from any node, thus by
Definition 1.0.1
of $$q_0$$ is a start state. And nodes with double circle $$q_2$$ and $$q_3$$ represent accept states.
One special type of non accepting states are those which has
only arrows pointing to it but has no arrows pointing to other states. Thus, if any transition
leads to that kind of state, then it is dead end because it is impossible to reach an accepting state,
these states are called dead states. For example, $$q_4$$.
Extended transition function
At this point we only talked about transitions as a separate functions. That is, given current state and an input symbol DFA transitions to next state or current. Then, that state becomes an input for the next transition function, and so on. This cycle repeats until there are no more input symbols, producing a sequence of state transitions. This sequence can be expressed by extended transition function denoted by $$δ^*$$. This function is very similar to $$δ$$, they both take state as an input and return state as an output. But contrary to $$δ$$, function $$δ^*$$ takes not a symbol from the alphabet $$Σ$$ but a string, consisting of symbols from the alphabet $$Σ$$. So, if the input state is a start state $$q_0$$, then given a string, $$δ^*$$ will return a state which we would get after sequence of state transitions. Maybe formal definition will be more understandable.
Let $$M = (Q, Σ, δ, q_0, F)$$ be a deterministic finite automaton (DFA). We define the extended transition function
$$$δ^* : Q × Σ^* → Q$$$as follows:
- For every $$q ∈ Q$$, $$δ^∗(q, ε) = q$$
- For every $$q ∈ Q$$, every $$y ∈ Σ^*$$, and every $$a ∈ Σ$$,
$$$δ^∗(q, ya ) = δ(δ^∗(q, y), a )$$$
Or maybe not :). $$δ^*$$ is defined recursively. We have a base case (1) and then we define it recursively (2). The main difference between $$ δ^* : Q × Σ^* → Q $$ and $$ δ : Q × Σ → Q $$ is a domain. Contrary to $$δ$$'s domain $$Q × Σ$$, extended transition function's domain $$Q × Σ^*$$ is a cartesian product of set of states $$Q$$ and Kleene Star (Wikipedia) $$*$$ over alphabet $$Σ$$ denoted by $$Σ^*$$. Lets call it a Kleene closure, which is a set of all strings over alphabet $$Σ$$, including the empty string $$ε$$. For example,
If $$Σ = \{aa, b\}$$, then $$Σ^* = \{ε, aa, b, aab, baa, aaaa, bb, aaaab, \dots \}$$
You may ask: why do wee need it? Remember, automaton accepts strings where each symbol is in alphabet $$Σ$$. So Kleene closure is a nice way of representing all those strings.
After defining $$δ^∗$$ signature, we need to define how it functions. We start with the base case which says that for every $$q ∈ Q$$, $$δ^∗(q, ε) = q$$. In plain english, function $$δ^∗$$ says: give me any state and empty string and I will give you back the same state. It may sound useless but in fact, it is very important case because, as we will see, it terminates a function. Empty string $$ε$$ indicates the end of the string, so it makes sense to return input state. Just like $$δ$$ transition would return the state after last input symbol.
Recursive case defines function for those cases when a string is not empty. $$ya$$ in $$δ^∗(q, ya)$$ means, concatenation between a string $$y$$ from $$Σ^*$$ and one symbol $$a$$ from the alphabet $$Σ$$. This is just another representation of original string. Advantage of $$ya$$ is that we can use different parts of it to define $$δ^∗(q, ya)$$.
$$$δ^∗(q, ya ) = δ(δ^∗(q, y), a )$$$We see that it is equal to the original transition function $$δ$$ from the definition of $$M$$. First argument of it is extended transition function $$δ^∗$$ with two arguments: a state and a string which is one symbol shorter than, $$ya$$. And the last argument of $$δ$$ is symbol $$a$$ from $$ya$$ string. For example, given following state diagram,
and the string $$abc$$, we can trace $$δ^∗(q_0, abc)$$ function,
$$$ \begin{align} δ^∗(q_0, abc) &= δ(δ^∗(q_0, ab), c) \\ &= δ(δ(δ^∗(q_0, a), b), c) \\ &= δ(δ(δ(δ^∗(q_0, ε), a), b), c) \\ &= δ(δ(δ(q_0, a), b), c) \\ &= δ(δ(q_1, b), c) \\ &= δ(q_2, c) \\ &= q_3 \end{align} $$$Regular languages
Using Definition 1.0.2 of extended transition function $$δ^∗$$ we can precisely define what it means for DFA to accept or to reject a string.
Let $$M = (Q, Σ, δ, q_0, F)$$ be a deterministic finite automaton (DFA), and let $$a ∈ Σ^*$$. The string $$a$$ is accepted by $$M$$ if
$$$δ^*(q_0, a) ∈ F$$$and is rejected by $$M$$ otherwise.
Basically string $$a$$ is accepted by automaton $$M$$ if given start state $$q_0$$ and $$a$$ extended transition function $$δ^*$$ returns an accept state. Which makes sense. Then we can ask ourselves: how would we call all the strings which automaton accepts? Well, we can think of strings as words, then the set of all words over alphabet is a language. Thus, we can say that a set of the strings which automaton accepts is a language which automaton recognises. Some may say that automaton accepts a language, but that is ambiguous with automaton accepting a string. So, to make it clear, we will say that automaton accepts a string, but recognises a language. More formally,
Let $$M$$ be a finite automaton. The language recognised by $$M$$ is the set
$$$L(M) = \{a ∈ Σ^∗\ |\ a\ is\ accepted\ by\ M\}$$$
Notice that is not necessary for automaton to accept all the strings over alphabet $$Σ$$ to recognise a language. It can accept one, two or even zero strings. In later case $$L(M) = ∅$$. Notation of $$L(M)$$ can be misleading because it looks like a function $$f(x)$$. Remember, $$L(M)$$ is a set, not a function. We call a language regular if there is a finite automaton which recognises it. More formally,
A language $$L ⊆ Σ^*$$ is regular if there is a finite automaton $$M$$ such that $$L = L(M)$$.
So, a language $$L$$ is a subset of Kleene's closure $$Σ^*$$. And, if all the strings in $$L$$ is also in a language which automaton recognises $$L(M)$$ then we say that a language $$L$$ is regular. For example,
$$$L_1 = \{ xaaay\;|\;x, y ∈ \{a, b\}^∗\} ⊆ \{a, b\}^*$$$is a regular language.
Another example of regular language is presented in Task 1 and Image 1.0.0 shows an automaton which recognises it. Below are some strings which are in the language and some which are not,
-
Is:
:)
,:(
,:(_:)
,:)___:)__:(__
,_________:)__
. -
Is not:
____
,::)
,:((
,:(:()
,:)_):)
,(:
.
So, if a string _:)__:(
is in the language which DFA recognises, then it should accept it. Let's see how it does that.
Initially, DFA is in the $$q_0$$ state, then it receives
_
character, the arrow with it points to the same state $$q_0$$. Next character is :
which points to $$q_1$$ state,
then following )
DFA goes to $$q_2$$ state. From there, we have _
to $$q_3$$, then one more _
from $$q_3$$ to itself.
After that, :
character points to $$q_1$$ and following (
DFA ends up in $$q_2$$ state.
The last state is an accept state, thus automaton from
Image 1.0.0
accepts _:)__:(
.
Transition table
One way of representing transitions between states is a state diagram.
Another way - using transition table. It looks like two dimensional array where rows represents
states and columns symbols from the alphabet. For example,
Table 1.0.0
shows all possible transitions of
Image 1.0.0
Notice how intersection of row and column describes next state, given current state (row) and a symbol (column). For
example, if I want to know next state from $$q_2$$ given )
, I just need to find
$$q_2$$ state in a row, then )
in the column, and check which state is at their
intersection ($$q_4$$).
$$:$$ | $$)$$ | $$($$ | $$\_$$ | |
---|---|---|---|---|
$$q_0$$ | $$q_1$$ | $$q_4$$ | $$q_4$$ | $$q_0$$ |
$$q_1$$ | $$q_4$$ | $$q_2$$ | $$q_2$$ | $$q_4$$ |
$$q_2$$ | $$q_4$$ | $$q_4$$ | $$q_4$$ | $$q_3$$ |
$$q_3$$ | $$q_1$$ | $$q_4$$ | $$q_4$$ | $$q_3$$ |
$$q_4$$ | $$q_4$$ | $$q_4$$ | $$q_4$$ | $$q_4$$ |
Next post DFA in Java will show how to implement DFA shown in Image 1.0.0 in Java language.