Regular languages and finite state machines. Construction of NFA using right-linear grammar

Settings

According to Kleene's theorem, any regular expression you can associate a finite machine, which is a formal model of the algorithm for recognizing lexemes denoted by a given regular expression. In the most general terms, a state machine-the recognizer is determined by a finite set of characteristic states of the input stream and transitions between them. A state change occurs when symbols of the input stream are received from a given alphabet in accordance with the transition function, which determines possible subsequent states based on the input symbol and the current state. Among the possible states, the initial one stands out(initial) and final(allowing) states in which the finite automaton recognizer can be, respectively, at the beginning and completion of processing tokens of the input stream. If the input sequence of symbols can generate a sequence of transitions that can transfer the finite state machine from the initial state to one of the final states, then it is considered admitting and belongs to the regular set recognized by it.

(00|11)*((01|10)(00|11)*(01|10)(00|11)*)*

Table 1

	0	1
Q1	Q4	Q2
Q2	Q3	Q1
Q3	Q2	Q4
Q4	Q1	Q3

The columns of the transition table indicate the characters of the input alphabet, and the rows correspond to the current states of the DFA. The elements of each line indicate the states of the DFA, to which it must transition from the current state upon receiving the corresponding characters of the input alphabet. In particular, from the first line of this transition table it follows that receiving the symbols 0 and 1 in the initial state Q1 translates the DFA to states Q4 and Q2, respectively.

When recognizing an input sequence using the transition table, it is easy to trace changes in the state of the DFA in order to determine whether one of the allowing states is achieved or not. In particular, for the binary vector 01001000 with an even number of zeros and ones, the considered DFA generates the following sequence of transitions, where each transition is labeled by the symbol of the input alphabet that causes it:

Q1 0 Q4 1 Q3 0 Q2 0 Q3 1 Q4 1 Q1 0 Q4 0 Q1

This sequence of transitions ends with the admitting state Q1, therefore, the binary vector 01001000 belongs to the regular set recognized by the considered DFA and satisfies the above regular expression.

In conclusion, it should be noted that the considered informal method of constructing

by the total number of characters of the alphabet of symbols and operation signs and parentheses in the entry r.

Basis. Automata for expressions of length 1: and are shown in the following figure.

Rice. 5.1.

Note that for each of these three automata, the set of final states consists of a single state.

Induction step. Now suppose that for each regular expression of length = 1, the word w can be divided into k subwords: w=w 1 w 2 ... w k and that's it. For each i= 1,... ,k the word w i translates q 0 1 into q f 1 . Then for the word w in the diagram M there is a path

Hence, . Conversely, if some word translates q 0 into q f , then either it exists or it is carried by a path that, having passed from q 0 to q 0 1 and then passing several times along the path from q 0 1 to q f 1 and returning from q f 1 to q 0 1 by -transition, ultimately from q f 1 by -transition ends in q f . Therefore such a word.

From Theorems 4.2 and 5.1 we directly obtain

Corollary 5.1. For each regular expression, one can effectively build a deterministic finite state machine that recognizes the language represented by that expression.

This statement is one of the examples of synthesis theorems: based on the description of a task (language as a regular expression), a program (DFA) that performs it is effectively constructed. The converse is also true - a theorem of analysis.

Theorem 5.2. For each deterministic (or non-deterministic) finite state machine, it is possible to construct a regular expression that represents the language recognized by that machine.

The proof of this theorem is quite technical and beyond the scope of our course.

Thus, we can conclude that the class of finitely automata languages coincides with the class of regular languages. From now on we will simply call it the class of automata languages.

The automaton M r, which is constructed in the proof of Theorem 5.1

For further study of the properties of finite automata and, in particular, for solving the synthesis problem, the following theorem is important.

Theorem 7.7 (determinization theorem). For any finite automaton, an equivalent deterministic finite automaton can be constructed.

In order to prove the theorem, it is necessary, firstly, to describe the algorithm for constructing a deterministic finite automaton from the original one; second, to justify this algorithm by rigorously proving that it does indeed produce a state machine that is deterministic and equivalent to the original one. Here we present only the algorithm for constructing a deterministic automaton.

The transformation of an arbitrary finite automaton to an equivalent deterministic one is carried out in two stages: first, arcs with the label \lambda are removed, then the determinization itself is carried out.

1. Removing λ-transitions (arcs labeled \lambda).

To go from the original finite automaton M=(V,Q,q_0,F,\delta) to the equivalent finite automaton M"=(V,Q",q_0,F",\delta") without λ-transitions, it is enough in the original make the following transformations in graph M.

A. All states, except the initial one, into which only arcs with the \lambda label enter, are deleted; thereby defining the set Q" of the finite automaton M". It is clear that Q"\subseteq Q. At the same time, we assume that the initial state remains the same.

b. The set of arcs of the finite automaton M" and their labels (thereby the transition function M" ) is defined as follows: for any two states p,r\in Q",~ p\to_(a)r occurs if and only if a \in V , and in the graph M one of two things holds: either there is an arc from p to r whose label contains the symbol a, or there is a state q such that p\Rightarrow_(\lambda)^(+)q and q\ to_(a)r In this case, the vertex q, generally speaking, may not belong to the set Q", i.e. it may disappear when passing to the automaton M" (Fig. 7.11). If q\in Q" , then, naturally, the arc (q,r) will be preserved in M" and the symbol a will be one of the symbols belonging to the label of this arc (Fig. 7.12).

Thus, in M" all arcs M are stored whose labels are different from \lambda and which connect a pair (vertices) of states from the set Q" (not deleted according to part a). In addition, for any triple of states p,q,r (not necessarily different!), such that p,r\in Q" and there is a path of non-zero length from p to q, the label of which is equal to \lambda (i.e. the path by λ-transitions), and an arc leads from q to r, the label of which contains the symbol a of the input alphabet, in M" an arc is constructed from p to r, the label of which contains the symbol a (see Fig. 7.11).

V. The set of final states F" of the finite automaton M" contains all states q\in Q", i.e. states of the finite automaton M that are not deleted according to paragraph a, for which q\Rightarrow_(\lambda)^(\ast) holds q_f for some q_f\in F (i.e. either state q itself is the final state of the finite automaton M, or a path of non-zero length leads from it along arcs labeled \lambda to one of the final states of the finite automaton M) (Fig. 7.13) .

2. Determination itself.

Let M=(Q,V,q_0,F,\delta) be a finite automaton without λ-transitions. Let us construct a deterministic finite automaton M_1 equivalent to M.

This finite automaton is defined in such a way that its state set is the set of all subsets of the state set of the finite automaton M. This means that each individual state of the finite automaton M_1 is defined as a certain subset of the set of states of the finite automaton M. In this case, the initial state of the new finite state machine (i.e. M_1) is a singleton subset containing the initial state of the old finite state machine (i.e. M), and the final states of the new finite state machine are all such subsets Q that contain at least one final the vertex of the original finite automaton M.

Henceforth, allowing some freedom of speech, we will sometimes call the states of the finite automaton M_1 states-sets. It is important, however, to clearly understand that each such state-set is a separate state of a new finite automaton, but not a set of its states. At the same time, for the original ("old") finite automaton M this is precisely the set of its states. Figuratively speaking, each subset of states of the old finite automaton is “collapsed” into one state of the new finite automaton*.

*Formally, we should define the set Q_1 as a set that is in one-to-one correspondence with the set 2^Q, but it is still more convenient for us to assume that Q_1 coincides with 2^Q - after all, the set of states of a finite automaton can be any non-empty finite set.

The transition function of the new finite automaton is defined so that from the state-set S by the input symbol a the finite automaton M_1 goes to the state-set, which is the union of all sets of states of the old finite automaton into which this old finite automaton goes by the symbol a from each state sets S. Thus, the finite automaton M_1 is deterministic by construction.

The above verbal description can be translated into formulas as follows: we build a finite machine M_1 so that

M_1=(Q_1,V,\(q_0\),F_1,\delta_1) , where

\begin(cases)Q_1=2^Q,\quad F_1=\(T\colon\, T\cap F\ne\varnothing,~T\in2^Q\),\\ (\forall S\subseteq Q) (\forall a\in V)\Bigl(\delta_1(S,a)= \bigcup\limits_(q\in S)\delta(q,a)\Bigr). \end(cases)

Let us pay attention to the fact that among the states of the new finite automaton there is a state \varnothing , and, according to (7.8), \delta_1(\varnothing,a)=\varnothing for any input symbol a . This means that, once in such a state, the state machine M_1 will not leave it. In general, any state q of a finite automaton such that for any input symbol a we have \delta(q,a)=q is called an absorbing state of the finite automaton. Thus, the state \varnothing of the deterministic finite state machine M_1 is absorbing. It is also useful to note that \delta_1(S,a)=\varnothing if and only if for each q\in S (states of the old finite state machine from the set of states S ) \delta(q,a)=\varnothing , i.e. e. in the graph M, no arc exits from each such state q, marked with the symbol a.

It can be proven that the finite automaton obtained using such an algorithm is equivalent to the original one.

Example 7.9. Let us determine the finite automaton shown in Fig. 7.14.

An equivalent finite automaton without λ-transitions is shown in Fig. 7.15. Note that vertex q_2 disappears, since only “empty” arcs enter it.

To determine the resulting automaton, it is not at all necessary to write down all of its 2^3=8 states, many of which may not be reachable from the initial state \(q_0\) . To obtain states that are reachable from \(q_0\), and only them, we will use the so-called pulling method.

This method can generally be described as follows.

In the initial finite automaton (without empty arcs) we define all sets of states that are reachable from the initial one, i.e. for each input symbol a we find the set \delta(q_0,a) . Each such set in the new automaton is a state directly accessible from the initial one.

For each of the defined state-sets S and each input symbol a, we find the set \textstyle(\mathop(\bigcup\limits_(q\in S) \delta(q,a))\limits^(\phantom(A)^( .))) . All states obtained at this step will be states of a new (deterministic) automaton, reachable from the initial vertex along a path of length 2. We repeat the described procedure until no new set states (including the empty one!) appear. It can be shown that this produces all states of the finite automaton M_1 that are reachable from the initial state \(q_0\) .

For the finite state machine in Fig. 7.15 we have:

\begin(aligned)& \delta_1(\(q_0\),a)=\(q_1\);\qquad \delta_1(\(q_0\),b)=\(q_1,q_3\);\\ & \ delta_1(\(q_1\),a)=\(q_1\);\qquad \delta_1(\(q_1\),b)=\(q_1\);\\ & \delta_1(\(q_1,q_3\) ,a)= \delta(q_1,a)\cup \delta(q_3,a)= \(q_1\)\cup\(q_1\)=\(q_1\);\\ & \delta_1(\(q_1, q_3\),b)= \delta(q_1,b)\cup \delta(q_3,b)= \(q_1\)\cup\(q_1\)=\(q_1\). \end(aligned)

Since no new states-sets have appeared, the “pulling” procedure ends here, and we get the graph shown in Fig. 7.16.

Regular language addition

One of the important theoretical consequences of the determinization theorem is the following theorem.

Theorem 7.8. The complement of a regular language is a regular language.

Let L be a regular language in the alphabet V. Then the complement of the language L (as a set of words) is the language \overline(L)=V^(\ast)\setminus L .

According to Theorem 7.7, for a regular language L, a deterministic finite automaton M can be constructed that admits L. Since in a deterministic automaton from each vertex for each input symbol a transition to exactly one vertex is defined, then whatever the chain x is in the alphabet V, there is a unique path for it in M, starting in the initial state on which the chain x is read. It is clear that the chain x is allowed by the automaton M , that is, x\in L(M) , if and only if the last state of the specified path is final. It follows that the chain x\notin L(M) if and only if the last state of the specified path is not final. But we just need a finite automaton M" that admits a chain x if and only if it is not allowed by the original finite automaton M. Consequently, turning each final state of M into a non-final state and vice versa, we obtain a deterministic automaton that admits the addition of the language L .

The proven theorem allows us to build a finite automaton that does not admit a certain set of chains, using the following method: we first build an automaton that admits a given set of chains, then we determinize it and go to the automaton for addition as indicated in the proof of Theorem 7.8.

Example 7.10. A. Let's build a finite automaton that accepts all chains in the alphabet \(0;1\) except chain 101.

First, let's build a finite automaton that allows a single chain 101. This automaton is shown in Fig. 7.17.

This automaton is quasi-deterministic, but not deterministic, since it is not fully defined. Let us determine it and obtain a deterministic equivalent finite automaton shown in Fig. 7.18.

And finally, moving to the addition (and renaming the states), we get the automaton shown in Fig. 7.19.

Note that in the resulting automaton all vertices, except vertex s_3, are final.

Note also that the transition to complement, which is discussed in the proof of Theorem 7.8, can only be carried out in a deterministic automaton. If we were to swap the roles of final and non-final vertices in the automaton shown in Fig. 7.17, then we would get an automaton that admits the language \(\lambda,1,10\) , which is not - as one can easily imagine - the set of all chains other than chain 101.

Note also that the finite state machine in Fig. 7.19 allows all strings that contain an occurrence of string 101 but do not match that string itself. Here, for example, is the path carrying chain 1011: s_0,s_1,s_2,s_3,t.

b. Let's build a finite automaton that accepts all chains in the alphabet \(0;1\) except those that contain an occurrence of the chain 101. Consider a language L, each chain of which contains an occurrence of the chain 101. It can be defined as follows:

L=(0+1)^(\ast)101(0+1)^(\ast).

We need to build an automaton to complement the language L.

In this case, using the regular expression directly, it is easy to construct a finite automaton that accepts the language L (Fig. 7.20).

Then we will carry out determinization using the “pull” method. The result of determination is presented in Fig. 7.21.

To completely solve the problem, all that remains is in Fig. 7.21 swap the roles of final and non-final vertices (Fig. 7.22).

V. Let's discuss the idea of constructing a finite automaton that allows those and only those chains in the alphabet \(0;1\) that do not begin with the chain 01 and do not end with the chain 11 (i.e., chains of the form 01x and chains of the type y11 are not allowed, no matter what there were chains x,y\in\(0;1\) ).

In this case, the complement of the language for which it is necessary to construct a finite automaton is the set of all such chains of zeros and ones that begin with the chain 01 or end with the chain 11. An automaton that allows this set of chains is constructed as an automaton for combining 01(0+1)^(\ ast)+(0+1)^(\ast)11 in the same way as described in the proof of Kleene’s theorem (see Theorem 7.6).

From the property that the class of regular languages is closed with respect to complement (see Theorem 7.8) it immediately follows that this class is closed with respect to intersection, set-theoretic and symmetric difference.

Corollary 7.3. For any two regular languages L_1 and L_2 the following statements are true:

1) the intersection of L_1\cap L_2 is regular;
2) the difference L_1\setminus L_2 is regular;
3) the symmetric difference L_1\vartriangle L_2 is regular.

The validity of the statements follows from the identities:

\begin(aligned) &(\scriptstyle(\mathsf(1))))\quad L_1\cap L_2= \overline(\overline(L_1) \cup\overline(L_2))\,;\\ &(\scriptstyle (\mathsf(2))))\quad L_1\setminus L_2= L_1\cap \overline(L_2)\,;\\ &(\scriptstyle(\mathsf(3))))\quad L_1\,\triangle\ ,L_2 = (L_1\cup L_2)\setminus (L_1\cap L_2).\end(aligned)

Firstly, the results obtained allow us to assert that the class of regular languages with respect to the operations of union, intersection and addition is a Boolean algebra, in which the unit is the universal language, and the zero is the empty language. Secondly, these algebraic properties of the family of regular languages allow us to solve the important problem of recognizing the equivalence of two arbitrary finite automata.

According to Definition 7.10, finite state machines are equivalent if the languages they accept are the same. Therefore, to verify the equivalence of the automata M_1 and M_2, it is enough to prove that the symmetric difference of the languages L(M_1) and L(M_2) is empty. To do this, in turn, it is enough to construct an automaton that admits this difference and make sure that the language it admits is empty. In general, the problem of recognizing that a state machine's language is empty is called the state machine's emptiness problem. To solve this problem, it is enough to find the set of final states of the automaton that are reachable from the initial state. Since a finite state machine is a directed graph, this problem can be solved, for example, using breadth-first search. A language allowed by a finite state machine is empty if and only if the set of final states reachable from the initial state is empty. In practice, it is preferable to recognize the equivalence of finite automata using a minimization algorithm, but now it is important for us to emphasize that the fundamental possibility of solving the equivalence problem follows from Theorem 7.7 and its algebraic consequences.

DKA is a special case of NKA. In him:

there is no state with ε-transitions;

for each state S and input symbol a, there is at most one arc emanating from S and labeled a.

DFA has only a maximum of one transition for any input symbol from each state. If a table is used to represent the DFA transition function, then each record will contain only one state. Thus, it is easy to check whether a given DFA admits a certain line, since there is only one path from the starting state, which is marked with this line.

Figure 3 shows the transition graph of a DFA that accepts the same language (a|b) * a(a|b)(a|b) as the NFA in Figure 1.

Figure 3. DFA allowing the string (a|b) * a(a|b)(a|b).

A deterministic finite automaton M that accepts the given language:

M = ((1, 2, 3, 4, 5, 6, 7, 8), (a, b), D, 1, (3, 5, 6, 8))

The transition function D is defined as follows:

Constructing nka using a regular expression

1. For ε the NKA has the following form (0 – initial state, 1 – final):

2. For a included in a given NKA language:

3. Let N(s) and N(t) be NFAs for regular expressions s and t.

For the regular expression s|t, the composite NFA has the following form:

b. For the regular expression st NKA:

With. For the expression s* the NFA has the form:

d. For the expression in brackets (s), the NFA N(s) is used as in point a.

Each new state receives an individual name. The construction of the NFA N(r) has the following properties:

N(r) has a number of states that does not exceed the number of symbols by more than 2 times.

N(r) has exactly one initial and one final state. The final state has no outgoing transitions.

Each state N(r) has either 1 transition for a symbol from the alphabet (), or no more than 2 outgoing ε-transitions.

Converting nka to dka.

The NFA in Figure 1 has 2 transitions from state 0 for symbol a: states 0 and 1. Such a transition is ambiguous, like the transition in ε. Modeling such satellites using a computer program is much more difficult. The definition of feasibility states that there must be some path from the initial state to the final state, but when there are multiple paths for the same input string, they must all be considered to find a path to the final state or find out that there is no such path.

In the NKA transition table, each entry corresponds to many states, but in the DKA transition table, there is only one. The essence of the transformation is that each DFA state corresponds to a set of NFA states. The DFA uses its states to track all possible states that the NFA may be in after reading the next input symbol. That is, after reading the input stream, the DFA is in a state that represents a certain set of states of the NFA, reachable from the initial one along the path corresponding to the input string. The number of such DFA states can significantly exceed the number of NFA states (exponential dependence), but in practice this is extremely rare, and sometimes there are even fewer states in DFA than in NFA.

Let's consider such a transformation using a specific example. Figure 4 shows another NFA that allows the language (a|b) * a(a|b)(a|b) (as in Figures 1 and 3).

Figure 4. NFA allowing language (a|b) * a(a|b)(a|b)

The transition from state 13 to state 14 shown in the figure can be represented similarly to the transition from the 8th to the 13th state.

Let's construct a DFA for this language. The starting state of the equivalent DFA is the state A = (0, 1, 2, 4, 7), that is, those states that can be reached from 0 by ε.

The input symbol alphabet is (a, b). From the initial state A, one can calculate the state reachable by a. Let's call this state B = (1, 2, 3, 4, 6, 7, 8, 9, 11, 13, 14).

Among the states in A, only state 4 has a transition along b to state 5, so DKA has a transition along b from A to state C = (1, 2, 4, 5, 6, 7).

If we continue this process with states B and C, all sets of states of the NFA will be marked. Thus we will have sets of states:

A = (0, 1, 2, 4, 7)

B = (1, 2, 3, 4, 6, 7, 8, 9, 11, 13, 14)

C = (1, 2, 4, 5, 6, 7)

D = (10, 12, 13, 14)

State A is the initial state, and states B, D, E are final.

The full transition table is shown below.

Below in Figure 5 is the DFA itself for this language.

Figure 5. DFA accepting language (a|b) * a(a|b)(a|b)

List of used literature:

Pentus A. E., Pentus M. R. – Theory of formal languages

A. Aho, R. Sethi, D. Ullman - Compilers: principles, technologies, tools.

Regular expressions (RE) are a very convenient form of writing so-called regular or automata languages. Therefore, RTs are used as an input language in many systems that process chains. Let's look at examples of such systems:

The Unix grep command or similar commands search for strings found in Web browsers or text formatting systems. In such systems, RFs are used to describe the patterns that the user is looking for in a file. Various search engines convert the search engine into either a deterministic finite state machine (DFA) or a non-deterministic finite state machine (NFA) and apply this machine to the file being searched.
Generators of lexical analyzers. Lexical analyzers are a component of the compiler; they break the source program into logical units (tokens), which can consist of one or more characters and have a specific meaning. The lexical analyzer generator receives formal descriptions of lexemes, which are essentially RVs, and creates a DFA that recognizes which of the lexemes appears in its input.
RV in programming languages.

In this article, we will first get acquainted with finite state machines and their types (DFA and NFA), and then consider an example of constructing a minimal DFA using a regular expression.

Finite state machines A finite state machine (FA) is a converter that allows you to associate an input with a corresponding output, and this output can depend not only on the current input, but also on what happened before, on the background of the operation of the finite state machine. Even human behavior, and not just artificial systems, can be described using spacecraft. For example, your reaction to the fact that your neighbor listens to loud music at night will be one after the first such incident and completely different after several such incidents. There can be an infinite number of such prehistories; the question arises: what kind of memory should a spacecraft have in order to behave differently for each prehistory? It is clear that it is impossible to store an infinite number of backstories. Therefore, the automaton sort of breaks down all possible prehistories into equivalence classes. Two histories are equivalent if they have the same influence on the behavior of the machine in the future. The equivalence class to which the automaton has assigned its current history is also called the internal state of the automaton.

Let's consider an example of the operation of a primitive spacecraft:

This spacecraft consists of:

tape represented by the input chain.
reading device.
a control block that contains a list of transition rules.

The reader can move in one direction, usually from left to right, thereby reading the characters of the input string. For each such movement, it can count one symbol. Next, the read character is transmitted to the control unit. The control unit changes the state of the machine based on transition rules. If the list of transition rules does not contain a rule for the read symbol, then the machine “dies”.

Now let's look at the ways in which the CA can be specified. They can be specified in the form of graphs or in the form of control tables. In the form of a graph, the CA is specified in the following way:

the vertices of the graph correspond to the states of the spacecraft.
directed edges correspond to transition functions (next to each such edge the symbol along which the transition is performed is indicated).
a vertex with an edge entering it that does not leave more than one state corresponds to the initial state.
the final states of the spacecraft are marked with a thick outline.

In the form of a control table, like this:

the states of the spacecraft are located in the rows of the table.
characters of the recognized language are in columns.
at the intersection, a state is indicated that can be reached from a given state using a given symbol.

An example of a CA in the form of a graph and in the form of a control table will be presented below.

DKA and NKA

The main difference between the DKA and the NKA is that the DKA can only be in one state during operation, while the NKA can be in several states at the same time. As an example of the work of the NCA, one can cite the idea of the American physicist Hugh Everett that any event splits the world into several worlds, in each of which this event ended in its own way. For example, in one world, Hitler won the Second World War, in another, Newton went into business instead of physics, and the discovery of the laws of classical mechanics had to be postponed for 50 years. In order to draw any conclusions from the operation of the automaton, all “worlds” should be studied. After the entire input chain has been read, we assume that the NFA accepts this chain if it has completed operation in an accepting state in at least one of the many “worlds”. Accordingly, the automaton rejects the chain if it terminates in a non-admitting state in each “world”. The DKA accepts the chain; this is obvious if, after reading the entire input chain, it finds itself in an accepting state.

In most cases, it is much easier to build an NKV than a DKA. But despite this, using NKA for modeling is not a good idea. Fortunately, for each NFA it is possible to construct a DFA that accepts the same input language. In this article we will not present an algorithm for constructing a DFA using an NFA, but will consider this algorithm based on the illustrative example below.

Constructing a minimal DFA using a regular expression

To begin with, here is a list of RT operations used in this article, in order of priority:

iteration (Kleene closure), using the symbol "*"
concatenation is specified using a space or an empty string (for example: ab)
joining using the "|" symbol

Let's look at an example where a regular expression is given:

Xy* (x | y*) | ab (x | y*) | (x | a*) (x | y*)

It is necessary to construct a minimal DFA using a regular expression and demonstrate the recognition of correct and incorrect chains.

To begin with, let’s simplify this RV, using the right-hand distributive law of concatenation with respect to union, we obtain the following RV:

(xy* | ab | (x | a*)) (x | y*)

Now we build an automaton based on this RV:

According to the rule for converting concatenation (we will not give the rules for converting PB into KA, since they are quite obvious), we obtain the following automaton:

According to the union transformation rule:

According to the concatenation transformation rule:

And at the end we apply the closure transformation rule and get εNKA. Here it is necessary to make a reservation that εNKA is an NKA that contains ε-transitions. In turn, an ε-transition is a transition in which the machine does not take into account the input chain or, in other words, a transition along an empty symbol.

We get rid of ε-transitions (“the asterisk” denotes the final states):

In this NFA, the states s3 and s5 are equivalent, since δ(s3, x) = δ(s5, x) = s1 and δ(s3, y) = δ(s5, y) = s5, s7. Rename the states s6 -> s5 and s7 -> s6:

We build a DKA according to the NKA:

In this DFA, states p1 and p5 are equivalent, since
δ(p1, x) = δ(p5, x) = p4 and δ(p1, y) = δ(p5, y) = p5. Rename the states p6 -> p5 and p7 -> p6:

This machine is a minimal DKA.

Let δ be a transition function, then we denote the extended transition function constructed from δ by δ’, and ω is the input chain.

Suppose the input is a chain ω = aaax, we expect that the machine will be in one of the accepting states.

δ’(p0, ε) = p0
δ’(p0, a) = δ(δ’(p0, ε), a) = δ(p0, a) = p3
δ’(p0, aa) = δ(δ’(p0, a), a) = δ(p3, a) = p5
δ’(p0, aaa) = δ(δ’(p0, aa), a) = δ(p5, a) = p5
δ’(p0, aaax) = δ(δ’(p0, aaa), x) = δ(p5, x) = p4

P4 is a valid final state, so the chain aaax is correct for this machine.

Now let's assume that ω = xyyb:

δ’(p0, ε) = p0
δ’(p0, x) = δ(δ’(p0, ε), x) = δ(p0, x) = p1
δ’(p0, xy) = δ(δ’(p0, x), y) = δ(p1, y) = p1
δ’(p0, xyy) = δ(δ’(p0, xy), y) = δ(p1, y) = p1
δ’(p0, xyyb) = δ(δ’(p0, xyy), b) = δ(p1, b) = ∅

Here we see that if we give the symbol b to the input of the automaton when it is in state p1, then this automaton will die, therefore the chain xyyb is incorrect.

P. S. This article discussed the algorithm for constructing a DFA using RT, but there are more convenient algorithms, in particular for programming, but this is a topic for another article...

Regular languages ​​and finite state machines. Construction of NFA using right-linear grammar

Regular languages and finite state machines. Construction of NFA using right-linear grammar