From Turing Machines to the Halting Problem

Tue Apr 01 2025

Computability Theory
cover

Why Turing Machines?

If you know a bit about Turing machines, you might wonder: in an era where computing devices have gone through countless iterations, the Turing machine seems like an awkward and inefficient model. Any modern device appears far more elegant and fast. So, is it still necessary to understand the Turing machine? What can it possibly offer us today?

For the first question, the answer is: it depends. If you’re not interested in computability or complexity theory and your focus is purely on applications, then you might not need to study Turing machines at all.

For the second question, the Turing machine offers a formal mathematical model for rigorous proofs — and it’s surprisingly powerful:

  • The Church–Turing thesis tells us that any computable problem can be solved by a Turing machine. This means anything your computer can do—or anything you can program in Python, Haskell, or C—can also be done by a Turing machine. In terms of computational power, they are equivalent.

  • If an algorithm runs in polynomial time on a Turing machine, it will also run in polynomial time on a RAM model—that is, on real-world devices like smartphones and computers.

Any function that can be computed algorithmically can also be computed by a Turing machine.
— Church–Turing Thesis

Therefore, if we prove that a problem cannot be solved by a Turing machine, then no programming language, RAM model, or even any other computing model can solve it.

If we prove an algorithm’s complexity is polynomial on the Turing machine model, it will belong to the same complexity class on any programming language or RAM-based device (though the actual runtime may differ). The same applies in reverse.

One of the most famous problems that a Turing machine cannot solve is the Halting Problem. By the end of this blog, we’ll see why.

Definition of a Turing Machine

The idea of a Turing machine was first introduced by Alan Turing in 1936. A computable problem is written on an input tape, which extends infinitely to the right. The machine starts in an initial state with its read/write head pointing to the first symbol of the input. Based on a set of rules, the current symbol under the head, and its state, the Turing machine can:

  • Update the symbol at the current position
  • Change its state
  • Move the head left or right

When the computation finishes, the result remains on the tape.

TM1 Arrow TM2

Formally, a Turing machine is defined as a 5-tuple Σ,Q,q0,H,δ\langle \Sigma, Q, q_0, H, \delta \rangle:

  • Σ\Sigma is a finite alphabet of symbols, including the blank symbol \sqcup.
  • QQ is a finite set of states.
  • q0Qq_0 \in Q is the initial state.
  • HQH \subseteq Q is the set of halting states.
  • δ\delta is the transition function.

Σ\Sigma is used to represent the input on the tape—it can be as simple as {0,1,}\{0, 1, \sqcup\}, like a binary computer, or any alphabet that is easy to work with.
QQ defines the machine’s states, q0q_0 tells it where to start, and HH defines the states where the machine should stop immediately.
δ\delta encodes the rules—the algorithm for solving the problem—and is usually written as:

δ:(QH)×ΣQ×Σ×{,}\delta : (Q \setminus H) \times \Sigma \rightarrow Q \times \Sigma \times \{\rightarrow, \leftarrow\}

Here, (QH)(Q \setminus H) is the current state, and Σ\Sigma is the current symbol under the head. Based on these, δ\delta determines the next state, the new symbol to write, and the direction to move the head.

Note: Besides the 5-tuple definition, some conventions of Turing machines may vary—for example, whether the tape is infinite on both sides, or whether the head starts on the first input symbol or just before it. These conventions are ultimately reflected in the definitions of QQ and δ\delta.

Example

Let’s define a Turing machine that inverts all the 0s and 1s in the input. For example, given 11001, the output will be 00110. Its 5-tuple is defined as:

Σ={0,1,},Q={q0,qH},q0=q0,H={qH}\Sigma = \{0, 1, \sqcup\}, \quad Q = \{q_0, q_H\}, \quad q_0 = q_0, \quad H = \{q_H\}

The transition function is defined as:

δ(q0,0)=(q0,1,)δ(q0,1)=(q0,0,)δ(q0,)=(qH,,)\begin{aligned} \delta(q_0, 0) &= (q_0, 1, \rightarrow) \\ \delta(q_0, 1) &= (q_0, 0, \rightarrow) \\ \delta(q_0, \sqcup) &= (q_H, \sqcup, \rightarrow) \end{aligned}

For the input 10011, the Turing machine runs on the tape as follows:

EX1 Arrow EX2 Arrow EX3 Arrow EX4 Arrow EX5 Arrow EX6 Arrow EX7

You may have noticed that for linear problems, a Turing machine can solve them quickly. But for nonlinear problems, the solutions are often less intuitive. Here, we only need to convince ourselves that any problem a computer can solve can also be solved by a specific Turing machine.

Decision Problem

A decision problem is a type of problem with a YES or NO answer. For example, given a positive integer, determine whether it is even; or given a Turing machine and an input, determine whether the machine halts on that input. For such problems, the Turing machine does not need to write the answer on the tape. Instead, it uses two special halting states, qacceptq_{accept} and qrejectq_{reject}, to represent acceptance and rejection. (This can easily be converted into writing the answer on the tape if needed.)

DTM

To formally define a decision problem, we consider that for any input string, there are two cases:

  • The string is a YES instance.
  • The string is a NO instance.

We define the set of all YES-instance strings as the language LL (languages are just subsets of Σ\Sigma^*). A Turing machine that solves this decision problem acts as a decision function: when given input xLx \in L, it halts in state qacceptq_{accept}; otherwise, it halts in state qrejectq_{reject}.

For example, consider the problem of determining whether a number is even, and the Turing machine M M that solves it.

The alphabet is {0,1,,9,} \{0, 1, \dots, 9, \sqcup\}, and the language L={strings representing even numbers} L = \{\text{strings representing even numbers}\}. Then, for input x=12L x = 12 \in L, M(x) M(x) halts in state qaccept q_{accept}. For input x=7L x = 7 \notin L, M(x) M(x) halts in state qreject q_{reject}.

Encoding

At this point, we have a basic understanding of how a Turing machine works. However, some parts are still lacking a formal definition—for example, the input xx. For the same problem and data, the encoding is not unique.

Take the parity-checking problem for positive integers as an example. If the alphabet is {0,1,,9,}\{0, 1, \dots, 9, \sqcup\}, the input can be a string of decimal digits. Alternatively, we could represent the number in binary using only {0,1,}\{0, 1, \sqcup\}. Even with the alphabet fixed to {0,1,}\{0, 1, \sqcup\}, encoding is still not unique—for example, we could encode a number by repeating the symbol 11: 22 becomes 11, and 55 becomes 11111.

Therefore, we need to define what makes an encoding valid or invalid. For a problem and data instance α\alpha, a valid encoding system should satisfy:

  1. If αβ\alpha \ne \beta, then code(α)code(β)\text{code}(\alpha) \ne \text{code}(\beta).
  2. We should be able to determine whether a string xΣx \in \Sigma^* is a valid encoding of some α\alpha.
  3. It must be possible to recover the original α\alpha from code(α)\text{code}(\alpha).

For point 2, recognizability helps us check whether the input is a valid instance of the problem. For example, in the positive integer parity-checking problem over the alphabet {0,1,,9,}\{0, 1, \dots, 9, \sqcup\}, the string “0” is not a valid encoding, since 0 is not a positive integer. Therefore, it is not the encoding of any valid data α\alpha. Other examples of invalid encodings might include 00340034 or 353\sqcup 5.

In decision problems, inputs that are invalid or do not belong to the problem domain—along with the NO instances—together form the set of inputs that the Turing machine should reject.

Decidability and Recognizability

Before we can decide whether a problem can be solved by a Turing machine, we must first define what it means for a problem to be solved by a Turing machine.

For decision problems encoding as language LL, we require that the Turing machine MM produces the correct answer in a finite number of steps. That is, if the data α\alpha is a YES-instance, the input x=code(α)Lx=\text{code}(\alpha)\in L, then M(x)M(x) halts in qacceptq_{accept} within a finite number of steps. If α\alpha is a NO-instance, the input x=code(α)Lx=\text{code}(\alpha)\notin L, or xx not a valid encoding, then M(x)M(x) halts in qrejectq_{reject} within a finite number of steps. At this point, we say that MM decides LL.

If there exists a Turing machine MM that decides a language LL, then the language and its corresponding decision problem are called decidable.

If no Turing machine MM exists that decides a language LL, then the language and its corresponding decision problem are called undecidable or unsolvable.

Among these, some problems are still recognizable—meaning there exists a Turing machine that halts on all YES-instances, but dose not halt on NO-instances.

Clearly, the Halting Problem is a recognizable problem. We can simply run the given Turing machine on the given input (see following section): if the machine halts, we also halts. If it does not halt, we make no decision.

Universal Turing Machine

So far, the Turing machines we’ve introduced are designed to solve specific tasks. This is quite different from modern computers, which allow us to run task-specific programs directly, rather than building a separate “machine” for each task. This leads to the concept of the Universal Turing Machine (UTM).

A UTM takes as input a encoding of a Turing machine MM and an input xx, then simulates the behavior of M(x)M(x) and produces the same result.

A UTM does exist, and can be constructed in various ways. One possible structure is as follows:

  • The input to the UTM is code(M)code(x)\text{code}(M)\text{code}(x), where code(M)\text{code}(M) includes all of MM‘s information and its transition function.
  • The UTM checks whether code(M)\text{code}(M) and code(x)\text{code}(x) are valid. If not, it enters an infinite loop.
  • The UTM then simulates M(x)M(x): if M(x)M(x) halts and produces an output zz, the UTM does the same; if M(x)M(x) loops forever, so does the UTM.

The Undecidability of the Halting Problem

The Halting Problem is a decision problem: given a Turing machine MM and an input xx, determine whether MM halts on xx. The corresponding language can be defined as:

HALT={y,xΣ×Σy=code(M) and M halts on x}\text{HALT} = \{ \langle y, x \rangle \in \Sigma^* \times \Sigma^* \mid y = \text{code}(M) \text{ and } M \text{ halts on } x \}

We already know that HALT\text{HALT} is recognizable. Now, we will prove that it is undecidable.

We use proof by contradiction: assume there exists a Turing machine MHM_H that decides HALT\text{HALT}.

Now we construct a Turing machine MM' that takes an input zΣz \in \Sigma^*, then runs MHM_H on input z,z\langle z, z \rangle.

  • If MHM_H accepts z,z\langle z, z \rangle—meaning that running the decoded Turing machine decode(z)\text{decode}(z) on input zz halts—then MM' enters an infinite loop.
  • If MHM_H rejects z,z\langle z, z \rangle—meaning that decode(z)\text{decode}(z) on input zz does not halt or the encoding is invalid—then MM' accepts.
DTM

Now consider feeding code(M)\text{code}(M') as input to MM'. The execution flow of MM' is as follows:

  1. MM' runs MHM_H on input code(M),code(M)\langle \text{code}(M'), \text{code}(M') \rangle.
  2. Then MHM_H decides whether MM' halts on input code(M)\text{code}(M'). If MM' halts on input code(M)\text{code}(M'), then MHM_H accepts; otherwise, it rejects.
  3. If MHM_H accepts code(M),code(M)\langle \text{code}(M'), \text{code}(M') \rangle, then MM' enters an infinite loop;
    if MHM_H rejects code(M),code(M)\langle \text{code}(M'), \text{code}(M') \rangle, then MM' accepts code(M)\text{code}(M').

However:

  • MHM_H accepting code(M),code(M)\langle \text{code}(M'), \text{code}(M') \rangle implies that MM' halts on input code(M)\text{code}(M'),
    but in fact MM' enters an infinite loop on that case.
  • MHM_H rejecting code(M),code(M)\langle \text{code}(M'), \text{code}(M') \rangle implies that MM' does not halt on input code(M)\text{code}(M'),
    but in fact MM' halts.

This contradiction shows that no Turing machine MHM_H can decide the language HALT\text{HALT}.

The Unrecognizability of the Complement of HALT

The language HALT is recognizable. As a direct consequence, its complement, HALT\text{HALT}^-, is not recognizable.

HALT={y,xΣ×Σycode(M) for any M or y=code(M) and M does not halt on x}\text{HALT}^- = \{ \langle y, x \rangle \in \Sigma^* \times \Sigma^* \mid y \ne \text{code}(M) \text{ for any } M \text{ or } y = \text{code}(M) \text{ and } M \text{ does not halt on } x \}

We use proof by contradiction: suppose there exists a Turing machine MHCM_{HC} that recognizes HALT\text{HALT}^-. Together with a machine MHM_H that recognizes HALT\text{HALT}, we can construct the following machine MM decides HALT\text{HALT}:

HTM

The input xx is given to both MHM_H and MHCM_{HC}, which run in parallel. The simplest method is to let each machine execute one step at a time, alternating between them. Since MHM_H and MHCM_{HC} recognize HALT\text{HALT} and HALT\text{HALT}^- respectively, for any y,xHALT\langle y, x \rangle \in \text{HALT}, MHM_H will halt and accept; otherwise, MHCM_{HC} will halt and accept.

This proof idea applies to any recognizable language—in fact, the complement of any recognizable language is not recognizable.

Conclusion

We have now proven that the Halting Problem is undecidable, and that no known real-world computational model can solve it. But how can we deal with it—or at least work around it?

The answer is to place restrictions on the Halting Problem. For example: does Turing machine MM halt on input xx within 100 steps? This version is clearly decidable.


Acknowledgments
This blog references material from the slides of COMP0017 Computability and Complexity Theory at University College London.