Welcome to

Understanding of MIP* = RE

Graphics of circles

Class P
Class BPP
Class IP
Class MIP
Class MIP*

Class P: Sorting

One of the more straightforward ways to understand the notion of computational complexity is via the traditional exercises on sorting.

How long does it take you to sort the library below from shortest to tallest book?

TRY AGAIN

MORE BOOKS

EXPLAIN

Let's formalize the library problem. There is a number of books, \(N\), which we need to reorder to produce a sorted list. If we are interested in the efficiency of the algorithm, we look at its asymptotic runtime - how much slower does our algorithm get as the list to be sorted grows. The runtime is described by big-O notation, which describes the worst-case behaviour of the alrgorithm.

In most cases, the speed of the algorithm depends on how large \(N\) is, and we are most interested in how much the algorithm slows down as \(N\) increases. For example, if the algorithm always takes \(N\) steps to finish, we say it runs in linear time, \(O(N)\) (it might be just checking if the list is sorted, for which it just goes through once and checks for each element that it's smaller than the next).

However, when sorting a list, algorithms need to go back and forth, reordering elements, which cannot be done in one fell swoop. So, many algorithms take multiple times the length of the list to sort it. Some of the most common algorithms for sorting can take \(O(N\log N)\) or even \(O(N^2)\) time. In the last case, as the list doubles in size, the time taken to sort it quadruples.

You can see how these two algorithms fare on the plot on the right. As they both take at most polynomial time in the size of the input to finish sorting, we say that sorting the books belongs to the \(P\) class of problems, our first complexity class.

Types of computational problems

On the previous page, we dealt with sorting. This is known as a function problem: The input is a list, and the output is the sorted list. Another common type of problem is an optimization problem: Ask what is the best possible output given some input and evaluation criteria. However, for us the focus are decision problems: reframing such that the output is a binary YES or NO.

We can often transform function and optimization problems into decision problems, although the complexity introduced on the previous page does not have to be kept in that transformation. For example, a sorting problem might be rephrased as "is this list sorted?". Obviously, this solves a slightly different problem, and is also way easier, just compare each pair of elements in turn and check their order.

There is a lot of theory already for all sorts of problems, but decision problems are the most examined. Technically, "class P" on the previous page refers only to the decision problem of "is this list sorted?" and since we can do this in linear time checking if a list is sorted would belong to an even smaller class.

Similarly, the rest of this website deals with other decision problems, to which the answers are either YES or NO, and complexity of these problems explores how long it takes to make this decision, in relation to the input size.

We can also define space complexity, which constrains the memory the algorithm needs to decide a problem instead of time.

FUNCTION PROBLEMS

The goal is to transform an input into an output, for example "Find a factor of X" or "Find all factors of X".

OPTIMIZATION PROBLEMS

The goal is to find the "best" output given an input and constraints, like "Find the shortest path through graph G".

our focus: DECISION PROBLEMS

The goal is to decide a YES/NO question, for example "Are these graphs G1, G2 isomorphic to each other?" or "Is P prime?"

Class BPP: Polynomial Identity Testing

Next up, we introduce another toy problem. Can you tell whether the following two polynomials are equal? How much harder does it get as you increase the highest degree, or number of variables (guessing doesn't count!)?

$$k^2 - l^2 \stackrel{?}{=} (k+l)(k-l)$$
What steps does the computer take?

PREVIOUS

TRUE

FALSE

NEXT

EXPLAIN

The algorithmic approach to find out whether two polynomials are equivalent is to expand all terms and then compare their coefficients. We can find some time savings by removing like terms as soon as they appear, without doing all addition, but that's rather coincidental. Overall, this brute-force approach is very inefficient, and scales as \({n+d \choose d}\) for number of variables \(n\) and degree \(d\).

However, with some leeway we can make solving this problem way more efficient. The first step is to transform it: Instead of asking whether two polynomials are equal, \(q_1\stackrel{?}=q_2\), we ask whether their difference is equal to the zero polynomial, \(q_1-q_2\stackrel{?}=0\). This is the same as asking whether a polynomial \(p=q_1-q_2\) is equal to \(0\).

One way to find out is to evaluate it at any point/vector \(x\): \(p(x)\). If the result is anything but \(0\), clearly the polynomial is non-zero. If the result is \(0\), the polynomial could very well be zero. We can try another point to verify... and as we increase the number of points, if all evaluate to zero, we become very confident that \(p = 0\), or by extension that the two polynomials \(q_1, q_2\) are equal.

With some smart choice of points to evaluate at and a surprisingly reasonable number of repetitions (in hundreds), we can achieve very high probability of a YES answer being a true YES. This makes the algorithm highly practical - for example Horner's method for polynomial evaluation runs in \(O(n^2)\), which you can see plotted on the graph.

The algorithm we have defined always returns a NO if the true answer is NO, and almost always returns a YES if the true answer is YES. It can do this in polynomial time. We call this class of problems \(RP\), for Randomized Polynomial time. If the roles of YES and NO were reversed, the class would be \(coRP\). If we allow either answer to be probabilistic with some constant confidence, the class putting that together is \(BPP\), bounded-error probabilistic polynomial time.

Solvers and verifiers

Up until now, the machines we were dealing with were solvers. Given an input, they worked through it and showed the process by which to determine the YES/NO answer. Another typical machine used in the theoretical models of decision problems is a verifier. Instead of solving the problem itself, a verifier receives a proof that the problem is solvable, and checks it in some given timeframe.

Take for example a partially filled sudoku grid. Is it solvable? To determine the answer, the computer can try all possible fills for each possible cell, and check if all the conditions are satisfied, but this will take a lot of time. For a 9x9 grid, with each cell having 9 options, the runtime will be \(O(9^{9^2})\). However, given a valid fill for the sudoku as proof, a verifier can check that it's valid in linear time. The proof is called a certificate.

Another similar situation is in the sorting example - a machine can take a long time to sort a list, but checking that it's sorted is a strictly linear operation - just check that all neighbouring pairs are in the right order.

The question remains how to generate such certificates. Since it's useful to work with them, the theory of computation has an all-powerful machine that can generate such a certificate instantly, called an oracle. The set-up is such that once the verifier receives the problem, it sends it over to the oracle, who responds with a certificate.

However, if it responded with no certificate for problems without a solution, it wouldn't be a very useful theoretical construct (the verifier wouldn't have to do any work, just trust the oracle). Instead the oracle is malicious, and always sends a certificate. The verifier must work through it to determine whether it's a valid proof.

To solidify the ideas a little, we impose two constraints on our system. If the true answer is YES, the verifier should accept the oracle's proof in a certain percentage of cases (e.g. 95%), and similarly for NO, the verifier should reject in a given percentage of cases (e.g. 95%). Similarly to the previous problem, we can amend this probability by making this a multi-round game, with the verifier asking more questions.

SOLVER

A machine that accepts a problem, determines whether it's answer is YES or NO in its runtime, and returns the answer.

VERIFIER

A machine that accepts a problem and a certificate, and in its runtime verifies whether the certificate provides a true answer to the problem.

ORACLE/PROVER

An all-knowing machine that solves any problem, and returns the YES/NO answer or a certificate for the answer.

Class IP: Graph Non-Isomorphism

Let's call our verifier Arthur and our oracle Merlin. You, as Arthur, are trying to determine whether two graphs G1 and G2 are not isomorphic. You know Merlin knows but he'll try to trick you. Your strategy is to secretly toss a coin to choose one of G1 and G2, shuffle the chosen graph around and send it to Merlin. If he can tell which graph it came from, they are (probably) not isomorphic. If he cannot tell, you're certain they are isomorphic!

G1
G2
Hey Merlin, is this a permutation of G1 or G2?

FLIP A COIN

PERMUTE GRAPH

SEND TO MERLIN

TRY AGAIN

EXPLAIN

How long can Merlin keep fooling you? Well, our two graphs are isomorphic. You are welcome to drag their nodes around and check this for yourself. If they are isomorphic, without knowing the result of your cointoss, Merlin can never figure out which graph was used to generate the permutation he received. He can only take a 50/50 guess, and that's why you never believe him straight away. But if he keeps guessing right, you eventually budge and admit that he cannot keep being lucky, and your graphs are indeed non-isomorphic.

On ther other hand, as soon as Merlin once tells you the wrong answer, you can be certain that he's trying to fool you! If the graphs were truly non-isomorphic, he must be able to tell which one you started your permutation from.

Verifying Merlin's answer takes constant time, but Arthur nonetheless spends time making the permutation in the first place. For the same reason, the complementary problem, deciding whether two graphs ARE isomorphic, is in \(NP\). Given a certificate, a description of the permutation, the verifier can check if its valid in polynomial time.

However, our problem of whether graphs AREN'T isomorphic, eludes such simple process, and that's why we have to solve it with Merlin and Arthur. Problems solved with Arthur starting by sending a question to Merlin, and then deciding based on his answer, are in class \(AM[k]\), with \(k\) describing how many "rounds" there have been in their conversation.

A more general class of problems, but using the same principle, is called Interactive Proofs, \(IP\).

Nonlocal games

As we move on, we will need to expand the interactive proof idea from our Arthur-Merlin argument. We remain in the verifier seat, but this time question two provers at once. This setup is called a multiprover interactive proof. The provers can setup a common strategy beforehand, but cannot communicate once the protocol starts. The notion is that this allows us to cross-check their answers and be more certain in our analysis.

We will only have 2 provers and a single round of interaction (1 question to each prover and their answer). This class is called \(MIP(2,1)\) (it also turns out that any larger number of provers or rounds can be reduced down to 2 provers and 1 round, but that's beyond our current discussion). This system can also be reformulated as a so-called Nonlocal game.

In this game, our two provers are players, and the verifier is a referee. The referee sends each player a question, and based on their answers decides whether they win. The players know the full rules of the game and can setup a strategy, but cannot communicate once the game starts. However, since the referee's questions are probabilistic (drawn from a known distribution), the players generally can't be sure that they always win.

Instead, their goal is to setup the best strategy possible that would win them the highest percentage of games. Their best strategy success rate is called the value of the game.

Finding the value of the game is an optimization problem; to turn it into a decision problem, we instead ask whether the game's value is for example \(\geq\frac{2}{3}\). This way, we can use our decision problem complexity classes to categorize a game. Let's see an example on the next page.

PROVER ⇿ PLAYER

An all-powerful machine, like Merlin, that provides information to the verifier.

VERIFIER ⇿ REFEREE

Our machine, like Arthur, whose complexity we are measuring. The referee questions the players, and based on their answers, "plays," determines whether they won.

NONLOCAL GAME VALUE

The probability that the players can win the game given their best strategy among all possible strategies.

Class MIP: CHSH game

Our two provers-players are Alice and Bob. Before playing, they can select a strategy to follow in the game. You are the verifier-referee. As the game starts, you send a random bit \(x\) to Alice and a random bit \(y\) to Bob. They respond according to their strategy. Alice and Bob win the game if \(x \wedge y = a \oplus b\). In words, they win if the sum of their answers modulo 2 equals the product of the inputs.

Alice's strategies

\(x\)

\(a\)

\(0\)

\(\rightarrow\)

\(0\)

\(1\)

\(\rightarrow\)

\(0\)

\(x\)

\(a\)

\(0\)

\(\rightarrow\)

\(0\)

\(1\)

\(\rightarrow\)

\(1\)

\(x\)

\(a\)

\(0\)

\(\rightarrow\)

\(1\)

\(1\)

\(\rightarrow\)

\(0\)

\(x\)

\(a\)

\(0\)

\(\rightarrow\)

\(1\)

\(1\)

\(\rightarrow\)

\(1\)

Referee

\(x\)

\(y\)

\(a\)

\(b\)

\(x \wedge y\)

\(a \oplus b\)

Bob's strategies

\(y\)

\(b\)

\(0\)

\(\rightarrow\)

\(0\)

\(1\)

\(\rightarrow\)

\(0\)

\(y\)

\(b\)

\(0\)

\(\rightarrow\)

\(0\)

\(1\)

\(\rightarrow\)

\(1\)

\(y\)

\(b\)

\(0\)

\(\rightarrow\)

\(1\)

\(1\)

\(\rightarrow\)

\(0\)

\(y\)

\(b\)

\(0\)

\(\rightarrow\)

\(1\)

\(1\)

\(\rightarrow\)

\(1\)

PLAY ONCE

KEEP PLAYING

EXPLAIN

You should see a set of simulations ran for a few selected strategies appear on the chart on the right. With a large number of games played, we can see statistical results as their win percentages stabilize. Some strategies lead to a very poor performance, winning just around a quarter of the games.

The best of the strategies shown, A4B4 is Alice and Bob completely ignoring the question and just answering \(1\) every time. Its win probability is 75%, and in fact this is the highest winning probability that Bob and Alice can achieve. You are welcome to try and find other strategies with the same win percentage, but you won't be able to beat it. You can even enumerate all 16 strategies (or accompany it by the "both-Alice-and-Bob-answer-randomly" strategy!), and evaluate their chances against the 4 question pairs.

Since 75% is the best Alice and Bob can do, we say that \(0.75\) is the classical value of the CHSH game.

Obviously, the example is fairly simplistic, but it illustrates the idea of multi-prover interactive proofs. In more complicated setups, the verifier is cross-checking both provers' answers. For example, think through how having two Merlins in the previous problem of graph non-isomorphism would allow you to draw your conclusion as soon as the two Merlins disagree!

Quantum Entanglement

We're finally arriving at the crux of our exploration. We now have two provers whose answers need to be coordinated, but who aren't allowed to communicate. One way they can improve their correlation is via quantum entaglement. There's limited space to explain here, so let's layout the basics and feel free to consult the linked resources for further explanations!

Until now, we've worked in the classical world, where bits are either \(0\) or \(1\). A qubit, on the other hand, can be in a superposition of \(\ket{0}\) and \(\ket{1}\), with a certain probability of us observing either of the two when we measure it. For example, a qubit could be in the state \(\ket{\psi} = \frac{\ket{0}}{\sqrt{2}} + \frac{\ket{1}}{\sqrt{2}}\). When measured, it has a 50% chance of showing a \(\ket{0}\). Refer to the linked sources to see how this is calculated from the state description!

If we have two qubits, we can describe their state at once with a so-called tensor product, e.g. \(\ket{\psi}_A \otimes \ket{\phi}_B\). For example, if \(\ket{\psi} = \frac{\ket{0} + \ket{1}}{\sqrt{2}}\) and \(\ket{\phi} = \ket{0}\), then \(\ket{\psi} \otimes \ket{\phi} = \frac{\ket{0} + \ket{1}}{\sqrt{2}} \otimes \ket{0} = \frac{\ket{00} + \ket{10}}{\sqrt{2}}\).

However, some states cannot be split into a tensor product. Take for example \(\ket{\psi} = \frac{\ket{00} + \ket{11}}{\sqrt{2}}\). No matter how hard you try, you won't find a tensor product to generate it. This is called an entangled state. If we calculate the measurement probabilities, we find out that there is a 50% chance of measuring \(\ket{00}\) and 50% chance of measuring \(\ket{11}\). If we give the two qubits to Alice and Bob while entangled, and Alice measures a \(\ket{0}\) state on her qubit, Bob's MUST be also \(\ket{0}\), because from the formula the probability of a \(\ket{01}\) state is 0%!

Finally, it's possible to measure in different ways than just to figure out zeros and ones. Depending on the axis of measurement, you can determine other directions of the qubit's spin. We measured along the \(\ket{0}, \ket{1}\) axis, but you could for example measure along the \(\ket{+} = \frac{\ket{0}}{\sqrt{2}} + \frac{\ket{1}}{\sqrt{2}}, \ket{-} = \frac{\ket{0}}{\sqrt{2}} - \frac{\ket{1}}{\sqrt{2}}\) axis. The only important thing in choosing the axis is to ensure that its two states are perpendicular (which you can check using their vector representations, or by reading the linked resources).

BIT

A unit of classical information, can be either \(0\) or \(1\).

QUBIT

A unit of quantum information, can be in some superposition of \(\ket{0}\) and \(\ket{1}\)

2-QUBIT STATE

A system of two qubits: \(\ket{\psi} = c_1\ket{00} + c_2\ket{01} + c_3\ket{10} + c_4\ket{11}\)

ENTANGLED STATE

A 2-qubit state that cannot be factored into a tensor product, like \(\ket{\psi} = \frac{\ket{00} + \ket{11}}{\sqrt{2}}\)

Class MIP*: CHSH game again

This time, Alice and Bob have prepared a special strategy for the same game: beforehand, they entangled a qubit pair \(\ket{\psi} = \frac{\ket{00} + \ket{11}}{\sqrt{2}}\) and each of them took one of the qubits. They also prepared their measurement axis in a specific way, which allows them to win as often as possible - play and watch how they do!

Alice's qubit

\(\ket{0_0}\)
\(\ket{1_1}\)
\(\ket{+_0}\)
\(\ket{-_1}\)

If I receive \(x=0\), I'll measure along the \(\ket{0}, \ket{1}\) axis, otherwise along the \(\ket{+}, \ket{-}\) axis.

Referee

\(a\)

\(b\)

\(x \wedge y\)

\(a \oplus b\)

Bob's qubit

\(\ket{a_0}\)
\(\ket{b_1}\)
\(\ket{A_0}\)
\(\ket{B_1}\)

If I receive \(x=0\), I'll measure along the \(\ket{a}, \ket{b}\) axis, otherwise along the \(\ket{A}, \ket{B}\) axis.

PLAY ONCE

KEEP PLAYING

EXPLAIN

Alice is using simple axes to measure her qubit. If she receives \(x=0\), she'll measure along the \(\ket{0}, \ket{1}\) axis. On the other hand if she receives \(x=1\), she'll use the \(\ket{+} = \frac{\ket{0}}{\sqrt{2}} + \frac{\ket{1}}{\sqrt{2}}, \ket{-} = \frac{\ket{0}}{\sqrt{2}} - \frac{\ket{1}}{\sqrt{2}}\) axis. In either case, she'll respond with a \(0\) or \(1\), respectively, depending on the outcome of her measurement. Bob is using more complicaed axes pairs. For a question \(y=0\), he'll use the \(\ket{a} = \cos{\frac{\pi}{8}}\ket{0} + \sin{\frac{\pi}{8}}\ket{1}, \ket{b} = -\sin{\frac{\pi}{8}}\ket{0} + \cos{\frac{\pi}{8}}\ket{1}\) axis, responding with \(0\) if he measures \(\ket{a}\) and \(1\) if he measures \(\ket{b}\). Similarly, if he is asked \(y=1\), he will measure along \(\ket{A} = \cos{\frac{\pi}{8}}\ket{0} - \sin{\frac{\pi}{8}}\ket{1}, \ket{B} = \sin{\frac{\pi}{8}}\ket{0} + \cos{\frac{\pi}{8}}\ket{1}\) axis, again adequately responding \(0\) or \(1\).

Look at the experimental results chart and compare the quantum strategy to some of the classical strategies Alice and Bob employed previously. They are doing much better now! With a single quantum-entangled qubit and some smart measuring axis, they upped the value of the game to 85%.

Let's translate the nonlocal games language back into a complexity class now. We say that a language belongs to the complexity class MIP* if you can translate it into a nonlocal game G, with quantum value over 2/3 for all strings belonging to the language and below 1/3 for all strings not belonging to the language.

We know that MIP* contains MIP, because you can always choose the "quantum" strategy of ignoring your qubits and going with the best classical strategy. We have also seen an example of a quantum strategy doing better than a classical strategy, suggesting that there are problems for which a quantum strategy is enough where a classical one cannot do. (In reality both of these statements come with more details, also including the lower bound).

Determining whether a string belongs to a langauge is then a question of finding the quantum value of game G, which is done by approximation, hopefully to a convergent value.

Continue to the last page to see the relationship between MIP* and RE!

MIP* = RE

The previous exercises established the MIP* class, but it's also important to talk about the RE class. RE stands for recursively enumerable, the class of languages whose 'yes' instances a Turing machine can list one by one. In a less cryptic way, it can also be defined as language to which a Turing machine can verify the 'yes' problems, but might run indefinitely for 'no' problems.

A famous example of a language belonging to RE is the Halting Problem, asking whether a given program will stop running; you can run it, and say 'yes' if it stops, but it might just keep running forever if the true answer is no. Because of an ensuing paradox, it is undecidable whether a program will halt, and a reduction to this problem is often used to show other problems are undecidable. A reduction is a statement of the type "if you could solve this problem, you could also solve the halting problem, and that's impossible."

This is exactly the type of structure Ji et al. used to show that it's also undecidable whether a string belongs to MIP*. Their idea translates the question of whether a program halts into a quantum nonlocal game - if the game has quantum value 1, the machine halts, and if it has value less than 1/2, it doesn't halt.

If there was a way to calculate (or at least approximate) the quantum value of a game, to figure out if a string belongs to MIP*, Ji et al.'s procedure would allow us to use this algorithm and figure out the Halting problem - which is impossible! So there cannot be a solver for MIP*, which must thus be undecidable and equal to RE.

Class P

Solvable by a Turing machine in polynomial time

Class BPP

Solvable with an error bound by a Turing machine in polynomial time

Class (M)IP

Verifiable with an error bound by a Turing machine in a (multiprover) interactive proof protocol

Class MIP*

Verifiable with an error bound by a Turing machine against multiple provers sharing quantum entanglement