Welcome to

Understanding of MIP* = RE

Graphics of circles

Class P
Class BPP
Class IP
Class MIP
Class MIP*

Class P: Sorting

One of the more straightforward ways to understand the notion of computational complexity is via the traditional exercises on sorting.

How long does it take you to sort the library below from shortest to tallest book?

  • Pedro Páramo
  • The Masnavi
  • The Man Without Qualities
  • Beloved
  • Sentimental Education
  • TRY AGAIN

    00:00:00

    MORE BOOKS

    02468100246810
    number of books
    time to sort (seconds)

    EXPLAIN

    Let's formalize the library problem. There is a number of books, NN, which we need to reorder to produce a sorted list. If we are interested in the efficiency of the algorithm, we look at its asymptotic runtime - how much slower does our algorithm get as the list to be sorted grows. The runtime is described by big-O notation, which describes the worst-case behaviour of the alrgorithm.

    In most cases, the speed of the algorithm depends on how large NN is, and we are most interested in how much the algorithm slows down as NN increases. For example, if the algorithm always takes NN steps to finish, we say it runs in linear time, O(N)O(N) (it might be just checking if the list is sorted, for which it just goes through once and checks for each element that it's smaller than the next).

    However, when sorting a list, algorithms need to go back and forth, reordering elements, which cannot be done in one fell swoop. So, many algorithms take multiple times the length of the list to sort it. Some of the most common algorithms for sorting can take O(NlogN)O(N\log N) or even O(N2)O(N^2) time. In the last case, as the list doubles in size, the time taken to sort it quadruples.

    You can see how these two algorithms fare on the plot on the right. As they both take at most polynomial time in the size of the input to finish sorting, we say that sorting the books belongs to the PP class of problems, our first complexity class.

    Types of computational problems

    On the previous page, we dealt with sorting. This is known as a function problem: The input is a list, and the output is the sorted list. Another common type of problem is an optimization problem: Ask what is the best possible output given some input and evaluation criteria. However, for us the focus are decision problems: reframing such that the output is a binary YES or NO.

    We can often transform function and optimization problems into decision problems, although the complexity introduced on the previous page does not have to be kept in that transformation. For example, a sorting problem might be rephrased as "is this list sorted?". Obviously, this solves a slightly different problem, and is also way easier, just compare each pair of elements in turn and check their order.

    There is a lot of theory already for all sorts of problems, but decision problems are the most examined. Technically, "class P" on the previous page refers only to the decision problem of "is this list sorted?" and since we can do this in linear time checking if a list is sorted would belong to an even smaller class.

    Similarly, the rest of this website deals with other decision problems, to which the answers are either YES or NO, and complexity of these problems explores how long it takes to make this decision, in relation to the input size.

    We can also define space complexity, which constrains the memory the algorithm needs to decide a problem instead of time.

    FUNCTION PROBLEMS

    The goal is to transform an input into an output, for example "Find a factor of X" or "Find all factors of X".

    OPTIMIZATION PROBLEMS

    The goal is to find the "best" output given an input and constraints, like "Find the shortest path through graph G".

    our focus: DECISION PROBLEMS

    The goal is to decide a YES/NO question, for example "Are these graphs G1, G2 isomorphic to each other?" or "Is P prime?"

    Class BPP: Polynomial Identity Testing

    Next up, we introduce another toy problem. Can you tell whether the following two polynomials are equal? How much harder does it get as you increase the highest degree, or number of variables (guessing doesn't count!)?

    k2l2=?(k+l)(kl)k^2 - l^2 \stackrel{?}{=} (k+l)(k-l)
    What steps does the computer take?

    PREVIOUS

    TRUE

    FALSE

    NEXT

    024681005201015
    ~ highest degree + n. variables
    ~ steps to decide (FLOPS)

    EXPLAIN

    The algorithmic approach to find out whether two polynomials are equivalent is to expand all terms and then compare their coefficients. We can find some time savings by removing like terms as soon as they appear, without doing all addition, but that's rather coincidental. Overall, this brute-force approach is very inefficient, and scales as (n+dd){n+d \choose d} for number of variables nn and degree dd.

    However, with some leeway we can make solving this problem way more efficient. The first step is to transform it: Instead of asking whether two polynomials are equal, q1=?q2q_1\stackrel{?}=q_2, we ask whether their difference is equal to the zero polynomial, q1q2=?0q_1-q_2\stackrel{?}=0. This is the same as asking whether a polynomial p=q1q2p=q_1-q_2 is equal to 00.

    One way to find out is to evaluate it at any point/vector xx: p(x)p(x). If the result is anything but 00, clearly the polynomial is non-zero. If the result is 00, the polynomial could very well be zero. We can try another point to verify... and as we increase the number of points, if all evaluate to zero, we become very confident that p=0p = 0, or by extension that the two polynomials q1,q2q_1, q_2 are equal.

    With some smart choice of points to evaluate at and a surprisingly reasonable number of repetitions (in hundreds), we can achieve very high probability of a YES answer being a true YES. This makes the algorithm highly practical - for example Horner's method for polynomial evaluation runs in O(n2)O(n^2), which you can see plotted on the graph.

    The algorithm we have defined always returns a NO if the true answer is NO, and almost always returns a YES if the true answer is YES. It can do this in polynomial time. We call this class of problems RPRP, for Randomized Polynomial time. If the roles of YES and NO were reversed, the class would be coRPcoRP. If we allow either answer to be probabilistic with some constant confidence, the class putting that together is BPPBPP, bounded-error probabilistic polynomial time.

    Solvers and verifiers

    Up until now, the machines we were dealing with were solvers. Given an input, they worked through it and showed the process by which to determine the YES/NO answer. Another typical machine used in the theoretical models of decision problems is a verifier. Instead of solving the problem itself, a verifier receives a proof that the problem is solvable, and checks it in some given timeframe.

    Take for example a partially filled sudoku grid. Is it solvable? To determine the answer, the computer can try all possible fills for each possible cell, and check if all the conditions are satisfied, but this will take a lot of time. For a 9x9 grid, with each cell having 9 options, the runtime will be O(992)O(9^{9^2}). However, given a valid fill for the sudoku as proof, a verifier can check that it's valid in linear time. The proof is called a certificate.

    Another similar situation is in the sorting example - a machine can take a long time to sort a list, but checking that it's sorted is a strictly linear operation - just check that all neighbouring pairs are in the right order.

    The question remains how to generate such certificates. Since it's useful to work with them, the theory of computation has an all-powerful machine that can generate such a certificate instantly, called an oracle. The set-up is such that once the verifier receives the problem, it sends it over to the oracle, who responds with a certificate.

    However, if it responded with no certificate for problems without a solution, it wouldn't be a very useful theoretical construct (the verifier wouldn't have to do any work, just trust the oracle). Instead the oracle is malicious, and always sends a certificate. The verifier must work through it to determine whether it's a valid proof.

    To solidify the ideas a little, we impose two constraints on our system. If the true answer is YES, the verifier should accept the oracle's proof in a certain percentage of cases (e.g. 95%), and similarly for NO, the verifier should reject in a given percentage of cases (e.g. 95%). Similarly to the previous problem, we can amend this probability by making this a multi-round game, with the verifier asking more questions.

    SOLVER

    A machine that accepts a problem, determines whether it's answer is YES or NO in its runtime, and returns the answer.

    VERIFIER

    A machine that accepts a problem and a certificate, and in its runtime verifies whether the certificate provides a true answer to the problem.

    ORACLE/PROVER

    An all-knowing machine that solves any problem, and returns the YES/NO answer or a certificate for the answer.

    Class IP: Graph Non-Isomorphism

    Let's call our verifier Arthur and our oracle Merlin. You, as Arthur, are trying to determine whether two graphs G1 and G2 are not isomorphic. You know Merlin knows but he'll try to trick you. Your strategy is to secretly toss a coin to choose one of G1 and G2, shuffle the chosen graph around and send it to Merlin. If he can tell which graph it came from, they are (probably) not isomorphic. If he cannot tell, you're certain they are isomorphic!

    G1
    G2
    Hey Merlin, is this a permutation of G1 or G2?

    FLIP A COIN

    PERMUTE GRAPH

    SEND TO MERLIN

    TRY AGAIN

    0512340.01.00.20.40.60.8
    number of rounds
    your certainty of non-isomorphism

    EXPLAIN

    How long can Merlin keep fooling you? Well, our two graphs are isomorphic. You are welcome to drag their nodes around and check this for yourself. If they are isomorphic, without knowing the result of your cointoss, Merlin can never figure out which graph was used to generate the permutation he received. He can only take a 50/50 guess, and that's why you never believe him straight away. But if he keeps guessing right, you eventually budge and admit that he cannot keep being lucky, and your graphs are indeed non-isomorphic.

    On ther other hand, as soon as Merlin once tells you the wrong answer, you can be certain that he's trying to fool you! If the graphs were truly non-isomorphic, he must be able to tell which one you started your permutation from.

    Verifying Merlin's answer takes constant time, but Arthur nonetheless spends time making the permutation in the first place. For the same reason, the complementary problem, deciding whether two graphs ARE isomorphic, is in NPNP. Given a certificate, a description of the permutation, the verifier can check if its valid in polynomial time.

    However, our problem of whether graphs AREN'T isomorphic, eludes such simple process, and that's why we have to solve it with Merlin and Arthur. Problems solved with Arthur starting by sending a question to Merlin, and then deciding based on his answer, are in class AM[k]AM[k], with kk describing how many "rounds" there have been in their conversation.

    A more general class of problems, but using the same principle, is called Interactive Proofs, IPIP.

    Nonlocal games

    As we move on, we will need to expand the interactive proof idea from our Arthur-Merlin argument. We remain in the verifier seat, but this time question two provers at once. This setup is called a multiprover interactive proof. The provers can setup a common strategy beforehand, but cannot communicate once the protocol starts. The notion is that this allows us to cross-check their answers and be more certain in our analysis.

    We will only have 2 provers and a single round of interaction (1 question to each prover and their answer). This class is called MIP(2,1)MIP(2,1) (it also turns out that any larger number of provers or rounds can be reduced down to 2 provers and 1 round, but that's beyond our current discussion). This system can also be reformulated as a so-called Nonlocal game.

    In this game, our two provers are players, and the verifier is a referee. The referee sends each player a question, and based on their answers decides whether they win. The players know the full rules of the game and can setup a strategy, but cannot communicate once the game starts. However, since the referee's questions are probabilistic (drawn from a known distribution), the players generally can't be sure that they always win.

    Instead, their goal is to setup the best strategy possible that would win them the highest percentage of games. Their best strategy success rate is called the value of the game.

    Finding the value of the game is an optimization problem; to turn it into a decision problem, we instead ask whether the game's value is for example 23\geq\frac{2}{3}. This way, we can use our decision problem complexity classes to categorize a game. Let's see an example on the next page.

    PROVER ⇿ PLAYER

    An all-powerful machine, like Merlin, that provides information to the verifier.

    VERIFIER ⇿ REFEREE

    Our machine, like Arthur, whose complexity we are measuring. The referee questions the players, and based on their answers, "plays," determines whether they won.

    NONLOCAL GAME VALUE

    The probability that the players can win the game given their best strategy among all possible strategies.

    Class MIP: CHSH game

    Our two provers-players are Alice and Bob. Before playing, they can select a strategy to follow in the game. You are the verifier-referee. As the game starts, you send a random bit xx to Alice and a random bit yy to Bob. They respond according to their strategy. Alice and Bob win the game if xy=abx \wedge y = a \oplus b. In words, they win if the sum of their answers modulo 2 equals the product of the inputs.

    Alice's strategies

    xx

    aa

    00

    \rightarrow

    00

    11

    \rightarrow

    00

    xx

    aa

    00

    \rightarrow

    00

    11

    \rightarrow

    11

    xx

    aa

    00

    \rightarrow

    11

    11

    \rightarrow

    00

    xx

    aa

    00

    \rightarrow

    11

    11

    \rightarrow

    11

    Referee

    xx

    yy

    aa

    bb

    xyx \wedge y

    aba \oplus b

    Bob's strategies

    yy

    bb

    00

    \rightarrow

    00

    11

    \rightarrow

    00

    yy

    bb

    00

    \rightarrow

    00

    11

    \rightarrow

    11

    yy

    bb

    00

    \rightarrow

    11

    11

    \rightarrow

    00

    yy

    bb

    00

    \rightarrow

    11

    11

    \rightarrow

    11

    PLAY ONCE

    KEEP PLAYING

    02468100.01.00.20.40.60.8
    number of games
    percentage of games won

    EXPLAIN

    You should see a set of simulations ran for a few selected strategies appear on the chart on the right. With a large number of games played, we can see statistical results as their win percentages stabilize. Some strategies lead to a very poor performance, winning just around a quarter of the games.

    The best of the strategies shown, A4B4 is Alice and Bob completely ignoring the question and just answering 11 every time. Its win probability is 75%, and in fact this is the highest winning probability that Bob and Alice can achieve. You are welcome to try and find other strategies with the same win percentage, but you won't be able to beat it. You can even enumerate all 16 strategies (or accompany it by the "both-Alice-and-Bob-answer-randomly" strategy!), and evaluate their chances against the 4 question pairs.

    Since 75% is the best Alice and Bob can do, we say that 0.750.75 is the classical value of the CHSH game.

    Obviously, the example is fairly simplistic, but it illustrates the idea of multi-prover interactive proofs. In more complicated setups, the verifier is cross-checking both provers' answers. For example, think through how having two Merlins in the previous problem of graph non-isomorphism would allow you to draw your conclusion as soon as the two Merlins disagree!

    Quantum Entanglement

    We're finally arriving at the crux of our exploration. We now have two provers whose answers need to be coordinated, but who aren't allowed to communicate. One way they can improve their correlation is via quantum entaglement. There's limited space to explain here, so let's layout the basics and feel free to consult the linked resources for further explanations!

    Until now, we've worked in the classical world, where bits are either 00 or 11. A qubit, on the other hand, can be in a superposition of 0\ket{0} and 1\ket{1}, with a certain probability of us observing either of the two when we measure it. For example, a qubit could be in the state ψ=02+12\ket{\psi} = \frac{\ket{0}}{\sqrt{2}} + \frac{\ket{1}}{\sqrt{2}}. When measured, it has a 50% chance of showing a 0\ket{0}. Refer to the linked sources to see how this is calculated from the state description!

    If we have two qubits, we can describe their state at once with a so-called tensor product, e.g. ψAϕB\ket{\psi}_A \otimes \ket{\phi}_B. For example, if ψ=0+12\ket{\psi} = \frac{\ket{0} + \ket{1}}{\sqrt{2}} and ϕ=0\ket{\phi} = \ket{0}, then ψϕ=0+120=00+102\ket{\psi} \otimes \ket{\phi} = \frac{\ket{0} + \ket{1}}{\sqrt{2}} \otimes \ket{0} = \frac{\ket{00} + \ket{10}}{\sqrt{2}}.

    However, some states cannot be split into a tensor product. Take for example ψ=00+112\ket{\psi} = \frac{\ket{00} + \ket{11}}{\sqrt{2}}. No matter how hard you try, you won't find a tensor product to generate it. This is called an entangled state. If we calculate the measurement probabilities, we find out that there is a 50% chance of measuring 00\ket{00} and 50% chance of measuring 11\ket{11}. If we give the two qubits to Alice and Bob while entangled, and Alice measures a 0\ket{0} state on her qubit, Bob's MUST be also 0\ket{0}, because from the formula the probability of a 01\ket{01} state is 0%!

    Finally, it's possible to measure in different ways than just to figure out zeros and ones. Depending on the axis of measurement, you can determine other directions of the qubit's spin. We measured along the 0,1\ket{0}, \ket{1} axis, but you could for example measure along the +=02+12,=0212\ket{+} = \frac{\ket{0}}{\sqrt{2}} + \frac{\ket{1}}{\sqrt{2}}, \ket{-} = \frac{\ket{0}}{\sqrt{2}} - \frac{\ket{1}}{\sqrt{2}} axis. The only important thing in choosing the axis is to ensure that its two states are perpendicular (which you can check using their vector representations, or by reading the linked resources).

    BIT

    A unit of classical information, can be either 00 or 11.

    QUBIT

    A unit of quantum information, can be in some superposition of 0\ket{0} and 1\ket{1}

    2-QUBIT STATE

    A system of two qubits: ψ=c100+c201+c310+c411\ket{\psi} = c_1\ket{00} + c_2\ket{01} + c_3\ket{10} + c_4\ket{11}

    ENTANGLED STATE

    A 2-qubit state that cannot be factored into a tensor product, like ψ=00+112\ket{\psi} = \frac{\ket{00} + \ket{11}}{\sqrt{2}}

    Class MIP*: CHSH game again

    This time, Alice and Bob have prepared a special strategy for the same game: beforehand, they entangled a qubit pair ψ=00+112\ket{\psi} = \frac{\ket{00} + \ket{11}}{\sqrt{2}} and each of them took one of the qubits. They also prepared their measurement axis in a specific way, which allows them to win as often as possible - play and watch how they do!

    Alice's qubit

    00\ket{0_0}
    11\ket{1_1}
    +0\ket{+_0}
    1\ket{-_1}

    If I receive x=0x=0, I'll measure along the 0,1\ket{0}, \ket{1} axis, otherwise along the +,\ket{+}, \ket{-} axis.

    Referee

    aa

    bb

    xyx \wedge y

    aba \oplus b

    Bob's qubit

    a0\ket{a_0}
    b1\ket{b_1}
    A0\ket{A_0}
    B1\ket{B_1}

    If I receive x=0x=0, I'll measure along the a,b\ket{a}, \ket{b} axis, otherwise along the A,B\ket{A}, \ket{B} axis.

    PLAY ONCE

    KEEP PLAYING

    02468100.01.00.20.40.60.8
    Quantum Strategy
    number of games
    percentage of games won

    EXPLAIN

    Alice is using simple axes to measure her qubit. If she receives x=0x=0, she'll measure along the 0,1\ket{0}, \ket{1} axis. On the other hand if she receives x=1x=1, she'll use the +=02+12,=0212\ket{+} = \frac{\ket{0}}{\sqrt{2}} + \frac{\ket{1}}{\sqrt{2}}, \ket{-} = \frac{\ket{0}}{\sqrt{2}} - \frac{\ket{1}}{\sqrt{2}} axis. In either case, she'll respond with a 00 or 11, respectively, depending on the outcome of her measurement. Bob is using more complicaed axes pairs. For a question y=0y=0, he'll use the a=cosπ80+sinπ81,b=sinπ80+cosπ81\ket{a} = \cos{\frac{\pi}{8}}\ket{0} + \sin{\frac{\pi}{8}}\ket{1}, \ket{b} = -\sin{\frac{\pi}{8}}\ket{0} + \cos{\frac{\pi}{8}}\ket{1} axis, responding with 00 if he measures a\ket{a} and 11 if he measures b\ket{b}. Similarly, if he is asked y=1y=1, he will measure along A=cosπ80sinπ81,B=sinπ80+cosπ81\ket{A} = \cos{\frac{\pi}{8}}\ket{0} - \sin{\frac{\pi}{8}}\ket{1}, \ket{B} = \sin{\frac{\pi}{8}}\ket{0} + \cos{\frac{\pi}{8}}\ket{1} axis, again adequately responding 00 or 11.

    Look at the experimental results chart and compare the quantum strategy to some of the classical strategies Alice and Bob employed previously. They are doing much better now! With a single quantum-entangled qubit and some smart measuring axis, they upped the value of the game to 85%.

    Let's translate the nonlocal games language back into a complexity class now. We say that a language belongs to the complexity class MIP* if you can translate it into a nonlocal game G, with quantum value over 2/3 for all strings belonging to the language and below 1/3 for all strings not belonging to the language.

    We know that MIP* contains MIP, because you can always choose the "quantum" strategy of ignoring your qubits and going with the best classical strategy. We have also seen an example of a quantum strategy doing better than a classical strategy, suggesting that there are problems for which a quantum strategy is enough where a classical one cannot do. (In reality both of these statements come with more details, also including the lower bound).

    Determining whether a string belongs to a langauge is then a question of finding the quantum value of game G, which is done by approximation, hopefully to a convergent value.

    Continue to the last page to see the relationship between MIP* and RE!

    MIP* = RE

    The previous exercises established the MIP* class, but it's also important to talk about the RE class. RE stands for recursively enumerable, the class of languages whose 'yes' instances a Turing machine can list one by one. In a less cryptic way, it can also be defined as language to which a Turing machine can verify the 'yes' problems, but might run indefinitely for 'no' problems.

    A famous example of a language belonging to RE is the Halting Problem, asking whether a given program will stop running; you can run it, and say 'yes' if it stops, but it might just keep running forever if the true answer is no. Because of an ensuing paradox, it is undecidable whether a program will halt, and a reduction to this problem is often used to show other problems are undecidable. A reduction is a statement of the type "if you could solve this problem, you could also solve the halting problem, and that's impossible."

    This is exactly the type of structure Ji et al. used to show that it's also undecidable whether a string belongs to MIP*. Their idea translates the question of whether a program halts into a quantum nonlocal game - if the game has quantum value 1, the machine halts, and if it has value less than 1/2, it doesn't halt.

    If there was a way to calculate (or at least approximate) the quantum value of a game, to figure out if a string belongs to MIP*, Ji et al.'s procedure would allow us to use this algorithm and figure out the Halting problem - which is impossible! So there cannot be a solver for MIP*, which must thus be undecidable and equal to RE.

    Class P

    Solvable by a Turing machine in polynomial time

    Class BPP

    Solvable with an error bound by a Turing machine in polynomial time

    Class (M)IP

    Verifiable with an error bound by a Turing machine in a (multiprover) interactive proof protocol

    Class MIP*

    Verifiable with an error bound by a Turing machine against multiple provers sharing quantum entanglement