Introduction to quantum computing with Q# – Part 15, Deutsch-Jozsa algorithm

Last time, we discussed a problem originally stated by David Deutsch, focusing on determining whether a function is constant or balanced. We found out that for that specific problem, quantum computing provides a much better query complexity than classical computing – as it can solve the task in a single blackbox function evaluation, while classical computing requires two function evaluations to provide the same answer.

Today, we shall look at the generalization of that simple problem.

History

In 1992, David Deutsch and Richard Jozsa published a paper entitled Rapid solution of problems by quantum computation in which they provided a generalization of the original 1985 Deutsch’s problem – the one we discussed in the last post. If you recall from last time, Deutsch’s problem dealt with an unknown (blackbox) function that took a single bit as an input, and produced a single bit as an output $f : \{0,1\} \rightarrow \{0,1\}$. We determined that classically we need two function evaluations to determine if the function is constant or balanced, while on a quantum computer a single oracle query suffices.

The generalization of Deutsch’s problem allows solving this not just for single bit input functions, but rather for any $n$ input function that produces either a 0 or a 1 as output for each input value. This can be written as $f : \{0,1\}^n \rightarrow \{0,1\}$.

To be perfectly clear, as it was briefly mentioned in the last post, the basic variant of the Deutsch’s problem we dissected the last time, does not really come from Deutsch’s original 1985 paper. Instead, that particular formulation is, technically speaking, the simplest case of the 1992 Deutsch-Jozsa paper too (the 1985 version was probabilistic not deterministic). However, for clarity reasons, and to improve digestibility of this material, it makes sense to tackle both variants separately, which is what we are doing in this series.

Deutsch and Jozsa emphasized in their 1992 paper that classical computation relies on problem solving based on function evaluations.

When a classical deterministic (Turing) computer solves a problem, it always does so by evaluating a function. For example, a factorization program will always find the same factor of a given input. Which factor it finds could be specified by an additional constraint, narrowing the task to a function evaluation. Therefore when solving problems a classical computer cannot help performing a harder computational task than the one it was set.

Specifically in the case of the generalized version of the Deutsch’s problem, often called the Deutsch-Jozsa algorithm, the classical computation theory tells us that $2^{n-1} + 1$ function evaluations are need to obtain the answer to the problem with certainty. Deutsch and Jozsa realized that on a quantum computer the same can be done much more efficiently, due to the ability to work on the problem without needing to evaluate the function – leading to, as we will see, problem resolution in a single oracle query.

Such a [quantum] computation also need not necessarily evaluate functions when it is solving problems, because the state of its output might be a coherent superposition of states corresponding to different answers, each of which solves the problem. This allows quantum computers to solve problems by methods which are not available to any classical device.

As we will see soon, the quantum computational speed up to solve this problem is quite dramatic.

Mathematical background

Before we can fully appreciate the solution to Deutsch-Jozsa algorithm, let’s recall that the Hadamard gate is represented by the following matrix:

$$H=\frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 &
-1 \end{bmatrix}$$

And that it creates a uniform superposition when applied to a qubit in a definite computational basis state $\ket{0}$ or $\ket{1}$:

$$H\ket{0} = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 &
-1 \end{bmatrix}\begin{bmatrix} 1 \\ 0 \end{bmatrix} =\frac{1}{\sqrt{2}}\ket{0} + \frac{1}{\sqrt{2}}\ket{1}$$

$$H\ket{1} = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 &
-1 \end{bmatrix}\begin{bmatrix} 0 \\ 1 \end{bmatrix} = \frac{1}{\sqrt{2}}\ket{0} – \frac{1}{\sqrt{2}}\ket{1}$$

This was extensively discussed already in part 2 of this series, and if things are not clear, I recommend you have a look at that post again. Let’s now consider what happens when we send two qubits, making up a two qubit quantum system, through an $H$ gate each. Since a multi qubit system is represented by a tensor product of the individual qubits, we can write this as follows, assuming we start with two qubits in state $\ket{0}$ each.

$$\displaylines{H\ket{0} \otimes H\ket{0} \\ = (\frac{1}{\sqrt{2}}\ket{0} + \frac{1}{\sqrt{2}}\ket{1}) \otimes (\frac{1}{\sqrt{2}}\ket{0} + \frac{1}{\sqrt{2}}\ket{1}) = \\ \frac{1}{2}(\ket{00} + \ket{01} + \ket{10} + \ket{11})}$$

We can extend this further and perform a similar calculation for the other three possible combinations of the computational basis states, where the difference will be visible in the sign, or rather, what we call in quantum computing, the phase.

$$\displaylines{H\ket{0} \otimes H\ket{1} \\ = (\frac{1}{\sqrt{2}}\ket{0} + \frac{1}{\sqrt{2}}\ket{1}) \otimes (\frac{1}{\sqrt{2}}\ket{0} – \frac{1}{\sqrt{2}}\ket{1}) = \\ \frac{1}{2}(\ket{00} – \ket{01} + \ket{10} – \ket{11})}$$

$$\displaylines{H\ket{1} \otimes H\ket{0} \\ = (\frac{1}{\sqrt{2}}\ket{0} – \frac{1}{\sqrt{2}}\ket{1}) \otimes (\frac{1}{\sqrt{2}}\ket{0} + \frac{1}{\sqrt{2}}\ket{1}) = \\ \frac{1}{2}(\ket{00} + \ket{01} – \ket{10} – \ket{11})}$$

$$\displaylines{H\ket{1} \otimes H\ket{1} \\ = (\frac{1}{\sqrt{2}}\ket{0} – \frac{1}{\sqrt{2}}\ket{1}) \otimes (\frac{1}{\sqrt{2}}\ket{0} – \frac{1}{\sqrt{2}}\ket{1}) = \\ \frac{1}{2}(\ket{00} – \ket{01} – \ket{10} + \ket{11})}$$

Equipped in that observation, we are now in a position to introduce the concept of the Kronecker product of matrices, which is a generalization of the outer product and produces the matrix of the tensor product. It turns out that we can describe the transformation above in a form of matrix of matrices, namely a new $H^{\otimes 2}$ matrix, made up of $H$ matrices.

$$H^{\otimes 2} = \frac{1}{\sqrt{2}}\begin{bmatrix} H & H \\ H & -H\end{bmatrix}$$

And we can generalize this even further into a $H^{\otimes n}$ gate that recursively acts upon an abitrary number of $n$ qubits.

$$H^{\otimes n} = \frac{1}{\sqrt{2}}\begin{bmatrix} H^{\otimes (n-1)} & H^{\otimes (n-1)} \\ H^{\otimes (n-1)} & -H^{\otimes (n-1)}\end{bmatrix}$$

What we have obtained here is often referred to as the so-called Walsh (or in some lieterature, Walsh-Hadamard) transformation $W$. Now suppose we want to create a uniform superposition over $n$ qubits, starting with all of them being in state $\ket{0}$. This is equivalent to $(H \otimes H \otimes … \otimes H)\ket{00 … 0}$ and we can write it in a shorthanded syntax as:

$$W\ket{0} = \frac{1}{\sqrt{2^n}}\sum_{x=0}^{2^n-1}\ket{x}
$$

While this provides a nice generalization for the case of using $n$ inputs, they are all still resticted to being in $\ket{0}$ initial state only. To take this a step further, let’s rewrite the formula to cover any arbitrary initial qubit state instead. This will come in handy in the Deutsch-Jozsa problem where we will need to apply $W$ (or, written differently, $H^{\otimes n}$) transformation first on a set of qubits all in $\ket{0}$ state, but then on the same set being already in the superposition.

To achieve this, we need to introduce a new elegant way of expressing the Hadamard gate acting on a single qubit, namely:

$$H\ket{x} = \frac{1}{\sqrt{2}}(\ket{0} + (-1)^x\ket{1})
$$

where $x \in {0,1}$. This captures the phase change that happens depending on whether we apply $H$ to $\ket{0}$ or $\ket{1}$. We can further rewrite this into a format relevant for any arbitrary computational basis input $\ket{x}$ an even more compact form:

$$H\ket{x} = \frac{1}{\sqrt{2}}\sum_{z \in \{0,1\}}(-1)^{xz}\ket{z}
$$

If we now consider that $\ket{x}$ and $\ket{z}$ have a length of $n$, it leads us to the final generalization of the $W$ transformation (or, again, written differently, $H^{\otimes n}$). The formula is as follows:

$$W\ket{x} = \frac{1}{\sqrt{2^n}}\sum_{z=0}^{2^n-1}(-1)^{x \cdot z}\ket{z}
$$

where $x \cdot z$ is the bitwise inner product of $x$ and $z$, modulo 2 (binary scalar product):

$$x \cdot z = x_0z_0 \oplus x_1z_1 \oplus … \oplus x_{n-1}z_{n-1}$$

Solving Deutsch-Jozsa problem

In order to best understand the Deutsch-Jozsa problem, we can write it in pure matrix transformations chain as:

$$(W \otimes I)U_f(W \otimes H)\ket{0^{\otimes n}}\ket{1}$$

In plain text, this means that we shall start with $n$ qubits in state $\ket{0}$ and one ancilla qubit in state $\ket{1}$. We shall then apply the $W$ operator to the qubits in state $\ket{0}$ and $H$ tranformation to the qubit in state $\ket{1}$. Then, we will evaluate $U_f$. At that point, we can ignore the ancilla qubit and apply again the $W$ transformation to the top $n$ qubits, which ultimately will get jointly measured.

In the previous part of this series, when looking at the basic variant of Deutsch’s problem, we approached it by initially ignoring the $\ket{x}$ and focusing on what happens when the ancilla qubit $\ket{1}$ goes through the initial transformation $H$ and then through the oracle $U_f$. This allowed us to arrive at the following state:

$$\ket{\Psi_2} = (-1)^{f(x)}\ket{x}\frac{1}{\sqrt{2}}(\ket{0} – \ket{1})
$$

We then injected $H\ket{x}$ in place of $\ket{x}$ and were able to complete the algorithm from there. Since the role of the ancilla qubit in the Deutsch-Jozsa generalization is the same as it was in the special simple case discussed last time, and there is still only one of them, the exact same approach will work for us here.

In this case, however, instead of swapping $\ket{x}$ with $H\ket{x}$ we need to remember that now $\ket{x}$ is actually consisting of $n$ qubit so we should use the above discussed $W$ transformation acting on an aribtrary number of $n$ qubits (instead of a single one). Additionally, we can observe that the usage of $\ket{-}$ syntax can help us save a little space in our equations too:

$$
\ket{-} = \frac{1}{\sqrt{2}}(\ket{0} – \ket{1})
$$

Taking all of that into account, we can rewrite $\ket{\Psi_2}$ as:

$$\ket{\Psi_2} = \frac{1}{\sqrt{2^n}}\sum_{x \in \{0,1\}^n}(-1)^{f(x)}\ket{x}\ket{-}
$$

Next, we can apply the final $W \otimes I$ transformation, using the formula we mentioned in the previous section. This results in the state $\ket{\Psi_3}$:

$$\ket{\Psi_3} = \frac{1}{\sqrt{2^n}}\sum_{x \in \{0,1\}^n}(-1)^{f(x)}\left[\frac{1}{\sqrt{2^n}}\sum_{y \in \{0,1\}^n}(-1)^{x \cdot y}\ket{y}\right]\ket{-}
$$

We can rewrite it further and omit the trailing ancilla $\ket{-}$ as it is unentangled from the rest and will not need to be measured anyway:

$$\ket{\Psi_3} = \sum_{y \in \{0,1\}^n}\left[\frac{1}{2^n}\sum_{x \in \{0,1\}^n}(-1)^{x \cdot y + {f(x)}}\right]\ket{y}
$$

At this point, we are about to make a profound discovery. The probability amplitude that we are interested in, the one for the state $\ket{0}^{\otimes n}$, is represented by the following coefficient:

$$
a_z = \frac{1}{2^n}\sum_{x \in \{0,1\}^n}(-1)^{x \cdot y + {f(x)}}
$$

If we now consider the case where $y = 0$, we arrive at:

$$
a_0 = \frac{1}{2^n}\sum_{x \in \{0,1\}^n}(-1)^{f(x)}
$$

If the functio $f(x)$ ($U_f$ in our circuit) is constant, we have two possibilities:

$$
a_0 =
\begin{cases}
1 & \text{when f(x) = 0} \\
-1 & \text{when f(x) = 1} \\
\end{cases}
$$

This tells us that when the function is constant, the probability amplitude $a_0$ for our initial set of qubits $\ket{0}^{\otimes n}$ has been reduced to +1 or -1, depending on the constant output of $U_f$. This means that for the joint measurement of an entire register, when the function is constant, all of the qubits must yield 0s as the result.

Conversely, when the function is balanced, the amplitude $a_0$ will yield a 0, because number of positive and negative terms +1 and -1 will cancel each other out. This results in a classical probability equal to 0 for receiving a measurement result of all zeros for our initial set of qubits $\ket{0^{\otimes n}}$. In other words, at least one qubit will produce a measurement result 1.

This is a remarkable conclusion – we have managed to reduce the entire problem to a simple binary outcome. When the joint measurement of the qubit register yields all zeros we have a constant function, and if it doesn’t, the function is balanced.

The mathematical formalism in this section follows in several parts the elegant and clean approach of Imre and Balazs.

Q# implementation

The Q# implementation for the Deutsch-Jozsa generalization will be in many aspects almost identical to the code used in the last post. I shall therefore avoid discussing the parts that are the same, and focus only on the important content.

Just like last time, we shall prepare four oracle functions, two of them being constant and two of them being balanced. The constant functions are the same as earlier:

The balanced functions will perform an XOR logic by applying the CNOT gate, which is also similar to the last time. The difference is, naturally, previously the CNOT spanned two qubits only (input and ancilla qubit), while here it will entangle each input qubit with the ancilla qubit.

The orchestrator operation that we will use to invoke the different oracles will also be based on the code from the last post, with the notable difference is that we need to perform a joint measurement over the entire input register, and check if that produces a zero, or in other words a $\ket{00..0}$ result. This can be achieved easily in Q# using the $MeasureAllZ (register : Qubit[])$ operation from the $Microsoft.Quantum.Measurement$ namespace, which measures the entire register in the $Z \otimes Z \otimes .. \otimes Z$ basis. The usage of the orchestrator lets us reuse the code when testing out different oracle implementations, which is very welcome from the purely programming standpoint.

This chunk of code allocates $n$ qubits (a qubit register) that represent the input into our oracles, and the ancilla qubit to ensure reversibility of the computation. The usage of $ApplyToEachA$ operation allows us to elegantly apply the $H$ gate to the entire qubit register. The invocation of $MeasureAllZ$ provides us the relevant result which is then returned as the output of the operation, following the usual resets of the qubits.

Finally, running the entry point to our program will invoke the orchestrator four times – once for each oracle function.

This concludes our Q# implementation of the Deutsch-Jozsa algorithm, with four sample oracle functions. Invoking this program produces the following output:

This is exactly what we expected based on the math discussed in the section above – once again Q# bestows us with a superb capability to test and verify quantum computing concepts with minimal effort.

Summary

To conclude, we need to say that we we can be really proud of ourselves. The implementation of Deutsch-Jozsa algorithm provides exponential speed up compared to its classical counterpart – single evaluation vs $2^{n-1} + 1$ evaluations of the oracle. Now, in order to not get too carried away, we need to emphasize, that it is also a problem that does not really have any real world applications. On top of that, we can easily find probabilistic classical solutions that perform a lot better than the $2^{n-1} + 1$ function evaluations we mentioned.

Regardless, the algorithm is extremely useful as a learning tool and an illustration of the potential that can be unlocked with the clever algorithmic approaches to phase changes and superpositions.

In the next post we will continue our work with some of the most important quantum algorithms.