Peter Shor on Quantum Error Correction

[Guest post by Annie Wei who scribed Peter Shor’s lecture in our physics and computation seminar. See here for all the posts of this seminar. –Boaz]

On October 19, we were lucky enough to have Professor Peter Shor give a guest lecture about quantum error correcting codes. In this blog post, I (Annie Wei) will present a summary of this guest lecture, which builds up quantum error correcting codes starting from classical coding theory. We will start by reviewing an example from classical error correction to motivate the similarities and differences when compared against the quantum case, before moving into quantum error correction and quantum channels. Note that we do assume a very basic familiarity with quantum mechanics, such as that which might be found here or here.

1. Motivation
We are interested in quantum error correction, ultimately, because any real-world computing device needs to be able to tolerate noise. Theoretical work on quantum algorithms has shown that quantum computers have the potential to offer speedups for a variety of problems, but in practice we’d also like to be able to eventually build and operate real quantum computers. We need to be able to protect against any decoherence that occurs when a quantum computer interacts with the environment, and we need to be able to protect against the accumulation of small gate errors since quantum gates need to be unitary operators.

In error correction the idea is to protect against noise by encoding information in a way that is resistant to noise, usually by adding some redundancy to the message. The redundancy then ensures that enough information remains, even after noise corruption, so that decoding will allow us to recover our original message. This is what is done in classical error correction schemes.

Unfortunately, it’s not obvious that quantum error correction is possible. One obstacle is that errors are continuous, since a continuum of operations can be applied to a qubit, so a priori it might seem like identifying and correcting an error would require infinite resources. In a later section we show how this problem, that of identifying quantum errors, can be overcome. Another obstacle is the fact that, as we’ve stated, classical error correction works by adding redundancy to a message. This might seem impossible to perform in a quantum setting due to the No Cloning Theorem, which states the following:

Theorem (No Cloning Theorem): Performing the mapping


is not a permissible quantum operation.

Proof: We will use unitarity, which says that a quantum operation specified by a unitary matrix U must satisfy

\langle\phi|U^{\dagger}U|\psi\rangle = \langle\phi|\psi\rangle.

(This ensures that the normalization of the state |\psi\rangle is always preserved, i.e. that |\langle\psi|\psi\rangle|^2=1, which is equivalent to the conservation of probability.)

Now suppose that we can perform the operation


Then, letting

(\langle\phi|\langle 0|)(|\psi\rangle|0\rangle)=\alpha,

we note that by unitarity

(\langle\phi|\langle 0|)(|\psi\rangle |0\rangle)=\alpha(\langle\phi|\langle 0|)U^{\dagger}U(|\psi\rangle|0\rangle).


(\langle\phi|\langle 0|)U^{\dagger}U(|\psi\rangle|0\rangle)=(\langle\phi|\langle\phi|)(|\psi\rangle|\psi\rangle)=\alpha^2,

and in general \alpha\neq\alpha^2, so we have a contradiction.

How do we get around this apparent contradiction? To do so, note that the no-cloning theorem only prohibits the copying of non-orthogonal quantum states. With orthogonal quantum states, either \alpha=0 or \alpha=1, so we don’t run into a contradiction. This also explains why it is possible to copy classical information, which we can think of as orthogonal quantum states.

So how do we actually protect quantum information from noise? In the next section we first review classical error correction, as ideas from the classical setting re-appear in the quantum setting, and then we move into quantum error correction.

2. Review of Classical Error Correction
First we start by reviewing classical error correction. In classical error correction we generally have a message that we encode, send through a noisy channel, and then decode, in the following schematic process:

In an effective error correction scheme, the decoding process should allow us to identify any errors that occurred when our message passed through the noisy channel, which then tells us how to correct the errors. The formalism that allows us to do so is the following: we first define a r\times n encoding matrix G that takes a message m of length r and converts it to a codeword c of length n, where the codewords make up the span of the rows of G. An example of such a matrix is


corresponding to the 7-bit Hamming codes, which encodes a 4-bit message as a 7-bit codeword. Note that this code has distance 3 since each of the rows in G differ in at most 3 spots, which means that it can correct at most 1 error (the number of errors that can be corrected is given by half the code distance).

We also define the parity check matrix H to be the matrix that satisfies


For example, to define H corresponding to the G we defined for the 7-bit Hamming code, we could take


Then we may decode x, a 7-bit string, in the following manner. Say that


where c is a codeword and e is the 1-bit error we wish to correct. Then


Here eH^T uniquely identifies the error and is known as the error syndrome. Having it tells us how to correct the error. Thus our error correction scheme consists of the following steps:

  1. Encode a r-bit message m by multiplying by G to obtain codeword mG=c.
  2. Send the message through channel generating error e, resulting in the string x=c+e.
  3. Multiply by H^T to obtain the error syndrome eH^T.
  4. Correct error e to obtain c.

Having concluded our quick review of classical error correction, we now look at the theory of quantum error correction.

3. Quantum Error Correction
In this section we introduce quantum error correction by directly constructing the 9-qubit code and the 7-qubit code. Then we introduce the more general formalism of CSS codes, which encompasses both the 9-qubit and 7-qubit codes, before introducing the stabilizer formalism, which tells us how we might construct a CSS code.

3.1. Preliminaries
First we introduce some tools that we will need in this section.

3.1.1. Pauli Matrices
The Pauli matrices are a set of 2-by-2 matrices that form an orthogonal basis for the 2-by-2 Hermitian matrices, where a Hermitian matrix H satisfies H^{\dagger}=H. Note that we can form larger Hilbert spaces by taking the tensor product of smaller Hilbert spaces, so in particular taking the k-fold tensor product of Pauli matrices gives us a basis for the 2^k-by-2^k Hermitian matrices. Note also that generally, in quantum mechanics, we are interested in Hermitian matrices because they can be used to represent measurements, and because unitary matrices, which can be used to represent probability-preserving quantum operations, can be obtained by exponentiating Hermitian matrices (that is, every unitary matrix U can be written in the form U=e^{iH} for H a Hermitian matrix).

The Pauli matrices are given by

\sigma_x\equiv X\equiv\left(\begin{array}{cc}0&1\\1&0\end{array}\right)

\sigma_y\equiv Y\equiv\left(\begin{array}{cc}0&-i\\i&0\end{array}\right)

\sigma_z\equiv Z\equiv\left(\begin{array}{cc}1&0\\0&-1\end{array}\right).

By direction computation we can show that they satisfy the relations





3.1.2. Von Neumann Measurements
We will also need the concept of a Von Neumann measurement. A Von Neumann measurement is given by a set of projection matrices \{E_1, E_2, ..., E_k\} satisfying

\sum_{i=1}^k E_i=I.

That is, the projectors partition a Hilbert space {\cal H} into k subspaces. Then, given any state |\psi\rangle\in{\cal H}, when we perform a measurement using these projectors we obtain the measurement result corresponding to E_i, with corresponding post-measurement state


with probability \langle\psi|E_i|\psi\rangle.

3.2. First Attempt at a Quantum Code
Now we make a first attempt at coming up with a quantum code, noting that our efforts and adjustments will ultimately culminate in the 9-qubit code. Starting with the simplest possible idea, we take inspiration from the classical repetition code, which maps

0\mapsto 000

1\mapsto 111

and decodes by taking the majority of the 3 bits. We consider the quantum analog of this, which maps



We will take our quantum errors to be the Pauli matrices X, Y, and Z. Then the encoding process, where our message is a quantum state \alpha|0\rangle+\beta|1\rangle, looks like the following:


We claim that this code can correct bit errors but not phase errors, which makes it equivalent to the original classical repetition code for error correction. To see this, note that applying an X_1 error results in the mapping


This can be detected by the von Neumann measurement which projects onto the subspaces





We could then apply \sigma_x to the first qubit to correct the error. To see that this doesn’t work for phase errors, note that applying a Z_2 error results in the mapping


This is a valid encoding of the state \alpha|0\rangle-\beta|1\rangle, so the error is undetectable.

What adjustments can we make so that we’re able to also correct Z errors? For this we will introduce the Hadamard matrix, defined as


and satisfying


Note in particular that, because HX=ZH, the Hadamard matrix turns bit errors into phase errors, and vice versa. This allows us to come up with a code that corrects phase errors by mapping

H|0\rangle\mapsto H^{\otimes 3}|000\rangle

H|1\rangle\mapsto H^{\otimes 3}|111\rangle

or equivalently,



Now we can concatenate our bit flip code with our phase flip code to take care of both errors. This gives us the 9-qubit code, also known as the Shor code.

3.3. 9-Qubit Code
In the previous section, we went through the process of constructing the 9-qubit Shor code by considering how to correct both bit flip errors and phase flip errors. Explicitly, the 9-qubit Shor code is given by the following mapping:



Here |0\rangle_L and |1\rangle_L are known as logical qubits; note that our 9-qubit code essentially represents 1 logical qubit with 9 physical qubits.

Note that by construction this code can correct \sigma_x, \sigma_y, and \sigma_z errors on any one qubit (we’ve already shown by construction that it can correct \sigma_x and \sigma_z errors, and \sigma_y can be obtained as a product of the two). This is also equivalent to the statement that the states \sigma_x^{(i)}|0\rangle_L, \sigma_y^{(i)}|0\rangle_L, \sigma_z^{(i)}|0\rangle_L, \sigma_x^{(i)}|1\rangle_L, \sigma_y^{(i)}|1\rangle_L, and \sigma_z^{(i)}|1\rangle_L are all orthogonal.

Now we have a 1-error quantum code. We claim that such a code can in fact correct any error operation, and that this is a property of all 1-error quantum codes:

Theorem: Given any possible unitary, measurement, or quantum operation on a one-error quantum code, the code can correct it.

Proof: \{I, \sigma_x, \sigma_y, \sigma_z\}^{\otimes t} forms a basis for the 2\times 2 matrices. For errors on t qubits, the code can correct these errors if it can individually correct errors \sigma_{w_i}^{(i)} for w_i\in\{x,y,z\}, i\in\{1,...,t\}, since \{I, \sigma_x, \sigma_y, \sigma_z\}^{\otimes t} forms a basis for \mathbb{C}^{2t}.

Example: Phase Error Next we’ll do an example where we consider how we might correct an arbitrary phase error applied to the |0\rangle_L state. Since quantum states are equivalent up to phases, the error operator is given by


Note that this can be rewritten in the \{I, \sigma_x, \sigma_y, \sigma_z\}^{\otimes t} basis as


Now, applying this error to |0\rangle_L, we get


After performing a projective measurement, we get state |0\rangle_L with probability \cos^2\frac{\theta}{2}, in which case we do not need to perform any error correction, and we get \sigma_z|0\rangle_L with probability \sin^2\frac{\theta}{2}, in which case we would know to correct the \sigma_z error.

3.4. 7-Qubit Code
Now that we’ve constructed the 9-qubit code and shown that quantum error correction is possible, we might wonder whether it’s possible to do better. For example, we’d like a code that requires fewer qubits. We’ll construct a 7-qubit code that corrects 1 error, defining a mapping to |0\rangle_L and |1\rangle_L by taking inspiration from a classical code, as we did for the 9-qubit case.

For this we will need to go back to the example we used to illustration classical error correction. Recall that in classical error correction, we have an encoding matrix G and a parity check matrix H satisfying GH^T=0, with \text{rank}(G)+\text{rank}(H)=n. We encode a message m to obtain codeword mG=c. After error e is applied, this becomes c+e, from which we can extract the error syndrome (c+e)H^T=eH^T. We can then apply the appropriate correction to extract c from c+e.

Now we will use the encoding matrix from our classical error correction example, and we will divide our codewords into two sets, C_1 and C_1', given by




Similar to how we approached the 9-qubit case, we will start by defining our code as follows:

|0\rangle_L\equiv\frac{1}{\sqrt{8}}\sum_{v\in C_1}|v\rangle

|1\rangle_L\equiv\frac{1}{\sqrt{8}}\sum_{w\in C_1'}|w\rangle.

Note that this corrects bit flip errors by construction. How can we ensure that we are also able to correct phase errors? For this we again turn to the Hadamard matrix, which allows us to toggle between bit and phase errors. We claim that

H^{\otimes 7}|0\rangle_L=\frac{1}{\sqrt{2}}(|0\rangle_L+|1\rangle_L)

H^{\otimes 7}|1\rangle_L=\frac{1}{\sqrt{2}}(|0\rangle_L-|1\rangle_L).

Proof: We will show that

H^{\otimes 7}|0\rangle_L=\frac{1}{\sqrt{2}}(|0\rangle_L+|1\rangle_L),

noting that the argument for |1\rangle_L is similar. First we will need the fact that

H^{\otimes 7}|v\rangle=\frac{1}{2^{7/2}}\sum_{w\in\{0,1\}^7}(-1)^{w\cdot v}|w\rangle.

To see that this fact is true, note that

H=\frac{1}{\sqrt{2}}(|0\rangle\langle 0|+|0\rangle\langle 1|+|1\rangle\langle 0|-|1\rangle\langle 1|)

and that w\cdot v is equal to the number of bits in which w and v are both 1. Now we can start by directly calculating

H^{\otimes 7}|0\rangle_L=\frac{1}{\sqrt{8}}\frac{1}{\sqrt{128}}\sum_{v\in C_1}\sum_{w\in\{0,1\}^7}(-1)^{v\cdot w}|w\rangle.

Note that for x and y two codewords, assuming that w\cdot y=1, we must have that x\cdot w=0 iff (x+y)\cdot w=1. Thus we can break the codespace up into an equal number of codewords x satisfying x\cdot w=0 and x\cdot w=1. This means that we must have that the sum \sum_{v\in C_1}\sum_{w\in\{0,1\}^7}(-1)^{w\cdot v}|w\rangle=0 unless we have w\perp C_1. But those w that satisfy w\perp C_1 are exactly all the codewords by definition, so we must have that

H^{\otimes 7}|0\rangle_L=\frac{1}{\sqrt{2}}|0\rangle_L+\frac{1}{\sqrt{2}}|1\rangle_L

as the sum in |0\rangle_L+|1\rangle_L runs equally over all codewords.

Thus we have constructed a 7-qubit quantum code that corrects 1 error, and moreover we see that for both the 9-qubit and 7-qubit codes, both of which are 1-error quantum codes, the fact that they can correct 1-error comes directly from the fact that the original classical codes we used to construct them can themselves correct 1 error. This suggests that we should be able to come up with a more general procedure for constructing quantum codes from classical codes.

3.5. CSS Codes
CSS (Calderbank-Shor-Steane) codes generalize the process by which we constructed the 9-qubit and 7-qubit codes, and they give us a general framework for constructing quantum codes from classical codes. In a CSS code, we require groups C_1, C_2 satisfying

 C_1\subseteq C_2

C_2^{\perp}\subseteq C_1^{\perp}

Then we can define codewords to correspond to cosets of C_1 in C_2, so that the number of codewords is equal to 2^{\text{dim}(C_2)-\text{dim}(C_1)}. Thus by this definition we can say that codewords w_1, w_2\in C_2 are in the same coset if w_1-w_2\in C_1. Explicitly, the codeword for coset w is given by the state

\frac{1}{|C_1|^{1/2}}\sum_{x\in C_1}|x+w\rangle,

and under the Hadamard transformation applied to each qubit this state is in turn mapped to the state

\frac{1}{|C_1^{\perp}|^{1/2}}\sum_{x\in C_1^{\perp}}|x+w\rangle.

That is to say, the Hadamard “dualizes” our original code, toggling bit errors to phase errors and vice versa. (This can be seen by direct calculation, as in the case of the 7-qubit code, where we used the fact that \sum_{v\in C_1}(-1)^{v\cdot w}=0 for w\not\in C_1^{\perp}.)

Note also that this code can correct a number of bit errors equal to the minimum weight of \{v\in C_2-C_1\}.

With the CSS construction we have thus reduced the problem of finding a quantum error correcting code to the problem of finding appropriate C_1, C_2. Note that the special case of C_2^{\perp}=C_1=C corresponds to weakly self-dual codes, which are well studied classically. Doubly even, weakly self-dual codes additionally have the requirement that all codewords have Hamming weights that are multiples of 4; they satisfy the requirement

1^n\subseteq C^{\perp}\subseteq C\subseteq\mathbb{Z}_2^n

and are also well studied classically.

3.6. Gilbert-Varshamov Bound
In the previous section we introduced CSS codes and demonstrated that the problem of constructing a quantum code could be reduced to the problem of finding two groups C_1, C_2 satisfying

C_1\subseteq C_2

C_2^{\perp}\subseteq C_1^{\perp}.

The next natural question is to ask whether such groups can in fact be found.

The Gilbert-Varshamov bound answers this question in the affirmative, ensuring that there do exist good CSS codes (the bound applies to both quantum and classical codes). It can be stated in the following way:

Theorem (Gilbert-Varshamov Bound): There exist CSS codes with rate R=(number of encoded bits)/(length of code) given by

R\geq 1-2H_2(d/n),

where d is the minimum distance of the code, d/2 is the number of errors it can correct, and H_2(x) is the Shannon entropy, defined as


Proof: Note that we can always take a code, apply a random linear transformation to it, and get another code. Thus each vector is equally likely to appear in a random code. Given this fact, we can estimate the probability that a code of dimension k contains a word of weight \leq d using the union bound:

P(code of dimension k has word of weight \leq d)\leq(number of words)\times P(word has weight \leq d)=2^k\cdot\frac{\sum_{i=0}^d \binom{n}{i}}{2^n}\approx \frac{2^k\cdot 2^{nH(d/n)}}{2^n}

For this to be a valid probability we need to have

(k/n)+H(d/n)< 1.

We can calculate rate by noting that for a CSS code, given by C_1\subseteq C_2, C_2^{\perp}\subseteq C_1^{\perp}, with \text{dim}(C_1)=n-k, \text{dim}(C_2)=k, the expression for rate is given by


Combining this with the bound we obtained by considering probabilities, we get that

R\geq 1-2H(d/n).

Thus there exist good CSS codes.

3.7. Stabilizer Codes
Having discussed and constructed some examples of CSS codes, we will now discuss the stabilizer formalism. Note that this formalism allows us to construct codes without having to work directly with the states representing |0\rangle_L and |1\rangle_L, as this can quickly get unwieldy. Instead, we will work with stabilizers, operators that leave these states invariant.

To see how working directly with states can get unwieldy, we can consider the 5-qubit code. We can define it the way we defined the 9-qubit and 7-qubit codes, by directly defining the basis vectors |0\rangle_L and |1\rangle_L,

|0\rangle_L\equiv\frac{1}{4}(|00000\rangle-|01100\rangle+|00101\rangle+|01010\rangle-|01111\rangle+(symmetric under cyclic permutations)),

with |1\rangle_L defined similarly. But we can also define this code more succinctly using the stabilizer formalism. To do so, we start by choosing a commutative subgroup of the Pauli group, with generators g_i satisfying



For example, for the 5-qubit code, the particular choice of generators we would need is given by

g_1\equiv IXZZX

g_2\equiv XIXZZ

g_3\equiv ZXIXZ

g_4\equiv ZZXIX.

Now we consider states \{|\psi\rangle\} that are stabilized by the \{g_i\}. That is, they satisfy


Note that the eigenvalues of \sigma_x, \sigma_y, and \sigma_z are \pm 1, so in the case of the 5-qubit code, there exists a 2^5/2=16-dimensional space of \{|\psi\rangle\} satisfying g_1|\psi\rangle=|\psi\rangle. Recalling that two commuting matrices are simultaneously diagonalizable, there exists a 16/2=8-dimensional space of \{|\psi\rangle\} satisfying g_1|\psi\rangle=g_2|\psi\rangle=|\psi\rangle, and so on, where we cut the dimension of the subspace in half each time we add a generator. Finally, there exists a 2^5/2^4=2-dimensional space of \{|\psi\rangle\} satisfying g_i|\psi\rangle=|\psi\rangle for all i=1,...,4. This 2-dimensional space is exactly the subspace spanned by |0\rangle_L and |1\rangle_L. Thus fixing the stabilizers is enough to give us our code.

Next we will consider all elements in the Pauli group that commute with all elements in our stabilizer group G=\{g_1,...,g_4\}. As we shall see, this will give us our logical operators, where a logical operator performs an operation on a logical qubit (for example, the logical X operator, X_L, would act on the logical qubit |0\rangle_L by mapping X_L|0\rangle_L=|1\rangle_L, and so on). In the 5-qubit case we end up with a 6-dimensional nonabelian group \tilde{G}=\langle g_1,...,g_4, h_1, h_2\rangle by adding the following two elements to those elements that are in G:



These will be our logical operators

X_L\equiv h_1

Z_L\equiv h_2

so that





Note that this code has distance 3 and corrects 1 error because 3 is the minimum Hamming weight in the group \tilde{G}. (To see this, note that XXXXX\cdot IXZZX=XIYYI has Hamming weight 3.)

Why is Hamming weight 2 not enough to correct one error? If we had, for example, XZIII\in\tilde{G}, then we would have


for |\psi_1\rangle, |\psi_2\rangle both in the code, which means that we wouldn’t be able to distinguish an X_1 error from a Z_2 error.

Note that, in general, when x\in\tilde{G}, x|\psi\rangle will be in the code, so elements of \tilde{G} map codewords to codewords. We can prove this fact by noting that


Note also that in the examples we’ve been dealing with so far, where we have a commuting subgroup of the Pauli group, our codes correspond to classical, additive, weakly self-dual codes over GF(4). Here GF(4)=\{0,1,\omega,\bar{\omega}\} (with group elements \{\omega, \bar{\omega},1\} corresponding to the third roots of unity) is the finite field on 4 elements, and multiplying Pauli matrices corresponds to group addition. Specifically,

X\equiv 1

Y\equiv \omega

Z\equiv \bar{\omega}

I\equiv 0




We have now concluded our discussion of quantum error-correcting codes. In the next section we will shift gears and look at quantum channels and channel capacities.

4. Quantum Channels
In this final section we will look at quantum channels and channel capacities.

4.1. Definition and Examples

4.1.1. Definition
We know that we want to define a quantum channel to take a quantum state as input. What should the output be? As a first attempt we might imagine having the output be a probability distribution \{p_i\} over states \{|\psi_i\rangle\}. It turns out that for a more succinct description, we can have both the input and output be a density matrix.

Recall that a density matrix takes the form

\rho=\sum_i p_i|\psi_i\rangle\langle\psi_i|

representing a probability distribution over pure states |\psi_i\rangle. \rho must also be Hermitian, and it must satisfy \text{Tr}(\rho)=1 (equivalently, we must have \sum_i p_i=1).

Now we may define a quantum channel as the map \eta that takes

\eta:\rho\mapsto\sum_i E_i\rho E_i^{\dagger},


\sum_i E_i^{\dagger}E_i=I.

To see that the output is in fact a density matrix, note that the output expression is clearly Hermitian and can be shown to have unit trace using the cyclical property of traces. Note also that the decomposition into \{E_i\} need not be unique.

4.1.2. Example Quantum Channels
Next we give a few examples of quantum channels. The dephasing channel is given by the map


It maps

\left(\begin{array}{cc}\alpha&\beta\\\gamma&\delta\end{array}\right)\mapsto \left(\begin{array}{cc}\alpha&(1-2p)\beta\\(1-2p)\gamma&\delta\end{array}\right)

\left(\begin{array}{cc}\alpha&\beta\\\gamma&\delta\end{array}\right)\mapsto \left(\begin{array}{cc}\alpha&(1-2p)\beta\\(1-2p)\gamma&\delta\end{array}\right),

so it multiplies off-diagonal elements by a factor that is less than 1. Note that when p=1/2, it maps

\alpha|0\rangle+\beta|1\rangle\mapsto|\alpha|^2|0\rangle\langle 0|+|\beta|^2|1\rangle\langle 1|,

which means that it turns superpositions into classical mixtures (hence the name “dephasing”).

Another example is the amplitude damping channel, which models an excited state decaying to a ground state. It is given by



Here we let the vector |0\rangle=(1, 0) denote the ground state, and we let the vector |1\rangle=(0, 1) denote the excited state. Thus we can see that the channel maps the ground state to itself, |0\rangle\mapsto|0\rangle, while the excited state |1\rangle gets mapped to |0\rangle with probability p and stays at |1\rangle with probability 1-p.

4.2 Quantum Channel Capacities
Now we consider the capacity of quantum channels, where the capacity quantifies how much information can make it through the channel. We consider classical channels, classical information sent over quantum channels, and quantum information sent over quantum channels. First we start off with the example of the quantum erasure channel to demonstrate that quantum channels behave differently from classical channels, and then we give the actual expressions for the channel capacities before revisiting the example of the quantum erasure channel.

4.2.1 Example: Quantum Erasure Channel
First we start with the example of the quantum erasure channel, which given a state |\psi\rangle replaces it by an orthogonal state |E\rangle with probability p and returns the same state |\psi\rangle with probability 1-p. We claim that the erasure channel can’t transmit quantum information when p\geq 0.5, behavior that is markedly different from that of classical information. That is to say, for p\geq 0.5, there is no way to encode quantum information to send it through the channel and then decode it so the receiver gets a state close to the state that was sent.

To see why this is the case, assume the contrary, that there do exist encoding and decoding protocols that send quantum information through quantum erasure channels with erasure rate p\geq 0.5. We will show that this violates the no-cloning theorem. Now, suppose that A does the following: For each qubit in the encoded state, she tosses a fair coin. If the coin lands heads, she send C the state |E\rangle and sends B the channel input with probability 2p-1 and the erasure state |E\rangle otherwise. If the coin lands tails, she sends B the state |E\rangle and sends C the channel input with probability 2p-1 and the erasure state otherwise. This implements a p\geq 0.5 channel to both receivers B and C, which means that A can use this channel to transmit an encoding of |\psi\rangle to both receivers, which in turn means that both receivers will be able to decode |\psi\rangle. But this means that A has just used this channel to clone the quantum state |\psi\rangle, resulting in a contradiction. Thus no quantum information can be transmitted through a channel with p\geq 0.5. Note, however, that we can send classical information over this channel, so the behavior of quantum and classical information is markedly different.

It turns out that the rate of quantum information sent over the erasure channel, as a function of p, is given by the following graph:

while the rate of classical information sent over the erasure channel, as a function of p, is given by the following graph:


Next we will formally state the definition of channel capacity, and then we will return to the quantum erasure channel example and derive the curve that plots rate against p.

4.2.2. Definition of Channel Capacities
Channel capacity is defined as the maximum rate at which information can be communicated over many independent uses of a channel from sender to receiver. Here we list the expressions for channel capacity for classical channels, classical information over a quantum channel, and quantum information over a quantum channel.

Classical Channel Capacity For a classical channel this expression is just the maximum mutual information over all input-output pairs,

\max_X H(\eta(X))-H(\eta(X)|X),

where X is the input information and \eta(X) is the output information after having gone through the channel \eta.

Classical Information Over a Quantum Channel The capacity for classical information sent over a quantum channel is given by

\max_{\{p_i,\rho_i\}} H(\eta(\rho))-\sum_i p_iH(\eta(\rho_i))

up to regularization, where \rho=\sum_i p_i\rho_i is the average input state, and \eta is the channel.

Note that we would regularize this by using n copies of the state (that is to say, we want the output of \eta^{\otimes n}) and then dividing by n, to get an expression like the following for the regularized capacity of classical information over a quantum channel:

\lim_{n\rightarrow\infty}\max_{\{p_i,\rho_i\}} [H(\eta(\rho)^{\otimes n})-\sum_i p_i H(\eta(\rho_i)^{\otimes n})]/n.

Quantum Information The capacity for quantum information is given by the expression

\max_\rho H(\eta(\rho))-H((\eta\otimes I)\Phi_\rho),

also up to regularization. Here \eta(\rho) is the output when channel \rho acts on input state \rho, while \Phi_\rho is the purification of \rho (that is, it is a pure state containing \rho that we can obtain by enlarging the Hilbert space). The regularized capacity for quantum information looks like the following:

 \lim_{n\rightarrow\infty}\max_\rho [H(\eta(\rho)^{\otimes n})-H((\eta\otimes I)(\Phi_\rho)^{\otimes n})]/n.

Now that we have the exact expression that allows us to calculate the quantum channel capacity, we will revisit our example of the quantum erasure channel and reproduce the plot of channel rate vs erasure probability.

4.2.3. Example Revisited: Quantum Erasure Channel
Recall that, up to regularization, the capacity of a quantum channel is given by

\max_\rho H(\eta(\rho))-H((\eta\otimes I)\Phi_\rho).

We will directly calculate this expression for the example of the quantum erasure channel. Let the input \rho be given by the density matrix for the completely mixed state,


so that the purification of \rho is given by the state


Recall that the erasure channel replaces our state with |E\rangle with probability p, while with probability 1-p it leaves the input state unchanged. Then, in the basis \{|0\rangle, |1\rangle, |E\rangle\}, the matrix corresponding to \eta(\rho) is given by


while in the basis \{|00\rangle, |01\rangle, |10\rangle, |11\rangle, |0E\rangle, |1E\rangle\}, the matrix corresponding to (\eta\otimes I)\Phi_\rho is given by

(\eta\otimes I)\Phi_\rho=\left(\begin{array}{cccccc}\frac{1-p}{2}&0&0&\frac{1-p}{2}&0&0\\0&0&0&0&0&0\\0&0&0&0&0&0\\\frac{1-p}{2}&0&0&\frac{1-p}{2}&0&0\\0&0&0&0&\frac{p}{2}&0\\0&0&0&0&0&\frac{p}{2}\end{array}\right)

We can directly calculate that


H((\eta\otimes I)\Phi_\rho)=H_2(p)+p.

Then, subtracting the two entropies, we can calculate the rate to be


which corresponds exactly to the line we saw on the diagram that plotted rate as a function of p for the quantum erasure channel.


  1. Bennett, C. H., DiVencenzo, D. P., and Smolin, J. A. Capacities of quantum erasure channels. Phys. Rev. Lett., 78:3217-3220 (1997). quant-ph/9701015.
  2. Bennett, C. H., DiVencenzo, D. P., Smolin, J. A., and Wootters, W. K. Mixed state entanglement and quantum error correction. Phys. Rev. A, 54:3824 (1996). quant-ph/9604024.
  3. Calderbank, A. R. and Shor, P. W. Good quantum error-correcting codes exist. Phys. Rev. A, 54:1098 (1996). quant-ph/9512032.
  4. Devetak, I. The Private Classical Capacity and Quantum Capacity of a Quantum Channel. IEEE Trans. Inf. Theor., 51:44-45 (2005). quant-ph/0304127
  5. Devetak, I. and Winter, A. Classical data compression with quantum side information. Phys. Rev. A, 68(4):042301 (2003).
  6. Gottesman, D. Class of quantum error-correcting codes saturating the quantum Hamming bound. Phys. Rev. A, 54:1862 (1996).
  7. Laflamme, R., Miquel, C., Paz, J.-P., and Zurek, W. H. Perfect quantum error correction code. Phys. Rev. Lett., 77:198 (1996). quant-ph/9602019.
  8. Lloyd, S. Capacity of the noisy quantum channel. Phys. Rev. A., 55:3 (1997). quant-ph/9604015.
  9. Nielsen, M. A. and Chuang, I. L. Quantum Computation and Quantum Information., Cambridge University Press, New York (2011).
  10. Shor, P. W. Scheme for reducing decoherence in quantum computer memory. Phys. Rev. A., 52:2493 (1995).
  11. Shor, P. W. The quantum channel capacity and coherent information. MSRI Workshop on Quantum Computation (2002).
  12. Steane, A. M. Error correcting codes in quantum theory. Phys. Rev. Lett., 77:793 (1996).
  13. Steane, A. M. Multiple particle interference and quantum error correction. Proc. R. Soc. London A, 452:2551-76 (1996).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s