WILLIAMS COLLEGE LIBRARIES
Your unpublished thesis, submitted for a degree at Williams College and administered by the Williams College Libraries, will be made available for research use. You may, through this form, provide instructions regarding copyright, access, dissemination and reproduction of your thesis.
_ The faculty advisor to the student writing the thesis wishes to claim joint authorship in this work.
In each section, please check the ONE statement that reflects your wishes.
1. PUBLICATION AND QUOTATION: LITERARY PROPERTY RIGHTS A student author automatically owns the copyright to his/her work, whether or not a copyright symbol and date are placed on the piece. The duration of U.S. copyright on a manuscriptand Williams theses are considered manuscriptsis the life of the author plus 70 years.
Vwe do not choose to retain literary property rights to the thesis, and I wish to assign them immediately to Williams College.
(hi, L1denl :Il1l!]()( ['10m tllo
_Vwe wish to retain literary property rights to the thesis for a period of three years, at which time the literary property rights shall be assigned to Williams College.
Ihis illilhur a to tho Ihe,i" in llll·· COITiil1 I'.
_ Vwe wish to retain literary property rights to the thesis for a period of__ years, or until my death, whichever is the later, at which time the literary property rights shall be assigned to Williams College.
H",.;Ulll;~ lhis
II. ACCESS The Williams College Libraries are investigating the posting of theses online, as well as their retention in
hardcopy. /'
/'
L Williams College is granted permission to maintain and provide access to my thesis in hardcopy and via the Web both on and off campus. version or yoni' \\ork,
__ Williams College is granted permission to maintain and provide access to my thesis in hardcopy and via the Web for oncampus use only. Selectil1g this 11,)\,;", )0 the version of yom work from the OllC,l1npus
)i('I\\'Olt 011
_ The thesis is to be maintained and made available in hardcopy form only.
',clectll1o tlus the vou suhmil.
\iii't! 1',(', or 1
III. COPYING AND DISSEMINAnON Because theses are listed on FRANCIS, the Libraries receive numerous requests every year for copies of works. If/when a hardcopy thesis is duplicated for a researcher, a copy of the release form always acco~panies the copy. Any digital version of your thesis will include the release form.
'\ /' Copies of the thesis may be provided to any researcher.
'tli tl) I'rum the Iliiuns
_ Copying of the thesis is restricted for _ years, at which time copies may be provided to any researcher.
This this iUi
_ Copying of the thesis or portions thereof, except as needed to maintain an adequate number of research copies available in the Williams College Libraries, is expressly prohibited. The electronic version of the thesis will be protected against duplication.
milch; 1'01' ck,·tmllic
Signed (student author) Signat:ure RerTlc>ved
Signat:ure RerTlc>ved
Accepted for the Libraries Date accepted 5;J()~(lf
Extracting Classical Information from Quantum
Systems with Limited Entangled Resources
Daniel O. King
Professor William K. Wootters, Advisor
A thesis submitted in partial fulfillment
of the requirements for the
Degree of Bachelor of Arts with Honors
in Physics
Williams College
Williamstown, Massachusetts
May 20,2009
Abstract
Nonlocality in quantum states, called entanglement, has received much attention throughout the history of the theory of quantum mechanics. However, other subtle types of nonlocality are also present in the theory. For example, there exist collections of unentangled bipartite quantum states for which the optimal measurement for discriminating among the states is nonlocal [1, 2, 3]. That is, an ensemble of quantum states can have nonlocal properties even though all of the individual states are unentangled.
To study this type of nonlocality, we wish to ask the following question: How much classical information can Alice and Bob extract from a bipartite quantum system whose initial state is described by an ensemble of quantum states E, given that they possess a limited amount of entangled resources? Essentially, we wish to know how well Alice and Bob can discriminate among the collection of states E if they are allowed a limited (but nonzero) amount of quantum communication. To quantify this, we define the "entanglementinformation function" of E, which gives the minimum amount of entanglement that Alice and Bob must initially possess in order to obtain any amount of information. vVe then investigate the entanglementinformation functions for two specific choices of E by looking for upper and lower bounds. To find upper bounds, we look for particular procedures that Alice and Bob can use to distinguish among the possible initial states, and to find lower bounds, we show that, given an amount of information I, a contradiction would occur if they could obtain I by sharing less than a certain amount of entanglement. In one case, we are able to explicitly find the entanglementinformation function by finding upper and lower bounds which agree.
Acknowledgments
I would like to thank Professor vVootters for introducing me to quantum information theory, as well as for his incredible patience, generosity, and inspiring mentorship. Also, I would like to thank Professor Strauch for administering a very close reading of my draft and offering so much helpful commentary.
In addition, I would like to thank the entire Williams College Physics Department for fostering such a warm and caring environment as well as offering many challenges over the past four years, each of which was a joy to tackle. I began my vVilliams career just taking physics as an elective, but I was hooked before very long.
Finally, I would like to thank my family and friends for all of the support they have given me throughout this project. Thanks to my parents, who always come to support whatever I'm doing; to Teng Jian, Scott, and Rahul for many interesting and enlightening (if not always completely serious) discussions about physics, mathematics, music, and life; to Travis, Duane, Greg, and Luc, my "other family;" and to Shyla, for always being there for me, even on the hardest days.
iii
Contents
1 Introduction 1
2 Background Information and Definitions 5
2.1 Quantum Mechanics . . 5
2.1.1 Quantum States . 5
2.1.2 Measurements . 9
2.2 Probability and Classical Information Theory 11
2.2.1 Bayes' Law . 11
2.2.2 Shannon Entropy . 11
2.2.3 Mutual Information .. 12
2.3 Quantum Information Theory 13
2.3.1 The Partial Trace ... 13
2.3.2 Local Operations and Classical Communication 13
2.3.3 Von Neumann Entropy ... 14
2.3.4 Entropy of Entanglement .. 15
2.3.5 Entanglement of Formation 16
2.3.6 Teleportation 17
2.4 RelatedWork ....... 19
3 Statement of Our Problem 21
3.1 Setup . 21
3.2 The EntanglementInformation Function . 22
3.3 Bounding the EntanglementInformation Function 25
4 The Lopsided Bell Ensemble 28
4.1 Introduction.............. 28
4.2 A Unitary Transformation Procedure 30
4.3 A "Bad Teleportation" Procedure 33
v
4.4 The EntanglementInformation Function for a = 1/j2 38
5 The Double Trine Ensemble 44
5.1 Introduction.................... 44
5.2 Two Measurements . 45
5.2.1 A NonLocal, Unentangled Measurement 45
5.2.2 A Local Measurement . 48
5.3 A Unitary Transformation Procedure . 51
5.3.1 Realizing E and ,G as Orthogonal Measurements 51
5.3.2 Finding the Appropriate Unitary Transformation 53
5.3.3 The Cost of1/V(a) . 54
5.3.4 Calculating the Procedural Information. 56
6 Conclusion 60
6.1 Summary of Results . 60
6.2 Ideas for future research . 60
6.2.1 Finding Lower Bounds on cf:(I) for other ensembles 60
6.2.2 Analyzing additional procedures and ensembles 61
6.3 FinalRemarks. ................... 62
A Performing the Transformations U(r) and W (a) 63
A.l Performing Ub) 63
A.2 Performing vV(a) 66
B Proof of Lemma 4.2 69
Chapter 1
Introduction
One of the most significant ways in which quantum mechanics differs from classical physics is that it is a nonlocal theory. That is, according to quantum mechanics, not all of the information about a particle is contained at the location of the particle itself. For example, consider two spinl/2 particles in the singlet state ~(il1) 11n). This state is entangled, which means that it cannot be written as a product of a state of the first particle with a state of the second particle. If two particles are in an entangled state, neither particle has a definite state of its own, and results of measurements on the two particles are correlated with one another. For the singlet state, if the zcomponent of the spin is measured for both particles, the two measurements are certain to return opposite outcomes, even though a measurement on just one of the two particles has a 5050 chance of returning In or 11). In fact, if we measure the components of the spins of the two particles along any axis, we will always find that the two particles return opposite outcomes. In this way, entanglement violates our intuitions from classical physics, because the result of the measurement on one of the particles determines the result of a subsequent measurement on the other particle, even if the particles are (in principle) across the universe from each other (in which case the first particle would not be able to "send a message" to the second particle telling it what the outcome of the first measurement was or even what the first measurement was, because the message would have to travel at the speed of light and would not reach the second particle before the second measurement was carried out). It was this strange property that motivated the famous paper by Einstein, Podolsky, and Rosen which attempted to argue that quantum mechanics is
an incomplete theory [4].
Given the apparent nonlocality of the theory of quantum mechanics, one might wonder whether or not it is possible to construct a local theory which mimics the pre
1
dictions of quantum mechanics. In 1964, J.S. Bell famously proved that no such "local hidden variable theory" could be consistent with all of the predictions of quantum mechanics [5]. In particular, if the spin of one particle of a singlet state pair is measured along some axis u and the spin of the other particle is measured along an axis rotated 45° with respect to u, the results of the two measurements are too strongly correlated to be predicted by any local theory. Apparently, then, the nonlocality of quantum mechanics is a fundamental property of the theory and not simply a superficial aspect which can be explained away by looking for a more complete extension of the theory.
Much work has been done to characterize and quantify the entanglement of states like the singlet [6], as well as more general states [7, 8, 9]. Over the past couple of decades, many authors have also begun studying nonlocality arising in measurements. For example, suppose that Alice and Bob each hold one part of a quantum system (we'll label these parts A and B, respectively). It is well known that there are some measurements on A and B which cannot be done locally. For example, the measurement on two spin1/2 particles with outcomes ~([j1) 11j)), ~([j1) +11 j)),
~(lTj) Ill)), and ~(Iii) +Ill)) cannot be done locally. That is, this measurement cannot be done if Alice and Bob are each only able to perform measurements on their own part of the system, unless they share some additional entanglement. Thus, it is natural to ask how much entanglement Alice and Bob need to share to perform particular measurements. Several studies have been done on this topic (e.g. [10, 11, 1]). Interestingly, in Ref. [1] it is shown that there exists a measurement on two qutrits (threedimensional quantum systems) whose outcomes are all mutually orthogonal and unentangled but which cannot be done locally. Apparently, the nonlocal properties of quantum mechanics extend beyond just entanglement, because this measurement has nonlocal properties despite the fact that its outcomes are unentangled. On the other hand, several studies have been done on the topic of how well Alice and Bob can distinguish among certain collections of possible initial states of A and B given that they share no additional entanglement (e.g. [12, 13]). In this thesis, we wish to ask a question related to both of these: How well can Alice and Bob distinguish among certain collections of possible initial states of A and B given that they share a limited (but nonzero) amount of entanglement?
Suppose that Alice and Bob possess parts A and B, respectively, of a quantum system, and suppose that they know that the joint state of A and B is one of the states {I?'ul), 1?'u2) ,... , I?'un) }. Vie can imagine that each of the states I?'ui) represents one letter of an nIetter alphabet and that the initial joint state of A and B encodes one letter of a message that Alice and Bob are attempting to reacl. In this setup, investigating how well Alice and Bob can distinguish among the possibilities l1fJi) given a limited amount of entanglement tells us how much classical information Alice and Bob can extract from A and B if they use up some entanglement, a quantum informationtheoretic quantity. This, in turn, gives us a deeper understanding of the general relationship between classical and quantum information.
To motivate our question more concretely, consider the following scheme, which was inspired by Ref. [14]. Alice and Bob work in an office, and their boss, Carol, needs them to manipulate some classical information (for one reason or another). However, this information is highly sensitive, so rather than encoding the information classically, Carol elects to encode the classical information into some bipartite quantum systems and then give half of each system to Alice and half to Bob. If Carol chooses the states of these systems appropriately (for example, as is described in [14]), Alice and Bob will not be able to read the information very effectively without sharing some entanglement. Presumably, Carol somehow prevents them from sharing entanglement, which keeps them from being able to read the sensitive information. However, if Alice and Bob were somehow able to sneak some entanglement into the office, they would be able to read the encoded information with more accuracy. As one can imagine such "quantum data hiding" [15] schemes actually being used someday, knowing how much classical information Alice and Bob can extract from a quantum system using a limited amount of entanglement will essentially tell us how bad it would be if they were able to sneak some entanglement into the office for the purpose of reading the sensitive information.
Outline of the Thesis
After reviewing some relevant theory and prior work in Chapter 2, we state our problem more formally in Chapter 3. In Chapters 4 and 5, we consider two specific collections of possible initial joint states of A and B, the Lopsided Bell Ensemble and the Double Trine Ensemble. For each of these two cases, we analyze a procedure that Alice and Bob might use in an attempt to distinguish among the possible initial states of A and B which involves Alice and Bob performing a nonlocal unitary transformation (which requires some entanglement) in order to convert the possible initial states into a collection of states which can be distinguished optimally by a local measurement. For the Lopsided Bell Ensemble, we consider an additional procedure in which Alice attempts to teleport part A to Bob using less than one full unit of entanglement, the result of which is that Bob receives a distorted version of the initial state of A. He can then make any measurement on the two particles locally. V'Ve prove that for the Bell Ensemble (a special case of the Lopsided Bell Ensemble), this "bad teleportation" procedure is optimal, in the sense that it requires the minimum amount of entanglement on average in order to extract at least a certain amount of classical information.
Chapter 2
Background Information and Definitions
2.1 Quantum Mechanics
2.1.1 Quantum States Pure States
The most basic type of quantum state is the pure state, which is represented by a normalized vector in a complex Hilbert space, called the state space of the system. For a system whose state space is twodimensional, such as the polarization state of a photon or the spin state of a spinl/2 particle, we use the generic basis vectors
10) == (~) and 11) == (~) . (2.1 )
Such a system is called a qubit (short for "quantum bit"). Similarly, for a system with a threedimensional state space (a qutrit), we use the basis vectors
(2.2)
and so on for systems with higher dimensional state spaces. When we represent a pure state of a closed quantum system by a complex vector, we are free to choose the overall phase of the vector however we like, because the
5
overall phase of the vector actually has no physical significance. For example, the vectors 10) and ei 4>IO) for all intents and purposes represent the same physical quantum state, as long as the system is closed. Since we shall only be concerned with closed quantum states, we would use these two vectors interchangeably. However, we may not freely choose relative phase factors. For example, the vectors (lj.J2)(IO) + 11)) and (lj.J2)(IO) + ei 4>ll)) represent different physical states if ¢ =1= 0 mod 21T.
Notice that, due to our freedom in choosing overall phase factors, we may write any pure state I1,/!) of a qubit in the following form:
o~ 8 ~ 1T, 0 ~ ¢ < 21T.
(2.3) Therefore, we may associate the state I1,/!) with the point (8, ¢) on the unit sphere in real threespace. This gives rise to an especially convenient representation of qubit pure states, called the Bloch Sphere, on which the north and south poles represent the basis states 10) and 11), respectively, as is shown in Figure 2.1. Note that orthogonal states correspond to antipodal points on the Bloch Sphere.
Figure 2.1: Representation of the vector IV)) on the Bloch Sphere.
Density Matrices and Mixed States
A more general way of representing a quantum state is the density matrix. The density matrix of a quantum system is a Hermitian matrix whose eigenvalues are
2.1. QUANTUM MECHANICS
nonnegative (that is, it is positive semidefinite) and whose trace is equal to 1. If a pure state is represented by a vector j1J'i), the state can equivalently be represented by the density matrix p = 11J'i)(1J'iI. However, density matrices can also be used to represent more general quantum states, called mixed states, which cannot be represented by vectors. Roughly speaking, a mixed state is a state in which the system might be found in several different pure states, each with some probability. More precisely, a mixed state is represented by a density matrix of the form p = I:i cd1J'ii)(1J'iil where the IV)i) 's are pure state vectors, each ai ~ 0, and I:i ai = 1. Here, ai represents the probability that the system is in the state IV)i).
Unitary Transformations
Let U be an operator on a complex vector space with the property that Ut = UI , that is, uut = UtU = I , (2.4)
where I is the identity operator. Then U is called a unitary operator (or unitary transformation). Mathematically speaking, unitary operators are the generalization to complex vector spaces of rotations in Euclidean space. As such, U preserves the lengths of vectors, as well as the inner products between vectors.
Physically, unitary transformations represent the class of reversible transformations that can be performed on a quantum system. If a system is in the state j1J'i) and a unitary transformation U is performed on it, the new state of the system is UI1J'i) , and performing the transformation ut on the system will return it to its original state, because ut(Uj1J'i)) = II1J'i) = 11J'i).
If the state of a system is instead described by a density matrix p, applying a unitary transformation U to the system yields the state Uput.
The Tensor Product
\1Ve will often wish to consider the joint quantum state of several systems. The tensor product formalism allows us to combine multiple systems into a single, larger system. If one system is in the state IV)) and another is in the state IcP), then the joint state of the two systems is the tensor product of the two individual states, denoted [1J'i) ® IcP). Also, if Alice performs a unitary operation A on her system and Bob performs a unitary operation B on his system, the resulting global unitary operation performed on the joint state of the two systems is A ® B.
To illustrate the action of the tensor product on matrices and vectors, we will use a couple of examples. Let
A=(W x). and B = (k l ) . (2.5)
I?p) = (~~) , I¢) ~ OU yz' m n
Then
a1b1
a1b2 a1b3
(2.6)
lib) ® I¢) ~ (~:) ® OU
a2bl a2b2 a2b3
and A ® n ~ (w x) ® CI) ~ y z m n (:~yk wi wn yl xk xm zk XI) xn zl . (2.7)
ym yn zm zn
In general, if u and v are vectors or matrices, U ® v is equal to u with each entry
replaced by an entire copy of v multiplied by that entry.
The tensor product is associative and distributive, but it is not commutative. Also, the tensor product behaves nicely with the inner product,
and with the outer product,
To simplify our notation, we will sometimes omit the ® symbol if no confusion could arise. For example, we may write 1?p)I¢) instead of !?p) ® I¢). Also, we will write the tensor product of two basis states as a single ket, such as, for example, 10) ® 11) == 101).
It is worthwhile to explicitly note that composite systems are allowed to be in states which cannot be written as tensor products. For example, the singlet state, (1/V2)(IOl) 110)), cannot be written as a tensor product of a state of the first particle with a state of the second particle.
2.1. QUANTUM MECHANICS
2.1.2 Measurements Orthogonal Measurements and POVM's
The simplest type of measurement that can be made on a quantum system is a complete orthogonal measurement. Formally, a complete orthogonal measurement consists of an orthonormal basis {Imi)} for the state space of the system being measured, where the vectors Imi) are the outcomes (eigenstates) of the measurement. If the orthogonal measurement1 {Imi)} is performed on a system in the state 11/J), then the probability of obtaining the outcome Imi) is given by2
(2.10)
Notice that orthogonal measurements are restricted to having a number of outcomes equal to the dimension of the state space of the system being measured. A more general type of measurement, the positive operator valued measure (or POVM), does not have this restriction. Mathematically, any set of positive semidefinite operators {Ei } with I::i Ei = I constitutes a valid POVM. Here, each Ei corresponds to one outcome of the measurement.
If the POVM {Ei } is performed on a system with a density matrix p, then the probability of obtaining the outcome Ei is given by
(2.11)
In particular, if p is a pure state density matrix, p = 11/J)(1/JI, then this becomes
(2.12)
Any orthogonal measurement {Imi)} can equivalently be represented by the POVIVI {Imi)(mil}. However, using the POVM formalism, we can construct other types of measurements which have more or fewer outcomes than the dimension of the target system's state space. For example, let
1
l1/Jl) = V2(IO) + 11)),
11/J2) = ~(e27ri/310) + e27ri/311)),
11/J3) = ~(e27ri/310) + e27ri/311). (2.13)
1 From here on, we shall refer to complete orthogonal measurements simply as "orthogonal measurements."
2For notational cleanliness, we will omit the ket brackets inside the argument of the probability function. That is, we will write p(rni) instead of P(lmi))'
Then the collection nl1Pl)(,hl, ~11P2)(1P21, ~11P3)(1P31} is a valid POVM on one qubit. Notice that this measurement has three outcomes, while the dimension of the state space is two. However, notice also that the outcomes are not orthogonal. The factor of 2/3 in front of each of the measurement operators ensures that the operators sum to the identity operator, but it also means that, for example, if the qubit is in the state l1Pl) before the measurement occurs, the measurement is not certain to return the outcome l1Pl) (that is, the outcome ~11Pl)(1P11) as would be the case with an orthogonal measurement, because
(2.14)
Measurements On Composite Systems
Suppose that a composite quantum system consisting of two parts A and B is in a global state described by the vector IW)AB (when confusion could arise, we'll use subscripts to indicate which parts of a composite system a state vector, density matrix, or operator refers to), and suppose that Alice performs a POVM {Ei } on part A, while Bob leaves part B alone. We wish to know what the final state of part B will be if Alice obtains a particular outcome of her measurement. Throughout this thesis, we shall mainly consider POVM's whose outcomes are all proportional to pure state density matrices, so suppose that Ei = ai l1Pi)(1PiI for all i. Then if Alice obtains the outcome Ej, the final state l1Pfinal) B of part B satisfies
(2.15)
where IB is the identity operator on part B. The quantity (1PjIW))B is called the partial inner product of l1Pj) with Iw), because it essentially denotes taking the inner product of l1Pj) with the A part of Iw) while leaving the B part alone. Notice that (1Pj 1 w)) B is a state vector of part B (hence the subscript), not a real number like the standard inner product. In general, the vector (1Pj!W))B will have length less than 1, so to find the final state of part B, we would need to normalize it. The probability that Alice obtains the outcome Ej is given by
(2.16)
2.2. PROBABILITY AND CLASSICAL INFORMATION THEORY
2.2 Probability and Classical Information Theory
2.2.1 Bayes' Law
Suppose that we wish to compute the probability that some event x occurs, given that we know an event y occurred, which we denote p(xly) (read "the probability of x given y"). According to Bayes' Law, this probability is given by
p(ylx)p(x)
(I) (2.17)
pxy p(y) ,
where p(x) is the a priori probability of x occurring, and similarly for p(y). In order for this formula to be useful, there needs to be some outside input which specifies the probabilities on the right hand side. When we apply Bayes' Law in this thesis, the event x will be Alice and Bob's systems being in a particular state and the event y will be Alice and Bob obtaining a particular outcome of some measurement. p(x) will be fixed by assumption, and we'll be able to calculate p(ylx) and p(y) using the laws of quantum mechanics.
2.2.2 Shannon Entropy
In this thesis, we quantify classical information using a standard measure of uncertainty, the Shannon Entropy. Let X be a discrete random variable which takes values Xl, .T2, ... ,Xn with probabilities PI, P2, ... ,Pn respectively. The Shannon Entropy of X is given by
n
17,(X) =LPi log Pi. (2.18)
i=l
We will adopt the conventions that log == 10g2 and that 0log 0 == O. With these conventions, the Shannon entropy measures the uncertainty in the value of a random variable in units of bits. For example, for a fair coin toss, the probability distribution is (1/2,1/2), that is, the coin has a 1/2 probability of coming up heads and a 1/2 probability of coming up tails. The Shannon Entropy of this distribution is3 17,(1/2,1/2) = 1 bit, that is, we lack 1 bit of information about the outcome of the coin flip. If the coin were guaranteed to come up heads, the probability distribution would be (1,0), and the Shannon Entropy is 17,(1,0) = 0, that is, we lack no information about the outcome. Finally, if the coin had unequal probabilities of coming up heads
3We shall sometimes write h(Pl,P2,'" ,Pn) instead of h(X) when the random variable X is clear from context or if we wish to emphasize the importance of the probabilities Pi.
and tails, say, (2/3,1/3), the Shannon Entropy of the probability distribution would be h(2/3, 1/3) ;::::; 0.92 bits. That is, the outcome is still quite uncertain, but we know that the outcome is slightly more likely to be heads than tails, so the entropy is not an entire bit.
Suppose that we know that a random variable Y has the value y. The remaining uncertainty in the value X is given by
(2.19)
The left hand side of this equation is read "the entropy of X given y." Notice that if X and Y are independent, p(xily) = P(Xi), so h(Xly) = h(X). On the other hand, if X and Yare not independent, h(Xly) may be greater than or less than h(X).
When viewed as a function of one variable P, the Shannon Entropy h(p, 1p) of a binary probability distribution is concave in P, which is to say that
L Cl:iPi) 2': Cl:i L h(Pi, 1Pi) (2.20)
22
if each Cl:i 2': 0 and Li Cl:i = 1. Furthermore, this inequality is strict if at least 2 Cl:i'S are nonzero. That is, the entropy of the average of several values of P is strictly greater than the average entropy of those values. This property generalizes to higher dimensional probability distributions as well. Let Pi = (Pi,1, Pi,2, ... ,Pi,n) be a collection of probability vectors indexed by i. Then4
(2.21)
if each Cl:i > 0 and Li Cl:i = 1, and the inequality is strict if at least two Cl:i'S are nonzero.
2.2.3 Mutual Information
In classical information theory, the mutual information of two random variables X and Y is defined as5 I(X : Y) == h(X) (h(XIY)), (2.22)
4If P = (Pl,P2,'" ,Pn) is a probability vector and we write h(p), we mean h(Pl,P2,'" ,Pn)'
5The mutual information is often defined in a slightly different (but completely equivalent) way which makes its symmetry properties more obvious, but this form is the most convenient for our current purposes. Also, the angle brackets in (h(XIY)) are not standard notation, but we shall use them to emphasize the fact that we are averaging over all of the values of Y.
2.3. QUANTUM INFORMATION THEORY
that is, I(X : Y) is the entropy in the value of X minus the entropy in the value of X given a value for Y, averaged over all the possible values for Y. The mutual information can be interpreted as the average amount of information we gain about the value of X by learning the value of Y. Note that I(X : Y) = 0if and only if X and Yare independent, because h(X) = (h(XIY)) in this case and only in this case.
One of the most crucial facts about the mutual information is that it is symmetric in its arguments, that is
I(X : Y) = I(Y : X). (2.23)
2.3 Quantum Information Theory
2.3.1 The Partial Trace
Suppose that Alice and Bob hold systems A and B, respectively, and that the joint state of A and B is given by the density matrix pAB = Li,j,k,l Cl:ijkllij)(kll. We define the partial trace of pAB over B by [16]
(2.24)
i,j,k,l i,j,k
The partial trace over A is defined similarly. Notice that trB(pAB) is a density matrix for system A. In fact, it is the density matrix of a system A when A and B are in the joint state pAB. That is, the density matrix trB (pAB) is the most complete possible description of the state of A alone when A and B are in the joint state pAB. If A and B are initially in a product state, that is pAB = pA ® pB, then trB (pAB) = pA, as we would expect. However, the partial trace allows us to describe the state of A or B alone even if neither system has a definite pure state of its own, as would be the case if A and B were initially in the singlet state.
2.3.2 Local Operations and Classical Communication
Suppose that Alice holds a quantum system A, Bob holds a system B, and that Alice and Bob are spatially separated. We use the term local operations and classical communication (LOCC) to describe a particular set of resources that we allow Alice and Bob to use. Under LOCC, we allow Alice and Bob to each perform any unitary operation or POVM on their own system, and we allow them each to prepare auxiliary quantum systems locally. Also, we allow them an unlimited amount of classical communication (so, for example, Alice could call Bob on the telephone and tell him the result of a measurement she made). Finally, we allow them to each "throwaway" any part of their own system, which we represent by taking the partial trace over that part. However, under LOCC, we do not allow Alice and Bob to physically transport quantum particles back and forth.
We define a procedure to be a specific sequence of LOCC. We allow each step of the sequence to be conditioned on measurement results obtained in previous steps. Finally, we require that a procedure terminate after a finite number of steps. For example, a valid procedure would be, "Alice makes a measurement {Imi/} on a system
A. If she obtains the outcome !ml/, Bob performs a unitary transformation UB on a system B. Otherwise, he does nothing."
2.3.3 Von Neumann Entropy
To quantify the uncertainty in the state of a quantum system, we use a quantity called the von Neumann Entropy, which is analogous in many ways to the classical Shannon Entropy. John von Neumann originally introduced this quantity in order to extend the concept of thermodynamic entropy to quantum systems, but it has since seen a number of other useful applications, such as in constructing measures of entanglement.
The von Neumann Entropy of a density matrix p is given by
S(p) = Tr(plogp), (2.25)
or equivalently, if {,\J are the eigenvalues of p,
(2.26)
Similar to the Shannon Entropy in classical information theory, the von Neumann Entropy can be interpreted as the amount of information we lack about the state of the system. If the system is in a pure state 11P/, then its density matrix 11P/(1P1 can be diagonalized to have one 1 and the rest O's along the main diagonal by transforming into a basis in which 11P/ is a basis vector. Hence, a pure state density matrix always has one eigenvalue equal to 1 and the rest equal to 0, and thus
S(I1P/(1PI) = lloglOlogO OlogO ... = O. (2.27)
That is, pure states always have zero von Neumann entropy. Intuitively, the reason for this is that there exists an orthogonal measurement which gives the state of the
2.3. QUANTUM INFORMATION THEORY
particle with certainty. On the other hand, mixed state density matrices always have nonzero von Neumann entropy. This makes sense intuitively, because for any POVM whose outcomes are proportional to pure states, there is always uncertainty in what the outcome will be.
2.3.4 Entropy of Entanglement
In order to gain some insight into the relationship between quantum information and classical information, we need a method of quantifying entanglement. Intuitively, we want our measure of entanglement to be such that the singlet state, (1/)2)(101) 110)), has as much entanglement as two qubits can have, product states such as 100) have no entanglement, and a "lopsided" singlet state like aI01) b11 0) has some entanglement, but not as much as the nonlopsided one. To make this more precise, we'll employ a measure of entanglement called the entropy of entanglement which makes use of the von Neumann Entropy. With our convention that all logarithms are to have base 2, the units of the entropy of entanglement are called "ebits."
The entropy of entanglement of a pure state I1/!) of A and B is defined by [6]
(2.28)
That is, the entanglement of I1/!) is the von Neumann Entropy of the density matrix of either A or B. This definition is well defined for pure states, because the density matrices of A and B will have the same eigenvalues.
To illustrate this definition, we now calculate the entropy of entanglement of the state 11/!)AB = clOl) dll0). The density matrix of A and B is then
pAB = 11/!)(1/!1 = (cIOl) dll0))(c*(011 d*(101) = IcI2101)(011 cd*10l)(101 c*dll0)(0l1 + IdI2110)(101· (2.29)
Tracing over B, we obtain
pA = trB(pAB)
= IcI210)(01(111) cd*IO)(II(OII) c*dll)(01(110) + IdI211)(11(010)
= IcI210)(01 + IdI211)(II. (2.30)
We can see that the eigenvalues of pA are Icl 2 and Id1 2 , so the entropy of entanglement of I1/!) is
(2.31)
If lei = Idl, this quantity is equal to 1, and if lei =1= Idl, it is less than 1, so, as we hoped, the entanglement of the "lopsided" singlet state is less than that of the nonlopsided one. In fact, the maximum value of the entropy of entanglement for two qubits is 1, so the singlet state has the maximum possible amount of entanglement that two qubits can have.
One of the most significant features of the entropy of entanglement is that it cannot increase on average under LOCe. More precisely, if Alice and Bob have systems A and B respectively and are only allowed LOCC, the entanglement6 between A and B can never increase on average. Depending on what operations they perform, it may remain the same, or it may decrease on average. If the final state of A and B is a mixed state, the entropy of entanglement may not be well defined, so we need a more general measure of entanglement.
2.3.5 Entanglement of Formation
We shall occasionally wish to consider the entanglement of mixed states. However, the entropy of entanglement is not well defined in general for mixed state density matrices. The reason for this is that, if pAB is a mixed state density matrix of systems A and B, it is not always true that the eigenvalues of trA(pAB) are the same as the eigenvalues of trB(pAB), as would be the case if pAB were a pure state density matrix. To describe the entanglement of mixed states, we use a more general measure of entanglement, introduced in Refs. [7, 8], called the entanglement of formation. As with the entropy of entanglement, one of the most crucial features of the entanglement of formation is that it does not increase, on average, under LOCC.
The entanglement of formation of a density matrix p is found as follows: First, write p as a linear combination of pure state density matrices, such as p = Li G:i 11/J.i/(1/JiI. Next, find the average entropy of entanglement of the pure states {11,Ui/}. Finally, minimize this quantity over all possible ways of writing p as a linear combination of pure state density matrices.
For a pure state density matrix Ppure = l1,Ui/( 1/Ji I, there is only one way of writing Ppure as a linear combination of pure state density matrices, namely by writing it as !1,Ui/(1/J.J Therefore, for pure states, the entanglement of formation is equivalent to the entropy of entanglement. However, there are always multiple ways of writing
6From now on, we will simply refer to the entropy of entanglement as "the entanglement." Although we shall also refer to the entanglement of formation in this way, there shouldn't be any cause for confusion, because the entropy of entanglement and the entanglement of formation are equivalent for pure states, and for mixed states, we shall only be concerned with the entanglement of formation. Thus, when it matters, it should be clear from context which one we mean.
2.3. QUANTUM INFORMATION THEORY
a mixed state density matrix Pmixed as a linear combination of pure state density matrices. To calculate the entanglement of formation of Pmixed, we would first need to find the pure state decomposition with the minimum average entanglement, and then we would need to prove that it had the minimum average entanglement, both of which are extremely difficult tasks in general. Therefore, it can be very difficult to work directly with the definition of the entanglement of formation.
Fortunately, an explicit formula is known for the entanglement of formation of a density matrix of two qubits [9]. Let p AB be the density matrix of two qubits A and B, and let
(2.32)
where (pAB )* is equal to pAB with each element replaced by its complex conjugate in the standard basis {rOO), [01), [10), Ill)}, and
0
i)
(2.33)
O"y = (i a
is the Pauli y matrix. Let
(2.34)
where the A/S are the square roots of the eigenvalues of the matrix pAB pAB, arranged in decreasing order. Finally, let
1+ VIC2 1+ VIC2 1VIC2 1VIC2
(,(C) = 2 log 2 2 log2(2.35)
The entanglement of formation of pAB is given by
(2.36)
2.3.6 Teleportation
Suppose that Alice possesses a qubit A which she wishes to send to Bob. In addition to this particle A, Alice holds a qubit C and Bob holds a qubit D, which are in the joint state 1¢)cD = (1/V2)(!00) + [11)), a maximally entangled state. However, Alice and Bob do not have any channel for sending quantum particles back and forth through the intervening space. In this situation, one of the most straightforward ways that Alice can send the state of particle A to Bob is by teleportation [17].
To illustrate how teleportation works, we'll use an example presented in Ref. [16]. Suppose that particle A is in the state lib) A = alO) +,611). Then the initial joint state of A, C, andD is
(2.37)
vVe can rewrite this state as 1
IW)AcD = "2 [16)Ac(aI0) + ,611))D
+
16)Ac(aIO) ,611))D
+
16)Ac(,610) + aI1))D
+ 1~4)Ac(,610) aI1))DJ, (2.38)
where
1 16) = v0(100) + 111))
1
16) = v0(100) 111))
1
16) = v0(101) + 110))
1~4) = ~(101) 110)) (2.39)
are the Bell states, which form an orthonormal basis for the state space of two qubits. First, Alice makes the Bell measurement {16), 16), 16), 1~4)} on the particles A and C (which is allowed under LaCe because Alice possesses both particles). She then communicates the result of her measurement to Bob classically (such as by calling him on the telephone). vVith this information, Bob knows what unitary transformation to perform on D to put it into the state lib) = alO) +,611), the initial state of particle
A. For example, if Alice obtains the outcome 16) on her Bell measurement, the final state of particle D is (aIO) ,611))D, and Bob can then perform the unitary
operation (~ ~1) on particle D to put it into the desired state. Notice that this
transformation does not depend on a and ,6, so this procedure will work even if Alice and Bob don't know the values of a and ,6.
2.4. RELATED WORK
After this procedure is done, particles C and D are no longer entangled with one another. For this reason, it makes sense to say that the entanglement of C and D was "used up" in the process. In the above example, C and D began in a maximally entangled state. If they were to begin in a partially entangled state like cIOO) +dill), Alice and Bob could still perform the teleportation using the same procedure, but the components of the final state of D would be distorted by factors involving c and
d.
It is worthwhile to note explicitly that this same procedure can be used to teleport the state of particle A to Bob even if particle A is initially entangled with another particle. For example, suppose that particle A is initially entangled with particle B, held by Bob, such that their joint state is 11,b)AB = (1/V2)(IOl) 110)). If Alice and Bob perform the exact same procedure described above, the resulting state of particles D and B will be (1/V2)(IOl) II0))DB, exactly the same as the initial state of particles A and B.
2.4 Related Work
As we mentioned in Chapter 1, several studies have been done to investigate how well Alice and Bob can distinguish among some collection of initial states of A and B given that they share no entanglement. For example, it is known that if A and Bare in one of two orthogonal states, then Alice and Bob can distinguish between these states perfectly using only LOCC, even if the states are entangled [12]. On the other hand, it is also known that if A and B are initially in one of the four Bell states, it is not possible for Alice and Bob to determine their state using LOCC [13], unless they are given two copies of A and B, in which case it is possible [12]. Curiously, there exist collections of initial states of A and B which are all unentangled but for which the optimal discrimination cannot be achieved by LOCC [1, 2, 3]. Equivalently, there exist measurements whose outcomes are all unentangled but which cannot be performed locally. This indicates that the nonlocal properties of the theory of quantum mechanics go deeper than just entanglement, for entanglement is a measure of the nonlocality of an individual state, whereas these examples display nonlocality in measurements and collections of unentangled states.
Also, D. Berry has designed a procedure by which Alice and Bob can perform a certain class of nonlocal unitary transformations by using some entanglement [18]. This class of transformations maps partially entangled states to unentangled states. If Alice and Bob perform a transformation of this form on their initial states, mapping them from partially entangled states to unentangled states, they can then distinguish among the initial states with a local measurement.
Finally, a few studies have been done to investigate how much entanglement Alice and Bob need to share in order to perform certain measurements. For example, in Ref. [10], Shelby Kimmel finds upper and lower bounds on the cost of several measurements and finds the exact cost of a particular measurement on two qubits. Ref. [11] extends this result, finding the exact cost of a class of measurements on two cldimensional quantum systems.
Each of the studies mentioned here approaches from a different angle the same general question of how one can extract classical information from a quantum system. Our goal is to approach this general question from yet another angle by asking how much information Alice and Bob can extract from a quantum system if they share a limited but nonzero amount of entanglement.
Chapter 3
Statement of Our Problem
In general, we wish to investigate the amount of classical information one can extract from a quantum system whose parts are spatially separated, given a limited amount of entangled resources. We now establish a framework in which we can formulate this problem more precisely.
3.1 Setup
An ensemble is a collection of quantum states, each with an associated probability. For example, the collection {(I'l/Jl), pd, (1'l/J2) , P2), ... , (!VJn), Pn)} where ~iPi =1 is an ensemble. We imagine that Alice and Bob, who are spatially separated, each possess a quantum system, which we'll label A and B respectively, and that the initial joint state of A and B is described by an ensemble
(3.1)
By this we mean that Alice and Bob know that A and B are initially in one of the joint states IVJi) but that as far as Alice and Bob know, each of the possible initial states is equally likely. Alice and Bob's goal is to determine which of the possible states A and B were initially in. Concretely, we may suppose that each state I'l/Ji) represents a letter in an alphabet and that the initial state of A and B encodes one letter of a message that Alice and Bob are attempting to decode.
We only allow Alice and Bob to use LOCC in pursuit of their goal of determining the state of A and B. We allow them to initially possess shared pairs of entangled systems, but since they cannot physically transport quantum systems back and forth under LOCC, they cannot create additional shared entangled resources. For example,
21
although Alice is allowed to prepare an auxiliary singlet state ~ (101) 110))cD locally, she cannot physically transport one of the particles C and D to Bob under LOCC, so this state cannot be used as an entangled resource. In effect, by restricting Alice and Bob to LOCC, we only allow them to use entangled resources that they possess initially.
3.2 The EntanglementInformation Function
Before introducing the central concept of interest, the entanglementinformation function, we need a few preliminary definitions.
Definition 3.1. Let the initial state of A and B be described by an ensemble E, let P be a pTOcedure that Alice and Bob perfoTm on A and B, and let 0 be an outcome of P. The specific information of 0 given E, denoted :J(E, 0), is given by
:J(E, 0) == h(state) h(stateIO), (3.2)
wheTe "state" is a random vaTiable whose values are the possible initial states of A and B.
Note that h(state) can be interpreted as Alice and Bob's initial uncertainty in the state of A and B, while h(stateIO) is Alice and Bob's uncertainty in the state of A and B after they have performed the procedure P and obtained the outcome O. Thus, :J(E, 0) is the amount by which Alice and Bob's uncertainty was reduced when they obtained the outcome 0, and it can therefore be interpreted as the amount of information they gain about the state of A and B if they perform the procedure P and obtain the outcome O.
Definition 3.2. Let the initial state of A and B be described by an ensemble E, let P be a pTOceduTe that Alice and Bob perform on A and B, and let ()(P) be the set of possible outcomes of P. The procedural information of P given E, denoted Y (E, P), is given by
Y(E, P) == min :J(E, 0). (3.3)
OEO(P)
vVe will use the procedural information as our measure of the procedure P's success in discriminating among the possible initial states of A and B. It is worthwhile to note that this is a rather strict measure of success. If Alice and Bob perform a procedure P which gives 0 specific information with probability 10100 and 1 bit of specific information otherwise, the procedural information of P is O. One could well
3.2. THE ENTANGLEMENTINFORMATION FUNCTION
imagine taking the average specific information of the outcomes of P as an alternative measure of success. If we suppose that the initial state of A and B encodes one letter of a long message that Alice and Bob are attempting to decode, it might indeed be more appropriate to take the average information as our measure of success, because in that case, it will not matter if Alice and Bob fail to decode a few letters of the message as long as they do well on average. However, if we instead imagine that Alice and Bob are attempting to decode a message of only a few letters (for example, a message consisting only of Y or N), they may completely misinterpret the message if they fail to decode one letter. In this sort of scenario, it seems reasonable to take our more strict measure of success.
Although Alice and Bob can usually obtain some information about the initial state of A and B by performing local measurements, it is often the case that they can obtain more information if they share some entanglement.
Definition 3.3. Let P be a procedure that Alice and Bob perform on A and B. The procedural cost of P, denoted CCi'(P) , is defined as the average amount of entanglement of formation that Alice and Bob must share to perform P.
For example, suppose that systems A and Bare qubits and that they are initially in one of the four Bell states I~i)' Let PL be the procedure in which Alice and Bob each make the orthogonal measurement {IO), II)} on their own particle, and let FE be the procedure in which Alice teleports her particle to Bob and then Bob performs the orthogonal measurement {16), 16), 16), 1~4)} on both particles. It is straightforward to calculate that1f(PL ) = 1 and f(PE ) = 2. However, teleporting a qubit requires that Alice and Bob share 1 ebit of entanglement, so CCi'(PE) = 1, while CCi'(Fd = o.
As another example, consider a procedure Pp in which Alice and Bob first flip a weighted coin with a 1/4 probability of landing heads and a 3/4 probability of landing tails. If the coin comes up heads, Alice and Bob teleport A to Bob, which requires 1 ebit of entanglement, while if the coin comes up tails, they do nothing. For this procedure, they have a 1/4 probability of needing 1 ebit of entanglement and a 3/4 probability of needing no entanglement, so CCi'(Pp ) = 0.25(1 ebit) +0.75·0 = 0.25 ebits.
Although we use the entanglement of formation as our measure of entanglement in the definition of the procedural cost, the specific procedures that we consider will only use pure entangled resource states. Therefore, since entanglement of formation reduces to entropy of entanglement for pure states, we only need to consider the resource states' entropy of entanglement, as we did in the above examples.
Finally, we introduce the central object of interest in this thesis.
IV/hen the ensemble E is clear from context, we will sometimes write yf(P) instead of yf(E, P).
Definition 3.4. Suppose that A and B are described by an ensemble E. Let 'Y](E) be the collection of all procedures P for which yJ(E, P) = 1. The entanglementinformation function of E is given by
g(E,1) == inf et'(P). (3.4)
PE3'I(E)
This function essentially gives the minimum amount of entanglement that Alice and Bob must share on average in order to gain at least an amount of information 1 about the initial state of A and B when A and B are described by the ensemble E.
Proposition 3.5. For a given ensemble E, the entanglementinformation function g(E, 1) is monotonically nondecreasing in 1.
Proof Let12 > h 2': 0, and let 5 > 0. By the definition of the entanglementinformation function, there exists a procedure Pij such that yJ(E, Pij) = hand
(3.5)
Let P~ be a procedure identical to Pij except that Alice and Bob throwaway an amount h11 of the information they gain at the end. Then yJ(E, pn = h, and g(E,1d ::; et'(pn < g(E, h) + 5. (3.6) Taking the infimum over all 5 > 0, we obtain
(3.7)
and therefore, g(E, 1) is monotonically nondecreasing in 1. o
Given an ensemble E, we may plot the entanglementinformation function with respect to 1 in the entanglement/information plane, as is shown hypothetically in figure 3.1. Although 1 is technically the independent variable, notice that we have put it on the vertical axis in this plot. The reason for this is that we may also interpret the plot or g(I) as giving the maximum amount of information that Alice and Bob are certain to be able to obtain about the state of A and B if they use a particular amount of entanglement on average, and this alternative interpretation is sometimes more convenient. Notice that any procedure P can be associated to the point (et'(P) , yJ(P)) in the entanglement/information plane in figure 3.1. The plot of the entanglementinformation function is therefore the dividing curve between the attainable and unattainable regions in the plane. That is, for any procedure P, the point (et'(P) , yJ(P)) must lie to the right of the entanglementinformation function.
2 As before, if the ensemble is clear from context, we will write g(I) instead of g(E, 1).
3.3. BOUNDING THE ENTANGLEMENTINFORMATION FUNCTION
0"(E,1)
Entanglement
Figure 3.1: A hypothetical plot of the entanglementinformation function. Notice that the independent variable I is on the vertical axis.
3.3 Bounding the EntanglementInformation Function
Our goal in this thesis will be to investigate the entanglementinformation functions for two different ensembles, the Lopsided Bell Ensemble and the Double Trine Ensemble. Ideally, we would like to be able to find the entanglement information function. However, calculating the entanglementinformation function directly for a given ensemble is, to say the least, very difficult in general. Therefore, our main method of investigating it will be to look for upper and lower bounds.
One way that we may find good upper bounds on 0"(1) is to actually look for procedures which yield a lot of information about the state of A and B and which do not cost very much entanglement. In fact, we shall often look for oneparameter classes of procedures P(A), where the procedural information and the procedural cost both increase with A. Making a parametric plot of the points (CCf(P(A)), Y(P(A))) will then yield an upper bound on 0"(1), because the entire curve must lie to the right of the plot of 0"(1).
We can also find lower bounds on 0"(1) by taking advantage of the fact that, as we mentioned in the previous chapter, the total entanglement of formation between Alice's system and Bob's system cannot increase on average under LOCC. Given a certain amount of information I, we look for a lower bound on 0"(1) by attempting to construct a situation in which the entanglement of formation between Alice's system and Bob's system would increase on average ifthere exists a procedure P with yJ(P) = 1 and '$'(P) < E for some E > O. If we can find such a situation, E is then a lower bound on g(1).
Let A and B be described by an ensemble E consisting of the Ns states {['l,bi) }~1' equally weighted. Let P be a procedure with outcomes {Oi}i:1' Notice that the specific information of the outcome Oi is given by
Ns
= 10gNs + LP('l,bj[Oi)logp('l,bj!Oi). (3.8)
j=l
The theory of quantum mechanics naturally gives us the probabilities of outcomes given states, but this expression requires that we calculate the probabilities of states given outcomes. Using Bayes' Law (equation (2.17)) to rewrite this expression, we get
(3.9)
which allows us to calculate the specific information of Oi in terms of probabilities of outcomes given states. However, this expression is still fairly complicated, and in order to calculate the procedural information of P, we would need to minimize this quantity over i. Thus, we would like to further simplify the calculation of the procedural information in the case that certain assumptions hold.
Suppose that each outcome of P is equally likely. Then p(Oi) =1/No for all i. In this case, notice that the specific information of Oi depends only on the probabilities {p(Oi['l,bj)}f~l' Therefore, if this collection of probabilities is the same (up to ordering) for every pair of outcomes Oi and OJ, we have that each outcome of P gives the same specific information, and hence, the procedural information yJ(P) is equal to the avemge specific information of the outcomes, that is,
yJ(P) = min [h(state) h(stateIOi)] = h(state) (h(stateloutcome)), (3.10)
2
where "outcome" is a random variable whose values are the possible outcomes of P. The rightmost expression in this equality is the average information Alice and Bob
3.3. BOUNDING THE ENTANGLEMENTINFORMATION FUNCTION
gain about the state by obtaining an outcome of the procedure P, which is just the mutual information of the state and the outcome,
yf(P) = I(state : outcome). (3.11)
However, recall that the mutual information is symmetric in its arguments, so we have that
yf(P) = I(outcome : state) = h(outcome) (h(outcomelstate)). (3.12)
This quantity is typically much more straightforward to calculate than expression (3.9), so we shall use it whenever it applies, namely whenever the outcomes of P all give the same specific information.
Chapter 4
The Lopsided Bell Ensemble
4.1 Introduction
In section 2.3.6, we defined the Bell states,
1
16) = yI2(IOO) + 111))
1
16) = yI2(IOO) 111))
1
16) = yI2(IOl) + 110))
1 1~4) = yI2(IOl) 1 10)), (4.1)
and we remarked that they form an orthonormal basis for the state space of two qubits. We now present a generalization of these states, the Lopsided Bell states, which are given by
I~f) = aIOO) + bill) I~~) = bIOO) alll) I~~') = alOl) + bll0)
I~~) = biOI) all0), (4.2)
where a and b are real, 0 ::; a ::; 1, and a2 + b2 = 1. In the case where a = b = l/yI2, these states reduce to the original Bell states. Notice that, even when a 1= b, these states still form an orthonormal basis. However, in this case, the entanglement of
28
4.1. INTRODUCTION
each state is h(laI 2, Ib1 2) < 1, that is, the states are not maximally entangled if a =Ib. For the sake of simplicity, we assume that a ;::: b. For the purposes of the calculations we perform in this chapter, this does not entail a loss of generality, because our calculations can easily be modified to handle the case where b > a, and they would then give the same results anyway.
Our goal in this chapter is to investigate the entanglementinformation function of the Lopsided Bell Ensemble1 La = {I~f)}t=l' We first note that since there are four equally likely initial states, h(state) = 10g4 = 2, and thus the maximum specific information for any outcome of any procedure is 2 bits. Moreover, we know that this maximum is achievable, because Alice could teleport system A to Bob, who could then perform the Lopsided Bell measurement {I~f)}t=l on A and B locally. This measurement would discriminate among the Lopsided Bell states perfectly, and therefore, the procedural information of this procedure would be 2 bits, the maximum conceivable amount. Thus, the entanglementinformation function tf}'(I) is defined if and only if a.:; I .:; 2.
Next, we note that Alice and Bob can gain some information locally by each making the canonical measurement {I 0), II)} on their own system. This corresponds to making the joint measurement {lOa), 101), 110), Ill)} on both systems. If they obtain one of the outcomes 100) or 111) (that is, if their measurements on their own systems return the same outcomes), they know that A and B were initially in one of the states I~f) or 1~2)' because these outcomes are orthogonal to 1~3) and 1~3)' Similarly, if they obtain one of the outcomes 101) or 110) (that is, if their individual measurements return different outcomes), they know that A and B were initially in one of the states 1~3) or 1~4)' In either case, however, Alice and Bob have no way of distinguishing perfectly between the remaining two possibilities if a =I1. In fact, the final Shannon entropy of the state of A and B is h(a2 ,b2) in either case, so the procedural information is 2 h(a2 ,b2).
To find upper bounds on tf}'(I) for 1 .:; I .:; 2, we first construct a procedure Pu in which Alice and Bob perform a nonlocal unitary transformation on A and B which increases the amount of information Alice and Bob gain from performing the canonical measurement {lOa), 101), 110), Ill)}. Next, we consider a "bad teleportation" procedure PT in which Alice attempts to teleport A to Bob using a resource pair with entanglement less than 1 ebit and then Bob performs the Lopsided Bell measurement on A and B. Finally, we show that in the case a = b = 1/V2, the upper
1More precisely, this is a oneparameter family of ensembles. Since we always assume that the initial states of A and B are equally likely, from this point on, we'll just specify the states when defining an ensemble, not the probabilities.
bound curve given by the bad teleportation method is g(I) by finding a lower bound and showing that it coincides with the upper bound.
4.2 A Unitary Transformation Procedure
Let
ab
Ua = (4.3)
b a
00
Ub 00 ~)
and notice that
Ual~n = 100)
Ual~~) = Ill)
Ual~~) = 101)
Ual~~) = 110), (4.4)
that is, Ua maps the Lopsided Bell states to the canonical basis states. Therefore, if Alice and Bob are able to perform the transformation Ua on A and B, they could simply perform the canonical measurement {IOO), 101), 110), Ill)} in order to distinguish perfectly among the possible initial states. Unfortunately, Ua is a nonlocal transformation (that is, it cannot be written in the form UA 0 UB , where UA is a unitary transformation acting on Alice's system and UB is a unitary transformation acting on Bob's system), and therefore, it requires that Alice and Bob share some entanglement.
Intuitively, our approach will be to find a "weaker version" of the transformation Ua which will map the Lopsided Bell states "part of the way" to the canonical basis states and will therefore increase the amount of information that Alice and Bob gain from performing the canonical measurement. Let
COSI 0 o
Si~l)
. 0 cos SIn I
Ub) = e",iJy®iJx =. .I o . (4.5) o SIn I cOS I
(
sin I 0 o cos I
Notice that when I= cos1 a, we have Ub) = Ua. If I< cos1 a, Ub) will rotate the Lopsided Bell states part of the way to the canonical basis states.
4.2. A UNITARY TRANSFORMATION PROCEDURE
As with Ua, the transformation Ub) cannot be written as a tensor product of a transformation on Alice's system with a transformation on Bob's system, so Alice and Bob need to share some entanglement to perform it. In Ref. [18], D. Berry gives an explicit method that Alice and Bob can use to perform a certain class of transformations which contains Ub). Berry also shows how to numerically calculate the average entanglement that Alice and Bob must share in order to make use of this method. Utilizing Berry's result, we may numerically calculate the amount of entanglement that Alice and Bob must share on average in order to perform, using Berry's method, the transformation Ub) as a function of f. A plot of this function is shown in Figure 4.1. See Appendix A for a more detailed description of Berry's
1 r.,,• ....,.I....,I""""'I~....
I,.~
...
....
0.8 I..••
0.6 I
.
:E
•
()
•
0.4I•
••••
0.2
•
•
•
oO_'_L1 _..L_1I _'_J1L_L'
o 0.2 0.4 0.6
Figure 4.1: A plot of Cb), the average amount of entanglement Alice and Bob must use in order to perform Berry's method.
method.
vVe can now state our procedure Pub) in full detail. First, Alice and Bob perform the transformation Ub) on A and B using Berry's method. Then, they perform the canonical measurement on A and B. The procedural cost2 'tfb) is given by the plot in Figure 4.1.
2When we deal with a oneparameter family of procedures P(A), we will sometimes write '6'(A) and J'(A) instead of '6'(P(A)) and J'(P(A))
To calculate the procedural information f({), we first note that once Alice and Bob have performed U(I), the four outcomes of their canonical measurement are equally likely a priori. To see this, let I(f({)) = U({)I~f). Then
I(f({)) = (a cos I + bsin I) 100) + (b cosI a sin I) Ill) j(g({)) = (b cosI a sin I) 100) (a cos I + bsin I) Ill) I(~({ )) = (a cos I + bsin I)101) + (b cosI a sin I)110) I(~({ )) = (b cosI a sin I) 101) (a cos I + bsin I)110) . (4.6)
To find the a priori probability of one of the outcomes of Alice and Bob's canonical measurement, say 100), we compute the probability of obtaining the outcome 100) in each state and then find the average of these probabilities, weighted by the probability that A and B were in each state (in this case, each of the states is equally likely, so each probability is given a weight of 1/4). Given equation (4.6), it is straightforward to verify that the a priori probability of each of Alice and Bob's outcomes is 1/4. Let leI) = 100), le2) = !01), le3) = 110), and le4) = Ill). If we can show that the collection {p(ejll(f(I))}~l is the same (up to reordering) as the collection {p(ehl(f(())};=l for any 1 :::; ]1,]2 :::; 4, then by our discussion at the end of the previous chapter, the specific information is the same for each outcome of Pu({) , and hence the procedural information f (I) is equal to the mutual information of the state and the outcome.
Notice that for any], the collection {p(ejl(f({))};=l is completely determined by the collection {(ejl(f({))};=l' Given]l and ]2, let]3 and]4 be the remaining two indices, and let V be a unitary transformation defined by
Vlejl) = lej2) Vlej2) = jejl) Vlej3) = lej4) V!ej4) = lej3) (4.7)
if (jl,]2) = (1,2) or (jl,]2) = (3,4), and
Vlejl) = lej2)
Vjej2) = Iejl)
Vlej3) = lej4)
Vlej4) = lej3) (4.8)
otherwise. Note that, up to overall phase, V permutes the vectors I(f({)). Also, up to overall phase, V maps IejJ to Ieh)' Since unitary transformations preserve
4.3. A "BAD TELEPORTATION" PROCEDURE 33
inner products, we therefore have that the collection {(ejll(i(r));=l is equal (up to
reordering) to {(ehl(i(r))}tl' Hence, {p(ejll(i(r))};=l = {p(ehI(i(r))};=l' and therefore the procedural information of Pu(r) is equal to the mutual information of the state and the outcome.
To calculate the procedural information, we note that the set of probabilities of the four outcomes given some initial state will be the same for all four possible initial states, except that the probabilities will be permuted among the outcomes. Since the Shannon Entropy function only depends on the collection of probabilities and not on which outcome each probability is associated to, we have that (h(outcomelstate)) = h(outcomel~f). Therefore, the procedural information of Pu (r) is
f(r) = I(state : outcome) = h(outcome) (h(outcomelstate))
= 2h(outcomel~n
=2 + L>(eil(f(r)) logp(eil(f(r)), (4.9)
where p(eil(f(r)) = l(eil(f(r))!2. Parametrically plotting (1f(r) , f(r)) for several values of a, we obtain the plots in Figure 4.2. Each of these curves is an upper bound (that is, a right bound) on the entanglementinformation function for the corresponding value of a.
Regardless of a, we know that Alice and Bob can obtain 2 bits of information using 1 ebit of entanglement, because Berry's procedure guarantees that they need no more than 1 ebit of entanglement to perform the transformation U(r). According to these plots, when a = b = 1/V2, Alice and Bob are only guaranteed about 1.35 bits of information if they use anything less than 1 ebit of entanglement. This sharp discontinuity suggests that the procedure P(r) is not very efficient with entanglement when it is used to discriminate among the nonlopsided Bell states. On the other hand, when a = 0.957, Alice and Bob can obtain 2 bits of information using less than 1 whole ebit of entanglement. Thus, we have to compare this procedure to the "bad teleportation" procedure discussed in the next section in order to judge whether or not Pu(r) is efficient at discriminating among very lopsided Bell states.
4.3 A "Bad Teleportation" Procedure
Next, we consider a procedure PT in which Alice attempts to teleport particle A to Bob using a resource state that is not maximally entangled. Specifically, suppose 2
III
•• •• ..••• •• •••
1.8
• •·· .•
.•
•
>=: • • ••·••
•
:. • •· •• I'
.9 1.6
+"
• •·•
cU
••• .•• •.
S
•• ••·
H
• .. •• ••
<8
• ...
>=: 1.4
· •·
>< •• ••·
•• •••• .....· ..
•• •• ..••
1.2
. • ·.• ..•• ••
•• ..
· .·.• •·• ..• ....
·.. ••
• ·... • ··....
• •.• ! •• •·1· I
A•
1
0 0.25 0.5 0.75 1 Entanglement
Figure 4.2: Parametric plots of (1f(i), f(r)) for several values of a. From bottom to top, the curves represent a = 1/)2, 0.7571, 0.8071, 0.8571, 0.9071,0.9571.
that Alice possesses an auxiliary particle C and Bob possesses an auxiliary particle D, and C and D are initially in the state I¢C) cIOO) + dill). If the initial state of A and B is 11J;), the initial joint state of A, B, C, andDis 11J;)AB ® l¢cleD. Alice makes the (nonlopsided) Bell measurement {I~i) };=l on her particles A and C. Using the partial inner product formalism, we can find the final state of particles D and B given each of the possible initial states of A and B and each of the possible outcomes of Alice's measurement. The results of this calculation are given in Table
4.1. Once Alice communicates the result of her measurement to Bob, he then knows the four possible states of particles D and B. For example, if Alice obtains the outcome ~ (100) + 111) )AC, Bob the knows that D and B are in one of the four states
aclOO) + bdlll), bclOO) adlll), ac101) + bdI10), and bclOl) adI10).
Notice that El and E2 have nonnegative eigenvalues and El + E2 = I, where I is the identity matrix. Therefore, {El ,E2} is a valid POVM. Notice that no matter
4.3. A "BAD TELEPORTATION" PROCEDURE
Initial state of A and B
(aIOO) + bill) )AB (bIOO) alll))AB (aeIOO) + bdlll) )DB (beIOO) adlll) )DB (aeIOO) bdlll))DB (beIOO) + adlll) )DB (adll0) + belOl) )DB (bdll0) aelO!) )DB (adll0) belO1) )DB (bdll0) + aelOl) )DB
Result of Alice's measurement J2(100) + Ill))Ac ~(IOO) Ill))AC ~(IOl) + 110) )AC 72(1 01 ) 110) )AC
Initial state of A and B
(aI01) + b11 0) )AB (biOI) all0))AB (aeIOl) + bd!10))DB (beIOl) adll0) )DB (aeIOl) bdll0))DB (beIOl) + adll0) )DB (adlll) + beIOO))DB (bdlll) aeIOO))DB (adlll) beIOO))DB (bdlll) + aeIOO))DB
Result of Alice's measurement J2(100) + Ill) )AC ~(IOO) lll))Ac ~ (101) + 110) )AC 32(101) II0))Ac
Table 4.1: The subnormalized final states of particles D and B given each possible initial state of A and B and each possible outcome of Alice's measurement on A and
C. According to the partial inner product formalism, the normsquared of these subnormalized states represents the probabilities that Alice obtains the various outcomes of her measurement.
which outcome Alice obtained, two of the possible final states of D and B lie in the subspace spanned by 100) and Ill), and the other two possible final states lie in the orthogonal subspace spanned by 101) and 110). As a first step toward distinguishing among the four possibilities, Bob can perform the POVM {El, Ed on particles D and
B. This measurement returns the outcome El with certainty if and only if the state of D and B lies in the {IOO), Ill)} subspace and returns E2 with certainty if and only if the state of D and B lies in the {IOl), 110)} subspace. Thus, performing the POVM {El ,E2} on particles D and B rules out two of the possible states and leaves the remaining two possibilities equally likely. Notice that Bob can then perform a local unitary transformation on D and B to map the two remaining possibilities onto the states Ifo) = aelOO) + bdlll) and IiI) = belOO) adlll). For example, if Alice obtains the outcome~(IOl) + 110)) on her Bell measurement and Bob obtains the outcome E2 on his POVM, Bob knows that the two remaining possible states of D and Bare ad110) + bclOl) and bd110) acIOl). He can then perform the unitary transformation
(4.11)
to map these two states onto Ifo) and Ih) . Let
COS e 0
~ ~ s~ne)
T(e) = 0 1 (4.12)
o 0 10 '
(
sin e 0 o cose and let
IBo(e)) = T(e)IOO)
IBl(e)) = T(e)I11). (4.13)
The last step in our procedure PT is that Bob will attempt to distinguish between the nonorthogonal states Ifo) and Ih) by making the orthogonal measurement
(4.14)
for some e. Let eo be the value of e at which the vectors IBo(eo)) and IBl(eO)) symmetrically straddle the vectors Ifo) and Ih) as is shown in Figure 4.3, that is (Bo(eo)lfo) = (Bl(eo)lh). By symmetry, we know that
so
We note that the procedure PT technically has 16 different outcomes (4 outcomes of Alice's Bell measurement times 2 outcomes of Bob's POVrvI times 2 outcomes of Bob's final measurement). However, the outcomes of Alice's Bell measurement are all equallylikely a priori, as are the outcomes of Bob's POVM {El ,E2}. Therefore, equation (4.16) implies that {p(Oil~k)}t=l = {P(Ojl~k)}t=l for any two outcomes Oi and OJ of PT. Hence, if Bob chooses e = eo, the procedural information y(PT) is equal to the mutual information of the state of A and B and the outcome of PT.
4.3. A "BAD TELEPORTATION" PROCEDURE
Ifo)
IiI)
Figure 4.3: Two nonorthogonal vectors Ifo) and IiI), and the orthogonal vectors IBo(eo)) and IB1(eO)) which symmetrically straddle them.
It is known that the optimal measurement for distinguishing between two equallylikely nonorthogonal states is the orthogonal measurement whose outcomes symmetrically straddle the states if the measure of success is the mutual information of the state and the outcome [19]. Therefore, if we maximize the mutual information of the state of A and B and the outcome of PT over e, we know that the maximum will occur at e= eo, and therefore at this maximum, the procedural information is equal to the mutual information.
Each of the 16 outcomes of PT is equallylikely a priori, so 17,(outcome) = log 16 =
4. Also, the entropy of the outcome given a particular state does not depend on which state it is, so (17,(outcomelstate)) = 17,(outcomel~n. Hence, the mutual information is I(outcome : state) = 4 17,(outcomel~n. Maximizing this quantity over efor several values of c and a, we obtain the plots in Figure 4.4.
Notice that when a = 1/v'2, the "bad teleportation" procedure is definitely a better choice than the unitary transformation procedure. On the other hand, the unitary transformation method is better than bad teleportation when a is close to 1. Finally, at about a = 0.9, neither method is significantly better than the other.
Also, notice that for any given a, the curves for the bad teleportation procedure and the unitary transformation procedure intersect the information axis at the same point. The reason for this is that in the limit where Alice and Bob's resource pair is unentangled, both procedures are in fact equivalent to Alice and Bob each making the measurement {IO), II)} on their own system.
1.8
1.2
0.25 0.5 0.75 1 Entanglement
Figure 4.4: Parametric plots of (~(c), Y(c)) for several values of a. For comparison, the points plotted in Figure 4.2 are shown here as small dots. As before, from bottom to top, these curves represent a 1/v'2, 0.7571, 0.8071, 0.8571, 0.9071, 0.9571.
4.4 The EntanglementInformation Function for a =
1/V2
In this section, we explicitly find the entanglementinformation function 0"(I) for the case where a = 1/v'2 (that is, for the nonlopsided Bell ensemble). We have the following:
Theorem 4.1. For a = 72' the entanglementinformation function is given paramet
rically by (h (~+ Jp(lp), ~ Jp(1 p)) ,2 h(p, 1p)) with ~ ::::; p::::; 1, and the "bad teleportation" procedure PT is the optimal procedure.
To prove this, we first show that the parametric curve
(h (~+ Jp(lp), ~ Jp(lp)) ,2 h(p, 1p)) (4.17)
with ~ ::::; p ::::; 1 is a lower bound on the entanglementinformation function by using an extension of a technique used in Ref. [10] to find lower bounds on the entanglement
4.4. THE ENTANGLEMENTINFORMATION FUNCTION FOR A = 1/12 39
costs of measurements, and then we show that the upper bound curve given by PT is equal to the lower bound.
Let the initial state of particles A and B be one of the four Bell states, with each state equally likely. Also, let particles C and D initially be in the same state as A and B. We represent the joint state of A, B, C, and D by the density matrix
4
p~ft~p = lL l~i)(~iIAB Q91~i)(~ilcD' (4.18)
i=1
Suppose that Alice and Bob perform a procedure with procedural information I on A and B in an attempt to discover their state. Let {PI,P2,P3,P'1} be the probabilities that they assign to the initial states of A and B after they have obtained an outcome
of this procedure. Then the final stat e of C and D is
P~~l = 4 LPil~i)(~il i=1 (4.19)
and the probabilities {pd satisfy the in equality
(4.20)
or equivalently,
(4.21) We will assume without loss of generality that PI is the largest of the p/s. Let
(4.22)
Then we can write P~rfal in the form lo/,ijk )(o/,ijk I (4.23)
If/PI ,P2 ,P3 ,P4 If/PI ,P2 ,P3 ,P4 . i,j,k=+,
In Ref. [7], it is proven that the states 11/J~~;'P3,pJ have the least average entanglement of any pure state decomposition of P~~l' The entanglement of each of these states is
(4.24)
and this is therefore the entanglement of formation of P~n~l' Let 1
IWl)ABcD = "2 (10000) + 10101) + 11010) + 11111)) 1
IW2)ABcD = "2 (10000) 10101) 11010) + 11111))
1
IW3)ABcD = "2 (10011) + 10110) + 11001) + 11100))
IW4)ABcD = "21 (10011) 10110) 11001) + 11100)). (4.25)
. ABCD· 1 f
Tllen we can wnte Pinitial 111 t le orm
4
p(~ft~F = L IWi)(wiIABCD. (4.26)
i=l
However, if we interchange particles Band C in our expressions for the IWi) 's, we can see that they can each be written as a tensor product of a state of AC with a state of BD. For example,
1
IWl)AcBD = "2 (10000) + 10011) + 11100) + 11111)) 11
= hUOO) + 111))Ac ® h UOO)+ 111))BD. (4.27)
Therefore, the states IWi) have 0 entanglement across the ACIBD cut, and thus, the entanglement of formation of P~ft~P across the ACIBD cut is O.
To find a lower bound on the entanglementinformation function cf!(I), we find the minimum c of the entanglement of formation E(pd of Pii;fal over all choices of the probability vector p = (Pl,P2,P3,P4) such that h(p) :s: 2I. Since the initial density matrix P~ft~P is unentangled between AC and B D and entanglement of formation cannot increase on average under LOCC, we may then conclude that Alice and Bob must have shared at least an amount c of entanglement initially. Thus, the procedural cost of any procedure P for which yJ(P) 2: I must be at least c, and hence cf!(I) is bounded below by c.
Since Alice and Bob obtain 1 bit of information locally by performing the canonical measurement {IOO), 101), 110), Ill)}, we know that cf!(I) = 0for 0 :s: I :s: 1. Thus, it suffices to consider 1 :s: I :s: 2. vVe need the following lemma, which we'll prove in Appendix B.
4.4. THE ENTANGLEMENTINFORMATION FUNCTION FOR A = 1/V2 41
Lemma 4.2. Let 1 ::; I ::; 2. Up to reordering of the second, third, and fourth components, the probability vector p = (PI,P2,P3,P4) with PI ~ P2,P3,P4 and h(p) ::; 2I which minimizes the entanglement of formation (4·24) of pfrfal is P= (p,1 P, 0, 0), where h(p, 1p) =2I.
\;Vith this lemma, we can calculate the lower bound on i:'(I) for any given I by finding the unique value of P ~ 1/2 such that h(p, 1p) 2 I and then plugging this value for p into the formula (4.24) for the entanglement of formation of P~~l' The lower bound curve is therefore given parametrically by
(4.28)
with ~ ::; p ::; 1, as we claimed.
Now, we find a parametric representation of the upper bound curve given by the "bad teleportation" procedure PT and show that it is, in fact, a different parameterization of the same curve. Since a = b = ~, in the last stage of the procedure, Bob
needs to distinguish between the states Ifo) = cIOO) +dill) and IiI) = cIOO) dill), which he accomplishes by making an orthogonal measurement whose outcomes symmetrically straddle Ifo) and IiI). In this case, the outcomes of this measurement will be ~(IOO) + 111)) and ~(IOO) 111)). Given either possible state, the probabilities
of the outcomes of this measurement will be ~ (c d) 2= ~ cd and ~ (c+d) 2= ~ +cd. Thus, the mutual information (and hence the procedural information) is given by
Y(c) = h(outcome) (h(outcomelstate)) = h(outcome) h(outcomeI6)
1 11.1 )
= log 16 h ,8(12cd), ... , 8(12cd),' ,8(1 + 2cd), ... , 8(1 + 2cd),'~ . ( v v8
4 4
(4.29)
Notice that the probabilities of half of the outcomes are 0 given the initial state 16) (specifically, since Bob's POVM {EI ,E2} is certain to return a particular outcome given a particular state, every outcome of the procedure which involves Bob getting the "wrong" outcome of this POVM has probability 0). Using the fact that
h(.T, y, z, w) = h(x+y, z+w) +(x+y)h (_x_, _y_) +(z+w)h (_z_,~) ,
x+y x+y z+w z+w
(4.30)
we have
y(c) = 4[h (~ cd ~ + Cd) + (~ Cd) h (~ ~ ~~)
2'2 2 4'4'4'4
+ (~ + Cd) h (~.~ ~ ~)]
2 4'4'4'4
= 2h (t cd, ~ + Cd) . (4.31)
Meanwhile, the procedural cost is given by '1f(c) = h(c2, d2). Thus, the upper bound curve obtained from the "bad teleportation" procedure is given by
(4.32)
for ~ :s; C :s; 1 (note that we may assume without loss of generality that c:2:: d). Let p = ~ + cd. Note that the ycomponents of the parameterizations (4.28) and
(4.32) are then equal. Also, using the fact that c2+ d2 = 1, we have
1= (c2 + d2 )2 = c4 + d4 + 2c2d2 ===? 14c2d2 = c4 + d4 2c2d2 . (4.33)
Taking the positive square root of both sides of this equation, we get
(4.34)
and this implies that ~ Vp(l p) = d2 . Therefore, the xcoordinates of (4.28) and
(4.32) are equal, and hence, (4.28) and (4.32) are just different parameterizations of
4.4. THE ENTANGLEMENTINFORMATION FUNCTION FOR A = 1/}2 43
the same curve. Since one is an upper bound on i:(I) and the other is a lower bound, they must both be equal to i:(1). This completes the proof of Theorem 4.1. Plotting the entanglementinformation function i:(1) for the nonlopsided Bell Ensemble, we obtain the curve in Figure 4.5.
2,,,,,,,,,
1.8
1.2
o 0.75 1 Entanglement
Figure 4.5: The entanglementinformation function for a = b = ~.
Chapter 5
The Double Trine Ensemble
5.1 Introduction
Next, we consider an example in which Alice's and Bob's particles are described by an ensemble of quantum states which are not mutually orthogonal, unlike the lopsided Bell ensemble. Consider the three singlequbit trine states [2, 3],
(5.1)
These three states lie in the real plane of the Bloch sphere, with each vector making an angle of 211/3 with the other two, as shown in Figure 5.1. From these, we can construct the three double trine states, an equal mixture of which forms the double trine ensemble:
(5.2)
vVe imagine that Alice's and Bob's particles A and B are in one of the three states Ito), Itl ), and It2 ) with equal probabilities, and they wish to determine which of these states their particles are in by using some entanglement to perform a procedure. Note that, since there are three equallylikely states, the information that Alice and Bob gain from any outcome of their procedure must be less than or equal to log 3 ~ 1.585 bits, which is the information they would gain if they were able to distinguish among these states perfectly. However, because the double trine states are not mutually orthogonal, there is no procedure which can distinguish them with 100% accuracy. The double trine ensemble has been used as a simple example in studies of the relationship between globally and locally accessible information because, despite the fact
44
5.2. TWO MEASUREMENTS
that each of the double trine states is a product state, the best known measurement for discriminating among the double trine states is a nonlocal measurement.
21T
3
Figure 5.1: A cross section of the Bloch sphere showing the three trine states.
5.2 Two Measurements
5.2.1 A NonLocal, Unentangled Measurement
We now present a specific nonlocal measurement E, introduced in Ref. [3], for which the specific information of each outcome is :::::; 1.369 bits. Only one other known measurement gives this much information (this other measurement is discussed in Refs. [2, 3]), and there is no known measurement which gives more. Let R be the unitary matrix given by
R = (cos(7f/8) sin(7f/8))
(5.3)
sin(7f/8) cos(7f/8)
and let
I¢t) = R l1/Jj) , I¢;) = Rl l1/Jj), j = 0,1,2. (5.4) These vectors are shown in Figure 5.2. Since the entries of R are all real, the ¢ states lie in the real plane of the Bloch sphere. Next, we define the six twoqubit states
j = 0,1,2. (5.5)
Finally, we define the operators 22
Ej = "3 IBj )(Bj l, F = IC,)/C'I j = 0,1,2. (5.6)
J 3 J\J'
We know that these operators are positive semidefinite, because they are all proportional to pure state density matrices, and it is straightforward to check that
L2(Ej + Fj ) = I, (5.7)
j=O
l1Po;
I¢o; I¢ci;
"
" // 7r /
<::1 /
"
I¢t;
1¢1/
""


' // 
1\ I\
I \
I \
!1P2; !1Pl;
I \
I \
1¢2; !¢t)
Figure 5.2: The trine states and the six ¢ states from which the measurement fis constructed.
By symmetry, each outcome of this measurement gives the same specific information, so the procedural information of any procedure P carrying out the measurement fis the same as the mutual information of the state and the outcome. To calculate the mutual information, first notice that the six outcomes of fare initially equally likely to occur, so h(outcome) = log 6. Next, we note that the probability of obtaining the outcome Eo given the state Ito; is
p(Eolto) = (to lEo Ito;
= "32 (to I(IBj ;(Bj I)Ito;
="32((1Pol ® (1Pol)(l¢ci; ® l¢o;)((qJcil ® (¢ol)(l1Po; ® 11/)0;)
= "32 (1Po !¢ci; (1PO I¢o; (¢ci l1Po; (¢o l1Po;
= ~ I('ljJo 1¢ci; 12 1(1Po I¢o;12 , (5.8)
5.2. TWO MEASUREMENTS
and since the vectors I¢t) and I¢o) each make angles of 7f/4 with I?/Jo) on the Bloch sphere, we have that
(5.9)
so
(5.10)
By symmetry, p(Folto) must have the same value, and furthermore, the four remaining outcomes must all be equally likely, so the remaining 0.028 of the probability must be split equally among them. Therefore, the probabilities of the six outcomes given the state Ito) are
(5.11)
Finally, once again by symmetry, notice that the probabilities of the six outcomes given Itl) or It2 ) will be exactly the same as these, except that the probabilities will be permuted among the outcomes. Therefore,
(h(outcomelstate)) = h(outcomelto)
= 2 (~ cos4(7f/8)) log (~COS4( 7f/8))
114 )(114 )
4 4: "3 cos (7f/8) log 4:"3 cos (7f/8)
(
~ 1.216, (5.12)
and hence the mutual information is
I(state : outcome) = log 6 1.216 ~ 1.369. (5.13)
Intuitively, the reason why the measurement c. distinguishes among the double trine states so well is that each of the outcomes renders one of the three states very likely, therefore making the other two states very unlikely. Although none of the outcomes definitively rule out any of the states, Alice and Bob can be very confident that A and B are in the state Itk) if they obtain either Ek or Fk as the outcome of performing c..
5.2.2 A Local Measurement
vVe next describe a local POVM ,C which discriminates among the doubletrine states with high accuracy and whose outcomes are very close to the outcomes of G, a fact which we shall take advantage of later. Let
s =(cos(Jr/3) sin(Jr/3) )
(5.14)
sin(Jr/3) cos(Jr/3) .
Note that S is a rotation by 2Jr/3 around the yaxis of the Bloch sphere with an overall minus sign. Let
1(1)
(5.15)
lAo) =J2 1'
and define
These states are shown in Figure 5.3. The overall minus sign in S was chosen for the sake of consistency, because it ensures that, as is the case with the 'ljJ, ¢+, and ¢vectors, (jl'j+1) = 1/2, where the· represents 'ljJl, "', or A and the addition in the subscript is taken modulo 2.
1"'0)
!'ljJ1 )
Figure 5.3: The three 'ljJlstates and the six", and A states. Next, define the six twoqubit states
j = 0,1,2. (5.17)
5.2. TWO MEASUREMENTS
Finally, we define our measurement operators to be
2
X· = IK·)(K·I j=0,1,2. (5.18)
J 3 JJ'
As before, the Xj's and Yj's are positive semidefinite operators because they are proportional to pure state density matrices, and
2:)Xj + Yj) = I, (5.19)
j
so the collection /:.; = {Xo,Xl, X2,Yo, Yl, Y2 } is a valid POVM.
Once again by symmetry, each outcome of this measurement gives the same specific information, so the procedural information of any procedure carrying out ,G is the same as the mutual information of the state and the outcome. The six outcomes of ,G are equally likely a priori, so h(outcome) = log 6. Given that the initial state is Ito) = l1,Uo) ® l1,Uo) , the probability of obtaining the outcome Xo is
p(Xolto) = (toIXolto)
2
= "3 ((1,Uol ® (1,Uol)(lK:o) ® 1>'0))( (K:ol ® (>'ol)(l1,Uo) ® l1,Uo)) 2 ~~
= "3 (1,UolK:o) (1,Uol1,Uo )(K:ol1,Uo)(1,Uo l1,Uo) = 0 (5.20)
because l1,Uo) and 11,U~) are orthogonal. Similarly, p(Yolto) = O. The probability of obtaining Xl given Ito) is
p(Xllto) = (toIXllto)
2 ~~
= "3 ((1,Uol ® (1,Uol)(IK:l) ® l1,Ul ))( (K:ll ® (1,Ul 1)(I1,Uo) ® l1,Uo))
= ~ 1(1,UOIK:l) 12 1(1,UoI1,bfW
= "32 cos2(57f/12) cos2(7f/6), (5.21)
and by symmetry,
(5.22)
also. The remaining probability must be split evenly between the remaining two outcomes, so the probabilities of the six outcomes given Ito) are
p(Xolto) = p(Yolto) = 0
2
p(XIlto) = p(Y2 I to) = "3 cos2(51f/12) cos2(1f/6)
22
p(YIlto) = p(X2I to) = 21 "32 cos (51f/12) cos(1f/6). (5.23)
As before, by symmetry, the probabilities of the six outcomes given ItI) or It2) will be exactly the same as these, except with the probabilities permuted among the outcomes. Therefore,
(h(outcomelstate)) = h(outcomelto) :=:::; 1.355, (5.24)
and thus, I(outcome: state) = 10g6 1.355:=:::; 1.23. (5.25) Notice that this is slightly less than the information given by G.
Let us now turn to the question of how Alice and Bob can perform this measurement ,G locally. First, Bob performs a nonorthogonal measurement on B with the outcomes {11/J~), l1/Jf) , l1/Jt)}· Technically, this amounts to performing the POVJVI
211211211}
{ "311/Jo )('1/)0 I, "311/JI )(1/JI I, "311/J2 )(1/J2! (5.26)
on B. This has the effect of ruling out one of the possible states of A and Band leaving the other two equally likely, because if he obtains the outcome !1/Jt) , his particle could not have been in the state l1/Jk) initially, and therefore the two particles could not have been in the state Itk)' It is then Alice's task to distinguish between the two remaining possibilities Itk+1) and Itk+2)' To do so, she will perform the orthogonal measurement with the outcomes {!K:k), IAk)}, which symmetrically straddle the states 1'I/)k+I) and 1'I/)k+2) on the Bloch sphere. The result of Bob's measurement fixes the index of the element of ,G that they obtain from this procedure. That is, if Bob obtains the outcome l1/Jt) , the element of ,G corresponding to the outcome of the entire procedure will either be Xk or Yk. Alice's measurement then decides between these two possibilities.
In short, the measurement ,G definitively rules out one of the three double trine states and leaves one of the remaining two fairly likely and the other fairly unlikely. However, as we calculated above, it does not discriminate among the three states quite as well as G does.
5.3. A UNITARY TRANSFORMATION PROCEDURE
5.3 A Unitary Transformation Procedure
We now wish to find a procedure which uses less than 1 ebit of entanglement and which yields more information than the local procedure ,c about the state of A and
B. To do so, we make precise the notion that, as we mentioned, the outcomes of ,c are "very close" to the outcomes of E by finding a weak unitary transformation which maps the outcomes of E onto the outcomes of ,c. From this, we can find an even weaker transformation which maps the outcomes of E "part of the way" to the outcomes of ,c. If Alice and Bob perform this weaker transformation on their qubits and then make the local measurement ,c, they will obtain more information than if they had performed ,c without doing the unitary transformation, because they will have effectively done a measurement that is closer to Ethan ,c is.
5.3.1 Realizing c and ,G as Orthogonal Measurements
The reader may have already noticed that that the inner products between the outcomes of E are not the same as the inner products between the outcomes of ,c, which implies that there does not exist a unitary transformation on two qubits which maps the outcomes of E onto the outcomes of ,c. To remedy this problem, we need to realize E and'c as orthogonal measurements by introducing more dimensions into the state space of A and B.
Suppose that Bob appends a qutrit C known to be in the state 10) to the AB system (he is allowed to do so because preparing an auxiliary quantum system locally is allowed under LOCC). Then, he performs a unitary transformation U on his two particles Band C which permutes the BC basis states in the following way:
UIOO) BC = 100) BC
U110)Bc = 101)Bc, (5.27)
Note that it does not matter what U does to the other BC basis states because we assumed that C is in the state 10). Also, notice that, once Bob has performed this transformation, the qubit B is known to be in the state 10), so Bob can discard it without losing any information about the initial state of A and B. Finally, notice that U has the effect of encoding the state of B into the first two dimensions of C. That is, once Bob has performed U, the 12)c component of the state of C is 0, and the 10)c and 11)c components are exactly the same as the 10)B and 11)B components of the initial state of B.
Let Icpj) be a qutrit state equal to the qubit state I¢j) in its first two dimensions and 0 in its third dimension, and let Icpj), IWj), and IWf) be defined similarly. Notice
that the inner product of any two of the "capital" vectors is equal to the inner product of the two corresponding "lowercase" vectors, for example (olwo) = (¢ol1/Jo). Let M be a measurement on A and C with the outcomes {Imn, Imj) }7=0, where
ImnAC = ~1¢nAIj)c + [f11/Jj)AI2)c
Imj)Ac = ~1¢j)AIt)c [f11/Jj)AI2)c, (5.28)
and let N be a measurement on A and C with outcomes {Inn, Inj) }J=o, where
(5.29)
(From here on in this chapter, we'll take addition in the subscripts to be modulo 2 and we'll omit the tensor product symbol ® when possible). It is straightforward to check that the outcomes of M are mutually orthogonal (remember that the state space of A and C has six dimensions, so there can exist six mutually orthogonal vectors). For example, using the fact that (¢tl¢o) = (tlo), we have that
(mtlmo) ~ (~(¢tl(epol + [f(¢ol(21) (~Wo)lept) [f1¢o)12))
= ~(¢tl¢o)(olt) ~(1/Jol1/Jo)(212)
= ~ I ( ¢tI ¢o) 1 2 ~
= 2 2 1:3 cos (7f/ 4) :3
211
. 
323
= O. (5.30)
Similarly, the outcomes of N are mutually orthogonal. That is, JY( and N are or
thogonal measurements. Furthermore, notice that the outcomes of N are product states, because Alice's part can be factored out in each case. For example, Int) can be written in the form IAl) A (J273lwf) + /17312) )c. Finally, notice that N can be
5.3. A UNITARY TRANSFORMATION PROCEDURE
done locally. To accomplish this, Bob first makes the orthogonal measurement with outcomes {J273lwf) + JI7312) }J=o and communicates his measurement result to Alice classically. If he obtains the outcome J273lwf) + JI7312),Alice performs the orthogonal measurement {I K,j), I}.j) }.
We now claim that performing the measurement ]V( on A and C is exactly equivalent to performing the measurement G on the initial state of A and B. Similarly, we claim that performing N on A and C is equivalent to performing £ on the initial state of A and B. To show that this is true, we need to show that the measurement statistics of ]V( on A and C are the same as the measurement statistics of G on A and B, and similarly for Nand £. Rather than carrying out all of the calculations necessary to prove this in detail here, we'll show how the calculations proceed by giving a representative example.
Let ITj)AC = l'l,IJj)Alwj)c, that is, the ITj)'s are the double trine states with the initial state of Bob's qubit B encoded into the first two dimensions of C. Note that the probability of obtaining the outcome Imt) of]v( given the initial state ITo) is
p(rntI1o)~
((f(¢t I(aIWoW
22
= ~I (¢tl'l,lJo) 11(¢al'l,lJo) 1, (5.31)
which is the same as expression (5.8) for p(Eolto). Notice that the crucial fact in this calculation was that the second term in Imt) is orthogonal to ITo) and thus it doesn't contribute at all to the probability. Similar calculations yield that p(m; In) = p(Ejltk)' p(mjITk) = p(Fjltk), p(n;ITk) = p(}j+lltk), and p(njITk) = p(Xj+2Itk). Therefore, the measurement statistics of ]V( and N are the same as the measurement statistics of G and £, respectively.
5.3.2 Finding the Appropriate Unitary Transformation
Since ]V( and N each have orthogonal outcomes, there must exist a unitary transformation W which takes the outcomes of]v( to the outcomes of N. Let
U = (Imt) Ima)Imt) ... 1m2)) (5.32)
and
v = (Inci) Ino)Inn ... Inz))· (5.33)
That is, U is a matrix whose columns are the outcomes of JY( and V is a matrix whose columns are the corresponding outcomes of N. Then
W = VU1 (5.34)
is the desired unitary transformation, because vVlmn = Inn and vVlmj) = Inf). This transformation can be written in the form 1¥ = ei7fH/12 , where the Hermitian matrix H is given by
H= 0 0 0 0 0 0 0 i y'2 0 0 0 i y'2 0 2 y'2 0 0 0 i y'2 0 0 0 0 0 0 0 i y'2 0 0 0 i y'2 (5.35)
i y'2 0 0 0 i y'2 0
Now, consider a more general transformation of the form l¥(oo) = eio:H , where H is defined as above and 0 :::; a :::; 7f/12. If a < 7f/12, then l¥(oo) will map the outcomes of JY( " part of the way" to outcomes n. notice that if alice and bob perform transformation 1¥(a) on particles a c then make local measurement n, statistics will be exactly same as they orthogonal {l¥(oo)lnci), vv(oo)lno), ... ,l¥(oo)lnz)} original state c. is, applying w(oo) making n is equivalent l1v( a) each c, because unitary transformations preserve inner products, given are determined entirely by products with being measured. 5.3.3 cost w(a) unfortunately, vv (a) nonlocal transformation, so need spend some entanglement it. find an upper bound this we decompose liv(oo) into three separate 5.3. procedure for which can bounds cost. let ( i) 00 a~2="0" 0 (5.36) i0 acts dimensions 10) 12) pauli y matrix in dimension 11), defined similarly. (5.37) q((3)="ei!3(CJx" @cjb2 ). (5.38) claim all :::; 0: 'if 12, w(o:)="R(B)Q((3)R(B)" (5.39) 1 coso:) b="sin1" . (3 (5.40) 2sm obtain condition, set right left sides equal other band arbitrary. comparison two components these matrices yields conditions ~ (1 + cos 0:)="cos" (1cos sin b, (5.41) 22 solved (5.40). plugging those expressions r(b)q((3)r(b) simplifying resulting (5.39). using results berry [18] childs et al [20], method use form r(r) or q(r). see appendix detailed description method. requires share amount b(r)="2C(r/2)" average, where c(r) function plotted numerically figure 4.1. thus, must order vv(o:) '1&"(0:)="B((3)" 2b(b), bare 5.4. 2,,,,,,, 0.075 0.15 0.225 5.4: average (in ebits) performing ltv oo. 5.3.4 calculating procedural information summary, p(a) follows: first, encode initial qutrit way discussed. then, steps described a. finally, however, mentioned earlier, them identical w( oo)n="{W(" oo)lnj), oo)!nj)};="o" performed calculate considering latter measurement. 1£;(00))="ltV(oo)lnj)" i£j(oo))="W(oo)lnj)," }~="o'" (5.42) initially one double trine states itj ), has 3 probability occurring. therefore, priori outcome i£t(a)) p(£t(a))="LP(Tj" )p(£t(a) ) j="O" 12=":3" lp(£t(a)itj (5.43) p(£t(a)itj always, similarly 1£;;) outcomes. may verify probabilities 6 (we recommend computer calculation). symmetry properties ensemble measurements ]v( it natural suspect w(a)n give specific information. true straight from equation (3.9) (once again, y'(a) any vv(a)n, (a)) _i i p(et(a) og l.j og l' (5.44) 3·3· 66 evaluating several values a, parametric plot versus cost, parameterized angle shown 5.5. know certain 1.369 bits ebit entanglement, could teleport particle bob, who glocally both particles. interesting fact note 5.5 more than 1.5 ebits achieve suggests very strongly there exists gives substantially entanglements near 1. also curve p close diagonal dashed line o. points our measure success were rather minimum outcome. point line, flip weighted 1.4 ~e~~_ q .3 1.3 cd s h '§ 1.2 f< 1.1 1'_'_'_'.j'.j_.l_.l_j o 0.5 11.5 5.5: .j1(a) '6'(a) p(a), horizontal located at .j1="1.369" bits, conjectured maximum possible gained about b. open circle (1, 1.369), would have achieved had simply teleported he g locally instead procedure. coin. coin comes up heads, teleports her performs locally. tails, does not .g adjusting weighting coin, (on average). although proven it, above case impossible under strict success. among best procedures available only going small entanglement. concavedown, whereas curves considered chapter 4 concaveup, except conclusion 6.1 summary thesis, investigated entanglementinformation lopsided bell ensemble. ensemble, found procedures, pu "bad teleportation" pt. special nonlopsided pt optimal finding lower showing agree, therefore explicitly (a). 6.2 ideas future research 6.2.1 g(i) ensembles like able apply section 4.4 9 f. Finally, let
andlet G= IA (>9 9 ® Ix ® Iy. any vector 177) of the form
177)=
we have that
ttt,
GT F U3U2Vwhole(e/2)U2Vwhole(e/2)U3FGI77) = t(e) 177). (A.19)
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 , (A.16)
0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 1
0 0 (A.17)
g~ 01 n,
It is straightforward (if messy) to compute that for
al a2 a3 a4 a5 a6
(A.18)
0G)®G}
Since X and Yare both known to be in the state 10), we know that the state of ACXY is of the form h). Notice that U2, Us, F, and G can all be written as a product of a transformation on Alice's system A with a transformation on Bob's system CXY. Therefore, all of these transformations can be performed locally, that is, they require no entanglement. Also, notice that the right hand side of (A.19) is the desired transformation applied to a vector of the form we care about, and the left hand side consists only of local unitary transformations and two transformations of the form V2 (e/2). Since we know that Alice and Bob can perform transformations of the form V2b) by the procedure described in the previous section, we therefore know that Alice and Bob can perform the transformation t(e) on A, C, X, and Y, which amounts to performing the transformation T(e) on A and C. The cost of performing t(e) on A, C, X, and Y is therefore B(e) = 2C(e/2) as we claimed in section 5.3.3, where C(e) is the entanglement that Alice and Bob must share to perform V2 (e) and is plotted in figure 4.1.
To arrive at the decomposition (A.19), we used a prescription described in Ref. [20]. In Lemma 2 of this paper, Childs, Leung, and Vidal describe a method for reversibly simulating any tensor product Hamiltonian Hp with the Ising Hamiltonian HIsing = (}z ® (}z· Although they use the language of time evolution under Hamiltonians instead of the language of unitary transformations and angles, these two concepts are mathematically equivalent, so their results apply to our case.
In particular, Childs et al describe a procedure for simulating the Hamiltonian Hp in two steps, namely
H
II
HIsing 7 7 Hp, (A.20)
where H II is an intermediate Hamiltonian defined in equation (30) of Ref. [20], and the arrows can be read as "simulates." To apply their results to our case, we take Hp =® (}~2. In the decomposition (A.19), the transformations U2 and Us are
(}z
responsible for the first simulation in (A.20), while the transformations F and G are responsible for the second simulation.
Appendix B
Proof of Lemma 4.2
vVe now prove Lemma 4.2, which we restate here for convenience.
Lemma 4.2. Let 1 :s; I :s; 2. Up to reordering of the second, third, and fourth
components, the probability vector p = (P1,P2,P3,P4) with PI ~ P2,P3,P4 and h(p) :s; 2I which minimizes the entanglement of formation (4·24) of P~~l is P = (p,1 P, 0, 0), where h(p, 1p) = 2I.
To prove this, we first define a convenient representation of the set of all probability vectors p = (P1,P2,P3,P4) with each Pi ~ 0 and "LiPi = 1. Let t 1, t2, t3, t4 E ]R3 be the vertices of a regular solid tetrahedron T, and let fi be the face of T opposite the vertex k For simplicity, we may choose the t/s so that the distance from ti to fi is 1 for each i. We associate the probability vector p = (P1,P2,P3,P4) to the convex combination P1t1 + P2t2 + P3t3 + P4t4, which is contained inside T. Notice that any point in T can be represented as such a convex combination, so this association puts probability vectors of the form p into onetoone correspondence with points in T. Also, notice that the vector (P1,P2,P3,P4) will be associated to the point a distance PI from h, P2 from 12, and so on.
We now need an intermediate lemma.
Lemma B.l. Let p = (P1,P2,P3,P4) be a probability vector such that PI ~ P2,P3,P4. If h(p) :s; 1, then PI ~ 1/2.
Proof. We show the contrapositive. Fix 1/4 :s; PI :s; 1/2 (we must have that PI ~ 1/4, because otherwise, we could not have PI ~ P2,P3,P4). Let P be a plane parallel to the face h and a height PI above it. Then the set of all probability vectors (P1,P2,P3,P4) is exactly Tn P, which is a triangular piece of the plane P. The set of all vectors (P1,P2,P3,P4) with PI ~ P2,P3,P4 is therefore a subset of T n P.
69
Figure B.1: The tetrahedron T.
We need to consider two cases:
(1) ~ < PI < ~,
(2) I< L6 aJi,(Vi) i=l
(B. 1)
where the last equality follows because h(Vi) is the same for each i and Li ai = 1, and the inequality is strict because more than 1 ai is nonzero. But this contradicts our assumption that the global minimum occurs at p. Hence, the global minimum of the Shannon Entropy on H occurs at the vertices Vi.
The Shannon Entropy at the vertices is given by
h(Vi) = h(PI,PI, 12PI, 0)
= h(2pI, 12pd + 2pIh (~, ~) + (1 2pdh(1 2PI, 0)
= h(2pI, 12PI) + 2pI. (B.2)
Consider the function f(x) = h(x, 1x) + .T. Note that f(x) is concave in x. Also, note that f(2/3) = h(2/3, 1/3) + 2/3 ~ 1.585 and f(l) = 1. Thus, since f is concave, we have that f(x) > 1 for any 2/3 < x < 1. Therefore, for any 1/3 < PI < 1/2, we have that h(Vi) = h(2p1l1 2pd + 2PI = f(2Pd > 1, and since the global minimum of h on H occurs at Vi, we have that h(p) > 1 for any p E H, which is what we wanted to show.
In case (2), the set of points for which one of P2, P3, and P4 is equal to PI consists of three line segments which intersect each other as is shown in Figure B.3. In this case,
\ \ \/
\7
\/
\/
\I
/\
/\
/\
Figure B.3: A top down view of T n P in case (2). As before, the dashed lines represent the points for which one of P2, P3, and P4 is equal to Pl.
the condition that PI 2: P2, P3, P4 restricts us to considering only those points in the small inner triangle R formed by the three dashed lines. Let {Wi}7=l be the vertices of R, and note that, up to reordering of the second, third, and fourth components, these vertices represent probability vectors of the form (PI, PI, PI, 1 3Pl). From here, the argument is essentially identical to the one used in case (1). D
Now, to prove Lemma 4.2, we need to minimize the entanglement of formation c(pd over all probability vectors P = (PI,P2,P3,P4) with PI 2: P2,P3,P4 satisfying h(p) :s; 2I. Since 1 :s; 1 ~ 2, we know by Lemma B.1 that PI 2: 1/2. Notice that the function C(Pl) (defined in equation (4.24)) increases monotonically as PI runs from 1/2 to 1. Therefore, we need to minimize Plover all probability vectors P = (Pl,P2,P3,P4) with PI 2: P2,P3,P4 satisfyingh(p):S; 21. Let PI = (p, 1p,O,O), where P is the unique number between 1/2 and 1 such that h(p, 1p) = 21, and let P2 and P3 be the two points equivalent to PI up to reordering of the second, third, and fourth components. These points each lie on one of the three edges of T which meet at t l, and since their first coordinates are the same, they are the same height above the face II, as is shown in Figure B.4. By the concavity of the Shannon Entropy, any point P::/: PI,P2,P3 with h(p) = 21 must lie above PI, P2, and P3 and therefore must have a greater value of PI than the p/s. Also, note that the set of all P with PI 2: P2, P3, P4 and h(p) < 21 is exactly the set of all points above the surface h(p) = 21. Therefore, PI, P2 and P3 have the smallest value of PI of all points P with PI 2: P2, P3, P4 and h(p) ~ 21. Since the p/s are of the form
Figure B.4: The points PI, P2, and P3 on the tetrahedron T.
(p, 1p, 0,0) up to reordering of the last three components, this completes the proof of Lemma 4.2.
Bibliography
[1] Charles H. Bennett, David P. DiVincenzo, Christopher A. Fuchs, Tal Mol', Eric Rains, Peter VV. Shor, John A. Smolin, and William K. vVootters. Quantum nonlocality without entanglement. Phys. Rev. A, 59(2):10701091, Feb 1999.
[2] Asher Peres and William K. Wootters. Optimal detection of quantum information. Phys. Rev. Lett., 66(9):11191122, Mar 1991.
[3] \iVilliam K. Wootters. Distinguishing unentangled states with an unentangled measurement. International Journal of Quantum Information, 4:219~232, February 2006.
[4] A. Einstein, B. Podolsky, and N. Rosen. Can quantummechanical description of physical reality be considered complete? Phys. Rev., 47(10):777780, May 1935.
[5] John S. Bell. On the einstein podolski rosen paradox. Physics, 1(3):195~200, February 1964.
[6] Charles H. Bennett, Herbert J. Bernstein, Sandu Popescu, and Benjamin Schumacher. Concentrating partial entanglement by local operations. Phys. Rev. A, 53(4):20462052, Apr 1996.
[7] Charles H. Bennett, David P. DiVincenzo, John A. Smolin, and William K. Wootters. Mixedstate entanglement and quantum error correction. Phys. Rev. A, 54(5):38243851, Nov 1996.
[8] Charles H. Bennett, Gilles Brassard, Sandu Popescu, Benjamin Schumacher, John A. Smolin, and vVilliam K. vVootters. Purification of noisy entanglement and faithful teleportation via noisy channels. Phys. Rev. Lett., 76(5):722725, Jan 1996.
[9] William K. Wootters. Entanglement of formation of an arbitrary state of two qubits. Phys. Rev. Lett., 80(10):22452248, Mar 1998.
74
BIBLIOGRAPHY
[10] Shelby Kimmel. Quantifying the entanglement cost of quantum measurements. Williams College Senior Honors Thesis, May 2008.
[11] Somshubhro Bandyopadhyay, Gilles Brassard, Shelby Kimmel, and William K. Wootters. Entanglement cost of nonlocal measurements. arxiv:quantph/0809.2264v4, March 2009.
[12] Jonathan Walgate, Anthony J. Short, Lucien Hardy, and Vlatko Vedral. Local distinguishability of multipartite orthogonal quantum states. Phys. Rev. Lett., 85(23):49724975, Dec 2000.
[13] Sibasish Ghosh, Guruprasad Kar, Anirban Roy, Aditi Sen(De), and Ujjwal Sen. Distinguishability of bell states. Phys. Rev. Lett., 87(27):277902, Dec 2001.
[14] Barbara M. Terhal, David P. DiVincenzo, and Debbie 'vV. Leung. Hiding bits in bell states. Phys. Rev. Lett., 86(25):58075810, Jun 2001.
[15] D. P. DiVincenzo, D. W. Leung, and B. M. Terhal. Quantum data hiding. IEEE Trans. Inf. Theory, 48:580598, 2002.
[16] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, 2000.
[17] Charles H. Bennett, Gilles Brassard, Claude Crepeau, Richard Jozsa, Asher Peres, and William K. Wootters. Teleporting an unknown quantum state via dual classical and einsteinpodolskyrosen channels. Phys. Rev. Lett., 70(13):18951899, Mar 1993.
[18] Dominic W. Berry. Implementation of multipartite unitary operations with limited resources. Physical Review A (Atomic, Molecular, and Optical Physics), 75(3):032349, 2007.
[19] L.B. Levitin. Optimal quantum measurements for two pure and mixed states. In
V.P. Belavkin, O. Hirota, and R.L. Hudson, editors, Quantum Communications and Measurement, pages 439448. Plenum Press, 1995.
[20] Andrew M. Childs, Debbie W. Leung, and Guifre Vidal. Reversable simulation of bipartite product hamiltonians. IEEE Trans. Inf. Theory, 50(6):11891197, 2004.