Gleason's Theorem: Helena Granstr Om August 31, 2006

Gleasons theorem
Helena Granstr om
August 31, 2006
Abstract
Gleasons theorem is a central result in mathematical physics. From it can be derived the
standard method of calculating quantum probabilites, by taking the trace of the product
between (the matrix representation of) the relevant projection operator and the so-called
density matrix.
In this diploma work, a thorough presentation is rst given of Gleasons original argument.
We then proceed to look at a Gleason-type theorem for so-called POVMs, relaxing the
assumptions of a proof published in 2003 and reaching the same result. Thereafter, a
Gleason-type theorem is proved for two restricted classes of POVMs.
Kochen-Speckers theorem (KS), implied by the Gleason result, is then presented. The
theorem (which is easily translated in terms of colourings) holds interesting implications
for the formulation of so-called hidden variables theories.
Here, a particular (incomplete) KS colouring is explored in some depth. We investigate
how eective the colouring will be in three and higher dimensions, using two dierent mea-
sures of this. Finally, we show that a restricted class of POVMs does not enforce the KS
result.
Sammanfattning
Gleasons teorem ar ett centralt resultat inom den matematiska fysiken. Fran det f oljer
det vanliga s attet att ber akna sannolikheter inom kvantmekaniken, genom att ta sparet av
produkten mellan (matrisrepresentation av) en l ampligt vald projektionsoperator och den
s.k. t athetsmatrisen.
Detta examensarbete inleds med en noggrann genomgang av Gleasons ursprungliga argu-
ment. Vi presenterar en sats av Gleason-typ f or s.k. POVM:er som bevisades 2003, och
visar att vi med svagare antaganden an de som gjordes i originalbeviset kan na samma
resultat. D arefter visar vi ett resultat analogt med Gleasons f or tva begr ansade klasser av
POVM:er.
Vi gar sedan vidare med en presentation av Kochen-Speckers teorem (KS), som f oljer av
Gleasons resultat. Denna sats (som enkelt kan overs attas i termer av f argningar) har in-
tressanta implikationer f or teorier med s.k. dolda variabler.
H ar utforskar vi n armare en s arskild (ofullst andig) KS-f argning. Vi unders oker hur eektiv
denna f argning ar i dimension tre och h ogre, med tva olika matt pa detta. Slutligen visar
vi att en begr ansad klass av POVM:er inte tvingar fram Kochen-Speckers resultat.
Acknowledgements
THANKS
Asa, Hans, Mattias, Mathias, Andreas, Patrik, Emil for handling crises of the LaTex,
integral and colouring variety. Those who should be thanked for handling all other types
of crises know who they are. Thanks are also due to my supervisor Ingemar Bengtsson
who, in addition to being a devoted and respectful teacher, falls into both of the above
categories.
Und es wurde fertig, das Leidenswerk. Es wurde vielleicht nicht gut, aber
es wurde fertig. Und als es fertig war, siehe, da war es auch gut.
Thomas Mann
Contents
1 Gleason explained 2
1.1 The proof of Gleasons theorem . . . . . . . . . . . . . . . . . . . . . . . . 3
2 A Gleason-type theorem for POVMs 19
2.1 Linearity with respect to the non-negative rationals . . . . . . . . . . . . . 21
2.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Linearity and the inner product . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Gleasons theorem for informationally complete POVMs 25
3.1 The quantum probability rule for a rst class of POVMs . . . . . . . . . . 29
3.2 The quantum probability rule for a second class of POVMs . . . . . . . . . 30
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4 Kochen-Speckers theorem 33
4.1 Kochen-Speckers theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 A KS colouring in arbitrary dimensions . . . . . . . . . . . . . . . . . . . . 34
4.3 KS coloured bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 KS for a restricted class of sic-POVMs . . . . . . . . . . . . . . . . . . . . 46
5 Conclusions and open questions 48
1
Chapter 1
Gleason explained
Gleasons theorem, formulated and proved by Andrew M. Gleason in 1957, is a state-
ment about measures on Hilbert spaces of dimension at least three. The theorem states
that the only possible probability measures on such spaces are measures of the form
(a) = Tr(P
a
), where is a positive semi-denite self-adjoint operator of unit trace, and
where P
a
is a projection operator for projection onto the subspace a.
Postulating that any orthogonal basis in some Hilbert space corresponds to a measure-
ment and that quantum systems can be represented by such spaces, we can understand
the projection operators as representing yes-no observables a
1
, commuting projectors cor-
responding to yes-no questions that can be simultaneously answered (or asked). Any
(measurable) property a of the system is then uniquely associated with a subspace (which
could be one-dimensional i.e. a vector) of the systems Hilbert space - within this frame-
work, Gleasons statement is one about the probability of obtaining a given outcome when
making a measurement on a quantum system. The theorem is of profound importance to
modern physics due to its strong implications for how probabilities can be introduced into
quantum mechanics. Put another way, it is a statement about the validity and uniqueness
of the quantum probability rule.
Gleasons original proof [1] has over the years acquired a reputation of being impenetrable
and hard to grasp. While the proof is rather compactly written it is not at all, however,
impossible to understand. In the following section we will go through Gleasons argument
step by step, trying to make explicit the implicit statements made in the original. The
steps will be more or less identical to those made by Gleason himself, but some things will
be expanded upon, and the structure of the proof will hopefully be made more visible.
One aspect of the theorem, which we will return to in our discussion of the Kochen-
Specker theorem in chapter 4, is that it contains a non-contextuality assumption; that is,
the assumption that the value
2
assigned to a vector v is independent of which basis we
consider this vector to be a part of. As Gleasons theorem shows, this is a strong assump-
1
I.e. answers to questions like Is the spin up in the z direction?
2
In eect, probability.
2
tion, and it has turned out to be of non-trivial importance in relation to attempts to reduce
the indeterminate-probabilistic nature of quantum mechanics by means of hidden variables
theories.
1.1 The proof of Gleasons theorem
We will start out with some denitions. First, let us remind ourselves of the denition of
a Hilbert space.
Denition 1.1.1 Any real or complex vector
3
space H with an inner product [ ) can be
ascribed a norm [ [ according to
[ x [=
_
x [ x).
If H is complete
4
under this norm, H is called a Hilbert space.
We will also give some more specic denitions.
Denition 1.1.2 A frame function of weight W for a separable
5
Hilbert space H is a
real-valued function dened on the unit sphere of H such that for any orthonormal basis
x
i
of H ,
i
f(x
i
) = W (1.1)
Denition 1.1.3 A frame function f is regular if and only if there exists a self-adjoint
operator dened on H such that
f(x) = x [ [ x)
for all unit vectors x.
On the space of operators, itself a Hilbert space, we will use the inner product
A [ B) = Tr(A
B) (1.2)
which will also dene the norm
[ A [=
_
A [ A) (1.3)
3
Throughout this section the words point and vector will be used interchangeably, so that the statement
that two points are orthogonal amounts to saying that the respective position vectors specifying the two
points are orthogonal.
4
A metric space M is said to be complete if every Cauchy sequence of points in M has a limit that is
also in M.
5
Separable in this context means that some countable subset of the space is dense in it. This means
that the space has some countable subset with which all its elements can be approached, in the sense of a
mathematical limit. An example of this is how any real number can be approximated to arbitrarily high
accuracy by rational numbers.
3
On the space of vectors, the inner product will be the regular scalar product, and bracket
notation will be used. No dierence in notation will be made between these two operations,
since it should be clear from the context which is referred to.
The crux of Gleasons proof is the realization that proving the theorem for a Hilbert space
of arbitrary dimension n 3 is accomplished by proving it for every two-dimensional sub-
space of a three-dimensional Hilbert space. This insight is indeed non-trivial.
Proving Gleasons theorem in three or higher dimensions is in eect equivalent to show-
ing that any non-negative frame function dened on a real or complex Hilbert space H
of dimension at least three is regular. Gleason arrives at this conclusion by considering
subspaces of H of dimension two, and embedding them in a three-dimensional subspace
(which can be done, because dimH 3.) Therafter he makes use of the fact that f is
regular on any such two-dimensional subspace, and that this means that f is regular on the
whole of its domain of denition. (Reaching these results, however, requires some work.)
Lets start out by investigating some properties of frame functions in H
3
, and two-
dimensional subspaces thereof.
Lemma 1.1.1 In a nite-dimensional real Hilbert space a frame function is regular if and
only if it is the restriction to the unit sphere of a quadratic form.
Proof : This is clear from the denition of regularity - the restriction to the unit sphere
accounts for the fact that the denition of regularity only involves unit vectors x.
Lemma 1.1.2 A function f on the unit circle in R
2
, f = cos n, where is dened as the
angle relative the positive x-axis, is a frame function if and only if n = 0 or n 2 (mod
4).
Proof : Assume that f has weight W. Since f is a frame function and unit vectors in the
directions and +

2
constitute an orthonormal basis in two dimensions, we must have
that
cos n + cos n( +

2
) = cos n + cos n cos n
2
sin n sin n
2
=
(1 + cos n
2
) cos n sin n
2
sin n = W
for all , 0 < 2.
This is true if and only if either n = 0 or
1 + cos n
2
= 0 cos n
2
= 1
n
2
= +k2 n = 2 + 4k n 2(mod 4).
4
Theorem 1.1.1 Every continuous frame function on the unit sphere in R
3
is regular.
Proof : Let C denote the space of continuous functions on the unit sphere S in R
3
with
norm given by the standard inner product on R
3
as dened above. The rotation group G
in R
3
is represented by a group of linear operators acting on C if we dene
U
h = h
1
, G, h C (1.4)
Applying the rotation to h is of course the same as applying the inverse rotation
1
to
the argument of h before h acts on it.
Let Q
l
denote the space of spherical harmonics Y
lm
of degree l. As we know, the spherical
harmonics are solutions of the Laplace equation in R
3
. These Q
l
are irreducible, rota-
tionally invariant subspaces of C - in fact they are the only such subspaces. Let F be
the subspace of C consisting of continuous frame functions on S. From the denition of
a frame function we can deduce that F is a closed subspace of C, a subspace which is
invariant under rotations.
We know that every continuous function on the unit sphere can be expressed as a sum
of spherical harmonics Y
lm
, and we would like to nd out what harmonics contribute to
the sums for the elements of F. In order to do this, we note the following properties of
the spherical harmonics. The and , or for cartesian coordinates ( n)
z
and ( n)
x
+ i( n)
y
,
dependencies can be separated, according to
Y
lm
( n) = Y
lm
(, ) =
(2l + 1)
4
(l m)!
(l +m)!
P
m
l
(cos )e
im
(1.5)
From this it follows how the spherical harmonics transform under reection, parity and
conjugation.
Y
lm
( n) = (1)
l+m
Y
lm
( , ) = (1)
m
Y
lm
(, +) = (1)
l
Y
lm
( n) = (1)
m
Y
lm
( n)
(1.6)
Q
0
is a space of constant functions, and since every constant function is a frame function,
we see that Q
0
F, so that in the expansion of a frame function f on the sphere, Y
00
can
contribute. Q
1
consists of linear functions on R
3
restricted to S. These functions change
sign under parity as can be seen from equation (1.6), which we know is not the case for
frame functions; if x
i
is an orthonormal basis, so of course is x
i
, and hence we know
them to sum to the same constant, with no change of sign. Hence, Q
1
, F and no Y
1m
s
occur in the parametrization of f. Q
2
, on the other hand, contains the restrictions to S of
quadratic forms of zero trace
6
. Quadratic forms of zero trace on S are frame functions of
weight 0 on S, meaning that Q
2
F.
6
The reader can convince herself that this is likely, considering that the space Q
2
has ve linearly
independent components, namely those of dierent m, with m [2, 2], which is just equal to the number
of independent components of a symmetric traceless 3 3-matrix.
5
To check if any Q
l
with l > 2 is a subset of F we proceed as follows. As noted above, the
spherical harmonics are solutions to the Laplace equation in R
3
. In cylindrical coordinates,
this equation reads
1
r
r
(r
r
) +
1
r
2
2
+

2
z
2
(1.7)
By insertion, its easily veried that
1
= r
l
cos l and
2
= (r
2
2(l 1)z
2
)r
l2
cos (l 2)
are both solutions of this equation of order l, and as such are elements in Q
l
and can be
written as a linear combination of Y
lm
s.
Under the assumption that Q
l
F these functions would be frame functions not only
on S, but also on the unit circle in the xy-plane by restriction. Consider a family of or-
thogonal triples of vectors x
1
, x
2
, x
3
with one vector, say x
1
, xed. Let f be a frame
function dened on this set such that
f(x
1
) +f(x
2
) +f(x
3
) = W (1.8)
For all orthogonal pairs of vectors x
2
, x
3
lying in the plane orthogonal to x
1
we will have
that
f(x
2
) +f(x
3
) = W f(x
1
) (1.9)
so f is a frame function also on the great circle orthogonal to the x vector x
1
.
The assumption that l > 2, however, then contradicts the statement of lemma 1.1.2,
namely that a function of the form cos n (or, as in this case, proportional to cos n) will
be a frame function if and only if n = 0 or n 2 (mod 4). Because this cannot be satised
by both l and l 2 simultaneously (for l > 2) we conclude that Q
l
, F for l > 2. By
this we can conclude that F is the closed linear span
7
of Q
0
and Q
2
, which in this case is
equivalent with saying that F = Q
0
+ Q
2
. This means that in the parametrization of a
frame function f on the unit sphere, possible under the assumption that f is continuous,
only Y
00
and Y
2m
, m = 2, 1, 0, 1, 2, can contribute.
As noted in lemma 1.1.1, a frame function is regular if and only if it is the restriction
to the unit sphere of a quadratic form. That this is the case with the functions of Q
2
is
clear. As for the constant functions of Q
0
, they too can be considered as restrictions of
quadratic forms, because on S the constant 1 = x
2
+y
2
+z
2
, which no doubt is a quadratic
form. So, the fact that F, the space of continuous frame functions on S, is exactly the
space Q
0
+ Q
2
and thereby consists only of restrictions to the unit sphere of quadratic
forms, amounts, by lemma 1.1.1, to the result that all elements of F are regular; that is,
every continuous frame function on S is regular.
7
The closed linear span of a subset S of some Hilbert space H is the smallest closed linear subspace
of H containing S.
6
The statement of theorem 1.1.1 is an important result, which we will be able to make use
of in proving the main theorem, but we will need it in a slightly stronger form. The next
step is to show that all non-negative frame functions on S are in fact continuous, hence
regular by theorem 1.1.1. Armed with this result we will then take the step from real
to complex two-dimensional Hilbert spaces, which will then show up in our later consid-
erations as subspaces of a larger Hilbert space. So, we now proceed to show that every
non-negative frame function on S is regular. In order two show this we will make use of
three intermediate results related to how much f is allowed to vary over a small open disk,
one of which requires some rather subtle geometric arguments.
We will also need the following denition.
Denition 1.1.4 : If f is a real-valued function dened on the set X we denote by
osc(f, X) the number supf(x) [ x X - inff(x) [ x X.
A great circle is dened as a circle on the sphere of maximal diameter, or equivalently, a
circle on the sphere the plane of which intersects the origin. Of course, every point of the
sphere lies on innitely many great circles. Here, given a point q, we will be interested in the
particular great circle for which q is the point with the smallest value of or equivalently,
the great circle which has a tangent that coincide with that of a circle in the plane =
q
at the point q,
q
being the value of the polar angle at the point q. The latter circle will,
given a point q, be denoted C
q
. In the following, this specic great circle will be referred
to as the Great Circle through q.
Lemma 1.1.3 Suppose p
3
N, where N is the set of points n such that
n

2
, with
p
3
,= (0, 0, 1). Consider the set P
1
of all points p
1
N such that for some point p
2
(a) p
2
is on the Great Circle through p
1
,
(b) p
3
is on the Great Circle through p
2
.
Then the set P
1
has a non-empty interior
8
.
Proof : The fact that the set P
1
is non-empty can be seen in gure 1.1. Consider two
innitesimally separated points p
21
and p
22
that are the respective highest points of two
neighboring great circles in the continuum of great circles passing through a given point
p
3
. With some help from gure 1.1, we can realize that p
21
will give rise to a set of points
p
11
that are the highest points on great circles passing through p
21
, a curve segment of
this set is sketched in the gure. If we look at the corresponding set p
12
for p
22
we realize
that this set will be a curve lying innitesimally close to the curve p
11
. By continuously
varying p
2
in this way, we can conclude that the set P
1
indeed has a non-empty interior.
It is, however, possible to nd out more about this set P
1
by deriving an analytic expression
for the set of points p
2
satisfying (b). Let the point p
3
have coordinates (sin , 0, cos )
9
in some orthonormal coordinate system (x, y, z), and let p
2
= (a, b, c), [ l [= a
2
+b
2
+c
2
= 1.
8
The interior of a set K is the union of all open sets that are subsets of K.
9
As opposed to in Gleasons original, our is the usual polar angle of spherical polar coordinates dened
as the angle relative the positive z-axis.
7
Figure 1.1: Some of the great circles that pass through p
3
, the highest point p
2
of one
of them and a great circle with highest point p
1
passing through p
2
. In the gure is also
sketched a curve segment consisting of points that are the highest points on some great
circle through p
2
- p
1
of course lies on this curve.
The tangent to the great circle connecting two points p
2
and p
3
at the point p
2
is a vector
in the plane in which p
2
and p
3
lie, and which is orthogonal to p
2
. Such a vector, call it t
c
,
can be created by the following maneuver:
t
c
= p
3
(p
3
p
2
)p
2
p
2
p
2
,
that is, just starting from p
3
and eliminating its component along p
2
.
We want to nd the points p
2
such that this tangent coincides with the tangent of the
circle C
p
2
of points on the sphere with =
p
2
at the point p
2
, call it t
2
. This tangent we
know to lie in the xy-plane and to be orthogonal to p
2
, which gives (normalization aside)
t
2
= (b, a, 0).
The condition that t
c
and t
2
be parallell at the point p
2
means that t
2
= t
c
for some
number , giving the three equations
b = sin a
2
sin ac cos (1.10)
a = ab sin bc cos (1.11)
8
0 = cos ac sin c
2
cos = cos ac sin (1 (a
2
+b
2
)) cos (1.12)
0 = (aN2 +bN2) cos ac sin (1.13)
However, we need only use the last of these equations in order to specify the points p
2
that
satisfy (b). The reason for this is that the only way in which the tangent of a great circle
through p
2
can lie entirely in the xy-plane is by also being a tangent to the surface of the
sphere in the xy-plane, and thereby parallell to the tangent of C
p
2
. So, the set P
2
of points
p
2
= (a, b, c) satisfying (b) are given by
= (a
2
+b
2
) cos ac sin = 0 (1.14)
-0.5 -0.25 0.25 0.5 0.75 1
-1
-0.5
0.5
1
Figure 1.2: Stereographic visualization of the set P
2
for some values of p
3
. Each point on
the P
2
curves will in turn give rise to a new curve according to the same pattern - the
union of these curves will be the set P
1
.
Lemma 1.1.4 Suppose that f is a frame function on the unit sphere S in R
3
and that for
a certain neighborhood
10
U of a point p on S, osc(f, U) = . Then every point on the great
circle the plane of which is orthogonal to p has a neighborhood V for which osc(f, V ) 2.
Proof : We choose coordinates so that p = (0, 0, 1). Let us dene the neighborhood U of
p as all points with < . Let q
0
be any point on the sphere for which =

2
(all such
points are orthogonal to p) and let r be the point with the same -coordinate as q
0
, and
10
The neighborhood of a point z
0
is dened as the set of points z satisfying | z z
0
|< for some ,
which is an open disk with radius centered at z
0
.
9
with
r
=

2
+

2
. Let C
0
be the unique great circle that connects r and q
0
(this circle will
pass through the point p as well, because r has the same value of as q
0
) and let r
and
q
0
be points in N C
0
such that r r
and q
0
q
0
. For both these points to lie in U, we
need to have

2
which is satised for all 0, so both r
and q
0
lie in U. This will be
true also if we substitute for q
0
a point q in some neighborhood V of q
0
, holding r xed.
So, let q
1
and q
2
be two points in V and, repeating the above procedure, let C
i
be the
great circle connecting r and q
i
and choose points r
i
and q
i
in N C
i
such that r r
i
and
q
i
q
i
(i = 1, 2). Then (because all the vectors specifying points on the unit sphere have
modulus one) r, r
i
and q
i
, q
i
both form orthonormal bases for C
i
(i = 1, 2). Therefore,
if we apply the frame function f to both these sets of vectors, using the frame function
condition we get
f(r) +f(r
i
) = f(q
i
) +f(q
i
), i = 1, 2 (1.15)
Subtracting the equations obtained for i = 1 and i = 2 and taking the modulus we get
[ f(q
1
) f(q
2
) [=[ f(r
1
) f(r
2
) +f(q
2
) f(q
1
) [[ f(r
1
) f(r
2
) [ + [ f(q
2
) f(q
1
) [ 2
(1.16)
where we in the last step made use of the fact that r
1
, r
2
, q
1
, q
2
U. So, for any pair of
points q
1
and q
2
both in V , [ f(q
1
) f(q
2
) [ 2, which shows that osc(f, V ) 2.
The result of lemma 1.1.4 can be generalized quite straightforwardly - this is done in
lemma 1.1.5.
Lemma 1.1.5 Suppose that f is a frame function on the unit sphere S in R
3
and that for
a certain non-empty open set U, osc(f, U) = . Then every point of S has a neighborhood
W such that osc(f, W) 4.
Proof : To realize this we need only apply lemma 1.1.4 twice - using the fact that we can,
from an arbitrary point in the sphere, reach any given point in two steps of arc length

2
.
Assuming that osc(f, U) = where U is the neighborhood of some point p, there exists
a point q orthogonal to p which has a neighborhood V for which osc(f, V ) 2. And,
by the same reasoning there is a neighborhood W of a point w orthogonal to q for which
osc(f, W) 4. Since any point on S can be reached in this way, we are done.
Next, we will show that every non-negative frame function on S in R
3
is continuous,
which together with theorem 1.1.1 implies that every non-negative frame function on S in
R
3
is regular. As the two preceeding lemmas indicate, we will go about this by showing
that a limited oscillation in the neighborhood of some point on the sphere leads to such
a limitation for all points. Showing that this oscillation is arbitrarily small amounts, of
course, to showing continuity.
10
Theorem 1.1.2 Every non-negative frame function on S in R
3
is regular.
Proof : Let f be a non-negative frame function on S with weight W. Subtracting a constant
from f only changes its weight, but the result is still a frame function, as can easily be
veried. Thus we can, with no loss of generality, assume that inff(x) = 0 for x S.
Now, let be some small positive number and set =

88
. This choice of is of course
made for later convenience, however, will also be a small positive number, provided that
is. If inff = 0 on S we can, by the denition of inmum, always nd a point p S such
that f(p) .
Let denote the transformation x(
x
,
x
) = x(
x
,
x
+

2
), and dene the function
g(x) = f(x) +f(x).
If x
1
, x
2
, x
3
is an orthogonal triple, then so is x
1
, x
2
, x
3
, so that
3
i=1
g(x) =
3
i=1
f(x) +
3
i=1
f(x) = W +W = 2W
so g is a frame function with weight 2W.
For q on the equator, q q, and p, q and q are all mutually orthogonal. This means
that
g(q) = f(q) +f(q) = f(q) +f(q) +f(p) f(p) = W f(p),
and since this holds for any point q on the equator we can deduce that g is constant on
the equator.
Now, consider a point r N p. Let C be the Great Circle through r. Because r
is the highest point on C, the point at which C intersects the equator will be orthogonal
to r, call it q. This orthogonality means that g(r) + g(q) 2W and since we know that
g(r) + g(q) = g(r) + W f(p) from above, we see that g(x) W + f(p) W +
x N p where the last inequality follows from how p was chosen.
Consider also a point s C N and a point t C N orthogonal to s. Because s
and t span the plane of C, as do r and q, the third orthogonal vector, call it w, is the same
for these two pairs, and we have g(r) +g(q) +g(w) = g(s) +g(t) +g(w), so that
g(r) +g(q) = g(r) +W f(p) = g(s) +g(t) g(s) +W + (1.17)
g(r) g(s) + 2 (1.18)
r N p and s C N, where we have again used that f(p) .
Let = infg(x) [ x N p and choose a point z N p for which g(z) + .
By denition of inmum, such a point can always be found. Now we consider a vector x
satisfying the conditions of lemma 1.1.3, namely that x N p is such that for some y
11
(a) y is on the Great Circle through x
(b) z is on the Great Circle through y
with z as dened above.
According to lemma 1.1.3 the set of points x satisfying these conditions is non-empty
(in fact has a non-empty interior), so such an x can always be found.
Then, by equation (1.18), we have that
g(x) g(y) + 2 (1.19)
and
g(y) g(z) + 2 (1.20)
so that
g(x) g(z) + 2 + 2 + 5 (1.21)
This means that for U the non-empty interior of the set X of such points x, osc(g, U) 5,
which by lemma 1.1.5 gives that there exists a neighborhood V of any point on the sphere,
hence of p, with osc(g, V ) 20. But because p = p we have that
g(p) = 2f(p) 2
supg(x) [ x V 22 (1.22)
Applying lemma 1.1.5 once again, we can infer that every point u on the sphere has a
neighborhood W such that
osc(f(u), W) 88 = (1.23)
u S. Because can be arbitrarily small this proves the continuity of f on S, and from
this regularity follows, according to theorem 1.1.1.
This concludes the second part of Gleasons proof, devoted to proving the continuity of
f on S. Continuing, we will need the following
Denition 1.1.5 We will say that a real-linear subspace K of a Hilbert space H is
completely real if the inner product takes only real values on K K .
Because we have proved the regularity of f on S, which can be considered as a completely
real subspace of, for instance, a complex two-dimensional Hilbert space, we are interested
in nding out if regularity on such a space has any implications for the regularity of f on
a larger Hilbert space, of which the smaller space can be considered a subspace. The next
section of the proof is devoted to this investigation.
We can start o by noting that a completely real subspace of any Hilbert space H is
itself a Hilbert space under restriction of the inner product on H . So, a frame function
for H becomes a frame function when restricted to a completely real subspace of H .
12
Lemma 1.1.6 If f is a non-negative frame function of weight W on a real Hilbert space,
then for any unit vectors x
1
and x
2
[ f(x
1
) f(x
2
) [ 2W [ x
1
x
2
[
Proof : That f is regular means that there exists a symmetric self-adjoint operator such
that f(x) = x [ [ x). Because f is non-negative we have x [ [ x) 0 and because of
the frame function condition, x [ [ x) W, so we have that
0 x [ [ x) W (1.24)
for all unit vectors x, and [ [ W.
Because we are in a real Hilbert space, x
1
[ [ x
2
) = x
2
[ [ x
1
), so
x
1
+x
2
[ [ x
1
x
2
) = x
1
[ [ x
1
) x
2
[ [ x
2
) = f(x
1
) f(x
2
) (1.25)
Therefore, we have that
[ f(x
1
)f(x
2
) [[ [[ x
1
+x
2
[[ x
1
x
2
[=[ [
x
1
x
1
+ 2x
1
x
2
+x
2
x
2
[ x
1
x
2
[ (1.26)
whence
[ f(x
1
) f(x
2
) [ 2W [ x
1
x
2
[ (1.27)
for unit vectors x
1
and x
2
.
We will now show that the regularity of f on every completely real subspace of a two-
dimensional complex Hilbert space implies regularity on the whole space, a result which
will prove very useful in our coming considerations.
Lemma 1.1.7 Suppose that f is a non-negative frame function on a two-dimensional com-
plex Hilbert space H
2
which is regular on every completely real subspace. Then f is regular.
Proof : Let W be the weight of f and let M = supf(x) [ x H
2
. We can choose a
sequence of unit vectors x
n
so that > 0 N such that [ f(x
n
) M [ n N.
Because every compact metric space E, and a Hilbert space is such space, is known to have
the property that every innite sequence in E has a limit point in E, we can also assume
that > 0 N such that [ x
n
y [ n N, y H
2
. We now construct a vector
n
=
y|x
n
|y|x
n
|
, chosen to have unit norm, and to be such that
n
x
n
[ y) is real. With this
choice of
n
we see that
n
1,
n
x
n
y.
For any number with [ [= 1 we have
f(x) = f(x) (1.28)
for all unit vectors x and all frame functions f. This is because if f is a frame function
for some Hilbert space H it is a frame function for any closed subspace S of H by
13
restriction. If we consider a one-dimensional S, the frame function condition immediately
gives equation (1.28). Since [
n
[= 1, f(
n
x
n
) = f(x
n
). Also,
n
x
n
[ y) =
y [ x
n
)x
n
[ y)
[ y [ x
n
) [
=
y [ x
n
)y [ x
n
)
[ y [ x
n
) [
R (1.29)
so
n
x
n
and y span a completely real subspace as dened above. Because a closed com-
pletely real subspace of a Hilbert space H is itself a real Hilbert space with respect to the
restriction of the inner product on H , lemma 1.1.6 applies to give (using (1.28))
[ f(y) M [=[ (f(y) f(
n
x
n
)) + (f(x
n
) M) [[ f(y) f(
n
x
n
) [ + [ f(x
n
) M [
2W [ y
n
x
n
[ + [ f(x
n
) M [
f(y) = M (1.30)
Let us now dene a function F on H by
F(v) =[ v [
2
f(
v
[ v [
) (1.31)
if v ,= 0,
F(0) = 0 (1.32)
The assumption that f(x) is regular, hence can be written as x [ [ x), on every com-
pletely real subspace, implies that F is a quadratic form under restriction to any completely
real subspace. Moreover, because of (1.28) we have that F(v) =[ [
2
F(v) for all scalars
and vectors v.
Let z be a unit vector orthogonal to y, and consider the completely real subspace spanned
by y and z. Since F restricts to be a frame function on this subspace, we have F(y) =
f(y) = M and F(z) = f(z) = W f(y) = W M. On the yz-subspace, F is by the
regularity assumption a quadratic form which by choice of M as supf obtains its maximum
value on the unit circle at y. Hence, the matrix for F relative to the basis (y, z) is diagonal.
This can be made likely by letting y = (1, 0) and z = (0, 1). If F were not diagonal in the
(y, z)-basis we would obtain a larger value of F than that at y at some point y obtained
by adding a component along the z-direction to y.
So,
F(y +z) =
2
F(y) +
2
F(z) =
2
M +
2
(W M), , R (1.33)
With the following maneuver we are, again using (1.28), able to take the step from real
to complex Hilbert space. Let and be non-zero complex numbers, and construct the
vector
z
=

[ [
[ [
z
Its easy to check that z
will also be a unit vector orthogonal to y. Thus, using (1.28) and

(1.33)
F(y+z) = F((
[ [
)(y+z)) = F([ [ y+ [ [ z
) = M [ [
2
+(WM) [ [
2
(1.34)
14
Because every vector in H can be expressed as a complex linear combination of y and z
we have shown that
F(x) = x [ [ x) (1.35)
for all x H , where is the self-adjoint operator with the matrix representation
_
M 0
0 W M
_
relative the basis (y, z). This completes the proof that f is regular on all of H .
Above, the regularity was extended from every completely real two-dimensional subspace
to all two-dimensional Hilbert spaces. The next step serves to generalize regularity on
every two-dimensional subspace to regularity on any Hilbert space H for which regularity
on any two-dimensional subspace holds.
Lemma 1.1.8 Suppose that f is a non-negative frame function for a Hilbert space H
(that can be either real or complex) and suppose that f is regular when restricted to any
two-dimensional subspace of H . Then f is regular.
Proof : We repeat the denition of F from the previous lemma:
F(v) =[ v [
2
f(
v
[ v [
) (1.36)
if v ,= 0,
F(0) = 0 (1.37)
Because f is regular on any two-dimensional subspace S of H by assumption, there exists
a bilinear or Hermitian
11
form A
S
such that
F(x) = A
S
(x, x) x [ [ x) (1.38)
for all x S. Now, we can dene a form A on all of H H by
A(x, y) = A
S
(x, y) (1.39)
if x S, y S. A is dened on all of H H because any pair of vectors x and y
can of course be considered as elements in the subspace that they span. This subspace is
two-dimensional in all cases except for when x and y are parallell, so that y = x for some
. Then, on the other hand, the bilinearity of A
S
gives that
A
S
(x, y) = A
S
(x, x) = F(x) (1.40)
irrespective of how S is chosen.
Using the polarization identity we dene A
S
(x, y) by
A
S
(x, y) =
1
4
(A
S
(x +y, x +y) A
S
(x y, x y)) (1.41)
11
Corresponding to the real and complex case, respectively.
15
when H is real and
A
S
(x, y) =
1
4
(A
S
(x+y, x+y) A
S
(xy, xy) +i(A
S
(x+iy, x+iy) A
S
(xiy, xiy))
(1.42)
when H is complex. Using that A
S
is linear in its rst argument and conjugate linear in
its second (as was used in deriving (1.40)) its easily veried that these expressions give
the desired result in the respective cases.
From (1.39), (1.41) and (1.42) we can deduce that
A(x, y) =
A(x, y)
A(x, y) = A(y, x)
4ReA(x, y) = F(x +y) F(x y)

2F(x) + 2F(y) = F(x +y) +F(x y)
Using these relations, we have that
8ReA(x, z) + 8ReA(y, z) = 2F(x +z) 2F(x z) + 2F(y +z) 2F(y z) =
= F(x+z +y+z)F(xz +yz)+F(xy)F(xy) = F(x+y+2z)F(x+y2z) =
4ReA(x +y, 2z) = 8ReA(x +y, z)
so that
ReA(x, z) +ReA(y, z) = ReA(x +y, z) (1.43)
Letting x ix and y iy, we nd
ImA(x, z) +ImA(y, z) = Im(A, x +y, z) (1.44)
(1.43) + (1.44) A(x, z) +A(y, z) = A(x +y, z) (1.45)
which with the rst two properties of A shows that A is bilinear or Hermitian on all of
H H .
The next step towards the proof of full regularity is showing that A is bounded, which
is done as follows.
Let x and y be vectors such that [ x [ 1, [ y [ 1. We can choose C, with
[ [= 1, so that A(x, y) is real, which gives
4 [ A(x, y) [= 4A(x, y) = 4ReA(x, y) = F(x +y) F(x y)
M([ x +y [
2
+ [ x y [
2
) = 2M([ x [
2
+ [ y [
2
) 4M
so
[ A(x, y) [ M (1.46)
and Ais a bounded sesquilinear form. Thus, we can apply the Riesz representation theorem,
which states the following.
16
Theorem 1.1.3 Let H
1
, H
2
be Hilbert spaces and let h : H
1
H
2
k be a bounded
sesquilinear form.Then h has a representation h(x, y) = x [ S [ y) where S : H
1
H
2
is
a bounded linear operator. S is uniquely determined by h and has a norm [ S [=[ h [.
By this theorem, there exists a bounded self adjoint operator, let us call it such that
A(x, y) = x [ [ y) (1.47)
for all x, y H. Since
f(x) = F(x) = A(x, x) = x [ [ x) (1.48)
for all unit vectors x H , this concludes the proof.
At this point, the results of theorem 1.1.2 and lemma 1.1.7 will be tied together in the
strong statement that every non-negative frame function on a Hilbert space H
N
with
N 3 is regular. The method that has been used throughout the proof, namely that
of transferring properties of subspaces to the Hilbert space of which they are a part, will
prove useful in this step as well.
Theorem 1.1.4 Every non-negative frame function on a (real or complex) Hilbert space
H of dimension at least three is regular.
Proof : As has been noted earlier, a frame function for a Hilbert space H by restriction
becomes a frame function (in general, of course, of dierent weight) for any completely real
subspace of H . Because we have assumed that the dimH 3, any such two-dimensional
subspace can be embedded in a completely real three-dimensional subspace of H , mak-
ing possible the application of theorem 1.1.2 because this three-dimensional space will be
isomorphic to R
3
. This shows that any non-negative frame function f is regular on any
completely real two-dimensional subspace of H . By lemma 1.1.7, if a frame function is
regular on every completely real two-dimensional subspace it is in fact real on all two-
dimensional subspaces; and by the statement of the last lemma, f is regular on all of H .
We are almost there - the crux of Gleasons theorem is contained in theorem 1.1.4. The
main result, however, is the following.
Theorem 1.1.5 Let p be a measure on the closed subspaces of a separable (real or complex)
Hilbert space H with dimH 3. There exists a positive semi-denite self-adjoint operator
of the trace class
12
such that for all closed subspaces A of H
p(A) = Tr(P
A
) (1.49)
12
A trace class operator is a compact operator for which a trace may be dened, such that the trace
is nite and independent of the choice of basis. When the dimension is nite any operator will be of the
trace class, because we can always choose a matrix representation for it and dene its trace as the sum
of its diagonal elements - the concept of trace class becomes meaningful only when dealing with innite
dimensional spaces.
17
where P
A
is the orthogonal projection of H onto A.
In particular, any assignment of probabilities to the vectors in H has to be of this form.
Proof : If B
x
is the one-dimensional subspace spanned by the unit vector x, f(x) = p(B
x
)
denes a non-negative frame function f. Since f is regular on all Hilbert spaces of di-
mension at least three, there is by deniton of regularity a self-adjoint operator such
that
f(x) = p(B
x
) = x [ [ x)
for all unit vectors x. The fact that x [ [ x) 0 for all unit vectors x shows that has
to be positive semi-denite. Given an orthonormal basis x
i
of H , we have that
p(H ) =
i
p(B
x
i
) =
i
x
i
[ [ x
i
) (1.50)
The sum on the far right converges; so we see that is in the trace class with Tr = p(H ).
For any closed subspace A of H we can always expand an orthonormal basis y
i
for
A to an orthonormal basis for H by adjoining to y
i
vectors z
i
, so that y
i
, z
i
is an
orthonormal basis for H .
Then, with P
A
the projection operator for orthogonal projection onto the subspace A,
P
A
y
i
= y
i
for all i, and of course P
A
z
j
= 0 for all j. Consequently,
p(A) =
i
p(B
y
i
) =
i
y
i
[ [ y
i
) =
i
P
A
y
i
[ [ y
i
) +
i
P
A
z
i
[ [ z
i
) = Tr(P
A
)
and we have shown that
p(A) = Tr(P
A
) (1.51)
thereby deriving the standard quantum rule, and proving Gleasons theorem.
It is of some importance to note that the last step is valid only for Hilbert spaces of dimen-
sion three and higher. Nowhere in this proof we have shown that a frame function has to be
regular on any two-dimensional space - the statement, based primarily on theorem 1.1.2,
is that every frame function is regular on the real unit sphere S, considered as a subspace
of R
3
. From Gleasons result follows the uniqueness of the density matrix as the means of
assigning probabilities to vectors in Hilbert spaces in a consistent way.
18
Chapter 2
A Gleason-type theorem for POVMs
The statement of Gleasons theorem, that any quantum state is given by a density operator,
was originally formulated using frame functions dened on sets of orthogonal projective
operators summing to one. The probability of outcome A when performing a measurement
is then given by an inner product between the projector P
A
corresponding to outcome A
and the density matrix of the system, which fully species (our knowledge of) its state.
The assumption made was
P
j
X
f(P
j
) = 1 (2.1)
X = P
j
D
d
[
j
P
j
= 1, P
i
P
j
=
ij
(2.2)
with D
d
the set of projection operators in d dimensions.
By allowing ourselves to make somewhat stronger assumptions about the frame function
than those originally made by Gleason, we can prove a Gleason-type theorem by much
simpler means than those available to Gleason. The frame function assumption, originally
made only for sets of orthogonal projectors, will here be made for the more general POVMs.
A POVM (Positive Operator Valued Measure) is a resolution of the identity operator into
positive operators, called eects, which just as the set considered by Gleason sum to one
but with the orthogonality constraint relaxed. Formally, a POVM is a set of n positive
operators E
i
that act on an N dimensional Hilbert space H
N
, and that satisfy
n
i=1
E
i
= 1, E
i
= E
i
, E
i
0, i = 0, 1, ..., n (2.3)
Note in particular that the number of elements of a POVM need not equal the dimension
of the Hilbert space. Both PVMs (Projective Valued Measures - in eect, ON bases; or-
thogonal resolutions of the identity) and POVMs can be said to represent measurements
19
of some quantity, but the POVM is a realization of a more general notion of measure-
ment. For a PVM, the results obtained are mutually exclusive (for instance, m = j means
m ,= j 1, ..., j). For a POVM, however, this is not the case. Also, while the projectors
of a PVM always commute, no such assumption is made for the eects of a general POVM.
Any POVM can (according to what is known as Naimarks theorem), however, always be
described in terms of a projective measurement, if the latter is performed in a higher di-
mension [2]. Because of this, the concept of a POVM is naturally inherent in the formalism.
A Gleason-type derivation of the standard quantum probability rule has indeed proved
possible using frame functions dened on POVMs, both general and restricted. For some
congurations, the fact that the quantum rule is not valid, has also been shown. In a 2003
paper, Caves et al [3] prove a Gleason-type theorem, originally formulated and proved by
Busch [4], and also investigate some specic types of POVMs, using the assumption
E
j
X
f(E
j
) = 1 (2.4)
X = E
j
K
d
[
j
E
j
= 1 (2.5)
K
d
being the set of eects in d dimensions.
It turns out that making the frame function assumption only for POVMs with two or three
elements, rather than for all POVMs, still enforces the quantum probability rule. In the
following, we will stay close to what is done in [3], noting specically when extra consid-
erations have to be made related to the number of elements in the POVM. The proof is
divided into several steps; rst linearity with respect to non-negative rationals is proved
in section 2.1, thereafter continuity in section 2.2. These two results add up to linearity,
which is related to an inner product in section 2.3.
The assumption we make is the following.
n
j=1
f(E
j
) = 1 (2.6)
for
E
j
K
d
,
n
j=1
E
j
= 1, n = 2, 3 (2.7)
Note that we here have chosen to normalize the frame function f so that the weight is
1, rather than ascribing to it the weight W. This, however, is only a question of normal-
ization and means no loss of generality.
20
Apart from the domain of denition of the frame function f, and the fact that this re-
sult is valid also for qubits (two-dimensional systems) the theorem is fully equivalent with
Gleasons original; stating that the density matrix description of a system is the only one
possible. The N = 2 case is of great importance in Gleasons original proof as well, al-
though the two-dimensional spaces he considers are always assumed to be subspaces of
some larger H ; in the end, his result is not valid for the qubit case. In using POVMs with
three elements, corresponding to projective measurements in higher dimensions, the same
assumption is in some sense implicit, which is arguably one of the reasons for the great
simplication of the proof in the POVM case.
The result we set out to prove is the following.
Theorem 2.0.6 For every frame function f : K
d
[0, 1], there is a unique unit-trace
positive operator such that f(E) = (, E) = Tr(E), where K
d
is the set of eects in d
dimensions.
The proof follows.
2.1 Linearity with respect to the non-negative ratio-
nals
The rst step is to prove additivity. This step necessitates making the frame function
assumption for both two- and three-element POVMs, since we consider the two POVMs
E
1
, E
2
, E
3
and E
1
+E
2
, E
3
.
Using the frame function property we obtain f(E
1
) +f(E
2
) +f(E
3
) = f(E
1
+E
2
) +f(E
3
)
and hence
f(E
1
) +f(E
2
) = f(E
1
+E
2
).
From additivity homogenity linearity with respect to non-negative rationals follows. Con-
sider the eect
n
m
E. Applying the frame function and making use of the additivity yields
mf(
n
m
E) = f(
n
m
E) +f(
n
m
E) +.... +f(
n
m
E)
. .
m times
= f(m
n
m
E) = f(nE) = f(E +E +..... +E
. .
n times
)
= f(E) +f(E) +... +f(E)
. .
n times
= nf(E)
f(
n
m
E) =
n
m
f(E) (2.8)
This linearity with respect to non-negative rationals in combination with continuity of
course implies full linearity, since the set of rationals is dense in the space of real numbers,
21
meaning that any real number s can be given to arbitrary accuracy by two rational num-
bers, one smaller and one larger than s. Hence, the next step is to prove that the frame
function is continuous.
2.2 Continuity
We will show that discontinuity of the frame function would lead to a contradiction with
how f is dened. Continuity in metric spaces is dened as follows
Denition 2.2.1 : f is continuous at x
0
if > 0 > 0 such that [ f(x) f(x
0
) [< ,
x satisfying [ x x
0
[< . The norm on the space of operators [ A [=
_
(A, A) is dened
through the inner product (A, B) = Tr(A
B).
We begin by showing continuity at the zero operator. Additivity implies that f(0) = 0;
f(E) = f(E +0) = f(E) +f(0). Assume that f is discontinuous at the zero operator.This
means that > 0 such that > 0 some eect E such that [ E [< and f(E) .
Choose =
1
N
< , with N Z
+
, and let E be an eect satisfying [ E [<
1
N
and f(E) .
Multiplying E by N gives F = NE, which is also an eect since [ F [= N [ E [< 1,
implying that the sum of the squares of its eigenvalues is less than 1. However, the additive
property (or the linearity with respect to non-negative rationals) of the frame function
gives f(F) = Nf(E) N > 1. f, however, is a function to the closed interval [0, 1] from
the set of eects that are members of POVMs with two or three elements, and since for any
arbitrary eect E
1
we can always nd E
2
such that E
1
+ E
2
= 1, this is a contradiction.
From this we can conclude that f is indeed continuous at the zero operator.
To generalize this result to include any arbitrary eect E
0
we proceed as follows. Let
E be an eect in the neighborhood of E
0
, and consider the dierence [ E E
0
[ . This
dierence can be diagonalized and divided into non-negative and negative eigenvalue parts,
so that EE
0
= AB, where A is the non-negative part, and -B is the part with negative
eigenvalues.
We have that [ A [, [ B [ [ A B [ = [ E E
0
[ . This follows from
[ A B [
2
= Tr((A B)
(A B))
= Tr(A
A A
B B
A +B
B) = Tr(A
A +B
B) Tr(A
A), Tr(B
B) (2.9)
From this, and from the fact that A and B are positive operators, we can conclude, pro-
vided [ E E
0
[ 1, that A and B are eects.
The frame function can be applied to the equation E + B = E
0
+ A (f is dened on
22
these operators for the same reason as given above - they are positive, and can always be
considered as being part of a two or three element POVM). Using additivity, this yields
f(E) f(E
0
) = f(A) f(B).
Because of continuity at the zero operator as shown, we know that =

2
> 0 > 0
such that [ A [, [ B [< f(A), f(B) <
.
This means that if [ E E
0
[=[ A B [< we have [ A [, [ B [< and [ f(E) f(E
0
) [=
=[ f(A) f(B) [[ f(A) [ + [ f(B) [< 2
= .
This completes the proof that f is continuous on all of K
d
, and taken together with the
results of section 2.2 this establishes that f is a linear function on K
d
.
2.3 Linearity and the inner product
Having proved that f is linear, we will make use of the fact that any linear function on a
vector space can be expressed by means of an inner product on this space. In order to do
so, we need to extend f to the entire vector space of operators, which is done as follows.
Let H be an arbitrary Hermitian operator. Such an operator can always be expressed as
a dierence between two positive operators, call them G
1
and G
2
. The most obvious way
to accomplish this is to diagonalize H and let G
1
and G
2
be the positive- and negative-
eigenvalue parts, respectively.
Any positive operator G can be expressed as G = E for some positive real number and
some eect E.
Now, dene f(H) = f(G
1
)f(G
2
) =
1
f(E
1
)
2
f(E
2
), using the additivity and linearity
of f. Although the decomposition of H is not unique the extension is, which can be proved
as follows. Assume that H =
1
E
1
2
E
2
=
3
E
3
4
E
4
. Choose such that max
i
and divide both sides of the above equation by to give
E
1
+

4
E
4
=

2
E
2
+

3
E
3
(2.10)
It is clear that these operators are all now in the original domain of f, meaning that the
frame function can be applied. This gives
1
f(E
1
) +
4
f(E
4
) =
2
f(E
2
) +
3
f(E
3
) (2.11)
By this we have a linear function f on the whole space of Hermitian operators. To make
the extension to the space of all operators we note that any operator C can be written
(uniquely) as C = A +iB for Hermitian operators A and B.
Getting back to the task of writing this linear function as an inner product on the vector
space of operators, we choose an orthonormal basis of operators
j
, enabling us to ex-
pand any arbitrary operator A as the sum A =
d
j

j
(
j
, A), d being the dimension of the
23
operator space.
Applying the frame function, we get f(A) =
d
j
f(
j
)(
j
, A).
We can now dene the operator as being the solution of the equations f(
j
) = (,
j
).
This is d
2
equations for the d
2
components of , so the solution is unique.
The requirements that the frame function is non-negative and normalized (that is,
i
f(x
i
) =
1) guarantee that has the density matrix properties positivity and unit trace. Positivity
follows from the fact that for any normalized vector [ ) we have, due to non-negativity
of the frame function, 0 f([ ) [) = [ [ ). That has unit trace is seen by
expanding the unit matrix using the normalization of the frame function:
Tr = (, 1) = (,
j
E
j
) =
j
(, E
j
) =
j
f(E
j
) = 1.
This completes the proof.
It should be noted that this last consideration is not aected by the weaker frame function
assumption, and also that the dimension of the Hilbert space of operators does not enter
into these calculations.
As noted above, the POVM version of Gleasons theorem is valid for two-dimensional
Hilbert spaces, in contrast to the original statement. One may say that one reason that
the proof simplies to such a great extent for POVMs is that we in this case get a frame
function that is regular on two-dimensional Hilbert spaces, so that no embedding in higher
dimensional spaces is needed. However, the two-dimensional case is highly present in both
proofs.
24
Chapter 3
Gleasons theorem for
informationally complete POVMs
As noted above, the POVM analogue of Gleasons theorem is valid also for two-state sys-
tems, so-called qubits. The purpose of this chapter is to investigate whether a Gleason-type
theorem can be proved for two restricted classes of POVMs, namely two semi-symmetrical
families of asymmetrical POVMs with four elements, in the qubit case. As we know, every
POVM can be seen as representing a measurement, the outcomes of which are in general
not mutually exclusive. In this particular case, we will be looking at the situation when we
have two possible states for our system, but four possible outcomes of a single measurement.
The four element POVMs discussed below are particularly interesting because they are
informationally complete. In dimension N the density matrix (a Hermitian operator of
trace one) is represented by a N N matrix, and is fully specied by N
2
real numbers, of
which N
2
1 are independent. A POVM with N
2
elements will give exactly N
2
probabil-
ities, of which N
2
1 are independent. Therefore, when the number of POVM elements
is the square of the dimension of the system (two in the qubit case) it is informationally
complete, in the sense that its statistics give enough information to construct the density
matrix.
1
The symmetric variety of such POVMs, the symmetric informationally complete
POVM, or sic-POVM for short, has been shown by Caves et al [2] not to necessitate the
quantum probability rule. In some sense, the failure of the proof in the sic case is due to
the great degree of symmetry present
2
- the argument appears to go through for all other
types of four element POVMs.
In the following, in contrast to in chapter 1 and 2, the continuity of the frame function will
not be proved, but assumed.
Using the fact that (two-dimensional) Hermitian operators can be expressed in terms of
1
A PVM will always have its number of elements equal to the dimension of the system, and will give
N 1 independent numbers upon measurement. Because of this, no PVM can be complete - we need
N + 1 PVMs to determine the density matrix; (N + 1)(N 1) = N
2
1.
2
This was suggested by C. Fuchs, private communication.
25
the Pauli matrices, we can write the general two-dimensional eect as
E = r1 +s = r1 +s n (3.1)
where 1 is the unit matrix and n is a unit vector.
Using this expression, we will end up working on the Bloch-sphere, which is a two-sphere
quite distinct from the S
2
of the Gleason proof. The points on this sphere are non-zero
vectors in a two-dimensional complex Hilbert space taken modulo a complex number, and
orthogonal vectors (states) as usually dened, correspond to antipodal points on the sphere
(on the sphere considered by Gleason, antipodal points were identied).
The restricted sets of POVMs considered in this section all consist of eects that are
multiples of one-dimensional projectors so that r = s
1
2
and E = r(1 + n ), and all
eects also have the same weight r, namely
1
N
for a N outcome POVM. We see that a
POVM is then fully specied by the n vectors of its elements, that sum to zero in order
for the eects to sum to one, which is clear from the equality
N
i=1
E
i
= N
1
N
+
1
N
N
i=1
n
i
(3.2)
So,
N
j=1
n
j
= 0 (3.3)
for any POVM with N elements.
Moreover, we will assume rotational invariance, so that all POVMs that are the same up
to a three-dimensional rotation are considered equivalent.
We will now be interested in frame functions

f(
1
N
(1 + n )) f( n) dened on this
set. From the right hand side of this equation it is clear that f is a function on the unit
sphere in three dimensions. The frame function condition will here as in chapter 2 be
normalized according to
N
i=1
f( n
i
) = 1 (3.4)
Proving a Gleason-type theorem is equivalent to showing that a frame function satisfy-
ing (3.4) has to be of the form
f( n) = Tr(E) =
1
N
(1 + n P) (3.5)
for some , where P = Tr() is a three component vector satisfying [ P [ 1. Because
the frame function is evidently a function on the unit sphere (continuous by assumption)
26
it can be written as a sum of spherical harmonics Y
lm
;
f( n) =
lm
c
lm
Y
lm
(3.6)
The quantum rule for the frame function evidently contains only harmonics with l = 0 and
l = 1; explicitly
1
N
(1+ nP) =
_
4
N
Y
00
+
_
2
3N
P
x
(Y
1,1
Y
1,1
)+i
_
2
3
P
y
(Y
1,1
+Y
1,1
)+
_
4
3N
P
z
Y
10
(3.7)
Note here the dierence between this and the case considered by Gleason, where the quan-
tum rule allowed only l = 0 and l = 2. Again, this is because we are working with a
dierent sphere than did Gleason.
So, what we will want to do is to expand the frame function over the particular POVM
we are looking at in terms of spherical harmonics, and check what values of l contribute
to the sum. In fact, it is possible to derive a property that has to hold if the lth harmonic
is to be allowed in the expansion of a frame function f( n), namely c
lm
N
j=1
Y
lr
( n
j
) =
l0
for all l, m and r.
For l = 0 this condition is trivial, and is accomplished simply by normalization.
For l 1 it is equivalent to either
c
lm
= 0, m = l, ..., l (3.8)
or
N
j=1
Y
lr
( n
j
) = 0, r = l, ..., l (3.9)
This means that if the lth harmonic contributes to a frame function f( n) that is, if not
all of the c
lm
are equal to zero, then the n vectors of the POVM must satisfy equation (3.9).
If we can show that for a certain POVM no l-values other than l = 0 and l = 1 can satisfy
this for all m, we will have derived the quantum rule, thereby proving a Gleason-type the-
orem. Due to the property of the POVM n-vectors (3.3), equation (3.9) is automatically
satised for l = 1.
In investigating whether higher values of l can contribute we will make use of some prop-
erties of the spherical harmonics (restated here just as in equation (1.5) and equation (1.6)
for convenience), namely the way that the and , or for cartesian coordinates ( n)
z
and
27
( n)
x
+i( n)
y
, dependencies can be separated, according to
Y
lm
( n) = Y
lm
(, ) =
(2l + 1)
4
(l m)!
(l +m)!
P
m
l
(cos )e
im
= (3.10)
= h
lm
(( n)
z
)(( n)
x
+i( n)
y
)
m
(3.11)
where
h
lm
(( n)
z
) =
(2l + 1)
4
(l m)!
(l +m)!
P
m
l
(( n)
z
)
(
_
1 ( n)
z
N2)
m
(3.12)
and also, how the spherical harmonics transform under reection, parity and conjugation
Y
lm
( n) = (1)
l+m
Y
lm
( , ) = (1)
m
Y
lm
(, +) = (1)
l
Y
lm
( n) = (1)
m
Y
lm
( n)
(3.13)
It should be noted that continuity of the frame function, necessary for a spherical harmonics
expansion to be possible, is here assumed, as opposed to in chapters 1 and 2.
A particularly useful form of equation (3.11) is that for m = l, in which case
Y
lm
( n) = Y
ll
( n) (( n)
x
+i( n)
y
)
l
(3.14)
The function h
lm
(( n)
z
) will in this case be independent of ( n)
z
because the factor (
_
1 ( n)
2
z
)
l
in P
l
l
(( n)
z
) will cancel the explicit dependence of equation (3.12), so that
4
j=1
Y
ll
( n)
4
j=1
(( n
j
)
x
+i( n
j
)
y
)
l
(3.15)
Another result that will be of great importance in the following is
Theorem 3.0.1 Any two non-zero associated Legendre functions P
m
n
(z) and P
s
n
(z), where
n is an integer such that n 1 and m ,= s, have on the open interval (1, 1) either no
common zero or exactly one common zero. The latter occurs if and only if n [ m [ and
n [ s [ are both odd and positive.
This 1984 result is due to N.H.J. Lacroix [5] and its proof, which is purely analytic and
quite elementary
3
, uses the fact that the associated Legendre functions are solutions of
Legendres associated equation, which is a second order dierential equation.
For a POVM with four elements, the vectors n
i
specify the vertices of a tetrahedron.
What we will do in the following is to express this tetrahedron in terms of Y
lm
( n
i
) and
make use of the fact that the sum
4
i=1
Y
lm
( n
i
) has to be zero for all m in order for the l:th
harmonic to contribute to the expression for f( n). Having concluded that only harmonics
with l = 0, 1 can contribute to the sum for a given POVM a Gleason result immediately
follows, because the fact that the frame function is real requires that the harmonics appear
in exactly the combinations of equation (3.7).
3
It does, however, involve properties of the so-called Pr ufer polar coordinates, see [6] for more on those.
28
3.1 The quantum probability rule for a rst class of
POVMs
As mentioned above, the unit vectors n of the eects of a POVM determine the POVM
completely. The unit vectors of a sic-POVM form a regular tetrahedron. If this tetrahe-
dron is stretched, using a parameter , a class of POVMs is obtained. In this rst case, we
will choose the dependence so that cos = 0 and cos = 1, correspond to the square
and the line, respectively. All values in between correspond to a specic tetrahedron (i.e.
a four element POVM).
The four unit vectors specifying a POVM in this class of deformations of the sic-POVM
can be expressed as
n
1
= (sin cos
3
2
, sin sin
3
2
, cos ) = (0, sin , cos )
n
2
= (sin cos

2
, sin sin

2
, cos ) = (0, sin , cos )
n
3
= (sin cos 0, sin sin 0, cos ) = (sin , 0, cos )
n
4
= (sin cos , sin sin , cos ) = (sin(), 0, cos )
(3.16)
Its easily veried that the vectors sum to zero, just as they are supposed to if they are to
represent a POVM. Using the symmetrical condition that the inner product between the
vectors spanning the POVM is the same for any pair in the set, one sees that the regular
tetrahedron corresponds to
sym
= arccos
1
3
(3.17)
To investigate what harmonics can contribute, we start by looking at the sum
4
j=1
Y
ll
( n
j
)
According to equation (3.15), this sum is proportional to
4
j=1
(( n
j
)
x
+i( n
j
)
y
).
Using (3.16),
4
j=1
Y
ll
( n
j
) 1
l
+ (1)
l
+ (i)
l
+ (i)
l
(3.18)
We see that this sum is zero for l = 0, 1, 2 and odds. This means that harmonics with
these values of l can contribute to the sum. To nd out if they actually do, we have to
check whether the sum
4
j=1
Y
lm
( n
j
) is zero for all m, not only for m = l. For l = 2, the
29
sum will be zero for all m and all except for m = 0. However, for the values of that are
solutions of P
0
2
(cos ) = 0 the l = 2 harmonic will contribute to the sum, and the quantum
rule will not hold.
The zeros of P
0
2
(cos ) are cos = 1, 0,
1
3
; that is, the line, the square and the reg-
ular tetrahedron. For l = 3 the only sum which does not give zero for all values of is
4
j=1
P
2
3
(cos ), the zeros of which are cos = 0 and 1.
Also for l = 5, the sum for l = 2 diers from zero for all but those satisfying P
2
5
(cos ) = 0.
The zeros are the same as those for P
2
3
(cos ). This is consistent with the results of Caves
et al.[2].
For any odd l, l +2 will be odd, so that P
m
l
(cos( )) = P
m
l
(cos ) by equation (3.13).
Also, the dependence is periodic with period 4. This means that
4
j=1
P
m
l
(cos ) will
be non-zero for all odd l and m 2 (mod 4). The condition for the l:th harmonic to
contribute is that the sum
4
j=1
Y
lm
( n
j
) is zero for all m.
For odd l 7 (even values of l have already been excluded) only those values of can
contribute that are zeros of both P
2
l
(cos ), P
6
l
(cos ) and so on all the way up to m = l 1.
Hence, in order for harmonics with l 7, for which at least
4
j=1
Y
l2
( n
j
) and
4
j=1
Y
l6
( n
j
)
are non-zero, to contribute for specic values
0
of we need these
0
to be common zeros
of the l:th associated Legendre functions P
m
l
(cos ) for dierent m. By theorem 3.0.1 no
such common zeros of the associated Legendre functions exist, other than cos = 1 and
0. That 1 will always be a root of the associated Legendre functions is clear from the
rst factor in the expression
P
m
l
(x) = (1 x
2
)
m
2
d
m
dx
m
(P
l
(x)) (3.19)
So, we have found that for this family of tetrahedrons, we have a Gleason-type theorem
(meaning that the quantum rule for calculating probabilites is valid and unique) for all con-
gurations except for the (lower-dimensional) extreme points and the regular tetrahedron
representing the sic-POVM.
3.2 The quantum probability rule for a second class
of POVMs
The same reasoning can be applied to another family of POVMs, with similar semi-regular
properties. This time, we choose the deformation parameter so that the unit vectors
of this class of POVMs span tetrahedrons whose base is always a regular triangle (in the
extremal point, one of the vectors is the zero vector, and the conguration is just a trine).
30
This subset of tetrahedrons can be parametrized as
n
1
= (0, 0, 3 cos cos 0) = (0, 0, 3 cos )
n
2
= (sin cos
2
3
, sin sin
2
3
, cos ) = (
1
2
sin ,
3
2
sin , cos )
n
3
= (sin cos
4
3
, sin sin
4
3
, cos ) = (
1
2
sin ,
3
2
sin , cos )
n
4
= (sin cos 0, sin sin 0, cos ) = (sin , 0, cos )
(3.20)
The symmetrical case corresponds to cos =
1
3
, which is seen as follows.
The symmetrical condition of the pairwise inner product between the POVM vectors being
the same for all pairs in the set, and specically
n
1
n
2
= n
2
n
3
(3.21)
gives
3 cos
2
=
1
4
sin
2
3
4
sin
2
+ cos
2
4cos
2
=
1
2
(1 cos
2
)
9
2
cos
2
=
1
2

cos =
1
3
(3.22)
The case =

2
is just the two-dimensional regular trine, while = and = 0 both give
the straight line.
Proceeding in analogue with the previous case, we consider the sum
4
j=1
Y
ll
( n
j
) (
1
2
+i
3
2
)
l
+ (
1
2
i
3
2
)
l
+ 1
l
=
= (e
il
+ e
il
+ 1) e
il
+ e
il
+ 1
with
arctan
3 =

3
arctan
3 =
4
3
This expression being equal to zero of course means that its real and imaginary parts are
zero separately. This leads to the condition
cos l + cos l + 1 = 0
and
sin l + sin l = 0
31
which gives
cos l = cos l
2 cos l + 1 = 0
So, the condition for the Y
ll
s to sum to zero is
cos l = cos l
3
=
1
2
l = 0, l = 2, l 1 (mod 2) l , 0 (mod 3)
(3.23)
Hence, the harmonics that can possibly contribute for this type of tetrahedrons have l
equal to zero, two or to an odd number that is not a multiple of three.
To nd out if these values of l really do contribute, resulting, for any l ,= 0, 1 contributing,
in the lack of a Gleason theorem, we proceed as in the previous case. Due to the
dependence, the cases in which the sum
4
j=1
Y
lm
( n
j
)
will be non-zero occur for m = 3, 6, 9...and so on. For l = 5, m = 3 is the only allowed
multiple of three, and for the values of that give P
3
5
(cos ) = 0 the l = 5 harmonic
contributes, and we do not have a Gleason theorem. The zeros of P
3
5
are, apart from 0
and 1,
1
3
, which as noted above, is exactly the regular tetrahedron. Higher values of l
either will be even or will allow at least two m-values that are multiples of three. Due to
the lack of common zeroes of the associated Legendre polynomials for dierent m as stated
in theorem 3.0.1, we can deduce that only l = 0, 2 can contribute, so that we do get the
standard quantum rule, except for the cases which give P
3
5
(cos ) = 0.
3.3 Summary
To summarize, we have found (assuming continuity, not proving it, n.b.) that a Gleason-
type theorem can be proved for all POVMs in these two families, apart from the regular one.
These two ways of deforming a regular tetrahedron are arguably the two most symmetric
ways of creating an irregular tetrahedron, and seeing as how it appears to be the high
degree of symmetry that causes the proof to fail in the sic case, it is not likely that any
other tetrahedrons would exhibit a behaviour like that of the regular tetrahedron.
32
Chapter 4
Kochen-Speckers theorem
4.1 Kochen-Speckers theorem
The theorem known as Kochen and Speckers theorem (KS) was formulated by Simon
Kochen and Ernst Specker [7] in 1967. The eective statement of the theorem, sometimes
referred to as the Bell-Kochen-Specker theorem [9], is that it in a Hilbert space of dimen-
sion N 3 is impossible to assign denite values from 0, 1 to all projection operators,
i.e. vectors
1
, in such a way that in each set of N orthogonal vectors exactly one vector
is assigned the value 1. This is, in eect, a corollary to Gleasons theorem, since it can
be shown that no density matrices give rise to such probabilities, but it can be and was
proved independently of Gleasons result (albeit 10 years later).
The physical implications of KS in relation to theories of so-called hidden variables have
been much discussed, and the most common interpretation is that the theorem places severe
restrictions on any such theory; in eect, that the theorem implies that any well-dened
properties possessed by particles would necessarily have to be contextual - the original
authors themselves thought their result to establish the nonexistence of hidden variables.
Recently, however, arguments have been made for an understanding of KS rather as an
epistemological statement about the limitation of the knowledge possible to obtain through
measurement. [8]
A problem that over the years has evolved into a downright contest, is that of nding
a nite set of KS uncolourable vectors. Kochen and Specker in their proof used 117 vectors
arranged in an ingenious way; the current record (in three dimensions) is due to Conway
and Kochen and lies at 31 - a number that can be further reduced in higher dimensions.
2
With a slight change of the rules, however, all of these records are easily broken. In two
dimensions, consider four points equally spaced on a circle, and rst treat them as a four
1
We will in the following look at unit vectors but are in fact interested in rays rather than vectors,
because in establishing orthogonality relations only directions are relevant; consequently, each unit vector
will represent all vectors with the same direction.
2
See for example Peres, [10]
33
element POVM. This necessitates a colouring that renders one of the points black and the
remaining three white. However, the two pairs of anti-parallell vectors also form two two-
dimensional PVMs, which requires exactly one vector in each pair to be coloured black,
resulting in two of the four points to be black - and we have a contradiction.
In section 4.4 we will consider another nite set of vectors in an attempt to prove a KS
result for sic-POVMs - a set that is shown not to suce for this purpose.
4.2 A KS colouring in arbitrary dimensions
Let us rst consider the three-dimensional case. We are interested in assigning value 0 or
1 to vectors in H
3
in such a way that no set of three mutually orthogonal vectors are
all assigned the value 0, and no pair of orthogonal vectors both have the value 1. These
conditions can be expressed as
g : S
2
0, 1 (4.1)
g(P
1
) +g(P
2
) +g(P
3
) = 1 (4.2)
for all sets of orthogonal vectors P
1
, P
2
, P
3
, S
2
being the unit two-sphere. Letting white
represent the value 0 and black the value 1, this problem can be translated into the problem
of colouring S
2
, in a way that satises the conditions just stated.
Any such assignment of truth values (probabilites from 0, 1) to all vectors in the Hilbert
space of some system would correspond to (the possibility of) the system having well-
dened properties, existing independent of measurement. That is, for any possible observ-
able the outcome of the corresponding measurement would be fully determined in advance.
However, by the Kochen-Specker theorem, a complete such assigment of truth values is
impossible. Hence, what we will try to do in the following is to assign probabilites from
0, 1 according to (4.2) to some of the vectors in H - some vectors will necessarily remain
uncoloured, by KS.
One way to go about this, suggested by Appleby [8], is to start out by colouring the
two polar caps dened by [ tan [< 1 black, and the region around the equator bounded
by [ tan [=
2 white, where is the usual polar angle. This type of colouring is sketched
in gure 4.1.
These limits are derived as follows. The two polar caps are made small enough so that no
two vectors in an orthogonal triple can simultaneously lie in the black region, which means
that they will extend down to =

4
. The white section around the equator is just wide
enough so that not all three vectors can lie in it at the same time.
An explicit expression for the limit of the latter section is derived as follows, see gure
4.2. The case we primarily have to guard against is when all three vectors lie at the same
34
Figure 4.1: A possible (incomplete) KS colouring of the unit two-sphere.
coordinate. The three points on the sphere specied by these vectors are then the corners
of a regular triangle, with side length
2, because we are on the unit sphere. The radius

of a circle in which such a triangle can be inscribed is
R
2
=
2
(2 + 1)
=
_
2
3
(4.3)
which as can be seen from the gures is exactly arcsin
w
, where
w
is the desired limiting
angle. This gives the white area to be the region around the equator with values between
arctan
2 and arctan
2.
This colouring of S
2
satises the Kochen-Specker criteria for 1
1
2
+
1
3
= 87% of all
vectors; that is, all orthogonal triples consisting of vectors from the so coloured areas will
satisfy equation (4.2).
An analoguous colouring can be done for S
n
, yielding the percentage results 79% for n = 3,
74% for n = 4 and 71% for n = 5.
The above numbers are obtained using the fact that equation (4.3) generalizes to
R
n
=
_
n
n + 1
=
_
N 1
N
(4.4)
35
Figure 4.2: Derivation of the limiting angles for the colouring in gure 4.1.
for the radius of the circumsphere of a regular n-simplex, where n = N1 is the dimension
of the sphere in N dimensions.
In the above percentages, the area of the black cap is included, but already in four di-
mensions the contribution to the total area is close to negligible. As we will see below it
will reduce further with increasing dimension, which is why we in the following will pri-
marily be interested in looking at the area taken up by the white section.
The fraction of the sphere in N dimensions that can be coloured white with the given
restriction is, using equation (4.4),
F =
_
2
arcsin
N1
N
sin
N2
d
_
2
0
sin
N2
d
= 2
vol(S
N2
)
vol(S
N1
)
_
2
arcsin
N1
N
sin
N2
d (4.5)
where vol(S
d
) denotes the surface area of the d-dimensional sphere.
As for the black area, B
N
, it will in analogy with the N = 3 case be located around the
poles of the sphere, with limiting angle

4
;
B
N
= vol(S
N2
)
_
4
0
sin
N2
d (4.6)
What, one may ask, is the fraction of the sphere in N dimensions that can be coloured
36
using this method in the limit N ? As can be seen from the expression
vol(S
d
) = vol(S
d1
)
_

0
sin
d1
d (4.7)
for high dimensions, the fraction of the area of the sphere that will lie around the poles
is negligible, due to the increasingly sharp peak around =

2
of the sine function when
raised to a large number. Thus the fraction of the surface area taken up by the black
section will be very small.
To determine the fraction of the sphere taken up by the white section requires a bit more
careful analysis. We will need to evaluate the expression
lim
N
2
vol(S
N2
)
vol(S
N1
)
_
2
arcsin
N1
N
sin
N2
d (4.8)
The volume of the sphere in d dimensions is
vol(S
d
) =
2
d+1
2
(
d+1
2
)
(4.9)
so that
vol(S
N2
)
vol(S
N1
)
=
1
(
N
2
)
(
N1
2
)
(4.10)
Equation (4.8) can then be written as
lim
N
2
1
(
N
2
)
(
N1
2
)
_
2
arcsin
N1
N
sin
N2
d
We will treat the Gamma function part and the integral part of the expression separately,
starting out by looking at the fraction
(
N
2
)
(
N1
2
)
In the limit of large N we can apply Stirlings approximation to the Gamma function
(z) =
2e
z
z
z
1
2
(1 +
1
12z
+O(
1
z
2
)), [ argz [< , [ z [ (4.12)
and consider the expression
2e
N
2
N
2
N
2

1
2
(1 +
2
12N
+O(
1
N
2
))
2e
N1
2
N1
2
N1
2

1
2
(1 +
2
12(N1)
+O(
1
(N1)
2
))
2
(1
1
N
)
2
, N
(4.13)
37
where we have used that
(
N 1
2
)
N
2
1
= (
N
2
)
N
2
1
(1
1
N
)
N
2
1
=
= (
N
2
)
N
2
1
((1
1
N
)
N
)
1
2
1
(1
1
N
)
(
N
2
)
N
2
1
1
e
1
(1
1
N
)
, N (4.14)
and
(1 +
2
12N
+O(
1
N
2
))
(1 +
2
12(N1)
+O(
1
(N1)
2
))
1, N (4.15)
Thus, we can conclude that
lim
N
(
N
2
)
(
N1
2
)
=
2
Next, let us take a look at the behaviour of the integral
_
2
arcsin
N1
N
sin
N2
d
in the limit of large N.
Its not hard to convince oneself that this is equivalent to looking at the integral
_
arccos
N1
N
0
cos
N2
d
which simplies the calculations because all expansions can be done around zero.
In the limit of large N we can use
arccos
_
N 1
N
=
1
N
+O(
1
N
3
2
) (4.17)
Expanding cos t around t = 0, we get
cos t[
t=0
= 1
t
2
2!
+O(t
4
) (4.18)
so that, using the regular binomial expansion and the fact that when N is large N 2 can
be approximated with N,
lim
N
cos
N2
= (1

2
2
)
N
+h(, N) = h(, N) + 1 N
2
2
+
N
2
2!
4
4

N
3
3!
6
8
+... (4.19)
where h(, N) is a function such that
lim
N
N
_
arccos
N1
N
0
h(, N) = 0
38
Integrating term by term and using equations (4.18) and (4.17), we get
lim
N
_
arccos
N1
N
0
cos
N2
d = [
N
6

3
+
N
2
40

5
N
3
336
7
+...]
=arccos
N1
N
=0
=
1
k=0
1
2
k
1
k!
(1)
k
(2k + 1)
(4.21)
The sum in (4.21) is in fact equal to
_
2
erf(
1
2
)
with erf the statistic-probabilistic error function;
erf(z)
2
_
z
0
e
t
2
dt (4.22)
Putting all of this together, we have the result
lim
N
2
vol(S
n2
)
vol(S
n1
)
_
2
arcsin
N1
N
sin
N2
d = erf(
1
2
) 0.68 (4.23)
So, approaching the limit of an innite number of dimensions of the Hilbert space H
in which our projective measurements are conducted, binary probabilities (corresponding
to well-dened, non-contextual properties of the system with available states in H ) can
be assigned to approximately 68% of the vectors in H . The behaviour of the percentage
as a function of dimension
3
is given in gure 4.3.
The minimum occurs around 12.465, and the integer giving the least percentage is N = 12;
about 66.76%.
What has been derived above is a lower limit for the area of the sphere that is KS colourable
in arbitrary dimensions. The possibility remains, however, that a maximally eective
colouring could cover a much larger area - possibly, in fact, as much as 99%
4
of the sphere
in R
3
[8].
3
Treated in the diagram as a continuous variable.
4
There are some measure theoretical subtleties that can be used to colour even more of the sphere, in
fact almost all of it, in a very special sense of the word. For more on this, see for example [11].
39
25 50 75 100 125 150
0.67
0.68
0.69
0.71
0.72
0.73
Figure 4.3: Percentage of the sphere in N dimensions that is colourable using the above
method, as a function of N.
4.3 KS coloured bases
The physically relevant question, however, is arguably not how large a fraction of all states
can be assigned probabilities 1 or 0, but rather what percentage of all complete orthogonal
bases, corresponding to measurements, can have all their basis vectors assigned binary
probabilities in a consistent way. A geometrical consideration (that can likely be further
generalized, albeit not without some eort) allows us to answer this question in three and
four dimensions.
Let us rst consider the colouring of the two-sphere proposed above - a black cap and
a white equatorial belt covering in total 87% of the sphere - and make use of the regular
measure on R
3
to compare the number of properly coloured bases consisting of vectors
from these sections with the total number of ordered orthonormal triples in R
3
.
In a properly coloured base exactly one vector has to be black, so one of the three vectors
in an orthogonal triple has to be chosen to lie on one of the black caps. The remaining two
orthogonal vectors can then be chosen from a great circle orthogonal to the rst vector -
the question is how large a fraction of this great circle will lie within the white section and
also how the second vector (which, of course, completely determines the third basis vector
up to a sign) can be chosen so that the third vector will also be contained within the white
section.
Figure 4.4 depicts the plane of the great circle orthogonal to the rst (black) vector on
which the remaining two vectors in the orthogonal triple will have to lie. The circle seg-
ment bounding the striped area is the cut between the white belt and the orthogonal great
circle. For any choice of second vector from this section, the third vector will be fully
determined (up to a sign). Hence, we cannot choose our second vector in a satisfactorily
40
Figure 4.4: A cut through the plane of the great circle orthogonal to the vector chosen to
lie on the black cap.
coloured triple from any part of the circle-belt overlap in gure 4.4, but only from the
sectors that will result in the third vector lying in the white belt as well. Given a second
vector, the third is obtained by rotation in the great circle plane by an angle of

2
. The
allowed choices for second vector are then the points such that the points corresponding to
a

2
rotation of these points are also white. This set of points is just the overlap between
the white (striped) sector in gure 4.4 and the same sector rotated by

2
, as illustrated in
gure 4.5, an overlap that can be shown to always be non-empty. Hence, what we will
need to nd is the total angle taken up by the striped section in gure 4.5 - this will be
denoted by .
Its clear that this can be expressed in terms of the of gure 4.4 as
= 4 2 (4.24)
The angle , in turn, can be expressed in terms of the regular polar angle that species
our choice of black vector using the following procedure.
First, consider the plane spanned by the vector chosen to lie in the black section, call it
z
, and a vector y
in the plane orthogonal to z
; x, y, z is a reference coordinate system

as shown. The vector x
orthogonal to y
and z
is chosen so that its z component equals

zero. From gure 4.6 it is clear that
z = 0x
+ sin y
+ cos z
(4.25)
41
Figure 4.5: Overlap between the white section of the great circle, and its rotation by

2
.
Meanwhile, as can (hopefully) be seen from gure 4.7, a vector v lying just on the boundary
of the white belt can be expressed in terms of y
and x
as
v = cos
+ sin
(4.26)
with
=

2
, its z component being equal to zero. We also know that the z component of
our vector v is just h, with h =
1
3
according to our earlier deliberations. Taken together,
this gives
v z = h = (cos
+ sin
) z = sin
z = sin
sin (4.27)
so that
= 2 arcsin
h
sin
(4.28)
and
= 8 arcsin
h
sin
2 (4.29)
When < arcsin
1
3
expression (4.28) for will not be dened; for those angles all of the
vectors orthogonal to the black section vector dened by the angle will lie within the
white section.
This enables us to express the fraction of the orthogonal great circle corresponding to every
choice of vector z
in terms of the angle , making possible integration over all values of

and thereby the comparison we have in mind.
So, the integrals we will want to evaluate are
42
Figure 4.6: The vector y
will make an angle with the z axis.

I = 2
_
arcsin
1
3
0
sin d +
_
4
arcsin
1
3
(8 arcsin
h
sin
2) sin d (4.30)
the value of which turns out to be 1.4572.
This, multiplied by a combinatorial factor of three because what we considered the rst
vector could as well have been the second or third, should be compared to the value of the
integral
2
_
2
0
sin d = 2 (4.31)
- the result is that approximately 69% of all possible ordered bases in R
3
can be satisfac-
torily KS-coloured using the given construction.
The above considerations for the three-dimensional case can with some modications be
applied also in four dimensions. In this case, the white equatorial belt will be a three-
dimensional object, and the orthogonal great circle will have turned into a two-sphere.
Introducing spherical coordinates ,
1
,
2
on the three-sphere, we will start out by nding
the intersectional area of the orthgonal two-sphere and the white belt. Let z
denote the
black vector, let z be a reference coordinate, and let y
be a vector on the orthogonal

two-sphere as in gure 4.8.
43
Figure 4.7: Coordinates x
and y
are introduced in the plane of the great circle orthogonal

to the vector z
.
The white section is the set of vectors
u : [ u z [ A, A =
1
2
(4.32)
For any vector u in this set we have that
u z
= 0 (4.33)
Now, lets make the ansatz
y
= az +bz
(4.34)
Normalization together with condition (4.33) then gives
a
2
+b
2
+ 2abz
z = 1 and az
z +b = 0 (4.35)
which combines to
a =
1
sin
2
, b =
cos
2
sin
2
(4.36)
Also,
u z
= 0 u y
= u (az +bz
) = au z (4.37)
So, using (4.32), the belt on the orthogonal two-sphere will be the set of vectors
v : [ v [ B, B = aA =
1
2 sin
2
(4.38)
44
Figure 4.8: y
lies on the two-sphere orthogonal to the vector z
.
For 0
2
arcsin
1
2
the entire orthogonal two-sphere will lie within the white section.
This intersection between the orthogonal two-sphere and the white section on the three-
sphere can now be treated in analogue with the previuos case. Given a black rst vector,
when placing the second vector in the white section, the segment of the great circle or-
thogonal to this second vector on which we can choose the third in order for the fourth to
lie in the white section is given by
= 8 arcsin
B
sin
1
2 (4.39)
Also in analogy with the previous case, all of the orthogonal great circle will be white for
arccos B
1
arcsin B. To summarize, we have integration over the angle
2
which
runs between 0 and

2
, covering the black cap, and the possibilities available for choosing
the remaining three vectors are governed by a function of
2
, obtained from an integration
over the angle
1
between arccos B and

2
, that is, over the white section of the two-sphere
orthogonal to the rst vector specied by
2
, B being a function of
2
.
To make all of this explicit, we have the following integrals
I = 2
_
arcsin B
arccos B
sin
1
d
1
+
_
2
arcsin B
(8 arcsin
B
sin
1
2) sin
1
d
1
(4.40)
and, nally
4
_
arcsin
1
2
0
sin
2
2
d
2
+
_
4
arcsin
1
2
I sin
2
2
d
2
(4.41)
45
The result when comparing this, multiplied by an overall combinatorial factor of four, to
the value of the expression
4
_
2
0
sin
2
d (4.42)
is that 32% of the ordered orthogonal triples in R
4
are properly coloured using the chosen
method.
4.4 KS for a restricted class of sic-POVMs
As mentioned in the introduction to this chapter, a lot of people have been working on
the problem of proving the KS theorem by nite means. This quest has focused mainly on
PVMs, that is orthogonal resolutions of the identity, but some work has also been done for
POVMs of dierent types. Cabello, for instance, has shown the KS theorem for a single
qubit using eight element POVMs [12] by inscribing cubes in dodecahedrons. Masahiro
Nakamura has done the same using POVMs with four elements. As noted earlier, the
KS result follows trivially if one simultaneously allows POVMs with dierent numbers of
elements.
It should be noted that the lack of a Gleason type theorem for sic-POVMs in the qubit case
opens up for the possibility that we dont have the restriction of the KS result either. It
gives no further information, though, if this is actually the case. It is therefore worthwhile
to investigate some more obvious sets of four-element POVMs to see if they can be KS
coloured.
Here, we will make use of the dodecahedron method proposed by Cabello. For the restricted
class of regular tetrahedrons that can be inscribed in a regular dodecahedron so that the
vertices of the two congurations coincide, it is possible to by explicit construction prove
that the KS result is not applicable. The reader can convince herself that an inscribed
regular tetrahedron will have corners lying on four mutually next-to-next-to-adjacent ver-
tices. For a xed dodecahedron, there are 10 possible ways to inscribe a regular tetrahedron.
A satisfactory Kochen-Specker colouring is then given, as in gure 4.9, by
A I M Q F C P S
A H P R J C L T
E G N Q J B R N
E H L U K D G T
F D M U K B I S
where boldface characters imply that the corresponding vertex is coloured black.
46
Figure 4.9: Colouring of the vertices of a regular dodecahedron, corresponding to a negation
of the KS-theorem for the class of regular tetrahedrons that can be inscribed.
Consequently, the KS result is not valid for this restricted class of symmetric information-
ally complete POVMs, and whether or not we have the KS result for all sic-POVMs is still
an open question.
47
Chapter 5
Conclusions and open questions
We have seen that it suces to make the frame function assumption for POVMs with
two and three elements in order to prove a Gleason type theorem for general POVMs. In
the case of two-dimensional quantum systems, qubits, the sic-POVM does not yield the
Gleason result, as has been shown by Caves et al. [3]. The considerations of this diploma
work, however, have made likely that for all other four element POVMs the quantum rule
holds.
Generalizing a method of colouring proposed by Appleby in three dimensions [8], we have
also found a lower limit for the area of the n-sphere that can be KS coloured, but we are
still ignorant as to a sharp upper limit.
In three and four dimensions, we have calculated how many of all possible bases the coloured
area corresponds to. In order to make the lower limit mentioned above more physically
interesting, one would need to answer this question in arbitrary dimensions - something
that appears to be non-trivial.
The Appleby colouring has the advantage that it generalizes easily to higher dimensions.
As for a maximally eective colouring, there are no arguments to support that this would
be the case. In particular, it is in no way obvious that the same method of colouring would
be maximal in dierent dimensions.
Another question left unanswered is that about the existence of a nite set of vectors that
can be used to prove the KS theorem for sic-POVMs - as of yet, no such set has been
presented.
48
Bibliography
[1] Andrew M. Gleason, Measures on the closed subspaces of a Hilbert space, Journal of
Mathematics and Mechanics, Vol. 6, No. 6, 885-893 (1957)
[2] N.I. Akhiezer, I.M. Glazman, Theory of Linear Operators in Hilbert Space, (Ungar,
New York, 1963)
[3] C.M. Caves, C.A. Fuchs, K. Manne and J.M. Renes,Gleason-type Derivations of
the Quantum Probability Rule for Generalized Measurements, quant-ph/0306179;
Found. Phys. 34, 193 (2004)
[4] Paul Busch, Quantum states and generalized observables: a simple proof of Gleasons
theorem, quant-ph/9909073; Phys. Rev. Lett. 91, 120403 (2003)
[5] Norbert H.J. Lacroix, On Common Zeros of Legendres Associated Functions, Math-
ematics of Computation, Vol. 43, No. 167, 243-245 (1984)
[6] Hans Sagan, Boundary and Eigenvalue Problems in Mathematical Physics, (Chelsea,
New York, 1955)
[7] S. Kochen and E.P. Specker, The Problem of Hidden Variables in Quantum Mechanics,
J. Math. Mech., Vol. 17, No. 1, 59-87 (1967)
[8] D.M. Appleby, The Bell-Kochen-Specker theorem, quant-ph/0308114; Stud. Hist.
Philos. Mod. Phys. 36 (2005)
[9] J.S. Bell, On the Problem of Hidden Variables in Quantum Mechanics, Rev. Mod.
Phys., Vol. 38, No. 3, 447-452 (1966)
[10] A. Peres, Quantum theory: Concepts and Methods,(Kluwer Academic Publishers, Dor-
drecht, 1995)
[11] I. Pitowsky, Quantum Mechanics and Value deniteness, Philos. Sci., Vol. 52, No. 1,
154-156 (1985)
[12] Ad an Cabello, Kochen-Specker Theorem for a Single Qubit Using Positive Operator-
Valued Measures, quant-ph/0210082; Phys. Rev. Lett. 90 (2003)
49

Gleason's Theorem: Helena Granstr Om August 31, 2006

Uploaded by

Copyright:

Available Formats

Gleason's Theorem: Helena Granstr Om August 31, 2006

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gleason's Theorem: Helena Granstr Om August 31, 2006

Uploaded by

Copyright:

Available Formats

Gleasons theorem

will also be a unit vector orthogonal to y. Thus, using (1.28) and

4ReA(x, y) = F(x +y) F(x y)

and divide both sides of the above equation by to give

2, because we are on the unit sphere. The radius

in the plane orthogonal to z

; x, y, z is a reference coordinate system

is chosen so that its z component equals

in terms of the angle , making possible integration over all values of

will make an angle with the z axis.

be a vector on the orthogonal

are introduced in the plane of the great circle orthogonal

lies on the two-sphere orthogonal to the vector z

You might also like