u AIB u
ˆ
ˆ
u A u
u B u
A B
.
(4.52)
jj"
j
j"
j
j"
operator
j
j'
j'
j"
jj' j'j"
j'
j'
product
This result corresponds to the standard “row by column” rule of calculation of an arbitrary element of the matrix product
A
A
... B
B
11
12
...
11
12
AB A
A
... B
B
.
(4.53)
21
22
...
21
22
...
...
...
...
...
...
Hence a product of operators may be represented (in a fixed basis!) by that of their matrices (in the same basis).
This is so convenient that the same language is often used to represent not only long brackets, Chapter 4
Page 11 of 52
QM: Quantum Mechanics
A
A
...
11
12
1
Long
ˆ
*
A A *
, *
,... A
A
,
(4.54) bracket
j
jj '
j'
1
2
...
21
22
2
as a matrix
j'
...
...
...
...
product
but even short brackets:
1
Short
*
*
, *
,... ,
(4.55) bracket
j
j
1
2
2
as a matrix
j
...
product
although these equalities require the use of non-square matrices: rows of (complex-conjugate!) expansion coefficients for the representation of bra-vectors, and columns of these coefficients for the representation of ket-vectors. With that, the mapping of quantum states and operators on matrices becomes completely general.
Now let us have a look at the outer product operator (26). Its matrix elements are just
*
u u
.
(4.56)
j
j'
j
j'
jj'
These are the elements of a very special square matrix, whose filling requires the knowledge of just 2 N
scalars (where N is the basis size), rather than N 2 scalars as for an arbitrary operator. However, a simple generalization of such an outer product may represent an arbitrary operator. Indeed, let us insert two identity operators (44), with different summation indices, on both sides of an arbitrary operator:
ˆ A ˆˆ ˆ
I I
A
ˆ
u u
A
u
u
,
(4.57)
j
j
j'
j'
j
j'
and then use the associative axiom to rewrite this expression as
A ˆ u u A ˆ u
u .
(4.58)
j
j
j'
j'
j, j'
But the expression in the middle long bracket is just the matrix element (47), so that we may write Operator
ˆ A u A u .
(4.59) via its
j
jj '
j'
matrix
j, j'
elements
The reader has to agree that this formula, which is a natural generalization of Eq. (44), is extremely elegant.
The matrix representation is so convenient that it makes sense to extend it to one level lower –
from state vector products to the “bare” state vectors resulting from the operator’s action upon a given state. For example, let us use Eq. (59) to represent the ket-vector (18) as
' ˆ A u A u
u A u .
(4.60)
j
jj'
j'
j jj' j'
j, j'
j, j'
According to Eq. (40), the last short bracket is just j’, so that
' u A
(4.61)
'
'
A
u
j
jj
j
jj'
j'
j
j, j'
j j'
Chapter 4
Page 12 of 52
QM: Quantum Mechanics
But the expression in the parentheses is just the coefficient ’j of the expansion (37) of the resulting ket-vector (60) in the same basis, so that
'
A .
(4.62)
j
jj' j'
j'
This result corresponds to the usual rule of multiplication of a matrix by a column, so that we may represent any ket-vector by its column matrix, with the operator’s action looking like
'
A
A
1
...
11
12
1
'
A
A
.
(4.63)
2
...
21
22
2
... ...
...
...
...
Absolutely similarly, the operator action on the bra-vector (21), represented by its row-matrix, is
†
†
A
A
...
11
12
' ,
*
*
*
*
†
†
' ,
A
A
.
(4.64)
1
2
... ,1 2
,... ...
21
22
...
...
...
By the way, Eq. (64) naturally raises the following question: what are the elements of the matrix on its right-hand side, or more exactly, what is the relation between the matrix elements of an operator and its Hermitian conjugate? The simplest way to answer it is to use Eq. (25) with two arbitrary states (say, uj and uj’) of the same basis in the role of and . Together with the orthonormality relation (38), this immediately gives13
Hermitian
conjugate:
†
ˆ
*
A A
.
(4.65)
jj'
j' j
matrix
elements
Thus, the matrix of the Hermitian-conjugate operator is the complex conjugated and transposed matrix of the initial operator. This result exposes very clearly the difference between the Hermitian and the complex conjugation. It also shows that for the Hermitian operators, defined by Eq. (22),
*
A A ,
(4.66)
jj'
j' j
i.e. any pair of their matrix elements, symmetric with respect to the main diagonal, should be the complex conjugate of each other. As a corollary, their main-diagonal elements have to be real:
*
A A ,
Im
i.e.
A .
0
(4.67)
jj
jj
jj
13 For the sake of formula compactness, below I will use the shorthand notation in that the operands of this equality are just A† jj’ and A* j’j. I believe that it leaves little chance for confusion, because the Hermitian conjugation sign † may pertain only to an operator (or its matrix), while the complex conjugation sign *, to a scalar – say a matrix element.
Chapter 4
Page 13 of 52
QM: Quantum Mechanics
In order to fully appreciate the special role played by Hermitian operators in quantum theory, let us introduce the key notions of eigenstates aj (described by their eigenvectors aj and aj) and eigenvalues ( c-numbers) Aj of an operator A ˆ , both defined by the equation they have to satisfy:14
Operator:
A ˆ a A a .
(4.68) eigenstates
j
j
j
and
eigenvalues
Let us prove that eigenvalues of any Hermitian operator are real,15
Hermitian
*
A A , for j ,
1 ,...,
2
N,
(4.69) operator:
j
j
eigenvalues
while the eigenstates corresponding to different eigenvalues are orthogonal:
Hermitian
a a
,
0
if A A .
(4.70) operator:
j
j'
j
j'
eigenvectors
The proof of both statements is surprisingly simple. Let us inner-multiply both sides of Eq. (68) by the bra-vector aj’. On the right-hand side of the result, the eigenvalue Aj, as a c-number, may be taken out of the bracket, giving
a A ˆ a A a a .
(4.71)
j'
j
j
j'
j
This equality has to hold for any pair of eigenstates, so that we may swap the indices in Eq. (71), and write the complex-conjugate of the result:
*
ˆ
*
*
a A a
A a a
.
(4.72)
j
j'
j'
j
j'
Now using Eqs. (14) and (25), together with the Hermitian operator’s definition (22), we may transform Eq. (72) into the following form:
a A ˆ a A* a a .
(4.73)
j'
j
j'
j'
j
Subtracting this equation from Eq. (71), we get
0
*
A A
a a .
(4.74)
j
j'
j'
j
There are two possibilities to satisfy this relation. If the indices j and j’ are equal (denote the same eigenstate), then the bracket is the state’s norm squared, and cannot be equal to zero. In this case, the left parentheses (with j = j’) have to be zero, proving Eq. (69). On the other hand, if j and j’
correspond to different eigenvalues of A, the parentheses cannot equal zero (we have just proved that all Aj are real!), and hence the state vectors indexed by j and j’ should be orthogonal, e.g., Eq. (70) is valid.
As will be discussed below, these properties make Hermitian operators suitable, in particular, for the description of physical observables.
14 This equation should look familiar to the reader – see the stationary Schrödinger equation (1.60), which was the focus of our studies in the first three chapters. We will see soon that that equation is just a particular (coordinate) representation of Eq. (68) for the Hamiltonian as the operator of energy.
15 The reciprocal statement is also true: if all eigenvalues of an operator are real, it is Hermitian (in any basis).
This statement may be readily proved by applying Eq. (93) below to the case when Akk’ = Ak kk’, with Ak* = Ak.
Chapter 4
Page 14 of 52
QM: Quantum Mechanics
4.4. Change of basis, and matrix diagonalization
From the discussion of the last section, it may look that the matrix language is fully similar to, and in many instances more convenient than the general bra-ket formalism. In particular, Eqs. (54)-(55) and (63)-(64) show that any part of any bra-ket expression may be directly mapped on the similar matrix expression, with the only slight inconvenience of using not only columns but also rows (with their elements complex-conjugated), for state vector representation. This invites the question: why do we need the bra-ket language at all? The answer is that the elements of the matrices depend on the particular choice of the basis set, very much like the Cartesian components of a usual geometric vector depend on the particular choice of reference frame orientation (Fig. 4), and very frequently, at problem solution, it is convenient to use two or more different basis sets for the same system. (Just a bit more patience –
numerous examples will follow soon.)
y
y'
α
y
'y
Fig. 4.4. The transformation
x'
'
of components of a 2D vector
x
at a reference frame’s rotation.
0
x
x
With this motivation, let us explore what happens at the transform from one basis, { u}, to another one, { v} – both full and orthonormal. First of all, let us prove that for each such pair of bases, and an arbitrary numbering of the states of each base, there exists such an operator U ˆ that, first, v
U ˆ
u ,
(4.75)
j
j
Unitary
operator: and, second,
definition
ˆ U
U ˆ † U ˆ U
† ˆ I ˆ.
(4.76)
(Due to the last property,16 U îs called a unitary operator, and Eq. (75), a unitary transformation.) A very simple proof of both statements may be achieved by construction. Indeed, let us take Unitary
operator:
U ˆ
construction
v u ,
(4.77)
j'
j'
j'
- an evident generalization of Eq. (44). Then, using Eq. (38), we obtain
U ˆ u
v u u v v ,
(4.78)
j
j'
j'
j
j'
j'j
j
j'
j'
so that Eq. (75) has been proved. Now, applying Eq. (31) to each term of the sum (77), we get Conjugate
unitary
U †
ˆ
operator
u v ,
(4.79)
j'
j'
j'
16 An alternative way to express Eq. (76) is to write
1
ˆ †
ˆ
U U , but I will try to avoid this language.
Chapter 4
Page 15 of 52
QM: Quantum Mechanics
so that
U
U †
ˆ ˆ v u u v
.
(4.80)
'
v
v
'
v v
j
j
j'
j
j
jj'
j
j
j
j, j'
j, j'
j
But according to the closure relation (44), the last expression is just the identity operator, so that one of Eqs. (76) has been proved. (The proof of the second equality is absolutely similar.) As a by-product of our proof, we have also got another important expression – Eq. (79). It implies, in particular, that while, according to Eq. (75), the operator U ˆ performs the transform from the “old” basis uj to the “new” basis vj, its Hermitian adjoint †
ˆ
U performs the reciprocal transform:
Reciprocal
ˆ †
U v u u .
(4.81) basis
j
j'
j'j
j
transform
j'
Now let us see how do the matrix elements of the unitary transform operators look like.
Generally, as was discussed above, the operator’s elements may depend on the basis we calculate them in, so let us be specific – at least initially. For example, let us calculate the desired matrix elements Ujj’
in the “old” basis { u}, using Eq. (77):
ˆ
U
u U u u v u
u
u v u v .
(4.82)
jj ' in u
j
j'
j
j"
j"
j'
j
j"
j"j'
j
j'
j"
j"
Now performing a similar calculation in the “new” basis { v}, we get
ˆ
U
v U v v v u
v
u v u v .
(4.83)
jj' in v
j
j'
j
j"
j"
j'
jj"
j"
j'
j
j'
j"
j"
Surprisingly, the result is the same! This is of course true for the Hermitian conjugate (79) as well:
†
†
U
U
v u .
(4.84)
jj ' in u
jj ' in v
j
j'
These expressions may be used, first of all, to rewrite Eq. (75) in a more direct form. Applying the first of Eqs. (41) to any state vj’ of the “new” basis, and then Eq. (82), we get v
u
u v
U u
.
(4.85)
j'
j
j
j'
jj' j
j
j
Basis
transforms:
Similarly, the reciprocal transform is
matrix
form
u
v
v u
U † v
.
(4.86)
j'
j
j
j'
jj' j
j
j
These formulas are very convenient for applications; we will use them already in this section.
Next, we may use Eqs. (83)-(84) to express the effect of the unitary transform on the expansion coefficients j of the vectors of an arbitrary state , defined by Eq. (37). As a reminder, in the “old”
basis { u} they are given by Eqs. (40). Similarly, in the “new” basis { v},
v .
(4.87)
j in v
j
Again inserting the identity operator in its closure form (44) with the internal index j’, and then using Eqs. (84) and (40), we get
Chapter 4
Page 16 of 52
QM: Quantum Mechanics
v u u
v u u U† u U†
.
(4.88)
j
v
in
j
j'
j'
j
j'
j'
jj'
j'
jj'
j'
u
in
j'
j'
j'
j'
The reciprocal transform is performed by matrix elements of the operator U ˆ :
U
.
(4.89)
j
u
in
jj'
j'
v
in
j'
So, if the transform (75) from the “old” basis { u} to the “new” basis { v} is performed by a unitary operator, the change (88) of state vectors components at this transformation requires its Hermitian conjugate. This fact is similar to the transformation of components of a usual vector at coordinate frame rotation. For example, for a 2D vector whose actual position in space is fixed (Fig. 4):
'
x
cos
sin
x ,
(4.90)
'
y
sin
cos
y
but the reciprocal transform is performed by a different matrix, which may be obtained from that participating in Eq. (90) by the replacement –. This replacement has a clear geometric sense: if the “new” reference frame { x’, y’} is obtained from the “old” frame { x, y} by a counterclockwise rotation by angle , the reciprocal transformation requires angle –. (In this analogy, the unitary property (76) of the unitary transform operators corresponds to the equality of the determinants of both rotation matrices to 1.)
Due to the analogy between expressions (88) and (89) on one hand, and our old friend Eq. (62) on the other hand, it is tempting to skip indices in these new results by writing
ˆ †
ˆ
U
,
U
. (SYMBOLIC ONLY!)
(4.91)
in v
in u
in u
in v
Since the matrix elements of U ând †
ˆ
U do not depend on the basis, such language is not too bad and is
mnemonically useful. However, since in the bra-ket formalism (or at least its version presented in this course), the state vectors are basis–independent, Eq. (91) has to be treated as a symbolic one, and should not be confused with the strict Eqs. (88)-(89), and with the rigorous basis-independent vector and operator equalities discussed in Sec. 2.
Now let us use the same trick of identity operator insertion, repeated twice, to find the transformation rule for matrix elements of an arbitrary operator:
A
v A ˆ v v
u
u
A
u
u
v
U †
ˆ
A
U
;
(4.92)
jj' in v
j
j'
j
k
k
k'
k'
j'
in
k
k'
jk
kk'
u
k'j'
k , k'
Matrix
elements’ absolutely similarly, we may also get
transforms
A
U A
U † .
(4.93)
jj' in u
jk kk' in v k'j'
k , k'
In the spirit of Eq. (91), we may represent these results symbolically as well, in a compact form: ˆ
ˆ † ˆ
ˆ
ˆ
ˆ ˆ
ˆ
A
U A
U ,
†
A
U A U . (SYMBOLIC ONLY!)
(4.94)
in v
in u
in u
in v
As a sanity check, let us apply Eq. (93) to the identity operator:
Chapter 4
Page 17 of 52
QM: Quantum Mechanics
I ˆ
U ˆ † IU
ˆ ˆ
U ˆ U
† ˆ
I ˆ
(4.95)
v
in
in u
u
in
in u
- as it should be. One more (strict rather than symbolic) invariant of the basis change is the trace of any operator, defined as the sum of the diagonal terms of its matrix:
Operator’s/
A ˆ
Tr
A
Tr A .
(4.96) matrix’
jj
trace
j
The (easy) proof of this fact, using previous relations, is left for the reader’s exercise.
So far, I have implied that both state bases { u} and { v} are known, and the natural question is where does this information come from in quantum mechanics of actual physical systems. To get a partial answer to this question, let us return to Eq. (68), which defines the eigenstates and the eigenvalues of an operator. Let us assume that the eigenstates aj of a certain operator A ˆ form a full and orthonormal set, and calculate the matrix elements of the operator in the basis { a} of these states, at their arbitrary numbering. For that, it is sufficient to inner-multiply both sides of Eq. (68), written for some index j’, by the bra-vector of an arbitrary state aj of the same set: a A ˆ a
a A a .
(4.97)
j
j'
j
j'
j'
The left-hand side of this equality is the matrix element Ajj’ we are looking for, while its right-hand side is just Aj’ jj’. As a result, we see that the matrix is diagonal, with the diagonal consisting of the operator’s eigenvalues:
Matrix
A A .
(4.98) elements in
jj'
j
jj'
eigenstate
basis
In particular, in the eigenstate basis (but not necessarily in an arbitrary basis!), A jj means the same as Aj.
Thus the important problem of finding the eigenvalues and eigenstates of an operator is equivalent to the diagonalization of its matrix,17 i.e. finding the basis in which the operator’s matrix acquires the diagonal form (98); then the diagonal elements are the eigenvalues, and the basis itself is the desirable set of eigenstates.
To see how this is done in practice, let us inner-multiply Eq. (68) by a bra-vector of the basis (say, { u}) in that we have happened to know the matrix elements Ajj’:
u A ˆ a u A a .
(4.99)
k
j
k
j
j
On the left-hand side, we can (as usual :-) insert the identity operator between the operator A ând the ket-vector, and then use the closure relation (44) in the same basis { u}, while on the right-hand side, we can move the eigenvalue Aj (a c-number) out of the bracket, and then insert a summation over the same index as in the closure, compensating it with the proper Kronecker delta symbol:
u A ˆ u u a A u a .
(4.100)
k
k'
k'
j
j
k '
j
kk'
k'
k '
Moving out the signs of summation over k’, and using the definition (47) of the matrix elements, we get 17 Note that the expression “matrix diagonalization” is a very common but dangerous jargon. (Formally, a matrix is just a matrix, an ordered set of c-numbers, and cannot be “diagonalized”.) It is OK to use this jargon if you remember clearly what it actually means – see the definition above.
Chapter 4
Page 18 of 52
QM: Quantum Mechanics
A A
u a .
(4.101)
kk '
j
kk '
0
k '
j
k '
But the set of such equalities, for all N possible values of the index k, is just a system of linear, homogeneous equations for unknown c-numbers uk’ aj. According to Eqs. (82)-(84), these numbers are nothing else than the matrix elements Uk’j of a unitary matrix providing the required transformation from the initial basis { u} to the basis { a} that diagonalizes the matrix A. This system may be represented in the matrix form:
Matrix
A A
A
... U
j
j
11
12
1
diagonali-
zation
A
A A
... U
,
(4.102)
j
j 0
21
22
2
...
...
... ...
and the condition of its consistency,
A A
A
...
Characteristic
11
j
12
equation
A
A A
... ,
0
(4.103)
for
21
22
j
eigenvalues
...
...
...
plays the role of the characteristic equation of the system. This equation has N roots Aj – the eigenvalues of the operator A ˆ ; after they have been calculated, plugging any of them back into the system (102), we can use it to find N matrix elements Ukj ( k = 1, 2, … N) corresponding to this particular eigenvalue.
However, since the equations (103) are homogeneous, they allow finding Ukj only to a constant multiplier. To ensure their normalization, i.e. enforce the unitary character of the matrix U, we may use the condition that all eigenvectors are normalized (just as the basis vectors are):
2
a a
a u
u a
U
(4.104)
j
j
j
k
k
j
,
1
kj
k
k
for each j. This normalization completes the diagonalization.18
Now (at last!) I can give the reader some examples. As a simple but very important case, let us diagonalize each of the operators described (in a certain two-function basis { u}, i.e. in two-dimensional Hilbert space) by the so-called Pauli matrices
0 1
0 i
1 0
Pauli
σ
,
σ
,
σ
.
(4.105)
x
matrices
1 0
y
i
0
z
0 1
Though introduced by a physicist, with a specific purpose to describe electron’s spin, these matrices have a general mathematical significance, because together with the 22 identity matrix, they provide a full, linearly-independent system – meaning that an arbitrary 22 matrix may be represented as
A
A
11
12
b I c σ c σ c σ ,
(4.106)
x
x
y
y
z
z
A
A
21
22
18 A possible slight complication here is that the characteristic equation may give equal eigenvalues for certain groups of different eigenvectors. In such cases, the requirement of the mutual orthogonality of these degenerate states should be additionally enforced.
Chapter 4
Page 19 of 52
QM: Quantum Mechanics
with a unique set of four c-number coefficients b, cx, cy, and cz.
Since the matrix z is already diagonal, with the evident eigenvalues 1, let us start with diagonalizing the matrix x. For it, the characteristic equation (103) is evidently
A
1
j
,
0
i.e.
2
A 1 ,
0
(4.107)
1
j
Aj
and has two roots, A 1,2 = ±1. (Again, the state numbering is arbitrary!) So the eigenvalues of the matrix
x are the same as of the matrix z. (The reader may readily check that the eigenvalues of the matrix y are also the same.) However, the eigenvectors of the operators corresponding to these three matrices are different. To find them for x, let us plug its first eigenvalue, A 1 = +1, back into equations (101) spelled out for this particular case ( j = 1; k, k’ = 1,2):
u a u a ,
0
1
1
2
1
(4.108)
u a u a .
0
1
1
2
1
These two equations are compatible (of course, because the used eigenvalue A 1 = +1 satisfies the characteristic equation), and any of them gives
u a u a ,
e.
i.
U U .
(4.109)
1
1
2
1
11
21
With that, the normalization condition (104) yields
2
2
1
U
U
.
(4.110)
11
21
2
Although the normalization is insensitive to the simultaneous multiplication of U 11 and U 21 by the same phase factor exp{ i} with any real , it is convenient to keep the coefficients real, for example taking
= 0, to get
1
U U
.
(4.111)
11
21
2
Performing an absolutely similar calculation for the second characteristic value, A 2 = –1, we get U 12 = – U 22, and we may choose the common phase to have
1
U U
,
(4.112)
12
22
2
so that the whole unitary matrix for diagonalization of the operator corresponding to x is19
†
1 1
1
Unitary matrix
U U
,
(4.113)
x
x
diagonalizing
2 1 1
x
For what follows, it will be convenient to have this result expressed in the ket-relation form – see Eqs.
(85)-(86):
1
1
a U u U u
u u
a U u U u
u u
(4.114a)
1
11
1
21
2
1
2 ,
2
12
1
22
2
1
2 ,
2
2
19 Note that though this particular unitary matrix is Hermitian, this is not true for an arbitrary choice of phases .
Chapter 4
Page 20 of 52
QM: Quantum Mechanics
†
†
1
†
†
1
u U a U a
a a
u U a U
a
a a
(4.114b)
1
11
1
21
2
1
2 ,
2
12
1
22
2
1
2 .
2
2
Now let me show that these results are already sufficient to understand the Stern-Gerlach experiments described in Sec. 1 – but with two additional postulates. The first of them is that the interaction of a particle with the external magnetic field, besides that due to its orbital motion, may be described by the following vector operator of its spin dipole magnetic moment:20
Spin
magnetic
m
ˆ Sˆ
,
(4.115a)
moment
where the constant coefficient , specific for every particle type, is called the gyromagnetic ratio,21 and Sîs the vector operator of spin, with three Cartesian components:
Spin
vector
ˆS n S ˆ n S ˆ n S ˆ .
(4.115b)
x
x
y
y
z
z
operator
Here n x,y,z are the usual Cartesian unit vectors in the 3D geometric space (in the quantum-mechanics sense, just c-numbers, or rather “c-vectors”), while S âre the “usual” (scalar) operators.
x, y, z
For the so-called spin-½ particles (including the electron), these components may be simply, as S ˆ
σ ˆ
,
(4.116a)
x, y, z
x, y, z
2
Spin-½ expressed via those of the Pauli vector operator ˆσ n ˆ n ˆ n ˆ , so that we may also write x
x
y
y
z
z
operator
Sˆ
σˆ .
(4.116b)
2
In turn, in the so-called z-basis, each Cartesian component of the latter operator is just the corresponding Pauli matrix (105), so that it may be also convenient to use the following 3D vector of these matrices: Pauli
n
n in
z
x
y
matrix
σ n σ n σ n σ
.
(4.117)
x
x
y
y
z
z
vector
n in
n
x
y
z
The z-basis, in which such matrix representation of ˆ
σ is valid, is defined as an orthonormal basis
of certain two states, commonly denoted an , in that the matrix of the operator σ îs diagonal, with z
eigenvalues, respectively, + 1 and –1, and hence the matrix S z (/2) z of S îs also diagonal, with the z
eigenvalues +/2 and –/2. Note that we do not “understand” what exactly the states and are,22 but 20 This was the key point in the electron spin’s description, developed by W. Pauli in 1925-1927.
21For the electron, with its negative charge q = – e, the gyromagnetic ratio is negative: e = – g e e/2 m e, where g e 2
is the dimensionless g-factor. Due to quantum-electrodynamic (relativistic) effects, this g-factor is slightly higher than 2: g e = 2(1 + /2 + …) 2.002319304…, where e 2/40 c ( E H/ m e c 2)1/2 1/137 is the so-called fine structure constant. (The origin of its name will be clear from the discussion in Sec. 6.3.) 22 If you think about it, the word “understand” typically means that we can express a new, more complex notion in terms of those discussed earlier and considered “known”. In our current case, we cannot describe the spin states by some wavefunction (r), or any other mathematical notion discussed in the previous three chapters. The bra-ket formalism has been invented exactly to enable mathematical analyses of such “new” quantum states we do not Chapter 4
Page 21 of 52
QM: Quantum Mechanics
loosely associate them with some internal rotation of a spin-½ particle about the z-axis, with either positive or negative angular momentum component Sz. However, attempts to use such classical interpretation for quantitative predictions runs into fundamental difficulties – see Sec. 6 below.
The second necessary postulate describes the general relation between the bra-ket formalism and experiment. Namely, in quantum mechanics, each real observable A is represented by a Hermitian operator ˆ
ˆ †
A A , and the result of its measurement,23 in a quantum state described by a linear superposition of the eigenstates aj of the operator,
a , with a ,
(4.118)
j
j
j
j
j
may be only one of the corresponding eigenvalues Aj.24 Specifically, if the ket (118) and all eigenkets
aj are normalized to 1,
,
1
a a
1,
(4.119)
j
j
then the probability of a certain measurement outcome Aj is25
2
Quantum
W
*
a a ,
(4.120) measurement
j
j
j
j
j
j
postulate
This relation is evidently a generalization of Eq. (1.22) in wave mechanics. As a sanity check, let us assume that the set of the eigenstates aj is full, and calculate the sum of the probabilities to find the system in one of these states:
ˆ
W
.
(4.121)
j
a a I 1
j
j
j
j
Now returning to the Stern-Gerlach experiment, conceptually the description of the first ( z-
oriented) experiment shown in Fig. 1 is the hardest for us, because the statistical ensemble describing the unpolarized electron beam at its input is mixed (“incoherent”), and cannot be described by a pure (“coherent”) superposition of the type (6) that have been the subject of our studies so far. (We will discuss such mixed ensembles in Chapter 7.) However, it is intuitively clear that its results are compatible with the description of the two output beams as sets of electrons in the pure states and , respectively. The absorber following that first stage (Fig. 2) just takes all spin-down electrons out of the picture, producing an output beam of polarized electrons in the definite state. For such a beam, the probabilities (120) are W = 1 and W = 0. This is certainly compatible with the result of the “control”
experiment shown on the bottom panel of Fig. 2: the repeated SG ( z) stage does not split such a beam, keeping the probabilities the same.
initially “understand”. Gradually we get accustomed to these notions, and eventually, as we know more and more about their properties, start treating them as “known” ones.
23 Here again, just like in Sec. 1.2, the statement implies the abstract notion of “ideal experiments”, deferring the discussion of real (physical) measurements until Chapter 10.
24 As a reminder, at the end of Sec. 3 we have already proved that such eigenstates corresponding to different values Aj are orthogonal. If any of these values is degenerate, i.e. corresponds to several different eigenstates, they should be also selected orthogonal, in order for Eq. (118) to be valid.
25 This key relation, in particular, explains the most common term for the (generally, complex) coefficients j, the probability amplitudes.
Chapter 4
Page 22 of 52
QM: Quantum Mechanics
Now let us discuss the double Stern-Gerlach experiment shown on the top panel of Fig. 2. For that, let us represent the z-polarized beam in another basis – of the two states (I will denote them as
and ) in that, by definition, the matrix S x is diagonal. But this is exactly the set we called a 1,2 in the x matrix diagonalization problem solved above. On the other hand, the states and are exactly what we called u 1,2 in that problem, because in this basis, we know matrix explicitly – see Eq. (117). Hence, in the application to the electron spin problem, we may rewrite Eqs. (114) as
1
1
Relation
,
,
(4.122)
between
2
2
eigenvectors
of Sx and Sz
1
1
,
,
(4.123)
2
2
Currently for us the first of Eqs. (123) is most important, because it shows that the quantum state of electrons entering the SG ( x) stage may be represented as a coherent superposition of electrons with Sx = +/2 and Sx = –/2. Notice that the beams have equal probability amplitude moduli, so that according to Eq. (122), the split beams and have equal intensities, in accordance with experimental results. (The minus sign before the second ket-vector is of no consequence here, but it may have an impact on outcomes of other experiments – for example, if coherently split beams are brought together again.)
Now, let us discuss the most mysterious (from the classical point of view) multi-stage SG
experiment shown on the middle panel of Fig. 2. After the second absorber has taken out all electrons in, say, the state, the remaining electrons, all in the state , are passed to the final, SG ( z), stage. But according to the first of Eqs. (122), this state may be represented as a (coherent) linear superposition of the and states, with equal probability amplitudes. The final stage separates electrons in these two states into separate beams, with equal probabilities W = W = ½ to find an electron in each of them, thus explaining the experimental results.
To conclude our discussion of the multistage Stern-Gerlach experiment, let me note that though it cannot be explained in terms of wave mechanics (which operates with scalar de Broglie waves), it has an analogy in classical theories of vector fields, such as the classical electrodynamics. Indeed, let a plane electromagnetic wave propagate normally to the plane of the drawing in Fig. 5, and pass through the linear polarizer 1.
1
/4
2
Fig. 4.5. A light polarization sequence similar to the three-stage
3
Stern-Gerlach experiment shown on the middle panel of Fig. 2.
0
Similarly to the output of the initial SG ( z) stages (including the absorbers) shown in Fig. 2, the output wave is linearly polarized in one direction – the vertical direction in Fig. 5. Now its electric field vector has no horizontal component – as may be revealed by the wave’s full absorption in a perpendicular polarizer 3. However, let us pass the wave through polarizer 2 first. In this case, the Chapter 4
Page 23 of 52
QM: Quantum Mechanics
output wave does acquire a horizontal component, as can be, again, revealed by passing it through polarizer 3. If the angles between the polarization directions 1 and 2, and between 2 and 3, are both equal to /4, each polarizer reduces the wave amplitude by a factor of 2, and hence the intensity by a factor of 2, exactly like in the multistage SG experiment, with the polarizer 2 playing the role of the SG
( x) stage. The “only” difference is that the necessary angle is /4, rather than by /2 for the Stern-Gerlach experiment. In quantum electrodynamics (see Chapter 9 below), which confirms classical predictions for this experiment, this difference may be interpreted by that between the integer spin of electromagnetic field quanta (photons) and the half-integer spin of electrons.
4.5. Observables: Expectation values and uncertainties
After this particular (and hopefully inspiring) example, let us discuss the general relation between the Dirac formalism and experiment in more detail. The expectation value of an observable over any statistical ensemble (not necessarily coherent) may be always calculated using the general statistical rule (1.37). For the particular case of a coherent superposition (118), we can combine that rule with Eq. (120) and the second of Eqs. (118):
A A W
* A
a A a
a A a
.
(4.124)
j
j
j
j
j
j
j
j
j
j
j
j
j
j
j
Now using Eq. (59) for the particular case of the eigenstate basis { a}, for which Eq. (98) is valid, we arrive at a very simple and important formula26
Expectation
A
A ˆ
.
(4.125) value
as a long
bracket
This is a clear analog of the wave-mechanics formula (1.23) – and as we will see soon, may be used to derive it. A big advantage of Eq. (125) is that it does not explicitly involve the eigenvector set of the corresponding operator, and allows the calculation to be performed in any convenient basis.27
For example, let us consider an arbitrary coherent state of spin-½,28 and calculate the expectation values of its components. The calculations are easiest in the z-basis because we know the matrix elements of the spin operator components in that basis. Representing the ket- and bra-vectors of the given state as linear superpositions of the corresponding vectors of the basis states and ,
*
*
,
.
(4.126)
and plugging these expressions to Eq. (125) written for the observable Sz, we get 26 This equality reveals the full beauty of Dirac’s notation. Indeed, initially in this chapter the quantum-mechanical brackets just reminded the angular brackets used for the statistical averaging. Now we see that in this particular (but most important) case, the angular brackets of these two types may be indeed equal to each other!
27 Note also that Eq. (120) may be rewritten in a form similar to Eq. (125): W ˆ , where îs the j
j
j
operator (42) of the state’s projection upon the j th eigenstate aj.
28 For clarity, the noun “spin-½” is used, here and below, to denote the spin degree of freedom of a spin-½
particle, independent of its orbital motion.
Chapter 4
Page 24 of 52
QM: Quantum Mechanics
*
* ˆ
S
S
z
z
(4.127)
*
ˆ
*
ˆ
*
ˆ
*
ˆ
S S S S .
z
z
z
z
Now there are two equivalent ways (both very simple) to calculate the long brackets in this expression. The first one is to represent each of them in the matrix form in the z-basis, in which the bra-and ket-vectors of states and are the matrix-rows (1, 0) and (0, 1), or similar matrix-columns – the exercise highly recommended to the reader. Another (perhaps more elegant) way is to use the general Eq. (59), in the z-basis, together with the spin-½-specific Eqs. (116a) and (105) to write Spin-½
component
ˆ S
S
i
S
.
(4.128)
x
ˆ
,
y
ˆ
,
z
operators
2
2
2
For our particular calculation, we may plug the last of these expressions into Eq. (127), and use the orthonormality conditions (38):
,
1
0 .
(4.129)
Both approaches give (of course) the same result:
S
*
*
.
(4.130)
z
2
This particular result might be also obtained using Eq. (120) for the probabilities W = *
and W = *, namely:
*
*
S W
W
.
(4.131)
z
2
2
2
2
The formal way (127), based on using Eq. (125), has, however, an advantage of being applicable, without any change, to finding the observables whose operators are not diagonal in the z-basis, as well.
In particular, absolutely similar calculations give
*
ˆ
*
ˆ
*
ˆ
*
ˆ
*
*
S
S S S S
(4.132)
x
x
x
x
x
,
2
*
ˆ
*
ˆ
*
ˆ
*
ˆ
*
*
S
S S S S i
(4.133)
y
y
y
y
y
,
2
Let us have a good look at a particular spin state, for example the spin-up state . According to Eq. (126), in this state = 1 and = 0, so that Eqs. (130)-(133) yield:
S ,
S
S 0 .
(4.134)
z
2
x
y
Now let us use the same Eq. (125) to calculate the spin component uncertainties. According to Eqs.
(105) and (116)-(117), the operator of each spin component squared is equal to (/2)2 I ˆ , so that the general Eq. (1.33) yields
Chapter 4
Page 25 of 52
QM: Quantum Mechanics
2
2
2
S
(4.135a)
z 2
2
2
ˆ 2
ˆ
S S
S
z
z
z
I ,
0
2
2
2
2
2
S
(4.135b)
x 2
2
2
ˆ 2
ˆ
S S
S 0
x
x
x
I ,
2
2
2
2
S
.
(4.135c)
y
2
2
2
2
ˆ
ˆ
S S
S 0
y
y
y
I
2
2
While Eqs. (134) and (135a) are compatible with the classical notion of the angular momentum of magnitude /2 being directed exactly along the z-axis, this correspondence should not be overstretched, because such classical picture cannot explain Eqs. (135b) and (135c). The best (but still imprecise!) classical image I can offer is the spin vector S oriented, on average, in the z-direction, but still having its x- and y-components strongly “wobbling” (fluctuating) about their zero average values.
It is straightforward to verify that in the x-polarized and y-polarized states the situation is similar, with the corresponding change of axis indices. Thus, in neither of these states all three spin components have definite values. Let me show that this is not just an occasional fact, but reflects the perhaps most profound property of quantum mechanics, the uncertainty relations. For that, let us consider two measurable observables, A and B, of the same quantum system. There are two possibilities here. If the operators corresponding to these observables commute,
ˆ ˆ
,
A B 0 ,
(4.136)
then all matrix elements of the commutator in any orthogonal basis (in particular, in the basis of eigenstates aj of the operator A ˆ ) have to equal zero:
a
A B a
a AB a
a BA a
.
(4.137)
j ˆ
ˆ
,
ˆ ˆ
ˆ ˆ
0
j '
j
j '
j
j '
In the first bracket of the middle expression, let us act by the (Hermitian!) operator A ôn the bra-vector, while in the second one, on the ket-vector. According to Eq. (68), such action turns the operators into the corresponding eigenvalues, which may be taken out of the long brackets, so that we get ˆ
ˆ
ˆ
A a B a
A a B a
A
A
a B a
(4.138)
j
j
j'
j'
j
j'
j
j'
0.
j
j'
This means that if all eigenstates of operator A âre non-degenerate (i.e. Aj Aj’ if j j’), the matrix of operator B ˆ has to be diagonal in the basis { a}, i.e., the eigenstate sets of the operators A ând B ˆ coincide. Such pairs of observables (and their operators) that share their eigenstates, are called compatible. For example, in the wave mechanics of a particle, its momentum (1.26) and kinetic energy (1.27) are compatible, sharing their eigenfunctions (1.29). Now we see that this is not occasional, because each Cartesian component of the kinetic energy is proportional to the square of the corresponding component of the momentum, and any operator commutes with an arbitrary integer power of itself:
ˆ ˆ,
A An
ˆ ˆ ˆ ˆ
ˆ ˆ ˆ ˆ
ˆ ˆ ˆ ˆ
,
A A ...
A A AA ...
A A
...
A
A
AA 0 .
(4.139)
n
n
n
Chapter 4
Page 26 of 52
QM: Quantum Mechanics
Now, what if operators A ând B ˆ do not commute? Then the following general uncertainty relation is valid:
General
1
uncertainty
A
B
A ˆ B ˆ, ,
(4.140)
relation
2
where all expectation values are for the same but arbitrary state of the system. The proof of Eq. (140) may be divided into two steps, the first one proving the so-called Schwartz inequality for any two possible states, say and :29
Schwartz
inequality
2
.
(4.141)
Its proof may be readily achieved by applying the postulate (16) – that the norm of any legitimate state of the system cannot be negative – to the state with the following ket-vector:
,
(4.142)
where and are possible, non-null states of the system, so that the denominator in Eq. (142) is not equal to zero. For this case, Eq. (16) gives
.
0
(4.143)
Opening the parentheses, we get
0.
(4.144)
2
After the cancellation of one inner product in the numerator and the denominator of the last term, it cancels with the 2nd (or the 3rd) term. What remains is the Schwartz inequality (141).
Now let us apply this inequality to states
A ˆ
~
and
B ˆ~
,
(4.145)
where, in both relations, is the same (but otherwise arbitrary) possible state of the system, and the deviation operators are defined similarly to the deviations of the observables (see Sec. 1.2):
~
A ˆ~ A ˆ A
B ˆ
,
B ˆ B .
(4.146)
With this substitution, and taking into account again that the observable operators A ând B âre Hermitian, Eq. (141) yields
2
ˆ~2
ˆ~2
ˆ~ ˆ~
A B AB .
(4.147)
Since the state is arbitrary, we may use Eq. (125) to rewrite this relation as an operator inequality: 29 This inequality is the quantum-mechanical analog of the usual vector algebra’s result 22 2.
Chapter 4
Page 27 of 52
QM: Quantum Mechanics