QM: Quantum Mechanics
into Eqs. (95) and (97), we see that they are indeed satisfied, provided that a system of four coupled, linear algebraic equations for four complex c-number amplitudes u 1,2,3,4 is satisfied. The condition of its consistency yields the same dispersion relation (87), i.e. the same two-branch diagram shown in Fig. 6, as follows from the Klein-Gordon equation. The difference is that plugging each value of , given by Eq. (87), back into the system of the linear equations for four amplitudes u, we get two solutions for their vector u ( u 1, u 2, u 3, u4) for each of the two energy branches – see Fig. 6 again. In the standard z-
basis of spin operators, they may be represented as follows:
1
0
0
1
cpz
cp
for E E 0 : u
c
,
u
c
2
2 ,
(9.104a)
E mc
E
mc
cp
cp
z
2
2
E mc
E
mc
cpz
cp
E
2
2
mc
E
mc
cp
cp
for E E 0 :
u
c
,
u
c
z
,
(9.104b)
2
2
E mc
E
mc
1
0
0
1
where p px ipy, and c are normalization coefficients.
The simplest interpretation of these solutions is that Eq. (103), with the vectors u+ given by Eq.
(104a), represents a spin-½ particle (say, an electron), while with the vectors u– given by Eq. (104b), it represents an antiparticle (a positron), and the two solutions for each particle, indexed with opposite arrows, correspond to two possible directions of the spin–½ , z = 1, i.e. Sz = /2. This interpretation is indeed solid in the non-relativistic limit, when two last components of the vector (104a), and two first components of the vector (104b) are negligibly small:
1
0
0
0
0
1
0
0
p
u
, u
, u
, u
,
for x, y, z 0 .
(9.105)
0
0
1
0
mc
0
0
0
1
However, at arbitrary energies, the physical picture is more complex. To show this, let us use the Dirac equation to calculate the Heisenberg-picture law of time evolution of the operator of some Cartesian component of the orbital angular momentum L rp, for example of Lx = ypz – zpy, taking into account that the Dirac operators (98a) commute with those of r and p, and also the Heisenberg commutation relations (2.14):
L ˆ
x
i
L ˆ H ˆ
,
c ˆα ˆ p
y ˆ zp
ˆˆ , ˆp i
c ˆ
p ˆ ˆ
p ˆ ,
(9.106)
x
z
y
z y y z
t
Chapter 9
Page 28 of 36
QM: Quantum Mechanics
with similar relations for two other Cartesian components. Since the right-hand side of these equations is different from zero, the orbital momentum is generally not conserved – even for a free particle! Let us, however, consider the following vector operator,
Spin
operator
σ 0ˆ
ˆ
in Dirac’s
Sˆ
.
(9.107a)
theory
2 0ˆ σˆ
According to Eqs. (4.105), its Cartesian components, in the z-basis, are represented by 44 matrices
0 1 0 0
0 i 0 0
1 0 0 0
1 0 0 0
i
0
0
0
0 1 0
0
S
,
S
,
S
. (9.107b)
x
y
z
2 0 0 0 1
2 0
0
0 i
2 0
0
1
0
0 0 1 0
0 0
i
0
0 0 0
1
Let us calculate the Heisenberg-picture law of time evolution of these components, for example S ˆ
x
i
S ˆ H ˆ
,
c S ˆ , ˆ
p ˆ ˆ
p ˆ ˆ p ˆ .
(9.108)
x
x x x y y z z
t
A direct calculation of the commutators of the matrices (98) and (107) yields
ˆ S , ˆ
S i
S i
(9.109)
x
x
ˆ
,
0
, ˆ
x
y
ˆ z ˆ
,
, ˆ
x
z
ˆ ,
y
so that we finally get
S ˆ
x
i
i c
ˆ
p ˆ ˆ
p ˆ ,
(9.110)
z
y
y
z
t
with similar expressions for the other two components of the operator. Comparing this result with Eq.
(106), we see that any Cartesian component of the operator defined similarly to Eq. (5.170), Jˆ Lˆ Sˆ ,
(9.111)
is an integral of motion,53 so that this operator may be interpreted as the one representing the total angular momentum of the particle. Hence, the operator (107) may be interpreted as the spin operator of a spin-½ particle (e.g., electron). As it follows from the last of Eq. (107b), in the non-relativistic limit the columns (105) represent the eigenkets of the z-component of that operator, with eigenstates Sz = /2, with the sign corresponding to on the arrow index. So, the Dirac theory provides a justification for spin-½ – or, somewhat more humbly, replaces the Pauli Hamiltonian postulate (4.163) with that of a simpler (and hence more plausible), Lorentz-invariant Hamiltonian (97).
Note, however, that this simple interpretation, fully separating a particle from its antiparticle, is not valid for the exact solutions (103)-(104), so that generally the eigenstates of the Dirac Hamiltonian are certain linear (coherent) superpositions of the components describing the particle and its antiparticle
– each with both directions of spin. This fact leads to several interesting effects, including the so-called Klien paradox at the reflection of a relativistic electron from a potential barrier.54
53 It is straightforward to show that this result remains valid for a particle in any central field U( r).
54 See, e.g., A. Calogeracos and N. Dombey, Contemp. Phys. 40, 313 (1999).
Chapter 9
Page 29 of 36
QM: Quantum Mechanics
9.7. Low-energy limit
The generalization of Dirac’s theory to the case of a (spin-½) particle with an electric charge q, moving in a classically-described electromagnetic field, may be obtained using the same replacement (90). As a result, Eq. (95) turns into
cα i
qA
ˆ
ˆ
2
mc
ˆ
q H Ψ 0 ,
(9.112) Dirac equation
in EM field
where the Hamiltonian operator H îs understood in the sense of Eq. (95), i.e. as the partial time derivative with the multiplier i. Let us prepare this equation for a low-energy approximation by acting on its left-hand side by a similar square bracket but with the opposite sign before the last parentheses –
also an operator! Using Eqs. (99) and (100), and the fact that the space- and time-independent operators αând β ˆ commute with the spin-independent, c-number functions Ar, t and r, t, as well as with the Hamiltonian operator i/ t, the result is
2
2
2
2
c ˆα i qA 2
mc cˆα i qA
ˆ
, q H
ˆ
q H Ψ 0 .
(9.113)
A direct calculation of the first square bracket, using Eqs. (98) and (107), yields
αˆ i
A
q 2 i
A
q 2
Sˆ
2 q A .
(9.114)
But the last vector product on the right-hand side is just the magnetic field – see, e.g., Eqs. (3.21): B A .
(9.115)
Similarly, we may use the first of Eqs. (3.21), for the electric field,
A
E
,
(9.116)
t
to simplify the commutator participating in Eq. (9.113):
A
αˆ i qA q ˆ
,
H qαˆ ˆ
H , A i qαˆ , i q
iαˆ i qαˆ E . (9.117)
t
As a result, Eq. (113) becomes
2
ˆ 2
2
2
2
2 ˆ
c i A
q
q H mc
2 qc SB i cq ˆ
αE
Ψ 0 .
(9.118)
So far, this is an exact result, equivalent to Eq. (112), but it is more convenient for an analysis of the low-energy limit, in which not only the energy offset E – mc 2 (which is just the energy used in the non-relativistic mechanics), but also the electrostatic energy of the particle, q, are much smaller than the rest energy mc 2. In this limit, the second and third terms of Eq. (118) almost cancel, and introducing the offset Hamiltonian
H ˆ~ H ˆ mc 2
I ˆ .
(9.119)
we may approximate their difference, up to the first non-zero term, as
2
2
q I ˆ
H ˆ mc 2
2
ˆ
I q I ˆ mc 2 I H ˆ~
ˆ
mc 2
2
ˆ
2
I mc H ˆ~
2
q I ˆ
.
(9.120)
As a result, after the division of all terms by 2 mc 2, Eq. (118) may be approximated as Chapter 9
Page 30 of 36
QM: Quantum Mechanics
Low-
ˆ~
1
q
i q
energy
H Ψ
i qA2
ˆ
q S B
ˆα E Ψ
.
(9.121)
Hamiltonian
2 m
m
2
mc
Let us discuss this important result. The first two terms in the square brackets give the non-relativistic Hamiltonian (3.26), which was extensively used in Chapter 3 for the discussion of charged particle motion. Note again that the contribution of the vector potential A into that Hamiltonian is essentially relativistic, in the following sense: when used for the description of magnetic interaction of two charged particles, due to their orbital motion with speed v << c, the magnetic interaction is a factor of ( v/ c)2 smaller than the electrostatic interaction of the particles.55 The reason why we did discuss the effects of A in Chapter 3 was that is was used there to describe external magnetic fields, keeping our analysis valid even for the cases when that field is strong because of being produced by relativistic effects – such as aligned spins of a permanent magnet.
The next, third term in the square brackets of Eq. (121) should be also familiar to the reader: this is the Pauli Hamiltonian – see Eqs. (4.3), (4.5), and (4.163). When justifying this form of interaction in Chapter 4, I referred mostly to the results of Stern-Gerlach-type experiments, but it is extremely pleasing that this result56 follows from such a fundamental relativistic treatment as Dirac's theory. As we already know from the discussion of the Zeeman effect in Sec. 6.4, the magnetic field effects on the orbital motion of an electron (described by the orbital angular momentum L) and its spin S are of the same order, though quantitatively different.
Finally, the last term in the square brackets of Eq. (121) is also not quite new for us: in particular, it describes the spin-orbit interaction. Indeed, in the case of a classical, spherical-symmetric electric field E corresponding to the potential ( r) = U( r)/ q, this term may be reduced to Eq. (6.56): Spin-orbit
1
1 dU
q
1
coupling
H ˆ
ˆ ˆ
ˆ ˆ
S L
S L
.
(9.122)
so
E
2 m 2 c 2
r dr
2 m 2 c 2
r
The proof of this correspondence requires a bit of additional work.57 Indeed, in Eq. (121), the term responsible for the spin-orbit interaction acts on 4-component wavefunctions, while the Hamiltonian (122) is supposed to act on non-relativistic state vectors with an account of spin, whose coordinate representation may be given by 2-component spinors:58
55 This difference may be traced by classical means – see, e.g., EM Sec. 5.1.
56 Note that in this result, the g-factor of the particle is still equal to exactly 2 – see Eq. (4.115) and its discussion in Sec. 4.4. In order to describe the small deviation of g e from 2, the electromagnetic field should be quantized (just as this was discussed in Secs. 1-4 of this chapter), and its potentials A and , participating in Eq. (121), should be treated as operators – rather than as c-number functions as was assumed above.
57 The only facts immediately evident from Eq. (121) are that the term we are discussing is proportional to the electric field, as required by Eq. (122), and that it is of the proper order of magnitude. Indeed, Eqs. (101)-(102) imply that in the Dirac theory, α
c ˆ plays the role of the velocity operator, so that the expectation values of the term are of the order of qv E/2 mc 2. Since the expectation values of the operators participating in the Hamiltonian (122) scale as S ~ /2 and L ~ mvr, the spin-orbit interaction energy has the same order of magnitude.
58 In this course, the notion of spinor (popular in some textbooks) was not used much; it was introduced earlier only for two-particle states – see Eq. (8.13). For a single particle, such definition is reduced to (r) s, whose representation in a particular spin-½ basis is the column (123). Note that such spinors may be used as a basis for an expansion of the spin-orbitals j(r) defined by Eq. (8.125), where the index j is used for numbering both the spin’s orientation (i.e. the particular component of the spinor's column) and the orbital eigenfunction.
Chapter 9
Page 31 of 36
QM: Quantum Mechanics
.
(9.123)
The simplest way to prove the equivalence of these two expressions is not to use Eq. (121) directly, but to return to the Dirac equation (112), for the particular case of motion in a static electric field but no magnetic field, when Dirac’s Hamiltonian is reduced to
H ˆ c ˆα ˆp ˆ mc 2 U r, U
with
q .
(9.124)
Since this Hamiltonian is time-independent, we may look for its 4-component eigenfunctions in the form
r
E
Ψr, t
exp
i
t ,
(9.125)
r
where each of is a 2-component column of the type (123), representing two spin states of the particle (index +) and its antiparticle (index –). Plugging Eq. (125) into Eq. (95) with the Hamiltonian (124), and using Eq. (98a), we get the following system of two linear equations:
2
E mc U r c ˆσ ˆp ,
0
E mc
U r
σ
c p
(9.126)
2
ˆ ˆ
.
0
Expressing - from the latter equation, and plugging the result into the former one, we get the following single equation for the particle’s spinor:
2
2
1
E mc Ur c ˆσ ˆp
σ p .
(9.127)
2
E mc U r ˆ ˆ
0
So far, this is an exact equation for eigenstates and eigenvalues of the Hamiltonian (124), but it may be substantially simplified in the low-energy limit when both the potential energy59 and the non-relativistic eigenenergy
~
2
E E mc
(9.128)
are much lower than mc 2. Indeed, in this case, the expression in the denominator of the last term in the brackets of Eq. (127) is close to 2 mc 2. Since 2 = 1, with that replacement, Eq. (127) is reduced to the non-relativistic Schrödinger equation, similar for both spin components of +, and hence giving spin-degenerate energy levels. To recover small relativistic and spin-orbit effects, we need a slightly more accurate approximation:
1
1
1
~
E U r 1
1
~
E U r
1
1
, (9.129)
E
2
mc
U r
2
mc ~
2
E U r
2
2
2
2
2 mc
2 mc
2 mc
2 mc
in which Eq. (127) is reduced to
2
~
~
ˆ p
E
U r
E U r
ˆσ ˆp
σ p .
(9.130)
2
m
2 mc ˆ ˆ
0
2
2
As Eqs. (5.34) shows, the operators of the momentum and of a function of coordinates commute as
pˆ, U r i
U ,
(9.131)
59 Strictly speaking, this requirement is imposed on the expectation values of U(r) in the eigenstates to be found.
Chapter 9
Page 32 of 36
QM: Quantum Mechanics
so that the last term in the square brackets of Eq. (130) may be rewritten as
~
E U
~
r
E U r
i
σˆ pˆ
σˆ pˆ
ˆ 2
p
σˆ U σˆ pˆ
.
(9.132)
2 mc2
2 mc2
2 mc2
Since in the low-energy limit, both terms on the right-hand side of this relation are much smaller than the three leading terms of Eq. (130), we may replace the first term’s numerator with its non-relativistic approximation p ˆ2 / 2 m . With this replacement, the term coincides with the first relativistic correction to the kinetic energy operator – see Eq. (6.47). The second term, proportional to the electric field E = – = – U/ q, may be transformed further on, using a readily verifiable identity
σˆ U σˆ pˆ U pˆ iσˆ U pˆ.
(9.133)
Of the two terms on the right-hand side of this relation, only the second one depends on spin,60 giving the following spin-orbital interaction contribution to the Hamiltonian,
q
ˆ
H
σˆ U
.
(9.134)
so
pˆ
Sˆ
pˆ
2 mc2
2 2 2
m c
For a central potential ( r), its gradient has only the radial component: = ( d/ dr)r/r = –Er/ r, and with the angular momentum definition (5.147), Eq. (134) is (finally!) reduced to Eq. (122).
As was shown in Sec. 6.3, the perturbative treatment of Eq. (122), together with the kinetic-relativistic correction (6.47), in the hydrogen-like atom/ion problem, leads to the fine structure of each Bohr level En, given by Eq. (6.60):
2 E
n
4 n
Δ E
.
(9.135)
fine
2
3
mc
j ½
This result receives a confirmation from the surprising fact that for the hydrogen-like atom/ion problem, the Dirac equation may be solved exactly – without any assumptions. I would not have time/space to reproduce the solution,61 and will only list the final result for the energy spectrum:
1/ 2
2
2
H-like atom:
E
Z
1
eigenenergies
(9.136)
2
mc
n
2
( j ½)2
2
2
Z 1/2 j
.
½
Here n = 1, 2, … is the same principal quantum number as in Bohr’s theory, while j is the quantum number specifying the eigenvalues (5.175) of J 2, in our case of a spin-½ particle taking half-integer values: j = l ½ = 1/2, 3/2, 5/2, … – see Eq. (5.189). This is natural, because due to the spin-orbit interaction, the orbital momentum and spin are not conserved, while their vector sum, J = L + S, is – at least in the absence of an external field. Each energy level (136) is doubly-degenerate, with two eigenstates representing two directions of the spin. (In the low-energy limit, we may say: corresponding to two values of l = j ½, at fixed j.)
60 The first term gives a small spin-independent energy shift, which is very difficult to verify experimentally.
61 Good descriptions of the solution are available in many textbooks (the older the better :-) – see, e.g., Sec. 53 in L. Schiff, Quantum Mechanics, 3rd ed., McGraw-Hill (1968).
Chapter 9
Page 33 of 36
QM: Quantum Mechanics
Speaking of that limit (when E – mc 2 ~ E H << mc 2): since according to Eq. (1.13) for E H, the square of the fine-structure constant e 2/40c may be represented as the ratio E H/ mc 2, we may follow this limit expanding Eq. (136) into the Taylor series in ( Z)2 << 1. The result,
2
2
4
4
Z
Z
2
n
3
E mc 1
,
(9.137)
2
4
2 n
2 n j ½ 4
has the same structure, and allows the same interpretation as Eq. (92), but with the last term coinciding with Eq. (6.60) – and with experimental results. Historically, this correct description of the fine structure of the atomic levels provided the decisive proof of Dirac’s theory.
However, even such an impressive theory does not have too many direct applications. The main reason for that was already discussed in brief in the end of Sec. 5: due to the possibility of creation and annihilation of particle-antiparticle pairs by an energy influx higher than 2 mc 2, the number of particles participating in high-energy interactions is not fixed. An adequate general description of such situations is given by the quantum field theory, in which the particle’s wavefunction is treated as a field to be quantized, using so-called field operators ˆ
r, t– very much similar to the electromagnetic field
operators (16). The Dirac equation follows from such theory in the single-particle approximation.
As was mentioned above on several occasions, the quantum field theory is well beyond the time/space limits of this course, and I have to stop here, referring the interested reader to one of several excellent textbooks on this discipline.62 However, I would strongly encourage the students going in this direction to start by playing with the field operators on their own, taking clues from Eqs. (16), but replacing the creation/annihilations operators a ˆ†
a
and
ôf the electromagnetic field oscillators with
j
j
those of the general second quantization formalism outlined in Sec. 8.3.
9.8. Exercise problems
9.1. Prove the Casimir formula, given by Eq. (23), by calculating the net force F = PA exerted by the electromagnetic field, in its ground state, on two perfectly conducting parallel plates of area A, separated by a vacuum gap of width t << A 1/2.
Hint: Calculate the field energy in the gap volume with and without the account of the plate effect, and then apply the Euler-Maclaurin formula63 to the difference between these two results.
9.2. Electromagnetic radiation by some single-mode quantum sources may have such a high degree of coherence that it is possible to observe the interference of waves from two independent sources with virtually the same frequency, incident on one detector.
(i) Generalize Eq. (29) to this case.
62 For a gradual introduction see, e.g., either L. Brown, Quantum Field Theory, Cambridge U. Press (1994) or R.
Klauber, Student Friendly Quantum Field Theory, Sandtrove (2013). On the other hand, M. Srednicki, Quantum Field Theory, Cambridge U. Press (2007) and A. Zee, Quantum Field Theory in a Nutshell, 2nd ed., Princeton (2010), among others, offer steeper learning curves.
63 See, e.g., MA Eq. (2.12a).
Chapter 9
Page 34 of 36
QM: Quantum Mechanics
(ii) Use this generalized expression to show that incident waves in different Fock states do not create an interference pattern.
9.3. Calculate the zero-delay value g(2)(0) of the second-order correlation function of a single-mode electromagnetic field in the so-called Schrödinger-cat state:64 a coherent superposition of two Glauber states, with equal but sign-opposite parameters , and a certain phase shift between them.
9.4. Calculate the zero-delay value g(2)(0) of the second-order correlation function of a single-mode electromagnetic field in the squeezed ground state defined by Eq. (5.142).
9.5. Calculate the rate of spontaneous photon emission (into unrestricted free space) by a hydrogen atom, initially in the 2 p state ( n = 2, l = 1) with m = 0. Would the result be different for m =
1? for the 2 s state ( n = 2, l = 0, m = 0)? Discuss the relation between these quantum-mechanical results and those given by the classical theory of radiation for the simplest classical model of the atom.
9.6. An electron has been placed on the lowest excited level of a spherically-symmetric, quadratic potential well U(r) = m e2 r 2/2. Calculate the rate of its relaxation to the ground state, with the emission of a photon (into unrestricted free space). Compare the rate with that for a similar transition of the hydrogen atom, for the case when the radiation frequencies of these two systems are equal.
9.7. Derive an analog of Eq. (53) for the spontaneous photon emission into the free space, due to a change of the magnetic dipole moment m of a small-size system.
9.8. A spin-½ particle, with a gyromagnetic ratio , is in its orbital ground state in dc magnetic field B0. Calculate the rate of its spontaneous transition from the higher to the lower energy level, with the emission of a photon into the free space. Evaluate this rate for in an electron in a field of 10 T, and discuss the implications of this result for laboratory experiments with electron spins.
9.9. Calculate the rate of spontaneous transitions between the two sublevels of the ground state of a hydrogen atom, formed as a result of its hyperfine splitting. Discuss the implications of the result for the width of the 21-cm spectral line of hydrogen.
9.10. Find the eigenstates and eigenvalues of the Jaynes-Cummings Hamiltonian (78), and discuss their behavior near the resonance point = .
9.11. Analyze the Purcell effect, mentioned in Secs. 3 and 4, quantitatively; in particular, calculate the so-called Purcell factor F P defined as the ratio of the rate s of atom’s spontaneous emission into a resonant cavity tuned exactly to the quantum transition frequency, to that into the free space.
9.12. Prove that the Klein-Gordon equation (84) may be rewritten in the form similar to the non-relativistic Schrödinger equation (1.25), but for a two-component wavefunction, with the Hamiltonian represented (in the usual z-basis) by the following 22-matrix:
64 Its name stems from the well-known Schrödinger cat paradox, which is (very briefly) discussed in Sec. 10.1.
Chapter 9
Page 35 of 36
QM: Quantum Mechanics
2
H σ i σ
2
mc 2σ .
z
y
z
2 m
Use your solution to discuss the physical meaning of the wavefunction’s components.
9.13. Calculate and discuss the energy spectrum of a relativistic, spinless, charged particle placed into an external uniform, time-independent magnetic field B. Use the result to formulate the condition of validity of the non-relativistic theory in this situation.
9.14. Prove Eq. (91) for the energy spectrum of a hydrogen-like atom/ion, starting from the relativistic Schrödinger equation.
Hint: A mathematical analysis of Eq. (3.193) shows that its eigenvalues are given by Eq. (3.201),
n = –1/2 n 2, with n = l + 1 + nr, where nr = 0, 1, 2,…, even if the parameter l is not integer.
9.15. Derive a general expression for the differential cross-section of elastic scattering of a spinless relativistic particle by a static potential U(r), in the Born approximation, and formulate the conditions of its validity. Use these results to calculate the differential cross-section of scattering of a particle with the electric charge – e by the Coulomb electrostatic potential (r) = Ze/40 r.
9.16. Starting from Eqs. (95)-(98), prove that the probability density w given by Eq. (101) and the probability current density j defined by Eq. (102) do indeed satisfy the continuity equation (1.52):
w/ t + j = 0.
9.17. Calculate the commutator of the operator 2
ˆ L and Dirac’s Hamiltonian of a free particle.
Compare the result with that for the non-relativistic Hamiltonian, and interpret the difference.
9.18. Calculate commutators of the operators 2
ˆ S and 2
ˆ J with Dirac’s Hamiltonian (97), and give
an interpretation of the results.
9.19. In the Heisenberg picture of quantum dynamics, derive an equation describing the time evolution of free electron’s velocity in the Dirac theory. Solve the equation for the simplest state, with definite energy and momentum, and discuss the solution.
9.20. Calculate the eigenstates and eigenenergies of a relativistic spin-½ particle with charge q, placed into a uniform, time-independent external magnetic field B. Compare the calculated energy spectrum with those following from the non-relativistic theory and the relativistic Schrödinger equation.
9.21.* Following the discussion at the very end of Section 7, introduce quantum field operators
ˆ that would be related to the usual wavefunctions just as the electromagnetic field operators (16) are related to the classical electromagnetic fields, and explore basic properties of these operators. (For this preliminary study, consider the fixed-time situation.)
Chapter 9
Page 36 of 36
QM: Quantum Mechanics
Chapter 10. Making Sense of Quantum Mechanics
This (rather brief) chapter addresses some conceptually important issues of quantum measurements and quantum state interpretation. Please note that some of these issues are still subjects of debate 1 –
fortunately not affecting quantum mechanics’ practical results, discussed in the previous chapters.
10.1. Quantum measurements
The knowledge base outlined in the previous chapters gives us a sufficient background for a (by necessity, very brief) discussion of quantum measurements.2 Let me start by reminding the reader of the only postulate of the quantum theory that relates it to experiment – so far, meaning perfect measurements. In the simplest case when the system is in a coherent (pure) quantum state, its ket-vector may be represented as a linear superposition
a ,
(10.1)
j
j
j
where aj are the eigenstates of the operator of an observable A, related to its eigenvalues Aj by Eq.
(4.68):
A ˆ a A a .
(10.2)
j
j
j
In such a state, the outcome of every single measurement of the observable A may be uncertain, but is restricted to the set of eigenvalues Aj, with the j th outcome probability equal to 2
W
.
(10.3)
j
j
As was discussed in Chapter 7, the state of the system (or rather of the statistical ensemble of macroscopically similar systems we are using for this particular series of similar experiments) may be not coherent, and hence even more uncertain than the state described by Eq. (1). Hence, the measurement postulate means that even if the system is in this (the least uncertain) state, the measurement outcomes are still probabilistic.3
If we believe that a particular measurement may be done perfectly, and do not worry too much how exactly, we are subscribing to the mathematical notion of measurement, that was, rather reluctantly, used in these notes – up to this point. However, the actual ( physical) measurements are always imperfect, first of all because of the huge gap between the energy-time scale ~ 10-34 Js of the quantum phenomena in “microscopic” systems such as atoms, and the “macroscopic” scale of the direct human perception, so that the role of the instruments bridging this gap (Fig. 1), is highly nontrivial.
1 For an excellent review of these controversies, as presented in a few leading textbooks, I highly recommend J.
Bell’s paper in the collection by A. Miller (ed.), Sixty-Two Years of Uncertainty, Plenum, 1989.
2 “Quantum measurements” is a very unfortunate and misleading term; it would be more sensible to speak about
“measurements of observables in quantum mechanical systems”. However, the former term is so common and compact that I will use it – albeit rather reluctantly.
3 The measurement outcomes become definite only in the trivial case when the system is definitely in one of the eigenstates aj, say a 0; then j = j,0exp{ i}, and Wj = j,0.
© K. Likharev
Essential Graduate Physics
QM: Quantum Mechanics
interaction
to a human
observer
instrument
back action
Fig.10.1. The general
quantum
macroscopic
scheme of a quantum
system
pointer
measurement.
Besides the famous Bohr-Einstein discussion in the mid-1930s, which will be briefly reviewed in Sec. 3, the founding fathers of quantum mechanics have not paid much attention to these issues, apparently because of the following reason. At that time it looked like the experimental instruments (at least the best of them :-) were doing exactly what the measurement postulate was telling. For example, the z-oriented Stern-Gerlach experiment (Fig. 4.1) turns two complex coefficients and , describing the spin state of the incoming electrons, into a set of particle-counter clicks, with the rates proportional to, respectively, 2 and 2. The crude internal nature of these instruments makes more detailed questions unnatural. For example, each click of a Geiger counter involves an effective disappearance of one observed electron in a zillion-particle electric discharge avalanche it has triggered. A century ago, it looked much more important to extend the newly born quantum mechanics to more complex systems (such as atomic nuclei, etc.) than to think about the physics of such instruments.
However, since that time the experimental techniques, notably including high-vacuum and low-temperature systems, micro- and nano-fabrication, and low-noise electronics, have improved quite dramatically. In particular, we now may observe quantum-mechanical behavior of more and more macroscopic objects – such as the micromechanical oscillators mentioned in Sec. 2.9. Moreover, some
“macroscopic quantum systems” (in particular, special systems of Josephson junctions, see below) have properties enabling their use as essential parts of measurement setups. Such developments are making the line separating the “micro” and “macro” worlds finer and finer, so that more inquisitive inquiries into the physical nature of quantum measurements are not so hopeless now. In my personal scheme of things,4 these inquiries may be grouped as follows:
(i) Does a quantum measurement involve any laws besides those of quantum mechanics? In particular, should it necessarily involve a human/intelligent observer? (The last question is not as laughable as it may look – see below.)
(ii) What is the state of the measured system just after a single-shot measurement – meaning a measurement process limited to a time interval much shorter than the time scale of the measured system’s evolution? (This question is a necessary part of any discussion of repeated measurements and of their ultimate form – continuous monitoring of a certain observable.)
(iii) If a measurement of an observable A has produced a certain outcome Aj, what statements may be made about the state of the system just before the measurement? (This question is most closely related to various interpretations of quantum mechanics.)
Let me discuss these issues in the listed order. First of all, I am happy to report that there is a virtual consensus of physicists on some aspects of these issues. According to this consensus, any reasonable quantum measurement needs to result in a certain, distinguishable state of a macroscopic output component of the measurement instrument – see Fig. 1. (Traditionally, its component is called a 4 Again, this list and some other issues discussed in the balance of this section are still controversial.
Chapter 10
Page 2 of 16
QM: Quantum Mechanics
pointer, though its role may be played by a printer or a plotter, an electronic circuit sending out the result as a number, etc.). This requirement implies that the measurement process should have the following features:
- provide a large “signal gain”, i.e. some means of mapping the quantum state with its -scale of action (i.e. of the energy-by-time product) onto a macroscopic position of the pointer with a much larger action scale, and
- if we want to approach the fundamental limit of uncertainty, given by Eq. (3), the instrument should introduce as little additional fluctuations (“noise”) as permitted by the laws of physics.
Both these requirements are fulfilled in a well-designed Stern-Gerlach experiment – see Fig. 4.1
again. Indeed, the magnetic field gradient, splitting the electron beam, turns the minuscule (microscopic) energy difference (4.167) between two spin-polarized states into a macroscopic difference between the final positions of two output beams, where their detectors may be located. However, as was noted above, the internal physics of the particle detectors (say, Geiger counters) at this measurement is rather complex, and would not allow us to discuss some aspects of the measurement, in particular to answer the second of inquiries we are working on.
This is why let me describe the scheme of an almost similar “single-shot” measurement of a two-level quantum system, which shares the simplicity, high gain, and low internal noise of the Stern-Gerlach apparatus, but has an advantage that at its certain hardware implementations,5 the measurement process allows a thorough, quantitative theoretical description. Let us measure a particle trapped in a double-well potential (Fig. 2), where x is some continuous generalized coordinate – not necessarily a mechanical displacement. Let the particle be initially in a pure quantum state, with the energy close to the well’s bottom. Then, as we know from the discussion of such systems in Secs. 2.6 and 5.1, the state may be described by a ket-vector similar to that of spin-½:
,
(10.4)
where the component states and is described by wavefunctions localized near the potential well bottoms at x x 0 – see the blue lines in Fig. 2. Our goal is to measure in which well the particle resides at a certain time instant, say at t = 0. For that, let us rapidly change, at that moment, the potential profile of the system, so that at t > 0, near the origin, it may be well described by an inverted parabola: 2
m
2
U ( x)
x ,
for t ,
0
x x .
(10.5)
f
2
5 The scheme may be implemented, for example, using a simple Josephson-junction circuit called the balanced comparator – see, e.g., T. Walls et al., IEEE Trans. on Appl. Supercond. 17, 136 (2007), and references therein.
Experiments have demonstrated that this system may have a measurement variance dominated by the theoretically expected quantum-mechanical uncertainty, at practicable experimental conditions (at temperatures below ~ 1K).
A conceptual advantage of this system is that it is based on externally-shunted Josephson junctions, i.e. the devices whose quantum-mechanical model, including its part describing the coupling to the environment, is in a quantitative agreement with experiment – see, e.g., D. Schwartz et al., Phys. Rev. Lett. 55, 1547 (1985).
Colloquially, the balanced comparator is a high-gain instrument with a “well-documented Hamiltonian”, eliminating the need for speculations about the environmental effects. In particular, the dephasing process in it, and its time T 2, are well described by Eqs. (7.89) and (7.142), with the coefficients equal to the Ohmic conductances G of the shunts.
Chapter 10
Page 3 of 16
QM: Quantum Mechanics
It is straightforward to verify that the Heisenberg equations of motion in such an inverted potential describe exponential growth of operator x în time (proportional to exp{ t}) and hence a similar, proportional growth of the expectation value x and its r.m.s. uncertainty x.6 At this “inflation”
stage, the coherence between the two component states and is still preserved, i.e. the time evolution of the system is, in principle, reversible.
(a)
(b)
U ( x, t)
t 0
x
x
f
f
x
t 0
0
x
x
0
0
t 0
t 0
Fig. 10.2. The potential inversion, as viewed on the (a) “macroscopic”
and (b) “microscopic” scales of the generalized coordinate x.
Now let the system be weakly coupled, also at t > 0, to a dissipative (e.g., Ohmic) environment.
As we know from Chapter 7, such coupling ensures the state’s dephasing on some time scale T 2. If x x exp{ T }, x ,
(10.6)
0
0
2
f
then the process, after the potential inversion, consists of two stages, well separated in time:
- the already discussed “inflation” stage, preserving the component the state’s coherence, and
- the dephasing stage, at which the coherence of the component states and is gradually suppressed as described by Eq. (7.89), i.e. the density matrix of the system is gradually reduced to the diagonal form describing a classical mixture of two probability packets with the probabilities (3) equal to, respectively, W = 2 and W = 2 1 – 2.
Besides dephasing, the environment gives the motion certain kinematic friction, with the drag coefficient (7.141), so that the system eventually settles to rest at one of the macroscopically separated minima x = x f of the inverted potential (Fig. 2a), thus ensuring a high “signal gain” x f/ x 0 >> 1. As a result, the final probability density distribution w( x) along the x-axis has two narrow, well-separated peaks. But this is just the situation that was discussed in Sec. 2.5 – see, in particular, Fig. 2.17. Since that discussion is very important, let me repeat – or rather rephrase it. The final state of the system is a classical mixture of two well-separated states, with the respective probabilities W and W, whose sum equals 1. Now let us use some detector to test whether the system is in one of these states – say the right 6 Somewhat counter-intuitively, the latter growth improves the measurement’s fidelity. Indeed, it does not affect the intrinsic “signal-to-noise ratio” x/ x, while making the intrinsic (say, quantum-mechanical) uncertainty much larger than the possible noise contribution by the later measurement stage(s).
Chapter 10
Page 4 of 16
QM: Quantum Mechanics
one. (If x f is sufficiently large, the noise contribution of this detector into the measurement uncertainty is negligible,7 and its physics is unimportant.) If the system has been found at this location (again, the probability of this outcome is W = 2), the probability to find it at the counterpart (left) location at a consequent detection turns to zero.
This probability “reduction” is a purely classical (or if you like, mathematical) effect of the statistical ensemble’s re-definition: W equals zero not in the initial ensemble of all similar experiments (where is equals 2), but only in the re-defined ensemble of experiments in that the system had been found at the right location. Of course, which ensemble to use, i.e. what probabilities to register/publish is a purely accounting decision, which should be made by a human (or otherwise intelligent :-) observer.
If we are only interested in an objective recording of results of a pre-fixed sequence of experiments (i.e.
the members of a pre-defined, fixed statistical ensemble), there is no need to include such an observer in any discussion. In any case, this detection/registration process, very common in classical statistics, leaves no space for any mysterious “wave packet reduction” – understood as a hypothetical process that would not obey the regular laws of quantum mechanical evolution.
The state dephasing and ensemble re-definition at measurements are in the core of several paradoxes, of which the so-called quantum Zeno paradox is perhaps the most spectacular.8 Let us return to a two-level system with the unperturbed Hamiltonian given by Eq. (4.166), the quantum oscillation period 2/ much longer than the single-shot measurement time, and the system initially (at t = 0) definitely in one of the partial quantum states – for example, a certain potential well of the double-well potential. Then, as we know from Secs. 2.6 and 4.6, the probability to find the system in this initial state at time t > 0 is
t
t
2 Ω
2 Ω
W ( t) cos
1 sin
.
(10.7)
2
2
If the time is small enough ( t = dt << 1/), we may use the Taylor expansion to write Ω2 2
dt
W ( dt) 1
.
(10.8)
4
Now, let us use some good measurement scheme (say, the potential inversion discussed above) to measure whether the system is still in this initial state. If it is (as Eq. (8) shows, the probability of such an outcome is nearly 100%), then the system, after the measurement, is in the same state. Let us allow it to evolve again, with the same Hamiltonian. Then the evolution of W will follow the same law 7 At the balanced-comparator implementation mentioned above, the final state detection may be readily performed using a “SQUID” magnetometer based on the same Josephson junction technology – see, e.g., EM Sec. 6.5. In this case, the distance between the potential minima x f is close to one superconducting flux quantum (3.38), while the additional uncertainty induced by the SQUID may be as low as a few millionths of that amount.
8 This name, coined by E. Sudarshan and B. Mishra in 1997 (though the paradox had been discussed in detail by A. Turing in 1954) is due to its superficial similarity to the classical paradoxes by the ancient Greek philosopher Zeno of Elea. By the way, just for fun, let us have a look at what happens when Mother Nature is discussed by people that do not understand math and physics. The most famous of the classical Zeno paradoxes is the case of Achilles and Tortoise: the fast runner Achilles can apparently never overtake the slower Tortoise, because (in Aristotle’s words) “the pursuer must first reach the point whence the pursued started, so that the slower must always hold a lead”. For a physicist, the paradox has a trivial, obvious resolution, but here is what a philosopher writes about it – not in some year BC, but in the 2010 AD: "Given the history of 'final resolutions', from Aristotle onwards, it's probably foolhardy to think we've reached the end.” For me, this is a sad symbol of modern philosophy.
Chapter 10
Page 5 of 16
QM: Quantum Mechanics
as in Eq. (7). Thus, when the system is measured again at time 2 dt, the probability to find it in the same state both times is
2
Ω2 2
dt
Ω2 2
W (2 dt)
dt
W ( dt) 1
1
.
(10.9)
4
4
After repeating this cycle N times (with the total time t = Ndt still much less than N 1/2/), the probability that the system is still in its initial state is
N
N
Ω2 dt 2
Ω2 t 2
Ω2 t 2
W ( Ndt) W t
( ) 1
1
1
.
(10.10)
4
4 N 2
4 N
Comparing this result with Eq. (7), we see that the process of system’s transfer to the opposite partial state has been slowed down rather dramatically, and in the limit N (at fixed t), its evolution is virtually stopped by the measurement process. There is of course nothing mysterious here; the evolution slowdown is due to the quantum state dephasing at each measurement.
This may be the only acceptable occasion for me to mention, very briefly, one more famous – or rather infamous Schrödinger cat paradox, so much overplayed in popular publications.9 For this thought experiment, there is no need to discuss the (rather complicated :-) physics of the cat. As soon as the charged particle, produced at the radioactive decay, reaches the Geiger counter, the initial coherent superposition of the two possible quantum states (“the decay has happened”/“the decay has not happened”) of the system is rapidly dephased, i.e. reduced to their classical mixture, leading, correspondingly, to the classical mixture of the final macroscopic states “cat dead”/“cat alive”. So, despite attempts by numerous authors, without a proper physics background, to represent this situation as a mystery whose discussion needs involvement of professional philosophers, hopefully the reader knows enough about dephasing from Chapter 7, to ignore all this babble.
10.2. QND measurements
I hope that the above discussion has sufficiently illuminated the issues of the group (i), so let me proceed to the question group (ii), in particular to the general issue of the back action of the instrument upon the system under measurement – symbolized with the back arrow in Fig. 1. In the instruments like the Geiger counter, such back action is large: the instrument essentially destroys (“demolishes”) the state of the system under measurement. Even the “cleaner” potential-inversion measurement, shown in Fig. 2, fully destroys the initial coherence of the system, i.e. perturbs it rather substantially.
However, in the 1970s it was understood that this is not really necessary. For example, in Sec.
7.3, we have already discussed an example of a two-level system coupled with its environment and described by the Hamiltonian (7.68)-(7.70):
ˆ
ˆ
ˆ
ˆ
H H H H
H c
H
f
(10.11)
s
int
e
ˆ
, with
ˆ
ˆ
and
,
s
z
z
int
ˆ , z
so that
ˆ
ˆ
H , H
.
(10.12)
s
int
0
9 I fully agree with S. Hawking who has been quoted to say, “When I hear about the Schrödinger cat, I reach for my gun.” The only good aspect of this popularity is that the formulation of this paradox should be so well known to the reader that I do not need to waste time/space repeating it.
Chapter 10
Page 6 of 16
Essential Graduate Physics
QM: Quantum Mechanics
Comparing this equality with Eq. (4.199), applied to the explicitly-time-independent Hamiltonian H ˆ , s
ˆ
i H
,
(10.13)
s
ˆ ˆ
H , H
s
ˆ H , s ˆ ˆ
ˆ
H H H
H H
s
int
e
ˆ ˆ,
s
int
0
we see that in the Heisenberg picture, the Hamiltonian operator (and hence the energy) of the system of our interest does not change in time. On the other hand, if the “environment” in this discussion is the instrument used for the measurement (see Fig. 1 again), the interaction can change its state, so it may be used to measure the system’s energy – or another observable whose operator commutes with the interaction Hamiltonian. Such a trick is called the quantum non-demolition (QND), or sometimes “back-action-evading” measurements.10 Due to the lack of back action of the instrument on the corresponding variable, such measurements allow its continuous monitoring. Let me present a fine example of an actual measurement of this kind – see Fig. 3.11
(a)
(b)
Fig. 10.3. QND measurements of single electron’s energy by Peil and Gabrielse: (a) the experimental setup’s core, and (b) a record of the thermal excitation and spontaneous relaxation of the Fock states. © 1999 APS; reproduced with permission.
In this experiment, a single electron is captured in a Penning trap – a combination of a (virtually) uniform magnetic field B and a quadrupole electric field.12 This electric field stabilizes the cyclotron orbits but does not have any noticeable effect on electron motion in the plane perpendicular to the magnetic field, and hence on its Landau level energies – see Eq. (3.50):
1
e
E n
,
with
B
.
(10.14)
n
c
c
2
m e
(In the cited work, with B 5.3 T, the cyclic frequency c/2 was about 147 GHz, so that the Landau level splitting c was close to 10-22 J, i.e. corresponded to k B T at T ~10 K, while the physical temperature of the system might be reduced well below that, down to 80 mK). Now note that the 10 For a detailed discussion of this field see, e.g., V. Braginsky and F. Khalili (ed. by K. Thorne), Quantum Measurement, Cambridge U. Press, 1992; for an earlier review, see V. Braginsky et al., Science 209, 547 (1980).
11 S. Peil and G. Gabrielse, Phys. Rev. Lett. 83, 1287 (1999).
12 It is similar to the 2D system discussed in EM Sec. 2.7, but with additional rotation about one of the axes.
Chapter 10
Page 7 of 16
QM: Quantum Mechanics
analogy between a Landau-level particle and a harmonic oscillator goes beyond the energy spectrum (14). Indeed, since the Hamiltonian of a 2D particle in a perpendicular magnetic field may be reduced to Eq. (3.47), similar to that of a 1D oscillator, we may repeat all procedures of Sec. 5.4 and rewrite this effective Hamiltonian in the terms of the creation-annihilation operators – see Eq. (5.72):
†
1
ˆ
H
a a
.
(10.15)
s
c ˆ
ˆ
2
In the Peil and Gabrielse experiment, the trapped electron had one more degree of freedom –
along the magnetic field. The electric field of the Penning trap created a soft confining potential along this direction (vertical in Fig. 3a; I will take it for the z-axis), so that small electron oscillations along that axis could be well described as those of a 1D harmonic oscillator of much lower eigenfrequency, in that particular experiment with z/2 64 MHz. This frequency could be measured very accurately (with error ~1 Hz) by sensitive electronics whose electric field does affect the z-motion of the electron, but not its motion in the perpendicular plane. In an exactly uniform magnetic field, the two modes of electron motion would be completely uncoupled. However, the experimental setup included two special superconducting rings made of niobium (see Fig. 3a), which slightly distorted the magnetic field and created an interaction between the modes, which might be well approximated by the Hamiltonian13
†
1
2
ˆ
H
const ˆ a ˆ a ˆ z ,
(10.16)
int
2
so that the main condition (12) of a QND measurement was very closely satisfied. At the same time, the coupling (16) ensured that a change of the Landau level number n by 1 changed the z-oscillation eigenfrequency by ~12.4 Hz. Since this shift was substantially larger than electronics’ noise, rare spontaneous changes of n (due to a weak uncontrolled coupling of the electron to the environment) could be readily measured – moreover, continuously monitored – see Fig. 3b. The record shows spontaneous excitations of the electron to higher Landau levels, with its sequential relaxation, just as described by Eqs. (7.208)-(7.210). The detailed data statistics analysis showed that there was virtually no effect of the measuring instrument on these processes – at least on the scale of minutes, i.e. as many as ~1013 cyclotron orbit periods.14
It is important, however, to note that any measurement – QND or not – cannot avoid the uncertainty relations between incompatible variables; in the particular case described above, continuous monitoring of the Landau state number n does not allow the simultaneous monitoring of its quantum phase (which may be defined exactly as in the harmonic oscillator). In this context, it is natural to wonder whether the QND measurement concept may be extended from quadratic-form variables like energy to “usual” observables such as coordinates and momenta. whose uncertainties are bound by the ordinary Heisenberg’s relation (1.35). The answer is yes, but the required methods are a bit more tricky.
For example, let us place an electrically charged particle into a uniform electric field E = n x E( t) of an instrument, so that their interaction Hamiltonian is
13 Here I have simplified the real situation a bit. Actually, in that experiment, there was an electron spin’s contribution to the interaction Hamiltonian as well, but since the used high magnetic field polarized the spins quite reliably, their only effect was a constant shift of the frequency z, which is not important for our discussion.
14 See also the conceptually similar experiments, performed by different means: G. Nogues et al., Nature 400, 239
(1999).
Chapter 10
Page 8 of 16
QM: Quantum Mechanics
H ˆ
q ˆ t() x ˆ .
(10.17)
int
E
Such interaction may certainly pass the information on the time evolution of the coordinate x to the instrument. However, in this case, Eq. (12) is not satisfied – at least for the kinetic-energy part of the particle’s Hamiltonian; as a result, the interaction distorts its time evolution. Indeed, writing the Heisenberg equation (4.199) for the x-component of the momentum, we get
ˆ p ˆ p
q Ê ( ) .
(10.18)
E 0
t
On the other hand, integrating Eq. (5.139) for the coordinate operator evolution, 15 we get the expression t
1
x ˆ t
( ) x ˆ t
( )
p ˆ t'
( dt' ,
(10.19)
0
)
m t 0
which shows that the perturbations (18) of the momentum eventually find their way to the coordinate evolution, not allowing its unperturbed sequential measurements.
However, for such an important particular system as a harmonic oscillator, the following trick is possible. For this system, Eqs. (5.139) with the addition (18) may be readily combined to give a second-order differential equation for the coordinate operator, that is absolutely similar to the classical equation of motion of the system, and has a similar solution:16
q
t
x ˆ t
( ) x ˆ t
( )
ˆ t'
( )sin
0
.
(10.20)
0 t t' dt'
m
E
E
0
This formula confirms that generally, the external field E( t) (in our case, the sensing field of the measurement instrument) affects the time evolution law – of course. However, Eq. (20) shows that if the field is applied only at moments t’n separated by intervals T/2, where T 2/0 is the oscillation period, its effect on coordinate vanishes at similarly spaced observation instants tn = tn’ + ( m +1/2) T. This is the idea of stroboscopic QND measurements. Of course, according to Eq. (18), even such measurement strongly perturbs the oscillator momentum, so that even if the values xn are measured with high accuracy, the Heisenberg’s uncertainty relation is not violated.
A direct implementation of the stroboscopic measurements is technically complicated, but this initial idea has opened a way to more practicable solutions. For example, it is straightforward to use the Heisenberg equations of motion to show that if the coupling of two harmonic oscillators, with coordinates x and X, and unperturbed frequencies and , is modulated in time as H ˆ
ˆ X
x ˆ cos t
cos t
,
(10.21)
int
15 This simple relation is limited to 1D systems with Hamiltonians of the type (1.41), but by now the reader certainly knows enough to understand that this discussion may be readily generalized to many other systems.
16 Note in particular that the function sin0 (with t – t’) under the integral, divided by 0, is nothing more than the temporal Green’s function G() of a loss-free harmonic oscillator – see, e.g., CM Sec. 5.1.
Chapter 10
Page 9 of 16
QM: Quantum Mechanics
then the process in one of the oscillators (say, that with frequency ) does not affect dynamics of one of the quadrature components of the counterpart oscillator, defined by relations17
p ˆ
p ˆ
x ˆ x ˆ cos t
sin t
,
x ˆ x ˆ sin t
cos t ,
(10.22)
1
m
2
m
while this component’s motion does affect the dynamics of one of the quadrature components of the counterpart oscillator. (For the counterpart couple of quadrature components, the information transfer goes in the opposite direction.) This scheme has been successfully used for QND measurements.18
Please note that the last two QND measurement examples are based on the idea of a periodic change of a certain parameter in time – either in the short-pulse form or the sinusoidal form. If the only goal of a QND measurement is a sensitive measurement of a weak classical force acting on a quantum probe system, i.e. a 1D oscillator of eigenfrequency 0, it may be implemented much simpler – just by modulating an oscillator’s parameter with a frequency 20. From the classical dynamics, we know that if the depth of such modulation exceeds a certain threshold value, it results in the excitation of the so-called degenerate parametric oscillations with frequency /2 0, and one of two opposite phases.19
In the language of Eq. (22), the parametric excitation means exponential growth of one of the quadrature components (with its sign depending on initial conditions), while the counterpart component is suppressed. Close to, but below the excitation threshold, the parameter modulation boosts all fluctuations of the almost-excited component, including its quantum-mechanical uncertainty, and suppresses ( squeezes) those of the counterpart component. The result is a squeezed state, already discussed in Sec. 5.5 of this course (see in particular Eqs. (5.143) and Fig. 5.8), which allows one to notice the effect of an external force on the oscillator on the backdrop of a quantum uncertainty much smaller than the standard quantum limit (5.99).
In electrical engineering, this fact may be conveniently formulated in terms of noise parameter
N of a linear amplifier – essentially the tool for continuous monitoring of an input “signal” – e.g., a microwave or optical waveform.20 Namely, N of “usual” (say, transistor or maser) amplifiers which are equally sensitive to both quadrature components of the signal, N has the minimum value /2, due to the quantum uncertainty pertinent to the quantum state of the amplifier itself (which therefore plays the role of its “quantum noise”) – the fact that was recognized in the early 1960s.21 On the other hand, a 17 The physical sense of these relations should be clear from Fig. 5.8: they define a system of coordinates rotating clockwise with the angular velocity equal to , so that the point representing unperturbed classical oscillations with that frequency is at rest in this rotating frame. (The “probability cloud” representing a Glauber state is also stationary in the coordinates [ x 1, x 2].) The reader familiar with the classical theory oscillations may notice that the observables x 1 and x 2 so defined are just the Poincaré plane coordinates (“RWA variables”) – see, e.g., CM Sec.
5.3-5.6, and especially Fig. 5.9, where these coordinates are denoted as u and v.
18 The first, initially imperfect QND experiments were reported by R. Slusher et al., Phys. Rev. Lett. 55, 2409
(1985), and other groups soon after this, using nonlinear interactions of optical waves. Later, the results were much improved – see, e.g., P. Grangier et al., Nature 396, 537 (1998), and references therein. Recently, such experiments were extended to mechanical systems – see, e.g., F. Lecocq et al., Phys. Rev. X 5, 041037 (2015).
19 See, e.g., CM Sec. 5.5, and also Fig. 5.8 and its discussion in Sec. 5.6.
20 For a quantitative definition of the latter parameter, suitable for the quantum sensitivity range (N ~ ) as well, see, e.g., I. Devyatov et al., J. Appl. Phys. 60, 1808 (1986). In the classical noise limit (N >> ), it coincides with k B T N, where T N is a more popular measure of electronics’ noise, called the noise temperature.
21 See, e.g., H. Haus and J. Mullen, Phys. Rev. 128, 2407 (1962).
Chapter 10
Page 10 of 16
QM: Quantum Mechanics
degenerate parametric amplifier, sensitive to just one quadrature component, may have N well below
/2, due to its ground state squeezing.22
Let me note that the parameter-modulation schemes of the QND measurements are not limited to harmonic oscillators, and may be applied to other important quantum systems, notably including two-level (i.e. spin-½-like) systems.23 Such measurements may be an important tool for the further progress of quantum computation and cryptography.24
Finally, let me mention that the composite systems consisting of a quantum subsystem, and a classical subsystem performing its continuous weakly-perturbing measurement and using its results for providing a specially crafted feedback to the quantum subsystem, may have some curious properties, in particular mock a quantum system detached from the environment.25
10.3. Hidden variables and local reality
Now we are ready to proceed to the discussion of the last, hardest group (iii) of the questions posed in Sec. 1, namely on the state of a quantum system just before its measurement. After a very important but inconclusive discussion of this issue by Albert Einstein and his collaborators on one side, and Niels Bohr on the other side, in the mid-1930s, such discussions have resumed in the 1950s.26 They have led to a key contribution by John Stewart Bell in the early 1960s, summarized as so-called Bell’s inequalities, and then to experimental work on better and better verification of these inequalities.
(Besides that work, the recent progress, in my humble view, has been rather marginal.)
The central question may be formulated as follows: what had been the “real” state of a quantum-mechanical system just before a virtually perfect single-shot measurement was performed on it, and gave a certain, documented outcome? To be specific, let us focus again on the example of Stern-Gerlach measurements of spin-½ particles – because of their conceptual simplicity.27 For a single-component system (in this case a single spin-½) the answer to the posed question may look evident. Indeed, as we know, if the spin is in a pure (least-uncertain) state , i.e. its ket-vector may be expressed in the form similar to Eq. (4),
,
(10.23)
where, as usual, and denote the states with definite spin orientations along the z-axis, the probabilities of the corresponding outcomes of the z-oriented Stern-Gerlach experiment are W = 2
and W = 2. Then it looks natural to suggest that if a particular experiment gave the outcome corresponding to the state , the spin had been in that state just before the experiment. For a classical 22 See, e.g., the spectacular experiments by B. Yurke et al., Phys. Rev. Lett. 60, 764 (1988). Note also that the squeezed ground states of light are now used to improve the sensitivity of interferometers in gravitational wave detectors – see, e.g., the recent review by R. Schnabel, Phys. Repts. 684, 1 (2017), and the later paper by F.
Acernese et al., Phys. Rev. Lett. 123, 231108 (2019).
23 See, e.g., D. Averin, Phys. Rev. Lett. 88, 207901 (2002).
24 See, e.g., G. Jaeger, Quantum Information: An Overview, Springer, 2006.
25 See, e.g., the monograph by H. Wiseman and G. Milburn, Quantum Measurement and Control, Cambridge U.
Press (2009), more recent experiments by R. Vijay et al., Nature 490, 77 (2012), and references therein.
26 See, e.g., J. Wheeler and W. Zurek (eds.), Quantum Theory and Measurement, Princeton U. Press, 1983.
27 As was discussed in Sec. 1, the Stern-Gerlach-type experiments may be readily made virtually perfect, provided that we do not care about the evolution of the system after the single-shot measurement.
Chapter 10
Page 11 of 16
QM: Quantum Mechanics
system such answer would be certainly correct, and the fact that the probability W = 2, defined for the statistical ensemble of all experiments (regardless of their outcome), may be less than 1, would merely reflect our ignorance about the real state of this particular system before the measurement –
which just reveals the real situation.
However, as was first argued in the famous EPR paper published in 1935 by A. Einstein, B.
Podolsky, and N. Rosen, such an answer becomes impossible in the case of an entangled quantum system, if only one of its components is measured with an instrument. The original EPR paper discussed thought experiments with a pair of 1D particles prepared in a quantum state in that both the sum of their momenta and the difference of their coordinates simultaneously have definite values: p 1 + p 2 = 0, x 1 – x 2
= a.28 However, usually this discussion is recast into an equivalent Stern-Gerlach experiment shown in Fig. 4a.29 A source emits rare pairs of spin-½ particles, propagating in opposite directions. The particle spin states are random, but with the net spin of the pair definitely equal to zero. After the spatial separation of the particles has become sufficiently large (see below), the spin state of each of them is measured with a Stern-Gerlach detector, with one of them (in Fig. 1, SG1) somewhat closer to the particle source, so it makes the measurement first, at a time t 1 < t 2.
(a)
(b)
c
particle pair
a
b
source
Fig. 10. 4. (a) General scheme
SG
SG
of two-particle Stern-Gerlach
1
2
a, c
experiments, and (b) the
orientation of the detectors,
Stern-Gerlach detectors
assumed at Wigner’s deviation
on both sides
of Bell’s inequality (36).
First, let the detectors be oriented say along the same direction, say the z-axis. Evidently, the probability of each detector to give any of the values sz = /2 is 50%. However, if the first detector had given the result Sz = –/2, then even before the second detector’s measurement, we know that the latter will give the result Sz = +/2 with the 100% probability. So far, this situation still allows for a classical interpretation, just as for the single-particle measurements: we may fancy that the second particle has a definite spin before the measurement, and the first measurement just removes our ignorance about that reality. In other words, the change of the probability of the outcome Sz = +/2 at the second detection from 50% to 100% is due to the statistical ensemble re-definition: the 50% probability of this detection belongs to the ensemble of all experiments, while the 100% probability, to the sub-ensemble of experiments with the Sz = –/2 outcome of the first experiment.
However, let the source generate the spin pairs in the entangled, singlet state (8.18),
1
s
,
(10.24)
12
2
28 This is possible because the corresponding operators commute: ˆ p ˆ p , ˆ x ˆ x p x p x .
1
2
1
2
ˆ , ˆ
1
1
ˆ , ˆ
2
2
0
29 Another equivalent but experimentally more convenient (and as a result, frequently used) technique is the degenerate parametric excitation of entangled optical photon pairs – see, e.g., the publications cited at the end of this section.
Chapter 10
Page 12 of 16
QM: Quantum Mechanics
that certainly satisfies the above assumptions: the probability of each value of Sz of any particle is 50%, and the sum of both Sz is definitely zero, so that if the first detector’s result is Sz = –/2, then the state of the remaining particle is , with zero uncertainty. Now let us use Eqs. (4.123) to represent the same state (24) in a different form:
1 1
1
1
1
s
.
(10.25)
12
2 2
2
2
2
Opening the parentheses (carefully, without swapping the ket-vector order, which encodes the particle numbers!), we get an expression similar to Eq. (24), but now for the x-basis:
1
s
.
(10.26)
12
2
Hence if we use the first detector (closest to the particle source) to measure Sx rather than Sz, then after it had given a certain result (say, Sx = –/2), we know for sure, before the second particle spin’s measurement, that its Sx component definitely equals +/2.
So, depending on the experiment performed on the first particle, the second particle, before its measurement, may be in one of two states – either with a definite component Sz or with a definite component Sx, in each case with zero uncertainty. Evidently, this situation cannot be interpreted in classical terms if the particles do not interact during the measurements. A. Einstein was deeply unhappy with such situation because it did not satisfy what, in his view, was the general requirement to any theory, which nowadays is called the local reality. His definition of this requirement was as follows:
“The real factual situation of system 2 is independent of what is done with system 1 that is spatially separated from the former”. (Here the term “spatially separated” is not defined, but from the context, it is clear that Einstein meant the detector separation by a superluminal interval, i.e. by distance r r c t t ,
(10.27)
1
2
1
2
where the measurement time difference on the right-hand side includes the measurement duration.) In Einstein’s view, since quantum mechanics did not satisfy the local reality condition, it could not be considered a complete theory of Nature.
This situation naturally raises the question of whether something (usually called hidden variables) may be added to the quantum-mechanical description to enable it to satisfy the local reality requirement. The first definite statement in this regard was John von Neumann’s “proof”30 (first famous, then infamous :-) that such variables cannot be introduced; for a while, his work satisfied the quantum mechanics practitioners, who apparently did not pay much attention.31 A major new contribution to the problem was made only in the 1960s by J. Bell.32 First of all, he has found an elementary (in his words,
“foolish”) error in von Neumann’s logic, which voids his “proof”. Second, he has demonstrated that Einstein’s local reality condition is incompatible with conclusions of quantum mechanics – that had been, by that time, confirmed by too many experiments to be seriously questioned.
30 In his very early book J. von Neumann, Mathematische Grundlagen der Quantenmechanik [Mathematical Foundations of Quantum Mechanics], Springer, 1932. (The first English translation was published only in 1955.) 31 Perhaps it would not satisfy A. Einstein, but reportedly he did not know about the von Neumann’s publication before signing the EPR paper.
32 See, e. g., either J. Bell, Rev. Mod. Phys. 38, 447 (1966) or J. Bell, Foundations of Physics 12, 158 (1982).
Chapter 10
Page 13 of 16
QM: Quantum Mechanics
Let me describe a particular version of the Bell’s result (suggested by E. Wigner), using the same EPR pair experiment (Fig. 4a), in that each SG detector may be oriented in any of 3 directions: a, b, or c
– see Fig. 4b. As we already know from Chapter 4, if a fully-polarized beam of spin-½ particles is passed through a Stern-Gerlach apparatus forming angle with the polarization axis, the probabilities of two alternative outcomes of the experiment are
W ( ) cos2 ,
W ( ) sin 2 .
(10.28)
2
2
Let us use this formula to calculate all joint probabilities of measurement outcomes, starting from the detectors 1 and 2 oriented, respectively, in the directions a and c. Since the angle between the negative direction of the a-axis and the positive direction of the c-axis is a-, c+ = – (see the dashed arrow in Fig. 4b), we get
1
2
1
W ( a c ) W ( a ) W ( c a ) W ( a ) W
,
(10.29)
a, c
cos
sin2
2
2
2
2
where W( x y) is the joint probability of both outcomes x and y, while W( x y) is the conditional probability of the outcome x, provided that the outcome y has happened. (The first equality in Eq. (29) is the well-known identity of the probability theory.) Absolutely similarly,
1
W ( c b ) W ( c ) W ( b c ) sin2 , (10.30)
2
2
1
2
2
1
W ( a b ) W ( a ) W ( b a ) cos
sin2 .
(10.31)
2
2
2
Now note that for any angle smaller than /2 (as in the case shown in Fig. 4b), trigonometry gives 1
2
1
2
1
sin sin
sin2
sin2 .
(10.32)
2
2
2
2
2
2
(For example, for 0 the left-hand side of this inequality tends to 2/2, while the right-hand side, to
2/4.) Hence the quantum-mechanical result gives, in particular,
Quantum-
W ( a b ) W ( a c ) W ( c b ), for / 2 .
(10.33) mechanical
result
On the other hand, we can get a different inequality for these probabilities without calculating them from any particular theory, but using the local reality assumption. For that, let us prescribe some probability to each of 23 = 8 possible outcomes of a set of three spin measurements. (Due to zero net spin of particle pairs, the probabilities of the sets shown in both columns of the table have to be equal.) Detector 1
Detector 2
Probability
a+ b+ c+
a- b- c-
W 1
a+ b+ c-
a- b- c+
W 2
W( a c )
a
+ b- c+
a- b+ c-
W 3
W( a b )
a
+ b- c-
a- b+ c+
W 4
W( c b )
a- b+ c+
a+ b- c-
W 5
a- b+ c-
a+ b- c+
W 6
a- b- c+
a+ b+ c-
W 7
a- b- c-
a+ b+ c+
W 8
Chapter 10
Page 14 of 16
QM: Quantum Mechanics
From the local-reality point of view, these measurement options are independent, so we may write (see the arrows on the left of the table):
W ( a c ) W W ,
W ( c b ) W W ,
W ( a b ) W W .
(10.34)
2
4
3
7
3
4
On the other hand, since no probability may be negative (by its very definition), we may always write W W W W W W .
(10.35)
3
4
2
4
3
7
Plugging into this inequality the values of these two parentheses, given by Eq. (34), we get Bell’s
inequality
W ( a b ) W ( a c ) W ( c b ).
(10.36)
(local-reality
theory)
This is the Bell’s inequality, which has to be satisfied by any local-reality theory; it directly contradicts the quantum-mechanical result (33) – opening the issue to direct experimental testing. Such tests were started in the late 1960s, but the first results were vulnerable to two criticisms: (i) The detectors were not fast enough and not far enough to have the relation (27) satisfied. This is why, as a matter of principle, there was a chance that information on the first measurement outcome had been transferred (by some, mostly implausible) means to particles before the second measurement –
the so-called locality loophole.
(ii) The particle/photon detection efficiencies were too low to have sufficiently small error bars for both parts of the inequality – the detection loophole.
Gradually, these loopholes have been closed.33 As expected, substantial violations of the Bell inequalities (36) (or their equivalent forms) have been proved, essentially rejecting any possibility to reconcile quantum mechanics with Einstein’s local reality requirement.
10.4. Interpretations of quantum mechanics
The fact that quantum mechanics is incompatible with local reality, makes it reconciliation with our (classically-bred) “common sense” rather challenging. Here is a brief list of the major interpretations of quantum mechanics, that try to provide at least a partial reconciliation of this kind.
(i) The so-called Copenhagen interpretation – to which most physicists adhere. This
“interpretation” does not really interpret anything; it just accepts the intrinsic stochasticity of measurement results in quantum mechanics, and the absence of local reality, essentially saying: “Do not worry; this is just how it is; live with it”. I generally subscribe to this school of thought, with the following qualification. While the Copenhagen interpretation implies statistical ensembles (otherwise, how would you define the probability? – see Sec. 1.3), its most frequently stated formulations34 do not put a sufficient emphasis on their role, in particular on the ensemble re-definition as the only point of human observer’s involvement in a nearly-perfect measurement process – see Sec.1 above. The most 33 Important milestones in that way were the experiments by A. Aspect et al., Phys. Rev. Lett. 49, 91 (1982) and M. Rowe et al. , Nature 409, 791 (2001). Detailed reviews of the experimental situation were given, for example, by M. Genovese, Phys. Repts. 413, 319 (2005) and A. Aspect, Physics 8, 123 (2015); see also the later paper by J.
Handsteiner et al., Phys. Rev. Lett. 118, 060401 (2017). Presently, a high-fidelity demonstration of the Bell inequality violation has become a standard test in virtually every experiment with entangled qubits used for quantum encryption research – see Sec. 8.5, in particular the paper by J. Lin cited there.
34 With certain pleasant exceptions – see, e.g. L. Ballentine, Rev. Mod. Phys. 42, 358 (1970).
Chapter 10
Page 15 of 16
QM: Quantum Mechanics
famous objection to the Copenhagen interpretation belongs to A. Einstein: “God does not play dice.”
OK, when Einstein speaks, we all should listen, but perhaps when God speaks (through experimental results), we have to pay even more attention.
(ii) Non-local reality. After the dismissal of J. von Neumann’s “proof” by J. Bell, to the best of my knowledge, there has been no proof that hidden parameters could not be introduced, provided that they do not imply the local reality. Of constructive approaches, perhaps the most notable contribution was made by David Joseph Bohm,35 who developed the initial Louis de Broglie’s interpretation of the wavefunction as a “pilot wave”, making it quantitative. In the wave-mechanics version of this concept, the wavefunction governed by the Schrödinger equation, just guides a “real”, point-like classical particle whose coordinates serve as hidden variables. However, this concept does not satisfy the notion of local reality. For example, the measurement of the particle’s coordinate at a certain point r1 has to instantly change the wavefunction everywhere in space, including the points r2 in the superluminal range (27).
After A. Einstein’s private criticism, D. Bohm essentially abandoned his theory.36
(iii) The many-world interpretation, introduced in 1957 by Hugh Everitt and popularized in the 1960s and 1970s by Bruce de Witt. In this interpretation, all possible measurement outcomes do happen, splitting the Universe into the corresponding number of “parallel multiverses”, so that from one of them, other multiverses and hence other outcomes cannot be observed. Let me leave to the reader an estimate of the rate at which the parallel multiverses have to be constantly generated (say, per second), taking into account that such generation should take place not only at explicit lab experiments but at every irreversible process – such as fission of every atomic nucleus or an absorption/emission of every photon, everywhere in each multiverse – whether its result is formally recorded or not. Nicolaas van Kampen has called this a “mind-boggling fantasy”.37 Even the main proponent of this interpretation, B. de Witt has confessed: “The idea is not easy to reconcile with common sense.” I agree.
(iv) Quantum logic. In desperation, some physicists turned philosophers have decided to dismiss the formal logic we are using – in science and elsewhere. From what (admittedly, very little) I have read about this school of thought, it seems that from its point of view, definite statements like “the SG
detector has found the spin to be directed along the magnetic field” should not necessarily be either true or false. OK, if we dismiss the formal logic, I do not know how we can use any scientific theory to make any predictions – until the quantum logic experts tell us what to replace it with. To the best of my knowledge, so far they have not done that. I personally trust the opinion by J. Bell, who certainly gave more thought to these issues: “It is my impression that the whole vast subject of Quantum Logic has arisen […] from the misuse of a word.”
As far as I know, neither of these interpretations has yet provided a suggestion on how it might be tested experimentally to exclude other ones. On the positive side, there is a virtual consensus that quantum mechanics makes correct (if sometimes probabilistic) predictions, which do not contradict any reliable experimental results we are aware of. Maybe, this is not that bad for a scientific theory.38
35 D. Bohm, Phys. Rev. 85, 165; 180 (1952).
36 See, e.g., Sec. 22.19 of his (generally very good) textbook D. Bohm, Quantum Theory, Dover, 1979.
37 N. van Kampen, Physica A 153, 97 (1988). By the way, I highly recommend the very reasonable summary of the quantum measurement issues, given in this paper, though believe that the quantitative theory of dephasing, discussed in Chapter 7 of this course, might give additional clarity to some of van Kampen’s statements.
38 For the reader who is not satisfied with this “positivistic” approach, and wants to improve the situation, my earnest advice is to start not from square one, but from reading what other (including some very clever!) people thought about it. The review collection by J. Wheeler and W. Zurek, cited above, may be a good starting point.
Chapter 10
Page 16 of 16
SM: Statistical Mechanics
Konstantin K. Likharev
Essential Graduate Physics
Lecture Notes and Problems
Beta version
Open online access at
http://commons.library.stonybrook.edu/egp/
and
https://sites.google.com/site/likharevegp/
Part SM:
Statistical Mechanics
Last corrections: October 15, 2021
A version of this material was published in 2019 under the title
Statistical Mechanics: Lecture notes
IOPP, Essential Advanced Physics – Volume 7, ISBN 978-0-7503-1416-6,
with the model solutions of the exercise problems published under the title
Statistical Mechanics: Problems with solutions
IOPP, Essential Advanced Physics – Volume 8, ISBN 978-0-7503-1420-6
However, by now this online version of the lecture notes
and the problem solutions available from the author, have been better corrected
See also the author’s list
https://you.stonybrook.edu/likharev/essential-books-for-young-physicist/
of other essential reading recommended to young physicists
© K. Likharev
Page 1 of 4
SM: Statistical Mechanics
Table of Contents
Chapter 1. Review of Thermodynamics (24 pp.)
1.1. Introduction: Statistical physics and thermodynamics
1.2. The 2nd law of thermodynamics, entropy, and temperature
1.3. The 1st and 3rd laws of thermodynamics, and heat capacity
1.4. Thermodynamic potentials
1.5. Systems with a variable number of particles
1.6. Thermal machines
1.7. Exercise problems (16)
Chapter 2. Principles of Physical Statistics (44 pp.)
2.1. Statistical ensemble and probability
2.2. Microcanonical ensemble and distribution
2.3. Maxwell’s Demon, information, and computing
2.4. Canonical ensemble and the Gibbs distribution
2.5. Harmonic oscillator statistics
2.6. Two important applications
2.7. Grand canonical ensemble and distribution
2.8. Systems of independent particles
2.9. Exercise problems (32)
Chapter 3. Ideal and Not-so-Ideal Gases (34 pp.)
3.1. Ideal classical gas
3.2. Calculating
3.3. Degenerate Fermi gas
3.4. The Bose-Einstein condensation
3.5. Gases of weakly interacting particles
3.6. Exercise problems (29)
Chapter 4. Phase Transitions (36 pp.)
4.1. First order phase transitions
4.2. Continuous phase transitions
4.3. Landau’s mean-field theory
4.4. Ising model: Weiss molecular-field theory
4.5. Ising model: Exact and numerical results
4.6. Exercise problems (18)
Chapter 5. Fluctuations (44 pp.)
5.1. Characterization of fluctuations
5.2. Energy and the number of particles
5.3. Volume and temperature
5.4. Fluctuations as functions of time
5.5. Fluctuations and dissipation
Table of Contents
Page 2 of 4
SM: Statistical Mechanics
5.6. The Kramers problem and the Smoluchowski equation
5.7. The Fokker-Planck equation
5.8. Back to the correlation function
5.9. Exercise problems (21)
Chapter 6. Elements of Kinetics (38 pp.)
6.1. The Liouville theorem and the Boltzmann equation
6.2. The Ohm law and the Drude formula
6.3. Electrochemical potential and drift-diffusion equation
6.4. Charge carriers in semiconductors: Statics and kinetics
6.5. Thermoelectric effects
6.6. Exercise problems (15)
* * *
Additional file (available from the author upon request):
Exercise and Test Problems with Model Solutions (131 + 23 = 154 problems; 232 pp.) Table of Contents
Page 3 of 4
SM: Statistical Mechanics
This page is
intentionally left
blank
Table of Contents
Page 4 of 4
SM: Statistical Mechanics
Chapter 1. Review of Thermodynamics
This chapter starts with a brief discussion of the subject of statistical physics and thermodynamics, and the relation between these two disciplines. Then I proceed to a review of the basic notions and relations of thermodynamics. Most of this material is supposed to be known to the reader from their undergraduate studies, 1 so the discussion is rather brief.
1.1. Introduction: Statistical physics and thermodynamics
Statistical physics (alternatively called “statistical mechanics”) and thermodynamics are two different but related approaches to the same goal: an approximate description of the “internal”2
properties of large physical systems, notably those consisting of N >> 1 identical particles – or other components. The traditional example of such a system is a human-scale portion of gas, with the number N of atoms/molecules3 of the order of the Avogadro number N A ~ 1023 (see Sec. 4 below).
The motivation for the statistical approach to such systems is straightforward: even if the laws governing the dynamics of each particle and their interactions were exactly known, and we had infinite computing resources at our disposal, calculating the exact evolution of the system in time would be impossible, at least because it is completely impracticable to measure the exact initial state of each component – in the classical case, the initial position and velocity of each particle. The situation is further exacerbated by the phenomena of chaos and turbulence,4 and the quantum-mechanical uncertainty, which do not allow the exact calculation of final positions and velocities of the component particles even if their initial state is known with the best possible precision. As a result, in most situations, only statistical predictions about the behavior of such systems may be made, with the probability theory becoming a major tool of the mathematical arsenal.
However, the statistical approach is not as bad as it may look. Indeed, it is almost self-evident that any measurable macroscopic variable characterizing a stationary system of N >> 1 particles as a whole (think, e.g., about the stationary pressure P of the gas contained in a fixed volume V) is almost constant in time. Indeed, as we will see below, besides certain exotic exceptions, the relative magnitude of fluctuations – either in time, or among many macroscopically similar systems – of such a variable is of the order of 1/ N 1/2, and for N ~ N A is extremely small. As a result, the average values of appropriate macroscopic variables may characterize the state of the system quite well – satisfactory for nearly all practical purposes. The calculation of relations between such average values is the only task of thermodynamics and the main task of statistical physics. (Fluctuations may be important, but due to their smallness, in most cases their analysis may be based on perturbative approaches – see Chapter 5.) 1 For remedial reading, I can recommend, for example (in the alphabetical order): C. Kittel and H. Kroemer, Thermal Physics, 2nd ed., W. H. Freeman (1980); F. Reif, Fundamentals of Statistical and Thermal Physics, Waveland (2008); D. V. Schroeder, Introduction to Thermal Physics, Addison Wesley (1999).
2 Here “internal” is an (admittedly loose) term meaning all the physics unrelated to the motion of the system as a whole. The most important example of internal dynamics is the thermal motion of atoms and molecules.
3 This is perhaps my best chance for a reverent mention of Democritus (circa 460-370 BC) – the Ancient Greek genius who was apparently the first one to conjecture the atomic structure of matter.
4 See, e.g., CM Chapters 8 and 9.